0% found this document useful (0 votes)

16 views67 pages

Classic CNN Architectures Overview

Uploaded by

22 / Gyanaranjan Nayak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views67 pages

Classic CNN Architectures Overview

Uploaded by

22 / Gyanaranjan Nayak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Lecture 5 part B

Classic CNN Architectures

Dana Erlich 30/04/2018

Outline
• Backpropagation of convolution
• Objectives and Introduction
• LeNet-5
• AlexNet
• VGG
• GoogleNet
• ResNet
Backpropagation of convolution

Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
To calculate the gradients of error ‘E’ with respect to the
filter ‘F’, the following equations needs to solved.

Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
Which evaluates to-

Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
If we look closely the previous equation can be written in
form of our convolution operation.

Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
Similarly we can find the gradients of the error ‘E’ with
respect to the input matrix ‘X’.

Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
The previous computation can be obtained by a different
type of convolution operation known as full convolution.

In order to obtain the gradients of the input matrix we need

to rotate the filter by 180 degree and calculate the full
convolution of the rotated filter by the gradients of the
output with respect to error.

F11 F12 Rotate x F12 F11 Rotate y F22 F21

F21 F22 F22 F21 F12 F11

Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
Backpropagation of max pooling
Suppose you have a matrix M of four elements:

a b
c d
and maxpool(M) returns d.
Then, the maxpool function really only depends on d.
So, the derivative of maxpool relative to d is 1, and its
derivative relative to a,b,c is zero. So you
backpropagate 1 to the unit corresponding to d, and
you backpropagate zero for the other units.
Slide taken from Forward And Backpropagation in Convolutional Neural Network. - Medium
Objectives
• We will examine classic CNN architectures
with the goal of:
- Gaining intuition for building CNNs
- Reusing CNN architectures
LeNet-5
• Gradient Based Learning Applied To Document
Recognition - Y. Lecun, L. Bottou, Y. Bengio, P. Haffner;
1998
• Helped establish how we use CNNs today
• Replaced manual feature extraction

[LeCun et al., 1998]

LeNet-5
conv avg pool conv avg pool
...
55 f=2 55 f=2
s=1 s=2 s=1 s=2
32321 28286 14146 101016

FC FC
... ^
𝑦
⋮ ⋮
10
5516
120 84 Reminder:
Output size = (N+2P-F)/stride + 1
This slide is taken from Andrew Ng [LeCun et al., 1998]
LeNet-5
• Only 60K parameters
• As we go deeper in the network:
• General structure:
conv->pool->conv->pool->FC->FC->output

• Different filters look at different channels

• Sigmoid and Tanh nonlinearity

[LeCun et al., 1998]

AlexNet
• ImageNet Classification with Deep Convolutional
Neural Networks - Alex Krizhevsky, Ilya Sutskever,
Geoffrey E. Hinton; 2012
• Facilitated by GPUs, highly optimized convolution
implementation and large datasets (ImageNet)
• One of the largest CNNs to date
• Has 60 Million parameter compared to 60k
parameter of LeNet-5

[Krizhevsky et al., 2012]

ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) winners

• The annual “Olympics” of computer vision.

• Teams from across the world compete to see who has the
best computer vision model for tasks such as classification,
localization, detection, and more.

• 2012 marked the first year where a CNN was used to

achieve a top 5 test error rate of 15.3%.

• The next best entry achieved an error of 26.2%.

ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) winners

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
Architecture AlexNet
CONV1
• Input: 227x227x3 images (224x224 before
MAX POOL1
padding)
NORM1
CONV2 • First layer: 96 11x11 filters applied at stride 4
MAX POOL2
NORM2 • Output volume size?
CONV3 (N-F)/s+1 = (227-11)/4+1 = 55 ->
CONV4 [55x55x96]
CONV5
Max POOL3
• Number of parameters in this layer?
FC6
FC7 (11*11*3)*96 = 35K
FC8
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Krizhevsky et al., 2012]
AlexNet

[Krizhevsky et al., 2012]

Architecture AlexNet
CONV1
MAX POOL1 • Input: 227x227x3 images (224x224 before
NORM1 padding)
CONV2 • After CONV1: 55x55x96
MAX POOL2 • Second layer: 3x3 filters applied at stride 2
NORM2
CONV3 • Output volume size?
CONV4
CONV5 (N-F)/s+1 = (55-3)/2+1 = 27 -> [27x27x96]
Max POOL3
FC6 • Number of parameters in this layer?
FC7 0!
FC8
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Krizhevsky et al., 2012]
AlexNet
conv max pool conv max pool
...
11 11 33 55 33
s=4 s=2 S=1 s=2
227227 3 P=0 5555 6 2727 96 P=2 2727 256

conv conv conv max pool

... ...
33 33 33 33
S=1 s=1 S=1 s=2
1313 256 P = 1 1313 384
P=1
1313 384
P=1
1313 256 66 256

This slide is taken from Andrew Ng [Krizhevsky et al., 2012]

AlexNet

FC FC
...
⋮ ⋮
Softmax
1000
4096 4096

This slide is taken from Andrew Ng [Krizhevsky et al., 2012]

AlexNet
Details/Retrospectives:
• first use of ReLU
• used Norm layers (not common anymore)
• heavy data augmentation
• dropout 0.5
• batch size 128
• 7 CNN ensemble

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Krizhevsky et al., 2012]
AlexNet
• Trained on GTX 580 GPU with only 3 GB of memory.

• Network spread across 2 GPUs, half the neurons (feature

maps) on each GPU.

• CONV1, CONV2, CONV4, CONV5:

Connections only with feature maps on same GPU.
• CONV3, FC6, FC7, FC8:
Connections with all feature maps in preceding layer,
communication across GPUs.

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Krizhevsky et al., 2012]
AlexNet

AlexNet was the coming out party for CNNs in the computer
vision community. This was the first time a model performed
so well on a historically difficult ImageNet dataset. This
paper illustrated the benefits of CNNs and backed them up
with record breaking performance in the competition.

[Krizhevsky et al., 2012]

ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) winners

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) winners

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
VGGNet
• Very Deep Convolutional Networks For Large Scale
Image Recognition - Karen Simonyan and Andrew
Zisserman; 2015
• The runner-up at the ILSVRC 2014 competition
• Significantly deeper than AlexNet
• 140 million parameters

[Simonyan and Zisserman, 2014]

Input

VGGNet
3x3 conv, 64
3x3 conv, 64
Pool 1/2
3x3 conv, 128
3x3 conv, 128 • Smaller filters
Pool 1/2 Only 3x3 CONV filters, stride 1, pad 1
3x3 conv, 256
3x3 conv, 256 and 2x2 MAX POOL , stride 2
Pool 1/2
3x3 conv, 512
3x3 conv, 512 • Deeper network
3x3 conv, 512
Pool 1/2 AlexNet: 8 layers
3x3 conv, 512 VGGNet: 16 - 19 layers
3x3 conv, 512
3x3 conv, 512
Pool 1/2 • ZFNet: 11.7% top 5 error in ILSVRC’13
FC 4096
FC 4096 • VGGNet: 7.3% top 5 error in ILSVRC’14
FC 1000
Softmax

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Simonyan and Zisserman, 2014]
VGGNet
•Why use smaller filters? (3x3 conv)
Stack of three 3x3 conv (stride 1) layers has the same effective
receptive field as one 7x7 conv layer.

• What is the effective receptive field of three 3x3 conv (stride

1) layers?
7x7
But deeper, more non-linearities
And fewer parameters: 3 * (32C2) vs. 72C2 for C channels per layer

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Simonyan and Zisserman, 2014]
Reminder: Receptive Field

conv conv conv

Input memory: 224*224*3=150K params: 0
3x3 conv, 64 memory: 224*224*64=3.2M params: (3*3*3)*64 = 1,728
3x3 conv, 64 memory: 224*224*64=3.2M params: (3*3*64)*64 = 36,864
Pool memory: 112*112*64=800K params: 0
3x3 conv, 128 memory: 112*112*128=1.6M params: (3*3*64)*128 = 73,728
3x3 conv, 128 memory: 112*112*128=1.6M params: (3*3*128)*128 = 147,456
Pool memory: 56*56*128=400K params: 0
3x3 conv, 256 memory: 56*56*256=800K params: (3*3*128)*256 = 294,912
3x3 conv, 256 memory: 56*56*256=800K params: (3*3*256)*256 = 589,824
3x3 conv, 256 memory: 56*56*256=800K params: (3*3*256)*256 = 589,824
Pool memory: 28*28*256=200K params: 0
3x3 conv, 512 memory: 28*28*512=400K params: (3*3*256)*512 = 1,179,648
3x3 conv, 512 memory: 28*28*512=400K params: (3*3*512)*512 = 2,359,296
3x3 conv, 512 memory: 28*28*512=400K params: (3*3*512)*512 = 2,359,296
Pool memory: 14*14*512=100K params: 0
3x3 conv, 512 memory: 14*14*512=100K params: (3*3*512)*512 = 2,359,296
3x3 conv, 512 memory: 14*14*512=100K params: (3*3*512)*512 = 2,359,296
3x3 conv, 512 memory: 14*14*512=100K params: (3*3*512)*512 = 2,359,296
Pool memory: 7*7*512=25K params: 0
FC 4096 memory: 4096 params: 7*7*512*4096 = 102,760,448
FC 4096 memory: 4096 params: 4096*4096 = 16,777,216
FC 1000 memory: 1000 params: 4096*1000 = 4,096,000

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Simonyan and Zisserman, 2014]
Input
3x3 conv, 64
3x3 conv, 64
Pool
VGGNet
3x3 conv, 128
3x3 conv, 128 VGG16:
Pool
3x3 conv, 256 TOTAL memory: 24M * 4 bytes ~= 96MB / image
3x3 conv, 256 TOTAL params: 138M parameters
3x3 conv, 256
Pool
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
Pool
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
Pool
FC 4096
FC 4096
FC 1000
Softmax

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Simonyan and Zisserman, 2014]
Input memory: 224*224*3=150K params: 0
3x3 conv, 64 memory: 224*224*64=3.2M params: (3*3*3)*64 = 1,728
3x3 conv, 64 memory: 224*224*64=3.2M params: (3*3*64)*64 = 36,864
Pool memory: 112*112*64=800K params: 0
3x3 conv, 128 memory: 112*112*128=1.6M params: (3*3*64)*128 = 73,728
3x3 conv, 128 memory: 112*112*128=1.6M params: (3*3*128)*128 = 147,456
Pool memory: 56*56*128=400K params: 0
3x3 conv, 256 memory: 56*56*256=800K params: (3*3*128)*256 = 294,912
3x3 conv, 256 memory: 56*56*256=800K params: (3*3*256)*256 = 589,824
3x3 conv, 256 memory: 56*56*256=800K params: (3*3*256)*256 = 589,824
Pool memory: 28*28*256=200K params: 0
3x3 conv, 512 memory: 28*28*512=400K params: (3*3*256)*512 = 1,179,648
3x3 conv, 512 memory: 28*28*512=400K params: (3*3*512)*512 = 2,359,296
3x3 conv, 512 memory: 28*28*512=400K params: (3*3*512)*512 = 2,359,296
Pool memory: 14*14*512=100K params: 0
3x3 conv, 512 memory: 14*14*512=100K params: (3*3*512)*512 = 2,359,296
3x3 conv, 512 memory: 14*14*512=100K params: (3*3*512)*512 = 2,359,296
3x3 conv, 512 memory: 14*14*512=100K params: (3*3*512)*512 = 2,359,296
Pool memory: 7*7*512=25K params: 0
FC 4096 memory: 4096 params: 7*7*512*4096 = 102,760,448
FC 4096 memory: 4096 params: 4096*4096 = 16,777,216
FC 1000 memory: 1000 params: 4096*1000 = 4,096,000

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Simonyan and Zisserman, 2014]
VGGNet
Details/Retrospectives :
• ILSVRC’14 2nd in classification, 1st in localization
• Similar training procedure as AlexNet
• No Local Response Normalisation (LRN)
• Use VGG16 or VGG19 (VGG19 only slightly better, more
memory)
• Use ensembles for best results
• FC7 features generalize well to other tasks
• Trained on 4 Nvidia Titan Black GPUs for two to three weeks.

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Simonyan and Zisserman, 2014]
VGGNet

VGG Net reinforced the notion that convolutional neural

networks have to have a deep network of layers in order for
this hierarchical representation of visual data to work.
Keep it deep.
Keep it simple.

[Simonyan and Zisserman, 2014]

ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) winners

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
GoogleNet
• Going Deeper with Convolutions - Christian Szegedy et
al.; 2015
• ILSVRC 2014 competition winner
• Also significantly deeper than AlexNet
• x12 less parameters than AlexNet
• Focused on computational efficiency

[Szegedy et al., 2014]

GoogleNet
• 22 layers
• Efficient “Inception” module - strayed from
the general approach of simply stacking conv
and pooling layers on top of each other in a
sequential structure
• No FC layers
• Only 5 million parameters!
• ILSVRC’14 classification winner (6.7% top 5
error)

[Szegedy et al., 2014]

GoogleNet
“Inception module”: design a good local network topology (network within
a network) and then stack these modules on top of each other
Filter
concatenation
1x1 3x3 5x5 1x1
convolution convolution convolution convolution

1x1 1x1 3x3 max

convolution convolution pooling

Previous layer

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
Naïve Inception Model
• Apply parallel filter operations on the input :
• Multiple receptive field sizes for convolution (1x1, 3x3, 5x5)
• Pooling operation (3x3)
• Concatenate all filter outputs together depth-wise
Filter
concatenation
1x1 3x3 5x5 3x3 max
convolution convolution convolution pooling

Previous layer
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
• What’s the problem with this?
High computational complexity

Filter
concatenation
1x1 3x3 5x5 3x3 max
convolution convolution convolution pooling

Previous layer

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
• Output volume sizes:
1x1 conv, 128: 28x28x128
3x3 conv, 192: 28x28x192
Example:
5x5 conv, 96: 28x28x96 Filter
3x3 pool: 28x28x256 concatenation
3x3 max
1x1 conv 128 3x3 conv 192 5x5 conv 96
pooling

Previous layer
• What is output size after 28x28x256
filter concatenation?
28x28x(128+192+96+256) = 28x28x672
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
• Number of convolution operations:
1x1 conv, 128: 28x28x128x1x1x256
3x3 conv, 192: 28x28x192x3x3x256
5x5 conv, 96: 28x28x96x5x5x256
Total: 854M ops
Filter
concatenation
3x3 max
1x1 conv 128 3x3 conv 192 5x5 conv 96
pooling

Previous layer
28x28x256
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
• Very expensive compute!
• Pooling layer also preserves feature
depth, which means total depth after
concatenation can only grow at every layer.

Filter
concatenation
3x3 max
1x1 conv 128 3x3 conv 192 5x5 conv 96
pooling

Previous layer
28x28x256
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
• Solution: “bottleneck” layers that use 1x1 convolutions to
reduce feature depth (from previous hour).

Filter
concatenation
1x1 3x3 5x5 3x3 max
convolution convolution convolution pooling

Previous layer

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
• Solution: “bottleneck” layers that use 1x1 convolutions to
reduce feature depth (from previous hour).

Filter
concatenation
1x1 3x3 5x5 1x1
convolution convolution convolution convolution

1x1 1x1 3x3 max

convolution convolution pooling

Previous layer

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
• Number of convolution operations:
1x1 conv, 64: 28x28x64x1x1x256
1x1 conv, 64: 28x28x64x1x1x256
1x1 conv, 128: 28x28x128x1x1x256
3x3 conv, 192: 28x28x192x3x3x64
5x5 conv, 96: 28x28x96x5x5x264
1x1 conv, 64: 28x28x64x1x1x256 Filter
Total: 353M ops concatenation

1x1 conv 128 3x3 conv 192 5x5 conv 96 1x1 conv 64

3x3 max
1x1 conv 64 1x1 conv 64
pooling

Previous layer
28x28x256
• Compared to 854M ops for naive version

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet
Details/Retrospectives :
• Deeper networks, with computational efficiency
• 22 layers
• Efficient “Inception” module
• No FC layers
• 12x less params than AlexNet
• ILSVRC’14 classification winner (6.7% top 5 error)

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [Szegedy et al., 2014]
GoogleNet

Introduced the idea that CNN layers didn’t always have to be

stacked up sequentially. Coming up with the Inception
module, the authors showed that a creative structuring of
layers can lead to improved performance and
computationally efficiency.

[Szegedy et al., 2014]

ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) winners

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
ResNet
• Deep Residual Learning for Image Recognition -
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun;
2015
• Extremely deep network – 152 layers
• Deeper neural networks are more difficult to train.
• Deep networks suffer from vanishing and
exploding gradients.
• Present a residual learning framework to ease the
training of networks that are substantially deeper
than those used previously.
[He et al., 2015]
ResNet
• ILSVRC’15 classification winner (3.57% top 5
error, humans generally hover around a 5-
10% error rate)
Swept all classification and detection
competitions in ILSVRC’15 and COCO’15!

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
ResNet
• What happens when we continue stacking deeper layers on a
convolutional neural network?

• 56-layer model performs worse on both training and test error

-> The deeper model performs worse (not caused by overfitting)!

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
ResNet
• Hypothesis: The problem is an optimization problem. Very
deep networks are harder to optimize.
• Solution: Use network layers to fit residual mapping instead
of directly trying to fit a desired underlying mapping.

• We will use skip connections allowing us to take the activation

from one layer and feed it into another layer, much deeper
into the network.
• Use layers to fit residual F(x) = H(x) – x
instead of H(x) directly

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
ResNet
Residual Block
Input x goes through conv-relu-conv series and gives us F(x).
That result is then added to the original input x. Let’s call that
H(x) = F(x) + x.
In traditional CNNs, H(x) would just be equal to F(x). So, instead
of just computing that transformation (straight from x to F(x)),
we’re computing the term that we have to add, F(x), to the
input, x.

[He et al., 2015]

ResNet [𝑙 +1]
[𝑙 ]
𝑎 [𝑙 +2 ]
𝑎 𝑎

Short cut/ skip connection

a
[l ]
𝐋𝐢𝐧𝐞𝐚𝐫 𝐑𝐞𝐋𝐔 𝐋𝐢𝐧𝐞𝐚𝐫 𝐑𝐞𝐋𝐔 a [l +2]

[l +1]
a
[𝐥 +𝟏] [ 𝐥 +𝟏 ] [𝐥 ] [ 𝐥 +𝟏 ] [𝐥 +𝟐] [ 𝐥 +𝟐 ] [𝐥 +𝟏] [ 𝐥 + 𝟐]
𝐳 ¿𝐖 𝐚 +𝐛 𝐳 =𝐖 𝐚 +𝐛
[ 𝐥 +𝟏] [ 𝐥 +𝟏 ]
𝐚 = 𝐠( 𝐳 ) 𝐚 [ 𝐥 +𝟐]
= 𝐠( 𝐳
[ 𝐥 +𝟐 ]
)

)
[He et al., 2015]
ResNet
Full ResNet architecture:
• Stack residual blocks
• Every residual block has two 3x3 conv layers
• Periodically, double # of filters and
downsample spatially using stride 2 (in each
dimension)
• Additional conv layer at the beginning
• No FC layers at the end (only FC 1000 to
output classes)

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
ResNet
• Total depths of 34, 50, 101, or 152 layers for
ImageNet
• For deeper networks (ResNet-50+), use
“bottleneck” layer to improve efficiency
(similar to GoogLeNet)

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
ResNet
Experimental Results:
• Able to train very deep networks without degrading
• Deeper networks now achieve lower training errors as
expected

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
ResNet

The best CNN architecture that we currently have and is a

great innovation for the idea of residual learning.
Even better than human performance!

[He et al., 2015]

Accuracy comparison

The best CNN architecture that we currently have and is a

great innovation for the idea of residual learning.

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
Forward pass time and power consumption

The best CNN architecture that we currently have and is a

great innovation for the idea of residual learning.

Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
Summary
• LeNet-5
• AlexNet
• VGG
• GoogleNet – Inception module
• ResNet – Residual block
References
• Gradient-based learning applied to document recognition; ann
LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner; 1998
• ImageNet Classification with Deep Convolutional Neural Networks -
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton; 2012
• Very Deep Convolutional Networks For Large Scale Image
Recognition - Karen Simonyan and Andrew Zisserman; 2015
• Going Deeper with Convolutions - Christian Szegedy et al.; 2015
• Deep Residual Learning for Image Recognition - Kaiming He,
Xiangyu Zhang, Shaoqing Ren, Jian Sun; 2015
• Stanford CS231- Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9
• Coursera, Machine Learning course by Andrew Ng.
References
• The 9 Deep Learning Papers You Need To Know About
(Understanding CNNs Part 3) by Adit Deshpande https://
[Link]/[Link]/The-9-Deep-Learnin
[Link]
• CNNs Architectures: LeNet, AlexNet, VGG, GoogLeNet, ResNet and
more … By Siddharth Das [Link]
siddharthdas_32104/cnns-architectures-lenet-alexnet-vgg-googlene
t-resnet-and-more-666091488df5
• Slide taken from Forward And Backpropagation in Convolutional
Neural Network. – Medium , By Sujit Rai
[Link]
n-in-convolutional-neural-network-4dfa96d7b37e
Thank You.

Typicals CNNs
No ratings yet
Typicals CNNs
58 pages
Deep Learning in Computer Vision Techniques
No ratings yet
Deep Learning in Computer Vision Techniques
76 pages
CNN Architectures in ILSVRC Competitions
No ratings yet
CNN Architectures in ILSVRC Competitions
82 pages
Alex and VGG Final
No ratings yet
Alex and VGG Final
58 pages
CNN Layer Sequence in Transfer Learning
No ratings yet
CNN Layer Sequence in Transfer Learning
64 pages
CNN Architectures for Image Classification
No ratings yet
CNN Architectures for Image Classification
68 pages
ImageNet ConvNet Architectures Overview
No ratings yet
ImageNet ConvNet Architectures Overview
13 pages
CNN Architectures in Deep Learning
No ratings yet
CNN Architectures in Deep Learning
167 pages
CNN Architectures Overview and Comparisons
No ratings yet
CNN Architectures Overview and Comparisons
107 pages
Data Science Interview Prep Guide
No ratings yet
Data Science Interview Prep Guide
11 pages
CNN Architectures Overview: LeNet to VGGNet
No ratings yet
CNN Architectures Overview: LeNet to VGGNet
40 pages
CNN Architectures and Case Studies
No ratings yet
CNN Architectures and Case Studies
120 pages
Understanding Convolutional Networks
No ratings yet
Understanding Convolutional Networks
17 pages
Overview of CNN Architectures and Models
No ratings yet
Overview of CNN Architectures and Models
57 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
64 pages
Overview of Modern CNN Architectures
No ratings yet
Overview of Modern CNN Architectures
32 pages
CNN Architectures: AlexNet & VGG Overview
No ratings yet
CNN Architectures: AlexNet & VGG Overview
15 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
84 pages
Overview of VGG Architecture in CNNs
No ratings yet
Overview of VGG Architecture in CNNs
12 pages
cs231n 2018 Lecture09
No ratings yet
cs231n 2018 Lecture09
106 pages
5-CNN Architectures 1
No ratings yet
5-CNN Architectures 1
95 pages
History of Convolutional Neural Networks
No ratings yet
History of Convolutional Neural Networks
44 pages
Overview of Classic CNN Architectures
No ratings yet
Overview of Classic CNN Architectures
167 pages
CNN Architectures: AlexNet, VGGNet, ResNet, Inception
No ratings yet
CNN Architectures: AlexNet, VGGNet, ResNet, Inception
14 pages
AlexNet: Pioneering CNN Architecture
No ratings yet
AlexNet: Pioneering CNN Architecture
15 pages
12 CNN Advanced
No ratings yet
12 CNN Advanced
93 pages
Overview of ConvNet Architectures
No ratings yet
Overview of ConvNet Architectures
84 pages
AlexNet: Revolutionizing Deep Learning
No ratings yet
AlexNet: Revolutionizing Deep Learning
15 pages
VGGNet Architecture Overview
No ratings yet
VGGNet Architecture Overview
6 pages
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
No ratings yet
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
9 pages
Evolution of CNNs in Image Classification
No ratings yet
Evolution of CNNs in Image Classification
37 pages
CNN Architectures for Image Classification
No ratings yet
CNN Architectures for Image Classification
105 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
17 pages
Ue21cs343bb2 20240216144722
No ratings yet
Ue21cs343bb2 20240216144722
36 pages
DL - Unit IV
No ratings yet
DL - Unit IV
69 pages
Deep Convolutional Neural Network Architectures
No ratings yet
Deep Convolutional Neural Network Architectures
66 pages
Computer Vision - Part2
No ratings yet
Computer Vision - Part2
27 pages
UC Berkeley: Convolutional Networks Overview
No ratings yet
UC Berkeley: Convolutional Networks Overview
31 pages
Deep Learning: Neural Networks Overview
No ratings yet
Deep Learning: Neural Networks Overview
45 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
69 pages
Overview of CNN Architectures
No ratings yet
Overview of CNN Architectures
46 pages
CNN Architectures Overview at Polimi
No ratings yet
CNN Architectures Overview at Polimi
113 pages
Understanding ResNet and C3D Architectures
No ratings yet
Understanding ResNet and C3D Architectures
82 pages
Overview of GoogLeNet Architecture
No ratings yet
Overview of GoogLeNet Architecture
44 pages
Overview of CNN Architectures in Deep Learning
No ratings yet
Overview of CNN Architectures in Deep Learning
63 pages
Comparing CNN Architectures: AlexNet, VGG, ResNet
No ratings yet
Comparing CNN Architectures: AlexNet, VGG, ResNet
25 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
211 pages
Deep Learning Image Processing Techniques
No ratings yet
Deep Learning Image Processing Techniques
39 pages
Overview of Convolutional Neural Networks
No ratings yet
Overview of Convolutional Neural Networks
82 pages
ConvNet Architectures Overview
No ratings yet
ConvNet Architectures Overview
37 pages
Comparative Analysis of CNN Architectures
No ratings yet
Comparative Analysis of CNN Architectures
41 pages
Key Insights from CNN Case Studies
No ratings yet
Key Insights from CNN Case Studies
94 pages
CO2 Session12
No ratings yet
CO2 Session12
27 pages
Deep Learning Lecture Series: CNNs
No ratings yet
Deep Learning Lecture Series: CNNs
110 pages
Deep Learning: CNN Architectures Overview
No ratings yet
Deep Learning: CNN Architectures Overview
68 pages
Understanding Convnet Architectures
No ratings yet
Understanding Convnet Architectures
22 pages
CNN Architectures for Image Recognition
No ratings yet
CNN Architectures for Image Recognition
17 pages
GoogleNET and ResNet v3 With Nin
No ratings yet
GoogleNET and ResNet v3 With Nin
74 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
26 pages
Machine Learning in Astronomy Review
No ratings yet
Machine Learning in Astronomy Review
40 pages
IoT-Driven Water Quality Monitoring System
No ratings yet
IoT-Driven Water Quality Monitoring System
39 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
48 pages
Stock Price Prediction with CNN & ML
No ratings yet
Stock Price Prediction with CNN & ML
4 pages
Digital Signal Processing Lab Manual
No ratings yet
Digital Signal Processing Lab Manual
48 pages
Convolutional Neural Network (CNN) Model For The Classification of Varieties of Date Palm Fruits (Phoenix Dactylifera L.)
No ratings yet
Convolutional Neural Network (CNN) Model For The Classification of Varieties of Date Palm Fruits (Phoenix Dactylifera L.)
18 pages
Computer Vision MCQs for Class 10
No ratings yet
Computer Vision MCQs for Class 10
17 pages
Riemannian Machine Learning for Radar Data
No ratings yet
Riemannian Machine Learning for Radar Data
5 pages
Computer & Network Security Syllabus
No ratings yet
Computer & Network Security Syllabus
45 pages
Decentralized Federated Learning for Skin Cancer Classification
No ratings yet
Decentralized Federated Learning for Skin Cancer Classification
4 pages
GAN-Based Facade Design for Urban Renovation
No ratings yet
GAN-Based Facade Design for Urban Renovation
13 pages
Pothole Severity Detection with YOLOv4
No ratings yet
Pothole Severity Detection with YOLOv4
15 pages
Lecture Notes in Electrical Engineering: Series Editors
No ratings yet
Lecture Notes in Electrical Engineering: Series Editors
16 pages
AI Applications in Lung Cancer Management
No ratings yet
AI Applications in Lung Cancer Management
11 pages
Machine Vision for SEM Image Analysis
No ratings yet
Machine Vision for SEM Image Analysis
9 pages
AI in Medical Imaging Basics
No ratings yet
AI in Medical Imaging Basics
169 pages
Deep Learning for Traffic Accident Prediction
No ratings yet
Deep Learning for Traffic Accident Prediction
15 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
30 pages
Java Machine Learning Project Links
No ratings yet
Java Machine Learning Project Links
9 pages
Contrastive Learning for Image Dehazing
No ratings yet
Contrastive Learning for Image Dehazing
10 pages
Download RGBD1K Dataset for Tracking
No ratings yet
Download RGBD1K Dataset for Tracking
9 pages
Enhancing Offline Shopping Ads with AI
No ratings yet
Enhancing Offline Shopping Ads with AI
12 pages
Deep Learning Question Bank for BAI701
No ratings yet
Deep Learning Question Bank for BAI701
3 pages
Image Classification with CNN Techniques
No ratings yet
Image Classification with CNN Techniques
15 pages
Efficient Deepfake Detection Fusion
No ratings yet
Efficient Deepfake Detection Fusion
11 pages
2023 Tech Trends Report Overview
No ratings yet
2023 Tech Trends Report Overview
820 pages
Bearing Fault Diagnosis via UCDAN Method
No ratings yet
Bearing Fault Diagnosis via UCDAN Method
11 pages
Enhanced YOLOv8 for Object Detection
No ratings yet
Enhanced YOLOv8 for Object Detection
14 pages
CNNs and Absolute Spatial Location
No ratings yet
CNNs and Absolute Spatial Location
12 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
20 pages