0% found this document useful (0 votes)

3 views66 pages

LectureNote CNN

The document discusses the limitations of flattening image matrices for neural networks, highlighting issues like excessive parameters and loss of spatial relationships. It introduces convolutional layers and filters as a solution, allowing networks to learn features more efficiently while maintaining local pixel relationships. Additionally, it covers pooling layers, CNN architecture, and advanced techniques like residual connections and batch normalization to improve model performance.

Uploaded by

armagangulal.561

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views66 pages

LectureNote CNN

Uploaded by

armagangulal.561

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Last Lecture, we flatten the image matrix into a long vector and feed it to a

dense layer.

• We need to learn “too many” parameters

• Flattening a 3024 ×3024 – pixel color image (from your phone) and
connecting to a single 100-neuron dense layer generates approximately ?
parameters.
his is computationally demanding, very data-hungry and increases the risk of
overfitting
We lose the local spatial adjacency relationships between pixels that define
features of the image.
• We don’t learn once and reuse repeatedly
• If a feature of the image (e.g., a vertical line or a circle) appears in
different places in the image, the network should “learn it once and use it
again and again” rather learn it separately each time.
Convolutional layers were developed to address these shortcomings
Convolutional Filters
A convolutional filter is a small square matrix of numbers

By choosing the numbers in a filter carefully and “applying” the filter to an

image, different features of the image can be detected
convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

• InputTake 3x3 region

• Multiply element-wise
• Sum values
• Move kernel (stride=1)
convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 3
1 2 1 0 0 1 0 -1 = -1 0 1
0 1 3 1 2 * 1 0 -1 2 -2 -1
2 0 1 3 1

(1×1 + 2×0 + 3×-1)+ (0×1 + 1×0 + 2×-1)+ (1×1 + 2×0 + 1×-1) = -4

convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(21 + 30 + 0(-1)) + (11 + 20 + 3(-1)) + (21 + 10 + 0*(-1))=2

convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(31 + 00 + 1(-1)) + (21 + 30 + 1(-1)) + (11 + 00 + 0*(-1)) = 4

convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(01 + 10 + 2(-1)) + (11 + 20 + 1(-1)) + (01 + 10 + 3*(-1)) = -5

convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(11 + 20 + 3*(-1))

+ (2*1 + 1*0 + 0*(-1))
+ (1*1 + 3*0 + 1*(-1))
=0
convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(21 + 30 + 1*(-1))

+ (1*1 + 0*0 + 0*(-1))
+ (3*1 + 1*0 + 2*(-1))
=3
convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(11 + 20 + 1*(-1))

+ (0*1 + 1*0 + 3*(-1))
+ (2*1 + 0*0 + 1*(-1))
= -2
convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(21 + 10 + 0*(-1))

+ (1*1 + 3*0 + 1*(-1))
+ (0*1 + 1*0 + 3*(-1))
= -1
convolution operation
1 2 3 0 1
0 1 2 3 2 1 0 -1 -4 2 4
1 2 1 0 0 1 0 -1 = -5 0 3
0 1 3 1 2 * 1 0 -1 -2 -1 2
2 0 1 3 1

(11 + 00 + 0*(-1))

+ (3*1 + 1*0 + 2*(-1))
+ (1*1 + 3*0 + 1*(-1))
=2
Output Size
• (N - F + 2P)/S + 1
• Example: (5-3+0)/1 +1 = 3
• F: Filter Size
• P: Padding
• S: Stride (Step Size)
Convolutional Layers
A convolutional layer is composed of one or
more convolutional filters

1 1 1
0 0 0
-1 -1 -1

Each filter can be thought of as a specialist for detecting a particular

feature (e.g., a horizontal line, an arc, a vertical line)
Applying a Convolutional Layer to
a color image

If we had instead applied f filters, the output would be a tensor with

shape 4 x 4 x f

mage source: [Link]

• These filters seem excellent but how are we supposed to
come up with the numbers in each filter?
• In fact, convolutional filters used to be designed by hand.
Computer Vision researchers invested a lot of effort in
devising filters that could detect various types of image
features
• As we figured out how to train deep networks with lots of
weights, a big idea emerged: think of the numbers in the
filter as weights and simply learn them from the data,
just like we learn all the other weights
• This is possible because a convolutional filter is just
a neuron
• Therefore, our entire machinery – neurons, layers,
loss functions, gradient descent – is perfectly applicable
As a result, a network with many convolutional layers
can learn increasingly complex features
Demos

• [Link]

• [Link]
s/demo/[Link]
Pooling Layers
• Pooling layers (also called down-sampling or
subsampling layers) reduce the size of the tensor
coming out of a convolutional layer
In average pooling, we take the average of each 2x2 box
[Link]
• Max pooling acts like an “OR” condition: if a
feature exists anywhere in its input, max-pooling
will pick it up i.e., max-pooling acts like a feature
detector

• Since successive convolutional layers can “see”

more and more of the original input image, the
max-pooling layers that follow them can detect if
a feature exists in more and more of the original
input image as well
The architecture of a basic CNN

[Link]
The architecture of a CNN

Each convolutional block typically has 1-2 convolutional layers followed by a

pooling layer
The final tensor gets flattened into a long vector and sent through 0 or more
hidden layers to the output layer

[Link]
ConvNet architecture patterns
• The modularity-hierarchy-reuse formula for
model architecture
• An overview of standard best practices for
building ConvNets: residual connections,
batch normalization, and depthwise
separable convolutions
• Ongoing design trends for computer vision
models
• Deep learning model architecture is
primarily about making clever use of
modularity, hierarchy, and reuse.
• You’ll notice that all popular ConvNet
architectures are not only structured into
layers, they’re structured into repeated
groups of layers (called blocks or modules).
• Deeper hierarchies are intrinsically good
because they encourage feature reuse and,
therefore, abstraction.
• In general, a deep stack of narrow layers
performs better than a shallow stack of large
layers. However, there’s a limit to how deep
you can stack layers: the problem
of vanishing gradients.
• This leads us to our first essential model
architecture pattern: residual connections.
Residual connections
• the game of telephone
where an initial message is whispered in the ear of
a player, who then whispers it in the ear of the
next player, and so on.
The final message ends up bearing little
resemblance to its original version.
As it happens, backpropagation in a
sequential deep learning model is pretty
similar to the game of telephone. You’ve got
a chain of functions, like this one:
y = f4(f3(f2(f1(x))))
the game of telephone. You’ve got a chain of functions, like this one:
y = f4(f3(f2(f1(x))))

The name of the game is to adjust the parameters of each function in the chain
based on the error recorded on the output of f4 (the loss of the model).
To adjust f1, you’ll need to percolate error information through f2, f3, and f4.

However, each successive function in the chain introduces some amount of

noise in the process. If your function chain is too deep, this noise starts
overwhelming gradient information, and backpropagation stops working.

Your model won’t train at all. This is called the vanishing gradients problem.
• The fix is simple: just force each function in the
chain to be nondestructive — to retain a
noiseless version of the information contained in
the previous input.
• The easiest way to implement this is called
a residual connection. It’s dead easy: just add the
input of a layer or block of layers back to its
output (see figure 9.3).
• The residual connection acts as an information
shortcut around destructive or noisy blocks (such
as blocks that contain ReLU activations or
dropout layers), enabling error gradient
information from early layers to propagate
noiselessly through a deep network.
• This technique was introduced in 2015 with the
ResNet family of models (developed by He et al.
at Microsoft).[1]
• Note that adding the input back to the output of a block
implies that the output should have the same shape as
the input. This is not the case if your block includes
convolutional layers with an increased number of filters
or a max pooling layer. In such cases, use a 1 ×
1 Conv2D layer with no activation to linearly project the
residual to the desired output shape.
• Note that adding the input back to the output of
a block implies that the output should have the
same shape as the input. In such cases, use a 1
× 1 Conv2D layer with no activation to linearly
project the residual to the desired output shape.
• if your block includes convolutional layers with an
increased number of filters or a max pooling layer.
• Typically use padding="same" in the convolution
layers in your target block to avoid spatial
downsampling due to padding, and you’d use strides
in the residual projection to match any
downsampling caused by a max pooling layer.
Batch normalization
• It’s a type of layer (BatchNormalization in Keras)
introduced in 2015 by Ioffe and Szegedy;[2] it can
adaptively normalize data even as the mean and
variance change over time during training.
• During training, it uses the mean and variance of
the current batch of data to normalize samples,
and during inference (when a big enough batch
of representative data may not be available),
• it uses an exponential moving average of the
batchwise mean and variance of the data seen
during training.
• In practice, the main effect of batch
normalization appears to be that it helps with
gradient propagation — much like residual
connections — and thus allows for deeper
networks.
• Some very deep networks can only be trained if
they include multiple BatchNormalization layers.
• For instance, batch normalization is used
liberally in many of the advanced ConvNet
architectures that come packaged with Keras,
such as ResNet50, EfficientNet, and Xception.
the main effect of batch normalization appears to
be that it helps with gradient propagation —
much like residual connections — and thus
allows for deeper networks. Some very deep
networks can only be trained if they include
multiple BatchNormalization layers. For
instance, batch normalization is used liberally in
many of the advanced ConvNet architectures
that come packaged with Keras, such as
ResNet50, EfficientNet, and Xception.
The BatchNormalization layer can be used
after any layer — Dense, Conv2D, and so on:
• Both Dense and Conv2D involve a “bias vector,”
a learned variable whose purpose is to make the
layer affine rather than purely linear. For
instance, Conv2D returns, schematically, y =
conv(x, kernel) + bias,
and Dense returns y = dot(x, kernel) +
bias. Because the normalization step will take
care of centering the layer’s output on zero, the
bias vector is no longer needed when
using BatchNormalization, and the layer can
be created without it via the
option use_bias=False. This makes the layer
slightly leaner.
Depthwise separable
convolutions

What if we told you that there’s a layer you can use as a drop-in
replacement for Conv2D that will make your model smaller (fewer
trainable weight parameters), leaner (fewer floating-point
operations), and cause it to perform a few percentage points
better on its task? That is precisely what the depthwise separable
convolution layer does (SeparableConv2D in Keras). This layer
performs a spatial convolution on each channel of its input,
independently, before mixing output channels via a pointwise
convolution (a 1 × 1 convolution), as shown in figure 9.4.
Depthwise separable
convolutions
• Consider a regular convolution operation with a
3 x 3 window, 64 input channels, and 64 output
channels. It uses 3 × 3 × 64 × 64 = 36,864
trainable parameters, and when you apply it to
an image, it runs a number of floating-point
operations that is proportional to this parameter
count.
• Meanwhile, consider an equivalent depthwise
separable convolution: it only involves 3 × 3 × 64
+ 64 × 64 = 4,672 trainable parameters and
proportionally fewer floating-point operations.
This efficiency improvement only increases as
the number of filters or the size of the
convolution windows gets larger.

Introduction to Convolutional Neural Networks
No ratings yet
Introduction to Convolutional Neural Networks
29 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
21 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
211 pages
Aiml Ece Unit-5+
No ratings yet
Aiml Ece Unit-5+
49 pages
Deep Learning and CNN Fundamentals
No ratings yet
Deep Learning and CNN Fundamentals
48 pages
DL Unit-IV
No ratings yet
DL Unit-IV
61 pages
Module 3
No ratings yet
Module 3
58 pages
Deep Learning: CNN Fundamentals & Case Study
No ratings yet
Deep Learning: CNN Fundamentals & Case Study
48 pages
Deep Learning: Convolutional Neural Networks
No ratings yet
Deep Learning: Convolutional Neural Networks
71 pages
Lec3 DL Cnns
No ratings yet
Lec3 DL Cnns
69 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
30 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
23 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
47 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
46 pages
Understanding Convolution in Deep Learning
No ratings yet
Understanding Convolution in Deep Learning
28 pages
Backward Pass in Convolution Layers
No ratings yet
Backward Pass in Convolution Layers
53 pages
Understanding Convolutional Networks and Learning Algorithms
No ratings yet
Understanding Convolutional Networks and Learning Algorithms
3 pages
Introduction to Convolutional Neural Networks
No ratings yet
Introduction to Convolutional Neural Networks
35 pages
Understanding ResNet and Skip Connections
No ratings yet
Understanding ResNet and Skip Connections
8 pages
CNN Mathematical Operations Explained
No ratings yet
CNN Mathematical Operations Explained
4 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
Understanding Convolutional Networks
No ratings yet
Understanding Convolutional Networks
72 pages
PyTorch Neural Network Basics
No ratings yet
PyTorch Neural Network Basics
20 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
116 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
18 pages
Understanding Convolution in CNNs
No ratings yet
Understanding Convolution in CNNs
62 pages
DSA 5102: Machine Learning Foundations
No ratings yet
DSA 5102: Machine Learning Foundations
45 pages
CNNs and PCA: Neural Networks Overview
No ratings yet
CNNs and PCA: Neural Networks Overview
65 pages
DSCI 556: Machine Learning Foundations
No ratings yet
DSCI 556: Machine Learning Foundations
44 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
41 pages
Pseudocode For Training: For Epoch in Epochs: For Batch - Data in Training - Data: Forward Pass Loss Caculation Backward Pass
No ratings yet
Pseudocode For Training: For Epoch in Epochs: For Batch - Data in Training - Data: Forward Pass Loss Caculation Backward Pass
35 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
19 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
19 pages
Understanding Convolutional Networks
No ratings yet
Understanding Convolutional Networks
34 pages
CNNs and Transfer Learning Overview
No ratings yet
CNNs and Transfer Learning Overview
63 pages
Bias-Variance Trade-off & CNN Overview
No ratings yet
Bias-Variance Trade-off & CNN Overview
14 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
67 pages
CNN and RNN Basics in PyTorch
No ratings yet
CNN and RNN Basics in PyTorch
13 pages
Practical Guide to Convolutional Neural Networks
No ratings yet
Practical Guide to Convolutional Neural Networks
70 pages
Understanding CNN Mathematics
No ratings yet
Understanding CNN Mathematics
9 pages
Miss Ans DL
No ratings yet
Miss Ans DL
20 pages
Beginner's Guide to CNNs in Theano
100% (2)
Beginner's Guide to CNNs in Theano
35 pages
Introduction to Convolutional Neural Networks
No ratings yet
Introduction to Convolutional Neural Networks
44 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
70 pages
Back Propagation Network & CNN
No ratings yet
Back Propagation Network & CNN
35 pages
CNN Training Process Explained
No ratings yet
CNN Training Process Explained
49 pages
Deep Neural Networks Explained
No ratings yet
Deep Neural Networks Explained
99 pages
Understanding Neural Networks: RNNs & CNNs
No ratings yet
Understanding Neural Networks: RNNs & CNNs
15 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
26 pages
DL Study Notes
No ratings yet
DL Study Notes
19 pages
CNN Techniques for Feature Map Reduction
No ratings yet
CNN Techniques for Feature Map Reduction
34 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
37 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
50 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
41 pages
Foundations of Convolutional Neural Networks
No ratings yet
Foundations of Convolutional Neural Networks
18 pages
Understanding ResNet-50 Architecture
No ratings yet
Understanding ResNet-50 Architecture
15 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
108 pages
Understanding Convolution Operations
No ratings yet
Understanding Convolution Operations
28 pages
CNN Architecture and Operations Explained
No ratings yet
CNN Architecture and Operations Explained
17 pages
Machine Learning in Microeconometrics: CV
No ratings yet
Machine Learning in Microeconometrics: CV
43 pages
Multi-Source Stock Market Prediction
No ratings yet
Multi-Source Stock Market Prediction
9 pages
Multigrid Methods in Isogeometric Discretization
No ratings yet
Multigrid Methods in Isogeometric Discretization
13 pages
Design and Analysis of Algorithms Exam
No ratings yet
Design and Analysis of Algorithms Exam
2 pages
Smoking Control Model Analysis
No ratings yet
Smoking Control Model Analysis
96 pages
Statistical Quality Control Exam 2018
No ratings yet
Statistical Quality Control Exam 2018
4 pages
Lightweight DenseNet for Plant Disease Diagnosis
No ratings yet
Lightweight DenseNet for Plant Disease Diagnosis
17 pages
Discrete Structures & Automata Theory Exam
No ratings yet
Discrete Structures & Automata Theory Exam
2 pages
Flowchart Exercises with Answers
No ratings yet
Flowchart Exercises with Answers
5 pages
Understanding Hadoop MapReduce Framework
No ratings yet
Understanding Hadoop MapReduce Framework
4 pages
Understanding Factor Variables in R
No ratings yet
Understanding Factor Variables in R
4 pages
EC3492 Digital Signal Processing Exam
No ratings yet
EC3492 Digital Signal Processing Exam
3 pages
Data Analytics Exam Paper 2023-2024
No ratings yet
Data Analytics Exam Paper 2023-2024
23 pages
Exponential Smoothing Methods Overview
No ratings yet
Exponential Smoothing Methods Overview
74 pages
BrainIB: GNN for Psychiatric Diagnosis
No ratings yet
BrainIB: GNN for Psychiatric Diagnosis
12 pages
Probability in Normal Distribution
No ratings yet
Probability in Normal Distribution
8 pages
Collaborative Filtering Techniques Explained
No ratings yet
Collaborative Filtering Techniques Explained
8 pages
Numerical Differentiation Techniques
No ratings yet
Numerical Differentiation Techniques
34 pages
Machine Learning in Number Plate Recognition
No ratings yet
Machine Learning in Number Plate Recognition
4 pages
Quantum Mechanics II: Perturbation Theory
No ratings yet
Quantum Mechanics II: Perturbation Theory
2 pages
Understanding Delta Encoding Techniques
No ratings yet
Understanding Delta Encoding Techniques
41 pages
FFT in Image Filtering and Compression
No ratings yet
FFT in Image Filtering and Compression
4 pages
AI Project Cycle Overview for Class 10
No ratings yet
AI Project Cycle Overview for Class 10
10 pages
Data Preprocessing Techniques in Weka
No ratings yet
Data Preprocessing Techniques in Weka
13 pages
Knowledge Representation in Data Mining
No ratings yet
Knowledge Representation in Data Mining
43 pages
Backpropagation Algorithm Explained
No ratings yet
Backpropagation Algorithm Explained
9 pages
Deep Learning for E-Commerce Recommendations
No ratings yet
Deep Learning for E-Commerce Recommendations
7 pages
ADI Schemes for Heston Option Pricing
No ratings yet
ADI Schemes for Heston Option Pricing
18 pages
ID3 Algorithm Decision Tree Example
No ratings yet
ID3 Algorithm Decision Tree Example
6 pages
M12 Fraser 07 PPT C12
No ratings yet
M12 Fraser 07 PPT C12
59 pages

LectureNote CNN

Uploaded by

LectureNote CNN

Uploaded by

Last Lecture, we flatten the image matrix into a long vector and feed it to a

• We need to learn “too many” parameters

By choosing the numbers in a filter carefully and “applying” the filter to an

• InputTake 3x3 region

(1×1 + 2×0 + 3×-1)+ (0×1 + 1×0 + 2×-1)+ (1×1 + 2×0 + 1×-1) = -4

(2*1 + 3*0 + 0*(-1)) + (1*1 + 2*0 + 3*(-1)) + (2*1 + 1*0 + 0*(-1))=2

(3*1 + 0*0 + 1*(-1)) + (2*1 + 3*0 + 1*(-1)) + (1*1 + 0*0 + 0*(-1)) = 4

(0*1 + 1*0 + 2*(-1)) + (1*1 + 2*0 + 1*(-1)) + (0*1 + 1*0 + 3*(-1)) = -5

(1*1 + 2*0 + 3*(-1))

(2*1 + 3*0 + 1*(-1))

(1*1 + 2*0 + 1*(-1))

(2*1 + 1*0 + 0*(-1))

(1*1 + 0*0 + 0*(-1))

Each filter can be thought of as a specialist for detecting a particular

If we had instead applied f filters, the output would be a tensor with

mage source: [Link]

• Since successive convolutional layers can “see”

Each convolutional block typically has 1-2 convolutional layers followed by a

However, each successive function in the chain introduces some amount of

You might also like

(21 + 30 + 0(-1)) + (11 + 20 + 3(-1)) + (21 + 10 + 0*(-1))=2

(31 + 00 + 1(-1)) + (21 + 30 + 1(-1)) + (11 + 00 + 0*(-1)) = 4

(01 + 10 + 2(-1)) + (11 + 20 + 1(-1)) + (01 + 10 + 3*(-1)) = -5

(11 + 20 + 3*(-1))

(21 + 30 + 1*(-1))

(11 + 20 + 1*(-1))

(21 + 10 + 0*(-1))

(11 + 00 + 0*(-1))