0% found this document useful (0 votes)
16 views13 pages

Optimizing CNN Hyperparameters Guide

The document discusses key considerations for designing Convolutional Neural Networks (CNNs), emphasizing the lack of a one-size-fits-all approach for selecting kernel sizes, output maps, and layers. It highlights the importance of transfer learning and feature extraction in improving model performance, especially when dealing with small datasets. Additionally, it addresses the necessity of data augmentation to enhance training data for deep learning models, thereby improving their effectiveness.

Uploaded by

aimabatool112
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views13 pages

Optimizing CNN Hyperparameters Guide

The document discusses key considerations for designing Convolutional Neural Networks (CNNs), emphasizing the lack of a one-size-fits-all approach for selecting kernel sizes, output maps, and layers. It highlights the importance of transfer learning and feature extraction in improving model performance, especially when dealing with small datasets. Additionally, it addresses the necessity of data augmentation to enhance training data for deep learning models, thereby improving their effectiveness.

Uploaded by

aimabatool112
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Convolutional Neural Network

How can I decide the kernel size, output


maps and layers of CNN?
Unfortunately there is absolutely no general answer to this question. No
principal method to determine these hyper parameters is known.
• Deeper networks is always better, at the cost of more data and increased
complexity of learning.
• Initially use fewer filters and gradually increase and monitor the error rate
to see how it is varying.
• Very small filter sizes will capture very fine details of the image. On the
other hand having a bigger filter size will leave out minute details in the
image.
• Just start of with a modest number of layers and increase the number
while measuring you performance on the test set.
• A conventional approach is to look for similar problems and deep learning
architectures which have already been shown to work. Than a suitable
architecture can be developed by experimentation.
Transfer Learning
• Transfer learning is a machine learning method
where a model developed for a task is reused as the
starting point for a model on a second task.
• Neural Network learn knowledge from one task and
apply that knowledge to another task
• Feature Extraction
• Another approach is to use Deep Learning to discover the best
representation of your problem, which means finding the most important
features. This approach is also known as Representation Learning and can
often result in a much better performance than can be obtained with
hand-designed representation.
Important Point
• Remember that
• Early layer in deep learning model identify
simple shapes
• Later layer identify more complex pattern by
extracting more abstract level features
• Last layers perform classification
• Most layers in deep neural networks are
useful because most of the computer vision
problems contain similar low level patterns
Transfer Learning

Download pretrained model weights


Small Dataset
remove the last fully connected layer
Add your own fully connected layer
Freeze the layer except the fully connected layer
Larger Dataset
• Freeze fewer layer and train the later layers
• Retain the output layer(As you have ferwer classes)
Very Large Dataset
• Download the pretrained model and the weights
• Retain the model(All layers)
Transfer Learning with Image Data

Three examples of models used for image processing of this type include:
• Oxford VGG Model
• Google Inception Model
• Microsoft ResNet Model

• Transfer Learning with Language Data


• Two examples of models of this type include:
• Google’s word2vec Model
• Stanford’s GloVe Model
Data Augmentation
Image Augmentation for Deep Learning
• Deep networks need large amount of training data to
achieve good performance. To build a powerful image
classifier using very little training data, image
augmentation is usually required to boost the
performance of deep networks.
• Image augmentation artificially creates training images
through different ways of processing or combination of
multiple processing, such as random rotation, shifts,
shear and flips, etc.
Data Augmentation
Data Augmentation

You might also like