0% found this document useful (0 votes)

91 views28 pages

DenseNet Architecture and Advantages

The document describes DenseNet, a convolutional neural network architecture where each layer is directly connected to every other layer in a feed-forward fashion. DenseNet uses dense blocks where the output of each layer is concatenated with the outputs of preceding layers. This facilitates strong gradient flow and parameter efficiency. The architecture achieved state-of-the-art results on CIFAR, SVHN and ImageNet datasets using relatively fewer parameters than standard convolutional networks.

Uploaded by

Fahad Raza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views28 pages

DenseNet Architecture and Advantages

Uploaded by

Fahad Raza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

DENSELY CONNECTED CONVOLUTIONAL NETWORKS

Presentation by :

MariaWaheed ( l1f18bscs0460)

Farrukh Alam Virk ( l1f18bscs0424)

WHAT ARE COVERED IN THIS PRESENTATION

 Dense Block
 DenseNet Architecture
 Advantages of DenseNet
 CIFAR & SVHN Small-scale Dataset Results
 ImageNet Large-Scale Dataset Results
 Further Analysis on Feature Reuse
STANDARD
CONNECTIVITY

Dense Block:
A Dense Block is a module used in convolutional neural networks that connects all
layers (with matching feature-map sizes) directly with each other. To preserve the feed-
forward nature, each layer obtains additional inputs from all preceding layers and passes
on its own feature-maps to all subsequent layers.
In Standard ConvNet, input image goes through multiple convolution and obtain high-level
features.
R E S NET CONNECTIV ITY
Identity mappings promote gradient
propagation.

: E lement-wise addition

In ResNet, identity mapping is proposed to promote the gradient propagation. Element-wise addition is used.
It can be viewed as algorithms with a state passed from one ResNet module to another one.
DE NSE ARCHITECTURE
DE NSE
CONNECTIVITY

C C C C

C : Channel-wise
concatenation
In DenseNet, each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all
subsequent layers. Concatenation is used. Each layer is receiving a “collective knowledge” from all preceding layers.
DE NSE AND S LIM

C C C C

So, it have higher computational efficiency and memory efficiency. The following figure shows the
concept of concatenation during forward propagation
DenseNet Architecture:
Basic DenseNet Composition Layer:
For each composition layer, Pre-Activation Batch Norm (BN) and ReLU, then 3×3 Conv are done with output feature maps
of k channels, say for example, to transform x0, x1, x2, x3 to x4. This is the idea from Pre-Activation ResNet.

Convolution (3x3)
Batch Norm
x3x x4
x1

ReL
3
0 xx1

U
2 x 0
x 2
x
k
channels
x5 =h5([x0, …, x4])
DenseNet-B (Bottleneck Layers):
To reduce the model complexity and size, BN-ReLU-1×1 Conv is done before BN-ReLU-3×3 Conv.

Convolution (1x1)

Convolution (3x3)
x4
Batch Norm

Batch Norm
x3x

ReL

ReL
x 1x

U
2
0

lxk 4xk k
channels channels channels
Higher parameter and computational
efficiency
MULTIPLE DENSE BLOCKS WITH TRANSITION LAYERS:
1×1 CONV FOLLOWED BY 2×2 AVERAGE POOLING ARE USED AS THE TRANSITION LAYERS BETWEEN TWO
CONTIGUOUS DENSE BLOCKS.

FEATURE MAP SIZES ARE THE SAME WITHIN THE DENSE BLOCK SO THAT THEY CAN BE CONCATENATED TOGETHER
EASILY.

AT THE END OF THE LAST DENSE BLOCK, A GLOBAL AVERAGE POOLING IS PERFORMED AND THEN A SOFTMAX
CLASSIFIER IS ATTACHED.

Dense Block 1 Dense Block 2 Dense Block 3

Convolution

Convolution
Pooling

Pooling

Linea
Output

r
Pooling reduces Feature map sizes match
feature map sizes within each block
DENSENETS-B
DenseNets-B are just regular DenseNets that take advantage of 1x1 convolution to reduce the feature
maps size before the 3x3 convolution and improve computing efficiency. The B comes after the name
Bottleneck layer you are already familiar with from the work on ResNets.
DenseNet-BC (Further Compression):
 If a dense block contains m feature-maps, The transition layer generate θm output feature
maps, where 0<θ≤1 is referred to as the compression factor.
 When θ=1, the number of feature-maps across transition layers remains unchanged. DenseNet with
θ<1 is referred as DenseNet-C, and θ=0.5 in the experiment.
 When both the bottleneck and transition layers with θ<1 are used, the model is referred
as DenseNet-BC.
 Finally, DenseNets with/without B/C and with different L layers and k growth rate are trained.

 DenseNets-C are another little incremental step to DenseNets-B, for the

cases where we would like to reduce the number of output feature maps.
The compression factor (theta) determines this reduction. Instead of having
m feature maps at a certain layer, we will have theta*m. Of course, is in the
range [0–1]. So DenseNets will remain the same when theta=1, and will be
DenseNets-B otherwise.
ADVANTAGES OF
DENSENET
ADVANTAGE 1: STRONG GRADIENT
FLOW

Error
Signal

The error signal can be easily propagated to earlier layers more

directly. This is a kind of implicit deep supervision as earlier layers
can get direct supervision from the final classification layer.
ADVANTAGE 2: PARAMETER & COMPUTATIONAL
EFFICIENCY
For each layer, number of parameters in ResNet is directly proportional to C×C while Number of
parameters in Dense Net is directly proportional to l×k×k

ResNet connectivity: #parameters:

Input s Output
t ure
fea
at ed
r rel hl O(CxC)
Co
C C

DenseNet connectivity: k<<C

Input
ures
eat Output
ifie df
ver
s O(lxkxk)
Di k: Growth rate
lX hl
k
k
ADVANTAGE 3: MAINTAINS LOW COMPLEXITY
FEATURES
Standard Connectivity:

Classifier uses most complex (high level)

features

w4 y = w4h4(x)

x h1(x) h2(x) h3(x) h4(x) classifier

In Dense Net, classifier uses features of all complexity
levels. It tends to give more smooth decision
boundaries. It also explains why Dense Net performs
well when training data is insufficient.

Increasingly complex
features
ADVANTAGE 3: MAINTAINS LOW COMPLEXITY
FEATURES
Dense Connectivity:
w0 y = w 0x +
Classifier uses features of all complexity
levels w1 +w1h1(x)
w2 +w2h2(x)
w3 +w3h3(x)
C C C C w4
+w4h4(x)
x h1(x) h2(x) h3(x) h4(x) classifier

In DenseNet, classifier uses features of all complexity levels. It tends to give more smooth decision
boundaries. It also explains why DenseNet performs well when training data is insufficient.

Increasingly complex
features
RESULTS
RESULTS ON C I FA R -
10
ResNet (110 Layers, 1.7 M) ResNet (1001 Layers, 10.2 M)
DenseNet (100 Layers, 0.8 M) DenseNet (250 Layers, 15.3 M)

W i t h data augmentation Without data augmentation

12.0 12.0
11.0 11.0 11.26
10.0 10.0 10.56

9.0 9.0 Previous

8.0 8.0 SOTA
Test Error

7.3
7.0 7.0
6.0 6.41 Previous 6.0
(%)

SOTA 5.9
5.0 5.0 5.2
4.62
4.0 4.5 4.2 4.0
3.6
3.0 3.0
2.0 2.0
With data augmentation (C10+), test
error:
•Small-size ResNet-110: 6.41%
•Large-size ResNet-1001 (10.2M parameters): 4.62%
•State-of-the-art (SOTA) 4.2%
•Small-size Dense Net-BC (L=100, k=12) (Only 0.8M parameters):
4.5%
•Large-size Dense Net (L=250, k=24): 3.6%

Without data augmentation (C10),

test error:
•Small-size ResNet-110: 11.26%
•Large-size ResNet-1001 (10.2M parameters): 10.56%
•State-of-the-art (SOTA) 7.3%
•Small-size Dense Net-BC (L=100, k=12) (Only 0.8M parameters):
5.9%
•Large-size Dense Net (L=250, k=24): 4.2%
RESULTS ON C IFA R -
100
ResNet (110 Layers, 1.7 M) ResNet (1001 Layers, 10.2 M)
DenseNet (100 Layers, 0.8 M) DenseNet (250 Layers, 15.3 M)

W i t h data augmentation Without data augmentation

35.0 35.0 35 .5 8
33.47 Previous
30.0 30.0 SOTA
28.2
27.22 Previous
25.0 25.0
SOTA
Test Error

24.2
22.71 22.3
20.0 20.5 20.0
(%)

19.6
17.6
15.0 15.0

10.0 10.0
DETAIL RESULTS:

SVHN is the Street View House Numbers dataset. The blue

color means the best result. Dense Net-BC cannot get a
better result than the basic Dense Net, authors argue that
SVHN is a relatively easy task, and extremely deep models
may overfit the training set.
RESULTS ON
I M A GEN ET
DenseNet ResNet DenseNet ResNet
28.0 28.0
ResNet-34 ResNet-34

26.0 26.0
DenseNet-121 DenseNet-121

Top-1 error (%)

ResNet-50 ResNet-50
24.0 24.0
DenseNet-169 DenseNet-169

DenseNet-201ResNet-101 DenseNet-201 ResN et-101

ResNet-152 ResNet-152
22.0 22.0
DenseNet-264
DenseNet-264
DenseNet-264(k=48) DenseNet-264(k=48)

20.0 20.0

23
16

29
3
20

80
40

60
0

# Parameters (M) GFLOPs

Top-1: 20.27%
Top-5: 5.17%
MULTI-SCALE (Preview
DENSENET )

Classifier 1 Classifier 2 Classifier 3 Classifier 4 …

cat: 0.2 cat: 0.4 cat: 0.6
0.2 ≱ 0.4 ≱ 0.6 > threshold
threshold threshold
MULTI-SCALE (Preview
DENSENET )

Test …
Input
Inference Speed:
…
~ 2.6x faster than ResNets
~ 1.3x faster than DenseNets
…

Classifier 1 Classifier 2 Classifier 3 Classifier 4 …

“Easy” “Hard”
examples examples
CONVOLUTIONAL
NETWORKS
LeNet AlexNet

VGG Inception

ResNet

Parallel Programming Concepts in IS1200
No ratings yet
Parallel Programming Concepts in IS1200
34 pages
CUDA Memory Types Overview
No ratings yet
CUDA Memory Types Overview
27 pages
Perceptron vs. Neuron in Deep Learning
No ratings yet
Perceptron vs. Neuron in Deep Learning
8 pages
Machine Learning Workload Essentials
No ratings yet
Machine Learning Workload Essentials
2 pages
Understanding CNNs in Deep Learning
No ratings yet
Understanding CNNs in Deep Learning
64 pages
Neuron Dynamics and Mathematical Systems
No ratings yet
Neuron Dynamics and Mathematical Systems
8 pages
Training Neural Networks Overview
No ratings yet
Training Neural Networks Overview
138 pages
Neuromorphic Computing for AI Efficiency
No ratings yet
Neuromorphic Computing for AI Efficiency
9 pages
Biological Neurons in Soft Computing
No ratings yet
Biological Neurons in Soft Computing
11 pages
Understanding Two's Complement Arithmetic
No ratings yet
Understanding Two's Complement Arithmetic
30 pages
CNN Architectures and Applications Overview
No ratings yet
CNN Architectures and Applications Overview
82 pages
Fundamentals of Artificial Neural Networks
No ratings yet
Fundamentals of Artificial Neural Networks
35 pages
Iris Dataset: Logistic Regression Analysis
No ratings yet
Iris Dataset: Logistic Regression Analysis
24 pages
Attractor Neural Networks Overview
No ratings yet
Attractor Neural Networks Overview
53 pages
CUDA Optimization Techniques by Stephen Jones
No ratings yet
CUDA Optimization Techniques by Stephen Jones
71 pages
CUDA Programming Fundamentals Guide
No ratings yet
CUDA Programming Fundamentals Guide
37 pages
Regularization Techniques in Deep Learning
No ratings yet
Regularization Techniques in Deep Learning
33 pages
Edge AI: Deep Learning Optimization
No ratings yet
Edge AI: Deep Learning Optimization
38 pages
Xavier Initialization in Deep Networks
No ratings yet
Xavier Initialization in Deep Networks
8 pages
Random Bit Generation Techniques
No ratings yet
Random Bit Generation Techniques
34 pages
Single Layer Perceptron Overview
No ratings yet
Single Layer Perceptron Overview
52 pages
128 NCPAINBook
No ratings yet
128 NCPAINBook
446 pages
1.10 Greedy Layer-Wise Training
No ratings yet
1.10 Greedy Layer-Wise Training
10 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
123 pages
Understanding Domain-Specific Architecture
No ratings yet
Understanding Domain-Specific Architecture
4 pages
Back Propagation in Neural Networks
No ratings yet
Back Propagation in Neural Networks
30 pages
High Performance Computing
100% (1)
High Performance Computing
294 pages
Neural Networks: Key Concepts Explained
No ratings yet
Neural Networks: Key Concepts Explained
1 page
Biological Neurons and Neural Networks, Artificial Neurons
No ratings yet
Biological Neurons and Neural Networks, Artificial Neurons
14 pages
IEEE 1901.2a in IoT Networking
No ratings yet
IEEE 1901.2a in IoT Networking
25 pages
k-Nearest Neighbor Overview and Metrics
No ratings yet
k-Nearest Neighbor Overview and Metrics
60 pages
Deep Q-Networks and Variants Overview
No ratings yet
Deep Q-Networks and Variants Overview
59 pages
Deep Feedforward Networks Overview
No ratings yet
Deep Feedforward Networks Overview
9 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
28 pages
(Ebook) Human-Machine Shared Contexts by William Lawless (Editor), Ranjeev Mittu (Editor), Donald Sofge (Editor) Isbn 9780128205433, 0128205431
No ratings yet
(Ebook) Human-Machine Shared Contexts by William Lawless (Editor), Ranjeev Mittu (Editor), Donald Sofge (Editor) Isbn 9780128205433, 0128205431
98 pages
Understanding Netfilter in Linux
No ratings yet
Understanding Netfilter in Linux
23 pages
Optimizing GPU Kernels for Deep Learning
No ratings yet
Optimizing GPU Kernels for Deep Learning
27 pages
A Comprehensive Survey On Model Compression and Acceleration
No ratings yet
A Comprehensive Survey On Model Compression and Acceleration
43 pages
MPLS for Quality of Service in NoCs
No ratings yet
MPLS for Quality of Service in NoCs
4 pages
Introduction To High Performance Scientific Computing
No ratings yet
Introduction To High Performance Scientific Computing
464 pages
Cluster Computing
No ratings yet
Cluster Computing
32 pages
Neural Networks for Fruit Sorting
No ratings yet
Neural Networks for Fruit Sorting
16 pages
CUDA and GPU Architecture Overview
No ratings yet
CUDA and GPU Architecture Overview
20 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
10 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
5 pages
Evolutionary Computing Slides
No ratings yet
Evolutionary Computing Slides
35 pages
Ann Book
No ratings yet
Ann Book
16 pages
TinyML to TinyDL: Trade-offs and Advances
100% (1)
TinyML to TinyDL: Trade-offs and Advances
38 pages
CNN Image Classification Explained
No ratings yet
CNN Image Classification Explained
27 pages
Neural Networks: Basics and Applications
No ratings yet
Neural Networks: Basics and Applications
36 pages
xLSTMTime: Enhanced Time Series Forecasting
No ratings yet
xLSTMTime: Enhanced Time Series Forecasting
13 pages
Local Search Algorithms in AI
No ratings yet
Local Search Algorithms in AI
32 pages
Cloud Networking Design and Implementation
No ratings yet
Cloud Networking Design and Implementation
21 pages
Deep Generative Models: GANs & VAEs Overview
No ratings yet
Deep Generative Models: GANs & VAEs Overview
45 pages
Overview of DenseNet Architecture
100% (1)
Overview of DenseNet Architecture
9 pages
Review - DenseNet - Dense Convolutional Network (Image Classification) - by Sik-Ho Tsang - Towards Data Science
No ratings yet
Review - DenseNet - Dense Convolutional Network (Image Classification) - by Sik-Ho Tsang - Towards Data Science
14 pages
Deep Learning Lecture: CNNs & Transfer Learning
No ratings yet
Deep Learning Lecture: CNNs & Transfer Learning
45 pages
Understanding DenseNets Architecture
No ratings yet
Understanding DenseNets Architecture
6 pages
Overview of CNN Architectures and Models
No ratings yet
Overview of CNN Architectures and Models
59 pages
DenseNet: Efficient Deep Learning Networks
No ratings yet
DenseNet: Efficient Deep Learning Networks
11 pages
Effective Decision-Making in Organizations
No ratings yet
Effective Decision-Making in Organizations
11 pages
Reaction Time to Emotional vs. Neutral Words
No ratings yet
Reaction Time to Emotional vs. Neutral Words
20 pages
Completing Control Systems in Biology
No ratings yet
Completing Control Systems in Biology
34 pages
Understanding Memory in Dementia
No ratings yet
Understanding Memory in Dementia
2 pages
Autonomous Intelligence in AI Systems
No ratings yet
Autonomous Intelligence in AI Systems
35 pages
Brain Circuit Linked to Spirituality
No ratings yet
Brain Circuit Linked to Spirituality
4 pages
Emotional Intelligence Assessment Scale
No ratings yet
Emotional Intelligence Assessment Scale
3 pages
Trataka's Impact on Critical Flicker Fusion
No ratings yet
Trataka's Impact on Critical Flicker Fusion
4 pages
Selman's Stages of Perspective-Taking
No ratings yet
Selman's Stages of Perspective-Taking
2 pages
Understanding Child Development Factors
No ratings yet
Understanding Child Development Factors
17 pages
Pill Bug Taste and Odor Preferences
No ratings yet
Pill Bug Taste and Odor Preferences
6 pages
Neuropsychological Assessment and Recovery
No ratings yet
Neuropsychological Assessment and Recovery
6 pages
Teacher Reflection on TRF Strategies
No ratings yet
Teacher Reflection on TRF Strategies
5 pages
Overview of the Central Nervous System
No ratings yet
Overview of the Central Nervous System
12 pages
Adolescence: Identity vs. Role Confusion
No ratings yet
Adolescence: Identity vs. Role Confusion
15 pages
Understanding Perceptual Grouping & Behavior
No ratings yet
Understanding Perceptual Grouping & Behavior
2 pages
Ascending Pathways in the Spinal Cord
No ratings yet
Ascending Pathways in the Spinal Cord
27 pages
Differential Regional Cerebrovascular Reactivity.7
No ratings yet
Differential Regional Cerebrovascular Reactivity.7
11 pages
Expanding Views on Human Intelligence
No ratings yet
Expanding Views on Human Intelligence
8 pages
Supporting Learners with Additional Needs
No ratings yet
Supporting Learners with Additional Needs
33 pages
Understanding the Framing Effect in Decisions
No ratings yet
Understanding the Framing Effect in Decisions
2 pages
Understanding Artificial Intelligence Basics
No ratings yet
Understanding Artificial Intelligence Basics
3 pages
Effective Note Taking Strategies
No ratings yet
Effective Note Taking Strategies
2 pages
Trampoline Exercises for Seniors
No ratings yet
Trampoline Exercises for Seniors
4 pages
Treatment Dysarthria Efficacy
No ratings yet
Treatment Dysarthria Efficacy
12 pages
Understanding the SQ3R Reading Method
No ratings yet
Understanding the SQ3R Reading Method
19 pages
Effective Study Habits and Techniques
No ratings yet
Effective Study Habits and Techniques
22 pages
Nerve Impulse Mechanisms Explained
No ratings yet
Nerve Impulse Mechanisms Explained
18 pages
Hull's Drive Reduction and Homeostasis Theory
No ratings yet
Hull's Drive Reduction and Homeostasis Theory
17 pages
Staying Positive in Tough Work Environments
No ratings yet
Staying Positive in Tough Work Environments
28 pages

DenseNet Architecture and Advantages

Uploaded by

DenseNet Architecture and Advantages

Uploaded by

DENSELY CONNECTED CONVOLUTIONAL NETWORKS

Farrukh Alam Virk ( l1f18bscs0424)

k channels k channels k channels k channels

Dense Block 1 Dense Block 2 Dense Block 3

 DenseNets-C are another little incremental step to DenseNets-B, for the

The error signal can be easily propagated to earlier layers more

ResNet connectivity: #parameters:

DenseNet connectivity: k<<C

Classifier uses most complex (high level)

x h1(x) h2(x) h3(x) h4(x) classifier

W i t h data augmentation Without data augmentation

9.0 9.0 Previous

Without data augmentation (C10),

W i t h data augmentation Without data augmentation

SVHN is the Street View House Numbers dataset. The blue

Top-1 error (%)

DenseNet-201ResNet-101 DenseNet-201 ResN et-101

# Parameters (M) GFLOPs

Classifier 1 Classifier 2 Classifier 3 Classifier 4 …

Classifier 1 Classifier 2 Classifier 3 Classifier 4 …

You might also like