0% found this document useful (0 votes)

17 views12 pages

DL UNIT 2 CNN Architectures

The document discusses various Convolutional Neural Network (CNN) architectures including LeNet, AlexNet, VGG, GoogLeNet, and ResNet, highlighting their designs and contributions to the ImageNet Large Scale Vision Recognition Challenge. It explains the importance of convolution and pooling layers in reducing parameters for efficient processing, and provides insights into the unique features of each architecture, such as activation functions and dropout techniques. Additionally, it includes details on convolution parameters and the overall architecture of each model.

Uploaded by

gomalavalli.ece

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views12 pages

DL UNIT 2 CNN Architectures

Uploaded by

gomalavalli.ece

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

CNN Architectures — LeNet,

AlexNet, VGG, GoogLeNet and

ResNet

Prabhu Raghav
Follow
5 min read
·
Mar 15, 2018
348
3

In my previous blog post, explained about my understanding of

Convolution Neural Network (CNN). In this post, I am going to
detailing about convolution parameters and various CNN
architectures used in ImageNet challenge. ImageNet has been
running an annual competition in visual recognition (ILSVRC# — “
ImageNet Large Scale Vision Recognition Challenge” from 2010
onwards) where participants are provided with 1.4 millions of
images. Below are some popular CNN architectures won in ILSVRC
competitions.

 LeNet-5

 AlexNet
 VGGNet

 GoogLeNet

 ResNet

Figure 1 : ILSVRC

Before delve into see the above mentioned details, let us see why
convolution and pooling layers are used in front of FC layers (Fully
Connected Layer) and how convolution, pooling layers is being
calculated in order to filtering essential features in an input image.

Why convolution layers are used before the fully connected

layers?
Generally, convolution and pooling layers act as giant filters.
Imagine an image 224 x 224 x 3 pixels and FC layer as a direct first
hidden layer with 100,000 perceptrons, the total number of
connections will be 224*224*3*100000=15,02,800,000 =>15 billion
parameters which is impossible to process. Convolution and max
pooling layers can actually help to reduce some features in the
image which are may not required to train.

Convolution Parameters

In convolution layer, it accepts a volume of size W x H x D and

requires four hyper parameters as follows:

 Number of filters → K

 Spatial Extent → Fw, Fh (Filter width, Filter height)

 Stride → Sw, Sh (Stride width, Stride height)

 Padding → P

To calculate receptive field, the formula is as follows,

To calculate pooling layer, the formula is as follows,

 OM → Output Matrix

 IM → Input Matrix

 P → Padding

 F → Filter

 S → Stride

By applying above receptive and pooling calculation formulas, the

convolutions, pooling and feature maps outputs are derived. The
key operations in a CNN,
Figure 2 : CNN Key Operation (Source : R.Fergus, Y.LeCun)

LeNet-5

In this classical neural network architecture successfully used on

MNIST handwritten digit recogniser patterns. Below is the LeNet-5
architecture model.
Figure 3 : LeNet-5 Architecture

LeNet-5 receives an input image of 32 x 32 x 1 (Greyscale image)

and goal was to recognise handwritten digit patterns. It uses 5 x 5
filter and with stride is 1. By applying the above receptive field
calculation formula and the output volume result is 28 x 28. The
derivation is in below,

 W x H → 32 x 32 (Width x Height)

 F(w x h) → 5 x 5 (Filter)

 S → 1 (Stride)

 P → 0 (Pooling)
Next layer is a pooling layer, to calculate pooling layer in the above
LeNet-5 architecture, the derivation as follows in below,

 IM → 28 (Input Matrix → Convolution output volume., Refer

above derivation output)

 P → 0 (Pooling)

 S → 1 (Stride)

Finally, it goes to fully connected (FC Layer) layer with 120 nodes
and followed by another FC Layer with 84 nodes. It uses Sigmoid or
Tanh nonlinearity functions. The output variable Yhat with 10
possible values from digits 0 to 9. It is trained on MNIST digit
dataset with 60K training examples.

AlexNet
It starts with 227 x 227 x 3 images and the next convolution layer
applies 96 of 11 x 11 filter with stride of 4. The output volume
reduce its dimension by 55 x 55. Next layer is a pooling layer which
applies max pool by 3 x 3 filter along with stride 2. It goes on and
finally reaches FC layer with 9216 parameter and the next two FC
layers with 4096 node each. At the end, it uses Softmax function
with 1000 output classes. It has 60 million parameters.

Figure 4 : AlexNet Architecture

Some of the highlights in AlexNet Architecture:

 It uses ReLU activation function instead Sigmoid or Tanh

functions. It speed by more than 5 times faster with same
accuracy
 It uses “Dropout” instead of regularisation to deal with
overfitting. But the training time is doubled with dropout ratio of
0.5

 More data and bigger model with 7 hidden layers, 650K units
and 60M parameters.

VGG-16

VGG-16 is a simpler architecture model, since its not using much

hyper parameters. It always uses 3 x 3 filters with stride of 1 in
convolution layer and uses SAME padding in pooling layers 2 x 2
with stride of 2.

Figure 5 : VGG-16 → Source

GoogLeNet

The winner of ILSVRC 2014 and the GoogLeNet architecture is also

known as Inception Module. It goes deeper in parallel paths with
different receptive field sizes and it achieved a top-5 error rate with
of 6.67%.
Figure 6 : GoogLeNet Inception Module

This architecture consists of 22 layer in deep. It reduces the number

of parameters from 60 million (AlexNet) to 4 million. The alternate
view of this architecture in a tabular format in below
Table 1 : GoogLeNet Architecture Tabular View. Source

ResNet (2015)

The winner of ILSRVC 2015, it also called as Residual Neural

Network (ResNet) by Kaiming. This architecture introduced a
concept called “skip connections”. Typically, the input matrix
calculates in two linear transformation with ReLU activation
function. In Residual network, it directly copy the input matrix to
the second transformation output and sum the output in final ReLU
function.
Figure 7 : ResNet Architecture comparison with plain. Source

The overall summary table of various CNN architecture models in

below.

Table 2 : ILSVRC competition CNN architecture models

465-Lecture 7
No ratings yet
465-Lecture 7
46 pages
Deep CNN
No ratings yet
Deep CNN
66 pages
ML II - Unit IV
No ratings yet
ML II - Unit IV
20 pages
Convolutional Neural Network Report
No ratings yet
Convolutional Neural Network Report
5 pages
CSCI417 Machine Intelligence - Lec11 RNN - V1
No ratings yet
CSCI417 Machine Intelligence - Lec11 RNN - V1
61 pages
Convolutional Neural Network2 26112024 015227pm
No ratings yet
Convolutional Neural Network2 26112024 015227pm
41 pages
Unit 2 CNN
No ratings yet
Unit 2 CNN
15 pages
CNNs: A Guide for Tech Enthusiasts
No ratings yet
CNNs: A Guide for Tech Enthusiasts
80 pages
CNN Architectures and Best Practices
No ratings yet
CNN Architectures and Best Practices
64 pages
ML Seminar
No ratings yet
ML Seminar
58 pages
Module 05
No ratings yet
Module 05
10 pages
CNN Architectures for Text and Image
No ratings yet
CNN Architectures for Text and Image
167 pages
Modern CNN Architectures Overview
No ratings yet
Modern CNN Architectures Overview
32 pages
Ch-3 Convolutional Neural Networks (CNNS)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNS)
11 pages
CNNs for Image Recognition
No ratings yet
CNNs for Image Recognition
17 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
17 pages
CNN Architecture
No ratings yet
CNN Architecture
6 pages
LeNet-5: CNN Architecture Overview
No ratings yet
LeNet-5: CNN Architecture Overview
14 pages
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
No ratings yet
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
50 pages
Al3502 - DLV Unit 3
No ratings yet
Al3502 - DLV Unit 3
11 pages
5b Dana
No ratings yet
5b Dana
67 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
14 pages
Notes
No ratings yet
Notes
15 pages
Aidl 2023s DL 08 CNN Architectures
No ratings yet
Aidl 2023s DL 08 CNN Architectures
51 pages
Bascis of AI - Module 2 - Complementary Study Material - 4
No ratings yet
Bascis of AI - Module 2 - Complementary Study Material - 4
4 pages
Unit 3
No ratings yet
Unit 3
37 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Unit V
No ratings yet
Unit V
84 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
15 pages
ML Lec 14 LeNeT CNN Architecture
No ratings yet
ML Lec 14 LeNeT CNN Architecture
14 pages
Module - 2.2
No ratings yet
Module - 2.2
20 pages
Unit 3
No ratings yet
Unit 3
14 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
28 pages
19 ResNet 10 09 2024
No ratings yet
19 ResNet 10 09 2024
35 pages
Unit 3
No ratings yet
Unit 3
38 pages
LeNet-5 and AlexNet Architectures Explained
No ratings yet
LeNet-5 and AlexNet Architectures Explained
13 pages
CNN Models: A Historical Overview
No ratings yet
CNN Models: A Historical Overview
82 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
Difference Between Alexnet, Vggnet, Resnet, and Inception
No ratings yet
Difference Between Alexnet, Vggnet, Resnet, and Inception
14 pages
Unit 5
No ratings yet
Unit 5
24 pages
Unit 5a - Machine Vision
No ratings yet
Unit 5a - Machine Vision
55 pages
Advancements in Image Classification Using Convolutional Neural Network
No ratings yet
Advancements in Image Classification Using Convolutional Neural Network
8 pages
Module 3 B
No ratings yet
Module 3 B
40 pages
BEFA
No ratings yet
BEFA
23 pages
Notes - CSE (DS)
No ratings yet
Notes - CSE (DS)
44 pages
CNN (1) - Unit 3 - Merged
No ratings yet
CNN (1) - Unit 3 - Merged
14 pages
CNN Architectures 01
No ratings yet
CNN Architectures 01
66 pages
Notes - CSE (DS)
No ratings yet
Notes - CSE (DS)
44 pages
CS414-Lesson 04. Popular Convolutional Neural Network Architectures
No ratings yet
CS414-Lesson 04. Popular Convolutional Neural Network Architectures
26 pages
Different Types of CNN Architecture1
No ratings yet
Different Types of CNN Architecture1
2 pages
138 B Pretrained Networks Classification Complete
No ratings yet
138 B Pretrained Networks Classification Complete
47 pages
Kanoria Shubham Anil 2023HT01569
No ratings yet
Kanoria Shubham Anil 2023HT01569
9 pages
Trustworthy - Final Essay
No ratings yet
Trustworthy - Final Essay
21 pages
Image Processing With Deep Learning
No ratings yet
Image Processing With Deep Learning
39 pages
CNN Basic
No ratings yet
CNN Basic
64 pages
Data Science Interview Prep: CNNs Explained
No ratings yet
Data Science Interview Prep: CNNs Explained
11 pages
Deep Learning Assign 2
No ratings yet
Deep Learning Assign 2
5 pages
Image Processing with CNNs Overview
No ratings yet
Image Processing with CNNs Overview
63 pages
Introducing Convolutional Neural Networks Slides
No ratings yet
Introducing Convolutional Neural Networks Slides
94 pages
EAMCET Probability and Random Variables
100% (5)
EAMCET Probability and Random Variables
8 pages
Lecture # 2-1 Probabilistic Models
No ratings yet
Lecture # 2-1 Probabilistic Models
40 pages
CS 791 - Course Overview
No ratings yet
CS 791 - Course Overview
27 pages
Computing Theory Assignment
No ratings yet
Computing Theory Assignment
10 pages
Advanced Deep Learning Practical File
No ratings yet
Advanced Deep Learning Practical File
29 pages
Distribution Cheatsheet PDF
No ratings yet
Distribution Cheatsheet PDF
3 pages
Tarifa Tinta Azul Web 2019
No ratings yet
Tarifa Tinta Azul Web 2019
163 pages
The Significance of LLM Tokenization
No ratings yet
The Significance of LLM Tokenization
6 pages
19eid331 - Artificial Neural Networks
No ratings yet
19eid331 - Artificial Neural Networks
3 pages
ML Prac 1 Urvashi
No ratings yet
ML Prac 1 Urvashi
15 pages
Understanding Pushdown Automata Basics
No ratings yet
Understanding Pushdown Automata Basics
30 pages
Sta3704 2025 TL012
No ratings yet
Sta3704 2025 TL012
2 pages
Actuarial Exam MLC Formula Guide
No ratings yet
Actuarial Exam MLC Formula Guide
34 pages
L11 CRF Tagger
No ratings yet
L11 CRF Tagger
8 pages
Effective Weight Initialization Methods
No ratings yet
Effective Weight Initialization Methods
8 pages
Solutions For Exercise Sheet 1
No ratings yet
Solutions For Exercise Sheet 1
7 pages
CH2.3 - Resnet Backpropagation Well Explained
No ratings yet
CH2.3 - Resnet Backpropagation Well Explained
5 pages
Neural Networks Essay Feranmi Dere
No ratings yet
Neural Networks Essay Feranmi Dere
7 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
Deep Learning Viva Questions Simple Answers
No ratings yet
Deep Learning Viva Questions Simple Answers
3 pages
Dynamic Dropout for Transformers
No ratings yet
Dynamic Dropout for Transformers
10 pages
Understanding CNN Architecture Basics
No ratings yet
Understanding CNN Architecture Basics
13 pages
Clip
No ratings yet
Clip
15 pages
Wooldridge, J. M., 2010. Econometric Analysis of Cross Section and Panel Data
0% (2)
Wooldridge, J. M., 2010. Econometric Analysis of Cross Section and Panel Data
17 pages
Overview of Statistical Distributions
No ratings yet
Overview of Statistical Distributions
5 pages
Understanding Stochastic Processes
No ratings yet
Understanding Stochastic Processes
8 pages
Time Series Notes9
No ratings yet
Time Series Notes9
32 pages
MLP Backpropagation Analysis
No ratings yet
MLP Backpropagation Analysis
1 page
Supervised Learning: Linear Regression Models
No ratings yet
Supervised Learning: Linear Regression Models
34 pages