0% found this document useful (0 votes)

70 views40 pages

Convolutional Neural Networks Guide

Uploaded by

saisivani.ashok2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views40 pages

Convolutional Neural Networks Guide

Uploaded by

saisivani.ashok2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Deep Learning – BCSE332L

vi R
Convolutional Neural Networks

ar
Dr. R. Bhargavi
g a
Professor
SCOPE

B h
VIT University

1
Computer Vision - Applications
Image Classification Object Detection

R
Malignant/Benign

g avi
r
Style Transfer

B ha
Dr. R Bhargavi, VIT 2
Working with Images - Fully connected DNN
• A fully connected DNN/MLP takes only tabular data as the input.
• It does not work well with images because they heavily rely on certain pixel
positions. Hence any positional variance will result in miss-classification (Example
shown in figure below)
Features are considered as independent of each other.

R
•

i
• A traditional fully connected DNN has huge number of learnable parameters

av
• Images of size 1024 x 1024 x 3, with 2 hidden layers of size 1000 ?

har g
B
Dr. R Bhargavi, VIT 3
Working with Images –DNN (cont…)
Input image

vi R
ar g a
B h
Flattened Input image to a Fully connected DNN

Dr. R Bhargavi, VIT 4

Working with Images - CNN
• Automatic feature extraction is done in CNN.
• This allows us to feed Images directly, instead of extracting features manually.
• Convolutional layers are responsible for feature extraction.
• Convolution layers will consider locality into account.
•

v
is also used for DNN.
i R
As the conv layers learn the representations the name representation learning

ar g a
B h
Dr. R Bhargavi, VIT 5
Convolutional Neural Network - Architecture

vi R
ar g a
B h Convolutional layers
Abstract
Features

FC layers (for
classification)

Dr. R Bhargavi, VIT 6

CNN – Architecture (cont…)

vi R
Convolution

ar g a
Pooling
Fully
Connected

h
Convolution Fully
Pooling Fully

B
Connected
Connected

Trainable Layers

Dr. R Bhargavi, VIT 7

CNN – Architecture (cont…)

vi R
ar g a
B h
Dr. R Bhargavi, VIT 8
Convolutional Layer
• Convolutional layer is the core building block of a Convolutional Network.
• Conv layer’s parameters consist of a set of learnable filters.
•
input.

v R
Local connectivity: Each neuron is connected only to a small region in the

i
ar g a x1
x3
x2
x4 *
w1

w3
w2

w4
z

h
=

B
Z

𝑧 = 𝑏 + % 𝑤! 𝑥!
Receptive Field
of the Neuron in !
the feature map
Dr. R Bhargavi, VIT 9
Convolutional Layer (cont…)
• Parameter sharing/ Weight sharing: In one conv layer same filter is used for the
entire image.

R
• Rationale - If detecting a horizontal edge is important at some location in the

i
image, it should intuitively be useful at some other location as well due to the

v
translationally-invariant structure of images. There is therefore no need to

ar g a
relearn to detect a horizontal edge at every one of the distinct locations in the
Conv layer output volume.

B h
Dr. R Bhargavi, VIT 10
Convolution Operation

Feature Map

vi R
ar g a
B h
Output size is given by (nh – kh +1) x (nw – kw +1) where (nh x nw) is the size (height
and width) of the input tensor and (kh x kw) is the size of the kernel

Dr. R Bhargavi, VIT 11

Convolutions with Multiple Channels
1 input channel and multiple output channels
• Use multiple kernels.

i R
• Each kernel results in one channel.

v
a
• Same convolution operation is used for each of the output channels.

ar g
• Each kernel learns different parameters corresponding to different
filters.
h
B
Dr. R Bhargavi, VIT 12
Convolutions with Multiple Channels (cont…)
Multiple input channels (3channels) and single output channel

vi R
ar g a
B h
Dr. R Bhargavi, VIT 13
Convolutions with Multiple Channels (cont…)
Multiple input channels (3channels) and multiple output channel
Kernel1: 3 channels

vi R Kernel2: 3 channels

ar g a Kernel3: 3 channels

B h
Input: 3 channels
Kernel4: 3 channels

Kernel5: 3 channels
Output: 5 channels

Dr. R Bhargavi, VIT 14

Padding
• With zero padding (aka valid conv) each convolution operation reduces the size of the
output.
• Some pixels (for example the corner ones) are least used where as few are used more often.
• If the input is of size n x n, and filter size is f x f and padding size is p then the resultant
output size will be ( n +2p –f +1) x (n +2p –f +1)
• If the o/p size is same as i/p size then it is called as Same padding

Padding size = 1

vi R
a
0 0 0 0 0 0 0 -1 0 0 3 0 3 2 0

r g
0 1 0 2 2 1 0 -1 1 0 4 -1 1 0 -2
* =

ha
0 2 1 1 2 1 0 0 1 0 2

B
0 2 1 1 1 1 0 2

0 0 1 1 2 2 0 2

0 2 2 1 1 1 0

0 0 0 0 0 0 0

Dr. R Bhargavi, VIT 15

Stride
• If the input is of size n x n, filter size is f x f, padding size is p, and Stride = s
then the resultant output size will be ( n +2p –f)/s +1 x (n +2p –f)/s +1

Stride = 2

R
0 0 0 0 0 0 0 -1 0 0 3 3 0

vi
0 1 0 2 2 1 0 -1 1 0 2 0 0
* =

a
2 -2 -2

g
0 2 1 1 2 1 0 0 1 0

ar
0 2 1 1 1 1 0

h
0 0 1 1 2 2 0

0
2

0
B 2

0
1

0
0

Dr. R Bhargavi, VIT 16

Inductive Biases
• Sparse connectivity – based on the assumption that neighboring pixels are
related
• Parameter sharing – based on the the assumption that same filters work in

R
different parts of the image

i
•

v
The above two assumptions are called Inductive biases.

a
•

g
Inductive biases result in CNNs learn more quickly and generalize better as

r
compared to fully connected NNs.

B ha
Dr. R Bhargavi, VIT 17
Pooling
• Used between the conv layers.
• Reduce the spatial size of the representation to reduce the amount of parameters
and computation in the network.
• Controls the overfitting.
• Accepts a volume of size W1×H1×D1

R
• Requires two hyperparameters:

i
• Spatial extent F.

av
• The stride S,

g
Produces a volume of size W2×H2×D2 where:

r
•

a
• W2= ((W1−F)/S)+1

h
• H2= ((H1−F)/S)+1

B
• D2=D1
• No learnable parameters.
• Padding the input using zero-padding is not done for pooling layer.

Dr. R Bhargavi, VIT 18

Pooling

vi R
ar g a
B h
Dr. R Bhargavi, VIT 19
CNN Architectures

vi R
ar g a
B h
Dr. R Bhargavi, VIT 20
Source: [Link]
LeNet
• LeNet, proposed by Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick
Haffner in 1998, laid the groundwork for convolutional neural networks (CNNs)

R
and their applications in handwritten digit recognition.

i
LeNet was trained using stochastic gradient descent (SGD) with

v
•

a
backpropagation.
•

har g
The network was trained on the MNIST dataset, comprising 60,000 training
examples and 10,000 test examples.

B
• Data augmentation techniques such as translation, rotation, and scaling were
employed to increase the diversity of training samples and improve
generalization.
• LeNet achieved a remarkable accuracy of over 99% on the MNIST dataset.
Dr. R Bhargavi, VIT 21
LeNet-5 (cont…)
• Used sigmoid and tanh activations.
• Has approx. 60k learnable parameters.
• LeNet was used to read zip codes, digits, etc

vi R
ar g a
Stride = 1B h
6 Kernels - 5 x 5 Avg pool - 2 x 2
Stride = 2 16 Kernels - 5 x 5
Stride = 1
Avg pool - 2 x 2
Stride = 2

Dr. R Bhargavi, VIT 22

Source: [Link]
AlexNet
• This deep convolutional neural network is
trained to classify the 1.2 million high-
resolution images in the ImageNet LSVRC-
2010 contest into the 1000 different classes.
•

vi R
Test data performance - Achieved top-1 and
top-5 error rates of 37.5% and 17.0%.
•

r g a
In the ILSVRC-2012 competition, a variant of

a
this model achieved a winning top-5 test error

h
rate of 15.3%.

B
Source: [Link]
Dr. R Bhargavi, VIT 23
AlexNet
• AlexNet consists of eight layers, including
• five convolutional layers followed by
• max-pooling layers and

R
• three fully connected layers.

vi
• Rectified Linear Units (ReLU) were used as activation functions, providing

a
faster convergence and alleviating the vanishing gradient problem.
•

h r g
The neural network has 60 million parameters and 650,000 neurons

a
Local Response Normalization (LRN) was introduced to normalize activations

B
within local regions of the feature maps.
• LRN operates on local groups of neurons, normalizing activity within each
group and across feature channels.

Dr. R Bhargavi, VIT 24

AlexNet - Training
• LRN is done using the formula

• !
𝑎",$ is the activity of a neuron computed by applying kernel i at position (x, y)
and then applying ReLU
•

i R
n - “adjacent” kernel maps at the same spatial position.

v
a
• N - the total number of kernels in the layer.

r g
• The constants k, n, α, and β are hyper-parameters with values k = 2, n = 5, α =

a
10−4 , and β = 0.75.
•

B h
AlexNet was trained using stochastic gradient descent (SGD) with momentum.

Dr. R Bhargavi, VIT 25

AlexNet - Training
• Overfitting reduction:
• Data augmentation
• Dropout

• Data augmentation techniques such as cropping, flipping, and color jittering

R
were employed to increase the diversity of training samples.
•

avi
The network was trained on two NVIDIA GTX 580 GPUs, marking one of the
earliest instances of utilizing GPU acceleration for deep learning.

g
har
B
Dr. R Bhargavi, VIT 26
AlexNet - Architecture
Conv ReLU
ReLU Conv ReLU 3x3
Maxpool 5x5
Maxpool S = 1,
Conv 3x3 p = same
Same
11 x 11 S=1 3x3
S=2
S=4 S=2

227 x 227 x 3 55 x 55 x 96

vi R 27 x 27 x 96 27 x 27 x 256 13 x 13 x 256 13 x 13 x 384

Conv

ar g
ReLU
a Conv ReLU

h
3x3 3x3 Maxpool

B
S = 1, Same 3x3
Same S = 1, S=2
FC FC
FC

13 x 13 x 384 13 x 13 x 256 6 x 6 x 256 4096 4096 1000

SoftMax
Architecture (cont…)

vi R
ar g a
B h
Dr. R Bhargavi, VIT 28
How to compute Number of parameters in
CNN
What will be the output size of the following network ? How many learnable
parameters exist? No padding is used and Stride = 1

vi R
a
3x3

g
10 x 10 x 1

r
x1

B ha Gray scale
image
Conv

Dr. R Bhargavi, VIT 29

Number of parameters in CNN (cont…)
3x3
10 x 10 x 1
x1

R
Gray scale

i
image

g av
Output size = (10 -3 +1, 10-3+1, 1) = 8,8,1

r
a
Parameters = (3 x 3 x 1) + 1 = 10

B h
Dr. R Bhargavi, VIT 30
Number of parameters in CNN (cont…)
What will be the output size of the following network ? How many learnable
parameters exist? No padding is used and Stride = 1

vi R
g a
10 x 10 x 1

har Conv Conv

B
Gray scale 3x3x5 3x3x2
image

Dr. R Bhargavi, VIT 31

Number of parameters in CNN (cont…)
10 x 10 x 1

R
Conv Conv

i
Gray scale 3x3x5 3x3x2

v
image

r g a
After first Conv Output size = (10 -3 +1, 10-3+1, 5) = 8,8,5

a
h
Parameters = for each Each filter (3 x 3 x 1) + 1 = 10 , For 5 filters = 50

B
Now
After Second conv filter, output size = (8 – 3 + 1, 8 – 3 + 1, 2) = 6,6,2
Parameters = Each filter (3 x 3 x 5)+1 = 46; Two filters = 92
Total parameters = 50 + 92 = 142
Dr. R Bhargavi, VIT 32
Number of parameters in CNN (cont…)
What will be the output size of the following network ? How many learnable
parameters exist? No padding is used and Stride = 1

vi R
ar g a100 x 100 x 3

h
Conv Conv

B
Color image 3x3x8 3x3x1

Dr. R Bhargavi, VIT 33

Number of parameters in CNN (cont…)

100 x 100 x 3

vi R
Color image
Conv
3x3x8
Conv
3x3x1

r g a
After first Conv Output size = (100 -3 +1, 100-3+1, 8) = 98, 98, 8

a
h
Parameters = for each Each filter (3 x 3 x 3) + 1 = 28 , For 8 filters = 224

B
Now
After Second conv filter, output size = (98 – 3 + 1, 98 – 3 + 1, 1) = 96,96,1
Parameters = Each filter (3 x 3 x 8)+1 = 73; only one filter = 73
Total parameters = 224 + 73 = 297
Dr. R Bhargavi, VIT 34
Number of parameters in CNN (cont…)
What will be the output size of the following network ? How many learnable
parameters exist? No padding is used and Stride = 1

vi R
ar g a
h
Conv Conv

B
(100) , 5 (3), 8 (3) ,1

Dr. R Bhargavi, VIT 35

Number of parameters in CNN (cont…)

i R
Conv Conv

v
(100) , 5 (3), 8 (3) ,1

r g a
After first Conv Output size = (100-3+1, 8) = 98, 8

a
h
Parameters = for each Each filter (3 x 5) + 1 = 16 , For 8 filters = 128

B
Now
After Second conv filter, output size = (98 – 3 + 1, 1) = 96,1
Parameters = Each filter (3 x 8)+1 = 25; only one filter = 25
Total parameters = 128 + 25 = 153
Dr. R Bhargavi, VIT 36
INCEPTION Module

vi R
ar g a
B h
Dr. R Bhargavi, VIT 37
GOOGLENET / INCEPTION NET

vi R
ar g a
B h
Auxiliary Loss
Dr. R Bhargavi, VIT 38
INCEPTION NET (cont…)

vi R
ar g a
B h
Dr. R Bhargavi, VIT 39
vi R
ar g a
B h
Dr. R Bhargavi, VIT 40

CSCI417 Machine Intelligence - Lec11 RNN - V1
No ratings yet
CSCI417 Machine Intelligence - Lec11 RNN - V1
61 pages
Additional CNN
No ratings yet
Additional CNN
82 pages
Transfer Learning - CNN Architectures
No ratings yet
Transfer Learning - CNN Architectures
120 pages
Difference Between Alexnet, Vggnet, Resnet, and Inception
No ratings yet
Difference Between Alexnet, Vggnet, Resnet, and Inception
14 pages
Convolutional Neural Networks - Annotated
No ratings yet
Convolutional Neural Networks - Annotated
83 pages
5b Dana
No ratings yet
5b Dana
67 pages
Deep Learning for Visual Recognition
No ratings yet
Deep Learning for Visual Recognition
82 pages
Unit III
No ratings yet
Unit III
58 pages
Lec 6
No ratings yet
Lec 6
31 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Deep Learning Cheatsheet Guide
No ratings yet
Deep Learning Cheatsheet Guide
14 pages
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
No ratings yet
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
8 pages
Convolutional Neural Networks Guide
No ratings yet
Convolutional Neural Networks Guide
31 pages
Intro to CNNs for Tech Enthusiasts
No ratings yet
Intro to CNNs for Tech Enthusiasts
31 pages
CNN Basic
No ratings yet
CNN Basic
64 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
Ml@ok Questions
No ratings yet
Ml@ok Questions
16 pages
CNN Architecture
No ratings yet
CNN Architecture
6 pages
CNN Models: A Historical Overview
No ratings yet
CNN Models: A Historical Overview
82 pages
07 Ais302 CNN
No ratings yet
07 Ais302 CNN
56 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
Unit 4 (CNN and SOM)
No ratings yet
Unit 4 (CNN and SOM)
15 pages
Deep Learning Subject Practicals Uni Mumbai
No ratings yet
Deep Learning Subject Practicals Uni Mumbai
25 pages
CC511 Week 7 - Deep - Learning
No ratings yet
CC511 Week 7 - Deep - Learning
33 pages
Lec 9 - AIHC - S2022 - V2
No ratings yet
Lec 9 - AIHC - S2022 - V2
124 pages
Images and Convolutional Neural Networks: Practical Deep Learning
No ratings yet
Images and Convolutional Neural Networks: Practical Deep Learning
34 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
71 pages
Ch-3 Convolutional Neural Networks (CNNS)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNS)
11 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
55 pages
Some Important Question
No ratings yet
Some Important Question
59 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
2023 AN2DL Lez 3 CNN TL Data Scarcity
No ratings yet
2023 AN2DL Lez 3 CNN TL Data Scarcity
121 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
14 pages
DL Unit-3
No ratings yet
DL Unit-3
70 pages
AlexNet
No ratings yet
AlexNet
20 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
Convolutional Neural Network Models
No ratings yet
Convolutional Neural Network Models
83 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
15 pages
Difference Between AlexNet, VGGNet, ResNet, and Inception
No ratings yet
Difference Between AlexNet, VGGNet, ResNet, and Inception
25 pages
An Analysis of Convolutional Neural Network Architectures
No ratings yet
An Analysis of Convolutional Neural Network Architectures
54 pages
CNN Midterm
No ratings yet
CNN Midterm
103 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
L7-CNNs NT
No ratings yet
L7-CNNs NT
82 pages
Module - 2.2
No ratings yet
Module - 2.2
20 pages
6 Lecture CNN
100% (1)
6 Lecture CNN
45 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
62 pages
Deep Learning CNN
No ratings yet
Deep Learning CNN
204 pages
New
No ratings yet
New
8 pages
CNN Tutorial: LeNet with Theano
No ratings yet
CNN Tutorial: LeNet with Theano
12 pages
DL Unit Iv
No ratings yet
DL Unit Iv
18 pages
CNN Short
No ratings yet
CNN Short
61 pages
CNNs: A Guide for Tech Enthusiasts
No ratings yet
CNNs: A Guide for Tech Enthusiasts
80 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
New Media New Audiences
No ratings yet
New Media New Audiences
9 pages
Salman Avestimehr: Machine Learning Expert
No ratings yet
Salman Avestimehr: Machine Learning Expert
5 pages
LRS 350N2 Spec
No ratings yet
LRS 350N2 Spec
5 pages
MadXAbhi - Industrial Management - by MadXAbhi - Robot
No ratings yet
MadXAbhi - Industrial Management - by MadXAbhi - Robot
41 pages
Module IV
No ratings yet
Module IV
7 pages
PAGE Descriptions of Robots
No ratings yet
PAGE Descriptions of Robots
3 pages
DUI0497A Cortex m0 r0p0 Generic Ug
No ratings yet
DUI0497A Cortex m0 r0p0 Generic Ug
140 pages
LML4807 Examination
No ratings yet
LML4807 Examination
7 pages
AllExercise FA-TOC Sipser
0% (1)
AllExercise FA-TOC Sipser
17 pages
IPTV Access for Tech Enthusiasts
No ratings yet
IPTV Access for Tech Enthusiasts
5 pages
Risk Assessment and Mitigation in Construction Building Projects
No ratings yet
Risk Assessment and Mitigation in Construction Building Projects
11 pages
SINAR Information No 31 (Orig Format)
No ratings yet
SINAR Information No 31 (Orig Format)
8 pages
Large Aperture Low Cost Hydrophone Array For Tracking Whales From Small Boats
No ratings yet
Large Aperture Low Cost Hydrophone Array For Tracking Whales From Small Boats
10 pages
Early Project Appraisal Making The Initial Choices (Knut Samset)
No ratings yet
Early Project Appraisal Making The Initial Choices (Knut Samset)
303 pages
90mm Shaft Bushes Price List
100% (1)
90mm Shaft Bushes Price List
1 page
Message ISC
No ratings yet
Message ISC
5 pages
Virtual Reality, Augmented Reality and The Metaverse-Opportunities in Digital Worlds in
No ratings yet
Virtual Reality, Augmented Reality and The Metaverse-Opportunities in Digital Worlds in
12 pages
La Transición Al Océano Azul
No ratings yet
La Transición Al Océano Azul
316 pages
Insane Cum From Ava Big Tits Porn Feat. Ava Addams XHamster
No ratings yet
Insane Cum From Ava Big Tits Porn Feat. Ava Addams XHamster
1 page
PPR - 1 - Abhishek Jayant Roll 15
No ratings yet
PPR - 1 - Abhishek Jayant Roll 15
6 pages
Enter Store Receipt Details: Default Setting
No ratings yet
Enter Store Receipt Details: Default Setting
1 page
Arc Welding PDF
No ratings yet
Arc Welding PDF
42 pages
E-Receipt: Samsung Axis Bank Credit Card 10% Extra Cashback Across Samsung Products & Services
No ratings yet
E-Receipt: Samsung Axis Bank Credit Card 10% Extra Cashback Across Samsung Products & Services
1 page
Quality Assurance Specialist Resume
No ratings yet
Quality Assurance Specialist Resume
5 pages
Manual For GE 24991 Universal Remote
No ratings yet
Manual For GE 24991 Universal Remote
16 pages
Com Dev Ksap
100% (1)
Com Dev Ksap
21 pages
Introduction To Oracle Database Performance Tuning
No ratings yet
Introduction To Oracle Database Performance Tuning
11 pages
Exc 4.1
No ratings yet
Exc 4.1
4 pages
Alfa Laval Unique Mixproof Valve Guide
No ratings yet
Alfa Laval Unique Mixproof Valve Guide
1 page
Introduction to MSW Logo Programming
No ratings yet
Introduction to MSW Logo Programming
53 pages

Convolutional Neural Networks Guide

Uploaded by

Convolutional Neural Networks Guide

Uploaded by

Deep Learning – BCSE332L

Dr. R Bhargavi, VIT 4

Dr. R Bhargavi, VIT 6

Dr. R Bhargavi, VIT 7

Dr. R Bhargavi, VIT 11

Dr. R Bhargavi, VIT 14

Dr. R Bhargavi, VIT 15

Dr. R Bhargavi, VIT 16

Dr. R Bhargavi, VIT 18

Dr. R Bhargavi, VIT 22

Dr. R Bhargavi, VIT 24

Dr. R Bhargavi, VIT 25

• Data augmentation techniques such as cropping, flipping, and color jittering

vi R 27 x 27 x 96 27 x 27 x 256 13 x 13 x 256 13 x 13 x 384

13 x 13 x 384 13 x 13 x 256 6 x 6 x 256 4096 4096 1000

Dr. R Bhargavi, VIT 29

har Conv Conv

Dr. R Bhargavi, VIT 31

Dr. R Bhargavi, VIT 33

Dr. R Bhargavi, VIT 35

You might also like