0% found this document useful (0 votes)

44 views41 pages

Module 2

Uploaded by

chinnu.200420

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views41 pages

Module 2

Uploaded by

chinnu.200420

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

RNS INSTITUTE OF TECHNOLOGY

Affiliated to VTU, Recognized by GOK, Approved by AICTE, New Delhi

(NAAC ‘A+ Grade’ Accredited, NBA Accredited (UG - CSE, ECE, ISE, EIE and EEE)
Channasandra, Dr. Vishnuvardhan Road, Bengaluru - 560 098
Ph:(080)28611880,28611881 URL: www.rnsit.ac.in

MODULE-2:
Basics of Supervised Deep Learning

Pooja R Rao
Assistant Professor
Department of CSE-DS
RNSIT
Introduction
•Supervised and unsupervised deep learning models are rapidly growing due to
their success in solving complex problems.
•Growth is supported by:
•High-performance computing resources.
•Availability of large amounts of labeled and unlabeled data.
•Access to advanced open-source libraries.
•This makes deep learning increasingly feasible for many applications.
Convolutional Neural Network (ConvNet/CNN)

•Deep learning models (supervised and unsupervised) are rapidly advancing

due to their effectiveness in solving complex problems.
•The growth is driven by powerful computing resources, large datasets, and
advanced open-source libraries.
•These factors make deep learning feasible for a wide range of applications.
•Convolutional Neural Networks (ConvNets or CNNs) are deep learning
models with multiple layers.
•CNNs are inspired by the human visual cortex.
•They are highly effective in tasks like image classification, object detection, speech
recognition, natural language processing, and medical image analysis.
•CNNs play a crucial role in computer vision applications such as self-driving cars,
robotics, and helping the visually impaired.
•They work by extracting local features at higher layers and combining them into
more complex features at deeper layers.
•CNNs are computationally intensive due to their deep architecture.
•Training CNNs on large datasets often requires several days and the use of GPUs.
•CNNs outperform most traditional techniques in visual recognition tasks.
Evolution of Convolutional Neural Network Models
LeNet

 The first practical convolutional neural network (CNN), designed to

classify handwritten digits (MNIST).
 Used backpropagation for training and was adopted for reading
handwritten checks.
 Did not scale well to larger problems due to:
o Small labeled datasets
o Slow computers
o Use of unsuitable activation functions (like sigmoid/tanh) leading to
vanishing gradients, which make training deep networks difficult.
AlexNet

• Achieved the first major breakthrough in 2012 by winning the ImageNet Large-
Scale Visual Recognition Challenge (ILSVRC).

• Reduced classification error rate from 26% to 15%.

• Improvements over LeNet included:

• Large labeled dataset (ImageNet: ~15 million images in 22,000+ categories).

• Training on high-speed GPUs (GTX 580) for several days.

• Use of ReLU activation function (f(x) = max(x, 0)), which is faster and avoids
vanishing gradient problems.

• Architecture: 5 convolutional layers, 3 pooling layers, 3 fully connected layers, and

a 1000-way softmax classifier.
ZFNet (2013):

o An improved version of CNN architecture by reducing the first-

layer filter size from 11×11 to 7×7 and stride from 4 to 2.

o This led to better feature extraction and fewer dead features.

o ZFNet won the ILSVRC 2013 competition.

VGGNet (2014):

o The depth of the network was made 19 layers by adding more

convolutional layers with 3 × 3 filters, along with 2 × 2 max-pooling
layers with stride and padding of 1 in all layers.

o The deeper, simpler architecture improved accuracy significantly.

o VGGNet achieved 7.32% error rate and was the runner-up in

ILSVRC 2014.
GoogLeNet (2015):
• Google developed a ConvNet model called GoogLeNet in 2015. It
uses an inception module which helps in reducing the number of
parameters in the network.

• The inception module is actually a concatenated layer of

convolutions (3 × 3 and 5 ×5convolutions) and pooling sub-layers
at different scales with their output filter banks concatenated into a
single output vector making the input for the succeeding stage.

• These sub-layers are not stacked sequentially but the sub-layers

are connected in parallel as shown in Fig. 2.1.
Increasing network layers can improve accuracy by learning more
features, but has limits:

1. Vanishing gradients: Very deep networks may lose important

information during training.

2. Optimization difficulty: Too many parameters make training harder.

To address this, network depth should be increased carefully.

GoogLeNet won ILSVRC 2015 with a 6.7% error rate.

Later versions include Inception V3 (2016) and Inception-ResNet

(2017).
ResNet:

• Microsoft Research Asia proposed a CNN architecture in 2015, which is, 152
layers deep and is called ResNet. ResNet introduced residual connections in
which the output of a conv-relu-conv series is added to the original input and
then passed through Rectified Linear Unit (ReLU) as shown in Fig. 2.2.
• In this way, the information is carried from the previous layer to the next
layer and during backpropagation, the gradient flows easily because of the
addition operations, which distributes the gradient. ResNet proved that a
complex architecture like Inception is not required to achieve the best results.
Performed exceptionally well, winning ILSVRC 2015 with a
3.6% error rate.
Inception-ResNet (2017):
 Combined the Inception module with residual connections to
form a hybrid model.
 This design significantly increased training speed.
 It slightly outperformed ResNet in terms of accuracy.
Xception:

A convolutional neural network architecture based on depthwise separable convolution

layers is called Xception. The architecture is actually inspired by inception model and
that is why it is called Xception (Extreme Inception). Xception architecture is a pile of
depthwise separable convolution layers with residual connections. Xception has 36
convolutional layers organized into 14 modules, all having linear residual connections
around them, except for the first and last modules. The Xception has claimed to perform
slightly better than Inception V3 on ImageNet. Table 2.1 and Fig. 2.3 show classification
performance of VGG-16, ResNet-152, Inception V3 and Xception on ImageNet.
SqueezeNet:
Researchers developed SqueezeNet to reduce the size and complexity of convolutional
neural networks without sacrificing accuracy. The approach included pruning small-weight
parameters to create sparse models and retraining them. Additionally, SqueezeNet adopted
three main strategies to minimize parameters and computation:
 (a) Replacing 3 × 3 filters with 1 × 1 filters.
 (b) Reducing the number of input channels to 3 × 3 filters.
 (c) Delaying subsampling to later layers to preserve larger activation maps. (subsampling means

reducing the size of the feature maps-Instead of shrinking the image too early, SqueezeNet keeps larger feature maps for more layers)

With these methods, SqueezeNet achieved AlexNet-level accuracy on ImageNet using 50

times fewer parameters.
ShuffleNet:
Another ConvNet architecture called ShuffleNet was introduced in 2017
for devices with limited computational power, like mobile devices,
without compromising on accuracy. ShuffleNet used two ideas,
pointwise group convolution(split the channels into groups and only connect within a group, computation reduces a
lot) and channel shuffle, to considerably decrease the computational cost
while maintaining the accuracy.
Convolution Operation
Architecture of CNN
Traditional Neural Network Limitations
 Fully connected layers connect every neuron in one layer to every neuron in the previous layer.
 This dense connectivity does not scale well to large images.
Need for CNN
 CNNs are better for large images and data with grid-like structure (e.g., 1D time-series, 2D images, 3D
volumes, 4D videos).
 Designed to process structured data efficiently.
Key Features of CNNs
 (i) Local Receptive Field:
o Each neuron connects only to a small region of the input.
o Helps extract local features like edges, corners.
 (ii) Weight Sharing:
o Same filter (set of weights) is applied across all positions in the input.
o Reduces number of parameters and enables feature detection anywhere in the input.
o A typical convolutional neural network consists of the following layers:
• Convolutional layer
• Activation function layer (ReLU)
• Pooling layer
• Fully connected layer and
• Dropout layer
 (iii) Subsampling (Pooling):
o Reduces spatial size and network parameters.
o Most common method is max-pooling.
Convolution Layer
 The convolution layer is the main building block of a convolutional neural network
(CNN).
 It uses the convolution operation (denoted by *) instead of general matrix multiplication.
 It has a set of learnable filters or kernels as its parameters.
 Its main task is to detect features in local regions of the input image that are common
across the dataset.
 A feature map is created for each filter by convolving it over subregions of the image.
 The process includes performing the convolution, adding a bias term, and applying an
activation function.
 The local receptive field is the region of the input the filter is applied to, and its size
matches the filter size.
Filters/Kernels
 The weights in each convolutional layer define the convolution filters (kernels)
 There can be multiple filters in a single convolutional layer.
 Each filter is designed to capture specific features like edges or corners.

 During the forward pass, each filter slides over the input’s width and height to
produce its feature map.

Hyperparameters
 Convolutional neural networks have hyperparameters that control model
behavior, output size, runtime, and memory.
 Four important hyperparameters in the convolution layer are:

 Filter Size: Typically between 3×3 and 11×11. Size is independent of input size.
 Number of Filters: Can vary. For example, AlexNet used 96 filters of size 11×11,
VGGNet used filters of size 7×7 or 11×11.
 Stride: Number of pixels the filter moves at each step. Small stride = more overlap
and larger output size; large stride = less overlap and smaller output size.
 Zero Padding: Number of pixels added as zeros around the input to control the
output’s spatial size.
• Each filter in the convolution layer produces a feature map of size
([A−K +2P]/S) + 1 where A is the input volume size, K is the size of
the filter, P is the number of padding applied and S is the stride.
• Suppose the input image has size 128 × 128, and 5 filters of size 5 × 5
are applied, with single stride and zero padding, i.e., A 128, F 5,P
0andS 1.
• The number of feature maps produced will be equal to the number of
filters applied, i.e., 5 and the size of each feature map will be ([128 − 5
+0]/1)+1 124. Therefore, the output volume will be 124 × 124 × 5.
Activation Function (ReLU)
 The output of each convolutional layer is passed through an activation function layer.
 The activation function transforms the feature map into an activation map.
 It determines the output signal of a neuron for a given input.
 Activation functions typically squash inputs to a specific range (e.g., 0–1 or −1 to 1).
 They perform a mathematical operation on the input to produce the neuron's activation
level.
 A good activation function is usually continuous and differentiable everywhere.
 Differentiability is important for gradient-based training methods used in ConvNets.
 If non-gradient-based methods are used, differentiability is not required.
 Many activation functions are used in ANNs and some of the commonly used activation
functions are as follows:
(non-monotonic: slopes up, dips slightly below zero, then rises)
Pooling Layer
• Pooling layers follow the convolution and activation layers in ConvNets to
reduce the spatial size of feature maps.
 This reduction lowers the number of parameters and computational cost in
the network.
 A pooling layer down-samples the input feature maps by summarizing
regions of neurons to select representative values.
 Max-pooling is the most common technique, dividing the input into small
regions (e.g., 2 × 2) and selecting the maximum value from each region.
 For a 2 × 2 region, max-pooling outputs the single highest value among the
four values.
• Other pooling types include average pooling (computes the mean of the region) and L2-norm
pooling (calculates the square root of the sum of squares of the values).
 Pooling layers discard less important details while preserving essential features in a smaller,
more manageable form.
 The idea behind pooling is that detecting a feature is more important than knowing its exact
location.
 This strategy works well for simple tasks but can have limitations for more complex problems.
Fully Connected Layer
•Convolutional Neural Networks (CNNs) consist of two main stages: feature extraction and classification.
•The feature extraction stage includes convolution and pooling layers that detect features from input data.
•Once enough features are extracted, the classification stage begins.
•The classification stage consists of one or more fully connected layers followed by a classifier.
•Fully connected layers take input from all neurons of the previous layer, enabling every value to contribute
to the prediction.
•These layers transform the spatial feature data into class scores or probabilities.
•Multiple fully connected layers can be used to learn complex feature relationships.
•The output from the last fully connected layer is sent to a classifier.
•Common classifiers used are Softmax and Support Vector Machines (SVMs).
•The Softmax classifier outputs class probabilities that sum to 1.
•The SVM classifier outputs class scores, and the class with the highest score is selected.
Dropout
 Deep neural networks have multiple hidden layers that help learn complex features.
 These are followed by fully connected layers used for decision-making.
 Fully connected layers are prone to overfitting due to their dense connections.
 Overfitting occurs when the model performs well on training data but poorly on new,
unseen data.
 To address overfitting, a dropout layer is used during training.
 Dropout randomly removes some neurons and their connections from the network
during each training iteration.
 The remaining reduced network is trained on the data at that stage.
 Dropped-out neurons are reinserted later with their original weights.
 This technique reduces overfitting and enhances the model's ability to generalize.
2.6 Challenges and Future Research Direction:
 Strong Performance: Convolutional Neural Networks (ConvNets) have shown excellent results in tasks like object
classification and detection, sometimes matching human-level accuracy.
 Vulnerabilities Exist: Despite their success, ConvNets are vulnerable to small, imperceptible changes in input images, which
can lead to incorrect classifications.
 Cause of Vulnerability: One key reason for this vulnerability is the pooling operation, which reduces the feature space but
also discards important spatial information.
 Loss of Spatial Relationships: ConvNets detect if a feature is present in a region but fail to capture the exact spatial
relationships between features, making it harder to recognize complex objects.
 Reliability Concern: These limitations raise concerns about the generalization and reliability of ConvNets in real-world
applications.
 Capsule Networks as a Solution: Capsule Networks have been proposed to overcome some of these issues. They use
capsules (groups of neurons) to represent objects and their parts more precisely.
 Dynamic Routing: Instead of max pooling, Capsule Networks use dynamic routing to preserve spatial relationships between
features across layers.
 Ongoing Research: Capsule Networks are still in the early stages of research, and their effectiveness across various visual
tasks remains under investigation.

Unit 3
No ratings yet
Unit 3
14 pages
CSCI417 Machine Intelligence - Lec11 RNN - V1
No ratings yet
CSCI417 Machine Intelligence - Lec11 RNN - V1
61 pages
Deep Learning (MODULE-3)
No ratings yet
Deep Learning (MODULE-3)
85 pages
Intro to CNNs for Tech Enthusiasts
No ratings yet
Intro to CNNs for Tech Enthusiasts
31 pages
Convolutional Neural Networks Guide
No ratings yet
Convolutional Neural Networks Guide
31 pages
Image Processing with CNNs Overview
No ratings yet
Image Processing with CNNs Overview
63 pages
DLRL Module 2
No ratings yet
DLRL Module 2
22 pages
Unit 5a - Machine Vision
No ratings yet
Unit 5a - Machine Vision
55 pages
Deep CNN
No ratings yet
Deep CNN
66 pages
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
No ratings yet
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
50 pages
Convolutional Neural Network Report
No ratings yet
Convolutional Neural Network Report
5 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
Module 5
No ratings yet
Module 5
20 pages
CNN Architectures for Text and Image
No ratings yet
CNN Architectures for Text and Image
167 pages
L10-DL Intro
No ratings yet
L10-DL Intro
25 pages
465-Lecture 7
No ratings yet
465-Lecture 7
46 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Trustworthy - Final Essay
No ratings yet
Trustworthy - Final Essay
21 pages
Al3502 - DLV Unit 3
No ratings yet
Al3502 - DLV Unit 3
11 pages
ML II - Unit IV
No ratings yet
ML II - Unit IV
20 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
44 pages
Module 05
No ratings yet
Module 05
10 pages
Bascis of AI - Module 2 - Complementary Study Material - 4
No ratings yet
Bascis of AI - Module 2 - Complementary Study Material - 4
4 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
CNNs for Image Recognition
No ratings yet
CNNs for Image Recognition
17 pages
Deep Learning (22CS63) : Module-3
No ratings yet
Deep Learning (22CS63) : Module-3
58 pages
19 ResNet 10 09 2024
No ratings yet
19 ResNet 10 09 2024
35 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
15 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
CC511 Week 7 - Deep - Learning
No ratings yet
CC511 Week 7 - Deep - Learning
33 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
Arora 2020
No ratings yet
Arora 2020
3 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
14 pages
Identify Web Cam Images Using Neural Networks
No ratings yet
Identify Web Cam Images Using Neural Networks
17 pages
CNN Architecture
No ratings yet
CNN Architecture
6 pages
Google Net
No ratings yet
Google Net
40 pages
DL 4
No ratings yet
DL 4
5 pages
AI Slide 2
No ratings yet
AI Slide 2
82 pages
Ch-3 Convolutional Neural Networks (CNNS)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNS)
11 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
47 pages
ch4 CNN
No ratings yet
ch4 CNN
35 pages
Kernel Slides
No ratings yet
Kernel Slides
33 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
CNN Basic
No ratings yet
CNN Basic
64 pages
DL UNIT 2 CNN Architectures
No ratings yet
DL UNIT 2 CNN Architectures
12 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Lec 2
No ratings yet
Lec 2
42 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
MN906 AI Watermarking
No ratings yet
MN906 AI Watermarking
99 pages
UNIT-III Convolution Neural Networks
No ratings yet
UNIT-III Convolution Neural Networks
9 pages
CNNs: A Guide for Tech Enthusiasts
No ratings yet
CNNs: A Guide for Tech Enthusiasts
80 pages
Class Notes Unit 5
No ratings yet
Class Notes Unit 5
13 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
Unit 4
No ratings yet
Unit 4
86 pages
Cours 8 B
No ratings yet
Cours 8 B
39 pages
Assignment-6 STC-DL
No ratings yet
Assignment-6 STC-DL
17 pages
CNN 3
No ratings yet
CNN 3
21 pages
Deep Learning Image Classification
No ratings yet
Deep Learning Image Classification
11 pages
CNN Applications in Computer Vision
No ratings yet
CNN Applications in Computer Vision
65 pages
Module 1 BA Notes
No ratings yet
Module 1 BA Notes
18 pages
Module 2
No ratings yet
Module 2
19 pages
Module 2
No ratings yet
Module 2
16 pages
DLRL Module 1
No ratings yet
DLRL Module 1
20 pages
BAI701 - DLRL - Question Bank (Module 1 & 2)
No ratings yet
BAI701 - DLRL - Question Bank (Module 1 & 2)
3 pages
Convolution Operation Solution
No ratings yet
Convolution Operation Solution
4 pages
Module1 Smlds Bad702 Notes
No ratings yet
Module1 Smlds Bad702 Notes
29 pages
DL QB
No ratings yet
DL QB
1 page
QB 1st IA
No ratings yet
QB 1st IA
2 pages
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
No ratings yet
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
78 pages
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
No ratings yet
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
61 pages
02.MOUDLE 5 - Text Mining
No ratings yet
02.MOUDLE 5 - Text Mining
27 pages
Module-IV HIVE
No ratings yet
Module-IV HIVE
69 pages
MD-102 Exam - Free Actual Q&as, Page 2 - ExamTopics
100% (1)
MD-102 Exam - Free Actual Q&as, Page 2 - ExamTopics
45 pages
Vivo Appstore ANR and Input Method Logs
No ratings yet
Vivo Appstore ANR and Input Method Logs
58 pages
53TW CV Applying For Job
No ratings yet
53TW CV Applying For Job
7 pages
Hive Command Guide for Intellipaat Users
No ratings yet
Hive Command Guide for Intellipaat Users
8 pages
Mindsdb
No ratings yet
Mindsdb
3 pages
ML-2550 - ML-2551N - ML-2552W - Cardinal 2110V
No ratings yet
ML-2550 - ML-2551N - ML-2552W - Cardinal 2110V
176 pages
Hot Candidate List: January. 2023
No ratings yet
Hot Candidate List: January. 2023
22 pages
YPDP Brochure - 2024
No ratings yet
YPDP Brochure - 2024
8 pages
Session 2 - C Character Set and C Tokens
No ratings yet
Session 2 - C Character Set and C Tokens
8 pages
Zeek
No ratings yet
Zeek
7 pages
Manual Indent No-60
No ratings yet
Manual Indent No-60
2 pages
Tutorial - Perf Wiki
No ratings yet
Tutorial - Perf Wiki
23 pages
Wpice For Wpide Hardware Installation Quick Start1.1
No ratings yet
Wpice For Wpide Hardware Installation Quick Start1.1
2 pages
ABS Class II Operator Manual
No ratings yet
ABS Class II Operator Manual
64 pages
Medical Device Software Guidance
No ratings yet
Medical Device Software Guidance
22 pages
6 Ynu 7 U 7
No ratings yet
6 Ynu 7 U 7
4 pages
G200 User Manual: Asset GPS Tracker
No ratings yet
G200 User Manual: Asset GPS Tracker
5 pages
Design and Implementation of A Computer Based Treasury Management System
0% (1)
Design and Implementation of A Computer Based Treasury Management System
61 pages
PHD Thesis Final 1
No ratings yet
PHD Thesis Final 1
238 pages
Entropy Label With EVPN Deep-Dive Technical Presentation
No ratings yet
Entropy Label With EVPN Deep-Dive Technical Presentation
22 pages
SAP Config Tool vs Visual Administrator
No ratings yet
SAP Config Tool vs Visual Administrator
2 pages
Uasar: Secondary Injection Test Equipment
No ratings yet
Uasar: Secondary Injection Test Equipment
11 pages
SWU Medical Tech Application Form
No ratings yet
SWU Medical Tech Application Form
1 page
Calculus Exam Solutions
No ratings yet
Calculus Exam Solutions
9 pages
Lab #2 - Assessment Worksheet
No ratings yet
Lab #2 - Assessment Worksheet
3 pages
NB-EVB User Guide Manual V1.00
No ratings yet
NB-EVB User Guide Manual V1.00
51 pages
Capitalization Template C-29957
No ratings yet
Capitalization Template C-29957
6 pages
Jamtaba - Manual v.1.0.4
No ratings yet
Jamtaba - Manual v.1.0.4
6 pages
g9 Project Report
No ratings yet
g9 Project Report
32 pages
System Programming Solution Bank
No ratings yet
System Programming Solution Bank
63 pages

Module 2

Uploaded by

Module 2

Uploaded by

RNS INSTITUTE OF TECHNOLOGY

Affiliated to VTU, Recognized by GOK, Approved by AICTE, New Delhi

•Deep learning models (supervised and unsupervised) are rapidly advancing

 The first practical convolutional neural network (CNN), designed to

• Reduced classification error rate from 26% to 15%.

• Improvements over LeNet included:

• Large labeled dataset (ImageNet: ~15 million images in 22,000+ categories).

• Training on high-speed GPUs (GTX 580) for several days.

• Architecture: 5 convolutional layers, 3 pooling layers, 3 fully connected layers, and

o An improved version of CNN architecture by reducing the first-

o This led to better feature extraction and fewer dead features.

o ZFNet won the ILSVRC 2013 competition.

o The depth of the network was made 19 layers by adding more

o The deeper, simpler architecture improved accuracy significantly.

o VGGNet achieved 7.32% error rate and was the runner-up in

• The inception module is actually a concatenated layer of

• These sub-layers are not stacked sequentially but the sub-layers

1. Vanishing gradients: Very deep networks may lose important

2. Optimization difficulty: Too many parameters make training harder.

To address this, network depth should be increased carefully.

GoogLeNet won ILSVRC 2015 with a 6.7% error rate.

Later versions include Inception V3 (2016) and Inception-ResNet

A convolutional neural network architecture based on depthwise separable convolution

With these methods, SqueezeNet achieved AlexNet-level accuracy on ImageNet using 50

You might also like