Vanishing and Exploding

Gradient descent vanishing and exploding issues arise due to the chain rule in backpropagation. The vanishing gradient problem occurs when the gradients become extremely small during backpropagation, making learning very slow. The exploding gradient problem is the opposite, where the gradients grow extremely large, making the neural network highly unstable. Both issues can be addressed using techniques like gradient clipping or initializing weights properly.

Uploaded by

logi9361

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

322 views9 pages

Vanishing and Exploding

Uploaded by

logi9361

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Gradient descent vanishing

& exploding
By
LOGESHWARI P
(CB.EN.P2BME23009)
GRADIENT DESCENT

Gradient descent is an optimization algorithm which is

commonly-used to train machine learning models and neural
networks.
Chain rule in back propagation
Vanishing gradient
Exploding gradient
• Softmax and sigmoid are both activation functions commonly used in machine learning for • 4. **Independence:**
different purposes. Let's compare them in terms of their characteristics and use cases:
• - **Softmax:** The probabilities sum to 1, and the output for one class is dependent on
the scores of other classes.
• 1. **Function Form:** • - **Sigmoid:** Each sigmoid output is independent of the others. It's applied element-
wise to each output node.
• - **Softmax:** It is used for multi-class classification problems. The softmax function
takes a vector of arbitrary real-valued scores and squashes them to a probability distribution
over multiple classes. The output is a vector of probabilities that sum to 1.
• 5. **Numerical Stability:**
• - **Sigmoid:** It is used for binary classification problems. The sigmoid function takes a
real-valued input and squashes it to the range [0, 1]. It's commonly used to produce the • - **Softmax:** The softmax function involves exponentiation, and in practice, it can be
probability of belonging to a particular class. sensitive to large input values, potentially leading to numerical instability issues.
• - **Sigmoid:** Generally more numerically stable compared to softmax.

• 2. **Output Range:**
• - **Softmax:** Produces a probability distribution over multiple classes, with each • 6. **Derivative:**
element in the range (0, 1). The sum of all elements in the output vector is 1.
• - **Softmax:** The derivative of the softmax function involves multiple terms, and it's
• - **Sigmoid:** Produces an output in the range (0, 1) and is suitable for binary often used in conjunction with the cross-entropy loss during backpropagation in
classification problems. It can be interpreted as the probability of belonging to the positive classification tasks.
class.
• - **Sigmoid:** The derivative of the sigmoid function has a simple and interpretable
form, making it computationally efficient during backpropagation.

• 3. **Application:**
• - **Softmax:** Typically used in the output layer of a neural network for multi-class • In summary, softmax is suitable for multi-class classification tasks, while sigmoid is
classification problems. It's especially useful when there are more than two classes. commonly used in binary classification problems. The choice between them depends on the
nature of the task and the number of classes involved.
• - **Sigmoid:** Commonly used in binary classification problems. It's also used in the
hidden layers of neural networks to model non-linear relationships in the data.

Understanding Cost Function & Gradient Descent
No ratings yet
Understanding Cost Function & Gradient Descent
142 pages
Matrix-Vector Multiplication Using MapReduce in Big Data.
No ratings yet
Matrix-Vector Multiplication Using MapReduce in Big Data.
4 pages
PAC Learning and VC Dimension Explained
No ratings yet
PAC Learning and VC Dimension Explained
31 pages
DL Unit5 RNN
No ratings yet
DL Unit5 RNN
107 pages
SCSA3015 Deep Learning Unit 2 PDF
No ratings yet
SCSA3015 Deep Learning Unit 2 PDF
32 pages
3.4 Lda
No ratings yet
3.4 Lda
12 pages
Data Visualization R Programming Power Bi Lab Record
No ratings yet
Data Visualization R Programming Power Bi Lab Record
29 pages
Dimensionality Reduction Lecture Slide
No ratings yet
Dimensionality Reduction Lecture Slide
27 pages
Single-Layer Perceptron Guide
No ratings yet
Single-Layer Perceptron Guide
39 pages
Fundamental - Deep Learning
No ratings yet
Fundamental - Deep Learning
69 pages
Autoencoders & Keras Overview
No ratings yet
Autoencoders & Keras Overview
42 pages
Pattern Recognition and Anomaly Detection Lab
No ratings yet
Pattern Recognition and Anomaly Detection Lab
3 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
CS6456-Object Oriented Programming
No ratings yet
CS6456-Object Oriented Programming
15 pages
NNDL Unit 3: Deep Learning Overview
No ratings yet
NNDL Unit 3: Deep Learning Overview
17 pages
ML Unit-1
No ratings yet
ML Unit-1
15 pages
Model Building Through
No ratings yet
Model Building Through
21 pages
Big Data Unit 2
No ratings yet
Big Data Unit 2
19 pages
Radial Basis Function Neural Network RBFNN
No ratings yet
Radial Basis Function Neural Network RBFNN
14 pages
Deep Learning Exam: Key Concepts
No ratings yet
Deep Learning Exam: Key Concepts
32 pages
Ocs353 DSF Unit III Notes
No ratings yet
Ocs353 DSF Unit III Notes
11 pages
DL Unit 3
No ratings yet
DL Unit 3
59 pages
DL Notes ALL
No ratings yet
DL Notes ALL
63 pages
18AI61
No ratings yet
18AI61
3 pages
Unit 3
No ratings yet
Unit 3
25 pages
DMDW Full Notes
No ratings yet
DMDW Full Notes
26 pages
Deep Learning and Machine Learning Basics
No ratings yet
Deep Learning and Machine Learning Basics
66 pages
ML Unit-1
No ratings yet
ML Unit-1
34 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
AL3451 Machine Learning Question Bank
100% (1)
AL3451 Machine Learning Question Bank
12 pages
3 Unit - Dspu
No ratings yet
3 Unit - Dspu
23 pages
Associative Memory in Neural Networks
No ratings yet
Associative Memory in Neural Networks
15 pages
Image Captioning with TensorFlow Guide
0% (1)
Image Captioning with TensorFlow Guide
2 pages
AML - Theory - Syllabus - Chandigarh University
No ratings yet
AML - Theory - Syllabus - Chandigarh University
4 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Gaussian Mixture Model Parameters
No ratings yet
Gaussian Mixture Model Parameters
24 pages
Linear Regression and SVM Concepts
No ratings yet
Linear Regression and SVM Concepts
8 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Bca Question Papers Blue Print
60% (5)
Bca Question Papers Blue Print
11 pages
Perspectives and Issues in Deep Learning.
No ratings yet
Perspectives and Issues in Deep Learning.
8 pages
Unit 2: Feature Extraction & Selection: Artificial Intelligence & Machine Learning
No ratings yet
Unit 2: Feature Extraction & Selection: Artificial Intelligence & Machine Learning
42 pages
Lab Program
100% (1)
Lab Program
15 pages
Unit 2 Fod
No ratings yet
Unit 2 Fod
27 pages
Transfer Learning Seminar
No ratings yet
Transfer Learning Seminar
12 pages
Notes Unit 1
No ratings yet
Notes Unit 1
13 pages
Enhancing Linear Regression Models
No ratings yet
Enhancing Linear Regression Models
18 pages
Knowledge Representation Issue
No ratings yet
Knowledge Representation Issue
18 pages
Supervised Learning Notes
No ratings yet
Supervised Learning Notes
13 pages
Summary Notes of CNN
No ratings yet
Summary Notes of CNN
23 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
29 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
64 pages
Study Materials - Restricted Boltzmann Machine
No ratings yet
Study Materials - Restricted Boltzmann Machine
6 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
No ratings yet
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
7 pages
Class Notes Unit 2 ML Material
No ratings yet
Class Notes Unit 2 ML Material
31 pages
ML Challenges
No ratings yet
ML Challenges
8 pages
Encoder-Decoder Seq2Seq Architecture
No ratings yet
Encoder-Decoder Seq2Seq Architecture
16 pages
Data Science Laboratory Lab Manual: Prepared by Dr. R Obulakonda Reddy, Associate Professor
No ratings yet
Data Science Laboratory Lab Manual: Prepared by Dr. R Obulakonda Reddy, Associate Professor
35 pages
Digital Image Processing
No ratings yet
Digital Image Processing
175 pages
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
39 pages
Python Interview Prep for Devs
No ratings yet
Python Interview Prep for Devs
9 pages
Artificial Intelligence With Microstrategy: Enhancing Data Ingestion and Customer Benefits With Ai Integration
No ratings yet
Artificial Intelligence With Microstrategy: Enhancing Data Ingestion and Customer Benefits With Ai Integration
7 pages
Outlier Detection Techniques
100% (1)
Outlier Detection Techniques
13 pages
Pattern Recognition
No ratings yet
Pattern Recognition
11 pages
Bhanu
No ratings yet
Bhanu
80 pages
Let's Begin With:: Differentiate Between Supervised and Unsupervised Learning
No ratings yet
Let's Begin With:: Differentiate Between Supervised and Unsupervised Learning
26 pages
JD - Campus 2026
No ratings yet
JD - Campus 2026
6 pages
DSA210 2025spring Syllabus-3
No ratings yet
DSA210 2025spring Syllabus-3
4 pages
AI Integration Manual SocialScience
No ratings yet
AI Integration Manual SocialScience
166 pages
Value Engineering in Box-Girder Bridge Construction
No ratings yet
Value Engineering in Box-Girder Bridge Construction
8 pages
Air Quality Report - SW
No ratings yet
Air Quality Report - SW
43 pages
Heart Disease Prediction Interview QA
No ratings yet
Heart Disease Prediction Interview QA
2 pages
Coconut Disease Prediction System and Recommendation Using Image Processing
No ratings yet
Coconut Disease Prediction System and Recommendation Using Image Processing
30 pages
2.12 Chapter 6 Decision Tree
No ratings yet
2.12 Chapter 6 Decision Tree
56 pages
The Powerful Impact of Artificial Intelligence On Construction Management
No ratings yet
The Powerful Impact of Artificial Intelligence On Construction Management
5 pages
Machine Learning Semester Notes JNTUK
No ratings yet
Machine Learning Semester Notes JNTUK
2 pages
Machine Learning for Blood Cancer Detection
No ratings yet
Machine Learning for Blood Cancer Detection
4 pages
Welcome To ISLP Documentation! - Introduction To Statistical Learning (Python)
No ratings yet
Welcome To ISLP Documentation! - Introduction To Statistical Learning (Python)
8 pages
KNN Classification with Scaling
No ratings yet
KNN Classification with Scaling
4 pages
Rishi Dua: Machine Learning & Big Data Expert
No ratings yet
Rishi Dua: Machine Learning & Big Data Expert
1 page
Artifical Neural Network
No ratings yet
Artifical Neural Network
7 pages
Deep Learning Applications and Image Processing
No ratings yet
Deep Learning Applications and Image Processing
5 pages
Application of Explainable AI For Diagnosis of Coronary Heart Disease
No ratings yet
Application of Explainable AI For Diagnosis of Coronary Heart Disease
8 pages
Sindh University
No ratings yet
Sindh University
10 pages
Unit 4 Rs
No ratings yet
Unit 4 Rs
10 pages
Deep Learning For Software Defect Prediction - A Survey
No ratings yet
Deep Learning For Software Defect Prediction - A Survey
6 pages
Sample Research Proposal 1
100% (2)
Sample Research Proposal 1
3 pages
Walmart - Store Sales Forecasting - by Sergio Alves - Medium
No ratings yet
Walmart - Store Sales Forecasting - by Sergio Alves - Medium
29 pages
r20 4-1 Open Elective IV Syllabus Final Ws
No ratings yet
r20 4-1 Open Elective IV Syllabus Final Ws
29 pages
Project of Core English Class 12
No ratings yet
Project of Core English Class 12
13 pages

Vanishing and Exploding

Uploaded by

Vanishing and Exploding

Uploaded by

Gradient descent vanishing

Gradient descent is an optimization algorithm which is

You might also like