0% found this document useful (0 votes)
31 views6 pages

Stochastic Gradient Descent Guide

The document discusses Stochastic Gradient Descent (SGD), an iterative optimization method that uses random samples from a dataset for each iteration, as opposed to using the entire dataset. It highlights the advantages of SGD in handling large datasets while maintaining suitable smoothness properties for optimization. Additional resources, including code and slides, are provided through links to a blog and a YouTube channel.

Uploaded by

Atal Bihari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views6 pages

Stochastic Gradient Descent Guide

The document discusses Stochastic Gradient Descent (SGD), an iterative optimization method that uses random samples from a dataset for each iteration, as opposed to using the entire dataset. It highlights the advantages of SGD in handling large datasets while maintaining suitable smoothness properties for optimization. Additional resources, including code and slides, are provided through links to a blog and a YouTube channel.

Uploaded by

Atal Bihari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Deep Learning from

Scratch
Theory + Practical
FAHAD HUSSAIN
MCS, MSCS, DAE(CIT)

Computer Science Instructor of well known international Center


Also, Machine Learning and Deep learning Practitioner

For further assistance, code and slide [Link]


YouTube Channel : [Link]
Stochastic gradient descent
The word ‘stochastic‘ means a system or a process that is linked with a random
probability. Hence, in Stochastic Gradient Descent, a few samples are selected
randomly instead of the whole data set for each iteration. In Gradient Descent,
there is a term called “batch” which denotes the total number of samples from a
dataset that is used for calculating the gradient for each iteration. In typical
Gradient Descent optimization, like Batch Gradient Descent, the batch is taken to
be the whole dataset. Although, using the whole dataset is really useful for
getting to the minima in a less noisy or less random manner, but the problem
arises when our datasets get really huge.

For further assistance, code and slide [Link]


YouTube Channel : [Link]
Stochastic gradient descent
Stochastic gradient descent (often abbreviated SGD) is an iterative method for
optimizing an objective function with suitable smoothness properties (e.g. differentiable
or subdifferentiable). ~Convex Loss function~

For further assistance, code and slide [Link]


YouTube Channel : [Link]
Stochastic gradient descent

For further assistance, code and slide [Link]


YouTube Channel : [Link]
Stochastic gradient descent

For further assistance, code and slide [Link]


YouTube Channel : [Link]
Thank You!
For further assistance, code and slide [Link]
YouTube Channel : [Link]

You might also like