Assignment # 02 (Deep Learning Essentials)
Name: Syed Muhammad Taqi Rizvi Email: muhamadtaqi6@[Link]
Question 1: Learning Rate Scheduling in Gradient Descent
Answer1: Learning Rate Scheduling in Gradient Descent
Learning Rate
Learning rate in Deep Learning use in Gradient Descent to optimize weight. Learning rate is basically use
to control the rate of descent of gradient descent. Learning rate helps algorithm (Gradient Descent) to
reach to global optimum position (global minima) where the los is approximately equal to zero. It
represent by alpha (α).
Learning Rate Scheduling:
Learning rate scheduling in gradient descent use schedule learning rate to automate learning rate to fix
suboptimal result by using fixed learning rate. Learning rate scheduling vary learning rate
Here we describe two learning rate techniques
1. Learning Rate Decay:
In normal or momentum learning rate technique where learning rate is fixed we can lead to
the sub optimal result or may we can bypass global optimum point. In Learning rate Decay
we start with high learning rate and gradually decreased learning rate to slow down as the
learning approaches convergence towards the global optimum point (global minima)
2. Cyclical Learning Rates:
Cyclical Learning rate is technique changes of learning rate are cyclic, always return to its
initial value. CLR allows keeping learning rate high and low. Learning rate oscillates between
base learning rate and maximum learning rate. The oscillation takes place using different
function like Triangular (linear), welch window (parabolic), Hann window.
Question 2: The Role of Computation Graphs in Backpropagation
Answer: The Role Computation Graph in Backpropagation
Computation Graph
A Computation graph is a graphical representation of operations and variable involved in a computing
function, same as the output of neural network.
How it Works?
Computation graph represent the function in graphical manner.
For example:
If you have a function
f(a,b,c) = 3(a + b * c)
It break a function in a manner, i.e.
1. u = b * c,
2. v = a + u,
3. f = 3v
a
v=a+u f = 3v
b u=b*c
c
Role in Backpropagation
Backpropagation is used to computing gradients (derivatives) of the loss function with respect to each
parameter. Back propagation is use to compute the effect of parameter “w” on the output. Use to
optimize or update the weight using back propagation to calculate the output and minimize the loss of
function.
How Computation graph helps:
1. Forward:
It computes the loss by passing input through graph from input to output layer
2. Backward:
It uses chain rule to compute gradient by moving backward from output to input and
parameters. It stores intermediate values (like outputs of functions and their derivatives)
during the forward pass to reuse in the backward pass.
Question 3: Applications of Fully Connected Neural Networks
Answer: Applications of Fully Connected Neural Network (FCNN)
The Fully Connected Neural Network (FCNN) also called Dense Neural Network is a type of artificial
neural network in which each neuron is connected with another neuron of next layer. It consist of layers
1. Input layer : takes date
2. Hidden layers : perform transformation using weight, biases and activation function
3. Output layer : give the final prediction
It generally use in Classification, Regression and Predicting task
FCNN applications are:
1. Image Classification
FCNNs are used to assign labels to images (e.g., cat, dog, cars). While CNNs handle spatial
features, FCNNs are typically used in the final layers to make decisions based on extracted
features.
Handles complex pattern once features are extracted.
Integrate with CNNs to finalize classification.
Easily extendible to multiclass classification problem.
2. Sentiment Analysis (Text Classification)
FCNNs are used to predict the sentiment (positive, negative, neutral) of text data. Inputs are
word embeddings or text features.
It captures nonlinear relationships in language pattern.
Fast inference time, especially useful in real-time application.
3. Fraud Detection in Finance
FCNNs are used to detect anomalies or fraudulent patterns in transaction data.
Learns subtle pattern in complex transactional data.
High accuracy when trained on historical transactional data.
Scales well to large datasets.