0 ratings0% found this document useful (0 votes) 86 views1 pageWeek 01 Assignment 01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
‘Avout the Course
Unit 11 - Week 9:
‘Course outtine
ow to acess te porta
Wook 4:
fe taca 4: GongeNt
Optmaer: Marti,
nimser
aca 0: opine:
ooo Greet NAC)
numer
tec 4: opine:
esa Optsee
case 4: Opa
sirp Asani rs
Asem Oaimset
Assignment 9
‘The de date forsubmiting thi asgnment hs posse Due on 2019-10-02, 23:59 IST,
Gradient ofa particular dimension points to the same direction, What can you say about the
‘momentum parameter, y. Choose the correct option.
14. The momentum term increasee
b. The momentum term decreases
Cannot comment
4 Itremains the same
on reves caret
decent newer:
2) Comment on the learning rat of Adagrad. Choose the correct option,
Learning rates adaptive
Learning rate increases for each time step
Learning rate remains the same for each update
None of the above
a some iaincane.
‘Aecentes Aree:
5) Adagrad has ts own limitations. Can you choose that limitation from the following options?
a. Accumulation ofthe positive squared gradients in the denominator
b. Overshooting minima
© Learning rate increases thus hindering convergence and cause the loss function
to fluctuate around the minimum or even to diverge
4. Getting trapped in local minima
‘Accept Answers:
©) what isthe full form of RMSProp?
Retain Momentum Propagation
Round Mean Square Propagation
Root Mean Square Propagation
‘None of the above
‘Accent Anewers:
5) RMSProp resolves the imitation af which optimizer?
Adagrad
Momentum
Both a and
Neither a norb
‘Accented Aree:
© Which ofthe following statement is true?
‘2 Gradient update rule for Momentum optimizer and Nesterov Accelerated
Gradient (NAG) optimizer are same
b, Momentum optimizer and Nesterov Accelerated Gradient (NAG) optimizer
perform differently irrespective of learning rates
c. Possibility of oscillations is lower in Nesterov Accelerated Gradient (NAG) than
Momentum optimiser
1d. None of the above
ede anoer wince
Seon
serpin:
7 The following i the equation of update vector for monet optimizer:
veeyves +11 Vos(8)
What isthe range of 7?
a. Oand 1
b. >0
te ne anoweraincaeet,
Sear
‘Accented Aree:
©) Why itis at all required to choose different learning rates for different weights?
To avoid the problem of diminishing learning rate
‘To avoid overshooting the optimum point
To reduce vertical oscillations while navigating the optimum point
This would aid to reach the optimum point faster
ne nena incomet
‘econ Anewors
® What isthe major drawback of setting large leering rate for updating weight parameter for
Gradient Descent?
Slower convergence
Struck in local minima
Overshoots optimum point
None of the above
‘ecente Answers:
79) Fora smaller magnitude gradient descent, what should be the suggested learning rate for
updating the weights?
small
Large
cannot comment
same learning rate for small and large gradient magnitudes
seven tars:
2pointe
pointe
2pointe
2 points
2aints
pons
2pointe
pons
pointe
2 points