Implement simpliﬁed Nesterov momentum

The main idea of Nesterov accelerated gradient (NAG, Nesterov momentum) is to update the parameter with the gradient at the predicted (peeked-ahead) parameter. To reduce the sample variance, NAG smoothes the update by exponentially averaging  the histories. 

Sutskever et al.[1] proved that NAG was effective to improve the stability and convergence rate of stochastic optimization of deep network. They showed it could be done in two steps.

![image](https://f.cloud.github.com/assets/4530451/1991823/c6476b8a-84a6-11e3-89e9-c65fb8f7c69f.png)

Simplified Nesterov momentum updates:
![image](https://f.cloud.github.com/assets/4530451/1991817/8efb94c6-84a6-11e3-8e19-ac5fb4b18898.png)

Bengio et al.[2] reformulated it to indicate that it was equivalent to the standard momentum except for different linear weighting coefficients.

[1] [Sutskever, I., Martens, J., Dahl, G. and Hinton, G. E. On the importance of momentum and initialization in deep learning. In 30th International Conference on Machine Learning, Atlanta, USA, 2013. JMLR:  W&CP volume 28.](http://jmlr.org/proceedings/papers/v28/sutskever13.html)
[2] [Yoshua Bengio, Nicolas Boulanger-Lewandowski, Razvan Pascanu. Advances in Optimizing Recurrent Networks. arXiv 1212.0901.](http://arxiv.org/abs/1212.0901)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement simpliﬁed Nesterov momentum #53

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement simpliﬁed Nesterov momentum #53

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions