Adam/AdamW implementation minor fix #22628

farhadrgh · 2019-07-09T17:56:50Z

I have noticed a small discrepancy between theory and the implementation of AdamW and in general Adam. The epsilon in the denominator of the following Adam update should not be scaled by the bias correction (Algorithm 2, L9-12). Only the running average of the gradient (m) and squared gradients (v) should be scaled by their corresponding bias corrections.

In the current implementation, the epsilon is scaled by the square root of bias_correction2. I have plotted this ratio as a function of step given beta2 = 0.999 and eps = 1e-8. In the early steps of optimization, this ratio slightly deviates from theory (denoted by the horizontal red line).

vincentqb · 2019-07-31T21:53:34Z

Good catch! I've updated the name to emphasize that both Adam and AdamW are affected. For reference, see also papers cited in documentation, here also.

vincentqb · 2019-07-31T21:54:19Z

This change looks good to me. I'll run a few tests before merging.

facebook-github-bot

@vincentqb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

farhadrgh · 2019-07-31T22:51:16Z

Good catch! I've updated the name to emphasize that both Adam and AdamW are affected. For reference, see also papers cited in documentation, here also.

Great, thanks!

facebook-github-bot · 2019-08-01T20:35:47Z

@vincentqb merged this pull request in fed5ca1.

fl2o · 2019-08-02T14:09:02Z

Has the same modification been done to the cpp implementation of Adam?

farhadrgh · 2019-08-02T17:02:07Z

Has the same modification been done to the cpp implementation of Adam?

I had forgotten about that. I have just addressed it in a new PR: #23737

Summary: This PR is in accordance with pytorch#22628 I had submitted the PR for `adam.py` and `adamw.py` but had forgotten about the `adam.cpp`. Pull Request resolved: pytorch#23737 Differential Revision: D16623828 Pulled By: vincentqb fbshipit-source-id: 4390fd751d1c0cd12f32214b4234d42a06dcbb20

azcohen14-zz · 2019-09-27T18:41:01Z

Would it make sense to have the same modification in the implementation of SparseAdam?

farhadrgh · 2019-09-27T19:38:08Z

Yes indeed! I looked into it, but apparently forgot to update it, I can go ahead and fix it in a separate PR.

AdamW implementation minor fix

130cdf2

pytorchbot added the module: optimizer Related to torch.optim label Jul 9, 2019

flake8 fix

722e5b1

ezyang added the open source label Jul 9, 2019

soumith requested a review from vincentqb July 10, 2019 00:16

jerryzh168 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 13, 2019

vincentqb changed the title ~~AdamW implementation minor fix~~ Adam/AdamW implementation minor fix Jul 31, 2019

facebook-github-bot reviewed Jul 31, 2019

View reviewed changes

facebook-github-bot closed this in fed5ca1 Aug 1, 2019

facebook-github-bot added the merged label Aug 1, 2019

farhadrgh mentioned this pull request Aug 2, 2019

Adam implementation minor fix #23737

Closed

farhadrgh deleted the adamw branch August 2, 2019 17:02

vincentqb approved these changes Aug 5, 2019

View reviewed changes

alcinos mentioned this pull request Aug 30, 2019

AdamW and AdaBound algorithms for C++ frontend #17468

Closed

farhadrgh mentioned this pull request Sep 27, 2019

Sparse Adam implementation minor fix #26987

Closed

mratsim mentioned this pull request Oct 22, 2019

Verify Adam implementation mratsim/Arraymancer#392

Open

ssnl added the module: bc-breaking Related to a BC-breaking change label Oct 22, 2019

bmc2-stripe mentioned this pull request Dec 13, 2019

Adam implementation differs from paper (applies bias B_2 correction to \epsilon) tensorflow/tensorflow#35102

Closed

vincentqb mentioned this pull request Dec 23, 2019

[WIP] Add warning to adam. #31577

Closed

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adam/AdamW implementation minor fix #22628

Adam/AdamW implementation minor fix #22628

Uh oh!

farhadrgh commented Jul 9, 2019

Uh oh!

vincentqb commented Jul 31, 2019 •

edited

Loading

Uh oh!

vincentqb commented Jul 31, 2019

Uh oh!

facebook-github-bot left a comment

Uh oh!

farhadrgh commented Jul 31, 2019

Uh oh!

facebook-github-bot commented Aug 1, 2019

Uh oh!

fl2o commented Aug 2, 2019

Uh oh!

farhadrgh commented Aug 2, 2019

Uh oh!

azcohen14-zz commented Sep 27, 2019

Uh oh!

farhadrgh commented Sep 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Adam/AdamW implementation minor fix #22628

Adam/AdamW implementation minor fix #22628

Uh oh!

Conversation

farhadrgh commented Jul 9, 2019

Uh oh!

vincentqb commented Jul 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincentqb commented Jul 31, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

farhadrgh commented Jul 31, 2019

Uh oh!

facebook-github-bot commented Aug 1, 2019

Uh oh!

fl2o commented Aug 2, 2019

Uh oh!

farhadrgh commented Aug 2, 2019

Uh oh!

azcohen14-zz commented Sep 27, 2019

Uh oh!

farhadrgh commented Sep 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

vincentqb commented Jul 31, 2019 •

edited

Loading