-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add gelu gradient for pytorch #21237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
soumith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check inline comment and provide resolution on what's up with the tolerance adjustment. Once you get clarity into that, and verify it's not a bug, do land.
Things reviewed in the diff:
- MKL and non-MKL implementations match in formula
- CUDA and CPU implementation match in formula
Things not reviewed in the diff:
- gradient formula is correct (relying on gradcheck to say it's right)
test/test_nn.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks suspicious. gradchecks happen in double precision, so a tolerance of 1e-3 looks really high and a custom eps is usually not needed. Any idea what is going on? Can you check some sample inputs to inspect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part just came from testing the numerical stability for gradchecks. Atol has been removed and now it is using default value.
d773d72 to
b333c2c
Compare
b333c2c to
2ef4e48
Compare
2ef4e48 to
c1b1d5d
Compare
c1b1d5d to
d247ea1
Compare
Summary: Add gelu activation forward on CPU in pytorch Differential Revision: D15400974 fbshipit-source-id: 1c59104bea69cbe26ab96921e242131890db657e
Summary: Pull Request resolved: pytorch#21237 Add gelu gradient for pytorch Reviewed By: zheng-xq Differential Revision: D15589816 fbshipit-source-id: 2feb4ed779cda1dec3fe03fcfba29861b4a86d12
d247ea1 to
aaf94f7
Compare
Summary: Pull Request resolved: pytorch/pytorch#21237 Add gelu gradient for pytorch Reviewed By: zheng-xq Differential Revision: D15589816 fbshipit-source-id: 76fda7c413afed5b6cc3abe3a26c258d393a53ce
|
This pull request has been merged in 31c79b7. |
Summary: Add gelu gradient for pytorch
Differential Revision: D15589816