-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[jit] dropout symbolic_script should respect the training flag #20760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…flag" [jit] dropout symbolic_script should respect the training flag as title gh-metadata: pytorch pytorch 20760 gh/suo/41/head
…flag" [jit] dropout symbolic_script should respect the training flag as title gh-metadata: pytorch pytorch 20760 gh/suo/41/head
|
is it possible to add a test to prevent regression in the future? |
torch/csrc/jit/symbolic_script.cpp
Outdated
| res = mask * input / p1m | ||
| p1m = 1. | ||
| res = input | ||
| mask = torch.zeros(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of backwards when is_training = False is not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops this got merged out somehow
…flag" [jit] dropout symbolic_script should respect the training flag as title gh-metadata: pytorch pytorch 20760 gh/suo/41/head
…flag" [jit] dropout symbolic_script should respect the training flag as title. This unfortunately means that the forward for dropout doesn't fuse completely anymore, but the "important" parts are fused, and all we're adding is the is_training check overhead. The only time we're doing "extra" stuff is if we 1) require_grad and 2) are not training, which seems like uncommon things. gh-metadata: pytorch pytorch 20760 gh/suo/41/head
…flag" [jit] dropout symbolic_script should respect the training flag as title. This unfortunately means that the forward for dropout doesn't fuse completely anymore, but the "important" parts are fused, and all we're adding is the is_training check overhead. The only time we're doing "extra" stuff is if we 1) require_grad and 2) are not training, which seems like uncommon things. gh-metadata: pytorch pytorch 20760 gh/suo/41/head
Stack from ghstack:
as title
Differential Revision: D15486511