-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Emphasize all DDP forward() outputs must participate in computing loss #20586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
function outputs must participate in calculating loss.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
torch/nn/parallel/distributed.py
Outdated
| as being ready to be reduced. Note that all | ||
| ``forward`` outputs must participate in | ||
| calculating loss. Otherwise, those unused | ||
| parameters will not be detected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that all
forwardoutputs that are derived from module parameters must participate in calculating loss and later the gradient computation. If they don't, this wrapper will hang waiting for autograd to produce gradients for those parameters. Any outputs derived from module parameters that are otherwise unused can be detached from the autograd graph usingtorch.Tensor.detach.
pietern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- added a bit more information to the part that ends up in the documentation.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
CC @borguz @chenyangyu1988