Emphasize all DDP forward() outputs must participate in computing loss #20586

mrshenli · 2019-05-16T14:51:03Z

CC @borguz @chenyangyu1988

function outputs must participate in calculating loss.

facebook-github-bot

@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

pietern · 2019-05-16T17:32:58Z

torch/nn/parallel/distributed.py

+                                       as being ready to be reduced. Note that all
+                                       ``forward`` outputs must participate in
+                                       calculating loss. Otherwise, those unused
+                                       parameters will not be detected.


Note that all forward outputs that are derived from module parameters must participate in calculating loss and later the gradient computation. If they don't, this wrapper will hang waiting for autograd to produce gradients for those parameters. Any outputs derived from module parameters that are otherwise unused can be detached from the autograd graph using torch.Tensor.detach.

pietern

LGTM -- added a bit more information to the part that ends up in the documentation.

facebook-github-bot

@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-05-17T16:08:05Z

@mrshenli merged this pull request in fa4ca4e.

Edit docs and error messages to emphasize that all DDP forward()

aaf9033

function outputs must participate in calculating loss.

mrshenli requested a review from pietern May 16, 2019 14:51

mrshenli requested a review from apaszke as a code owner May 16, 2019 14:51

pytorchbot added oncall: distributed Add this issue/PR to distributed oncall triage queue module: nn Related to torch.nn labels May 16, 2019

mrshenli added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 16, 2019

facebook-github-bot reviewed May 16, 2019

View reviewed changes

pietern reviewed May 16, 2019

View reviewed changes

address comments

fda75d2

facebook-github-bot reviewed May 16, 2019

View reviewed changes

pietern approved these changes May 16, 2019

View reviewed changes

facebook-github-bot closed this in fa4ca4e May 17, 2019

facebook-github-bot added the merged label May 17, 2019

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Emphasize all DDP forward() outputs must participate in computing loss #20586

Emphasize all DDP forward() outputs must participate in computing loss #20586

Uh oh!

mrshenli commented May 16, 2019

Uh oh!

facebook-github-bot left a comment

Uh oh!

pietern May 16, 2019 •

edited

Loading

Uh oh!

pietern left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented May 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Emphasize all DDP forward() outputs must participate in computing loss #20586

Emphasize all DDP forward() outputs must participate in computing loss #20586

Uh oh!

Conversation

mrshenli commented May 16, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

pietern May 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pietern left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pietern May 16, 2019 •

edited

Loading