Skip to content

Conversation

@pietern
Copy link
Contributor

@pietern pietern commented Jun 20, 2019

Stack:
    :black_circle:  #22037 Support sparse gradients in DistributedDataParallel  💛
    :white_circle:  #22036 Add sparse tensor allreduce  💛

This adds support for sparse gradients to the reducer as well as to
the DistributedDataParallel wrapper. Note that an out of band signal
is needed whether or not a dense parameter (e.g. an embedding) is
expected to receive a sparse gradient or not. This information is
passed to the bucket assignment computation routine and the reducer as
a vector of booleans. Every parameter for which we expect a sparse
gradient is assigned its own bucket, as we cannot easily group
multiple unrelated sparse tensors.

Differential Revision: D15926383

Differential Revision: D15926383
Differential Version: 85311083
@pietern pietern requested review from apaszke and mrshenli as code owners June 20, 2019 19:37
@pytorchbot pytorchbot added oncall: distributed Add this issue/PR to distributed oncall triage queue module: nn Related to torch.nn labels Jun 20, 2019
@pietern
Copy link
Contributor Author

pietern commented Jun 21, 2019

@pytorchbot retest this please

The Windows failures are likely unrelated and to be fixed by #22029.

@pietern pietern added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 21, 2019
@pietern
Copy link
Contributor Author

pietern commented Jun 24, 2019

@pytorchbot retest this please

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 77eda8d.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: nn Related to torch.nn oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants