Support sparse gradients in DistributedDataParallel #22037

pietern · 2019-06-20T19:37:25Z

Stack:
:black_circle: #22037 Support sparse gradients in DistributedDataParallel 💛
:white_circle: #22036 Add sparse tensor allreduce 💛

This adds support for sparse gradients to the reducer as well as to
the DistributedDataParallel wrapper. Note that an out of band signal
is needed whether or not a dense parameter (e.g. an embedding) is
expected to receive a sparse gradient or not. This information is
passed to the bucket assignment computation routine and the reducer as
a vector of booleans. Every parameter for which we expect a sparse
gradient is assigned its own bucket, as we cannot easily group
multiple unrelated sparse tensors.

Differential Revision: D15926383

Differential Revision: D15926383 Differential Version: 85311083

pietern · 2019-06-21T08:45:48Z

@pytorchbot retest this please

The Windows failures are likely unrelated and to be fixed by #22029.

pietern · 2019-06-24T06:44:24Z

@pytorchbot retest this please

facebook-github-bot · 2019-06-24T16:04:12Z

This pull request has been merged in 77eda8d.

V1: Initial commit

014ad7a

Differential Revision: D15926383 Differential Version: 85311083

pietern requested review from apaszke and mrshenli as code owners June 20, 2019 19:37

pytorchbot added oncall: distributed Add this issue/PR to distributed oncall triage queue module: nn Related to torch.nn labels Jun 20, 2019

pietern mentioned this pull request Jun 20, 2019

Add sparse tensor allreduce #22036

Closed

pietern added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 21, 2019

mrshenli approved these changes Jun 24, 2019

View reviewed changes

facebook-github-bot closed this in 77eda8d Jun 24, 2019

facebook-github-bot added the merged label Jun 24, 2019

pietern mentioned this pull request Jul 1, 2019

[Feature Request] Support for Sparse CPU Embeddings in Mixed GPU/CPU model NVIDIA/apex#243

Open

ezyang deleted the export-D15926383 branch July 19, 2019 15:54

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support sparse gradients in DistributedDataParallel #22037

Support sparse gradients in DistributedDataParallel #22037

Uh oh!

pietern commented Jun 20, 2019 •

edited

Loading

Uh oh!

pietern commented Jun 21, 2019

Uh oh!

pietern commented Jun 24, 2019

Uh oh!

facebook-github-bot commented Jun 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Support sparse gradients in DistributedDataParallel #22037

Support sparse gradients in DistributedDataParallel #22037

Uh oh!

Conversation

pietern commented Jun 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pietern commented Jun 21, 2019

Uh oh!

pietern commented Jun 24, 2019

Uh oh!

facebook-github-bot commented Jun 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pietern commented Jun 20, 2019 •

edited

Loading