Skip to content

Only populate grad accumulator to var mapping for find_unused_parameters=True in DDP#45942

Closed
rohan-varma wants to merge 5 commits intogh/rohan-varma/181/basefrom
gh/rohan-varma/181/head
Closed

Only populate grad accumulator to var mapping for find_unused_parameters=True in DDP#45942
rohan-varma wants to merge 5 commits intogh/rohan-varma/181/basefrom
gh/rohan-varma/181/head

Conversation

@rohan-varma
Copy link
Copy Markdown
Contributor

@rohan-varma rohan-varma commented Oct 7, 2020

Stack from ghstack:

We only need to keep track of this for traversing the autograd graph
when find_unused_parameters=True. Without that, we populate and keep this
mapping in memory, which occupies sizeof(pointer) * number of grad accumulators
of extra memory.

Also renames the variable to something more meaningful.

Differential Revision: D24154407

…ers=True in DDP

We only need to keep track of this for traversing the autograd graph
when find_unused_parameters=True. Without that, we populate and keep this
mapping in memory, which occupies sizeof(pointer) * number of grad accumulators
of extra memory.

Differential Revision: [D24154407](https://our.internmc.facebook.com/intern/diff/D24154407/)

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 7, 2020
rohan-varma added a commit that referenced this pull request Oct 7, 2020
…ers=True in DDP

We only need to keep track of this for traversing the autograd graph
when find_unused_parameters=True. Without that, we populate and keep this
mapping in memory, which occupies sizeof(pointer) * number of grad accumulators
of extra memory.

Differential Revision: [D24154407](https://our.internmc.facebook.com/intern/diff/D24154407/)

ghstack-source-id: 113723598
Pull Request resolved: #45942
…sed_parameters=True in DDP"


We only need to keep track of this for traversing the autograd graph
when find_unused_parameters=True. Without that, we populate and keep this
mapping in memory, which occupies sizeof(pointer) * number of grad accumulators
of extra memory.

Also renames the variable to something more meaningful. 

Differential Revision: [D24154407](https://our.internmc.facebook.com/intern/diff/D24154407/)

[ghstack-poisoned]
…sed_parameters=True in DDP"


We only need to keep track of this for traversing the autograd graph
when find_unused_parameters=True. Without that, we populate and keep this
mapping in memory, which occupies sizeof(pointer) * number of grad accumulators
of extra memory.

Also renames the variable to something more meaningful. 

Differential Revision: [D24154407](https://our.internmc.facebook.com/intern/diff/D24154407/)

[ghstack-poisoned]
@codecov
Copy link
Copy Markdown

codecov bot commented Oct 9, 2020

Codecov Report

❗ No coverage uploaded for pull request base (gh/rohan-varma/181/base@b1374ed). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@                    Coverage Diff                     @@
##             gh/rohan-varma/181/base   #45942   +/-   ##
==========================================================
  Coverage                           ?   68.28%           
==========================================================
  Files                              ?      410           
  Lines                              ?    53609           
  Branches                           ?        0           
==========================================================
  Hits                               ?    36608           
  Misses                             ?    17001           
  Partials                           ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1374ed...1e29d71. Read the comment docs.

…sed_parameters=True in DDP"


We only need to keep track of this for traversing the autograd graph
when find_unused_parameters=True. Without that, we populate and keep this
mapping in memory, which occupies sizeof(pointer) * number of grad accumulators
of extra memory.

Also renames the variable to something more meaningful. 

Differential Revision: [D24154407](https://our.internmc.facebook.com/intern/diff/D24154407/)

[ghstack-poisoned]
…sed_parameters=True in DDP"


We only need to keep track of this for traversing the autograd graph
when find_unused_parameters=True. Without that, we populate and keep this
mapping in memory, which occupies sizeof(pointer) * number of grad accumulators
of extra memory.

Also renames the variable to something more meaningful. 

Differential Revision: [D24154407](https://our.internmc.facebook.com/intern/diff/D24154407/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in f739875.

@facebook-github-bot facebook-github-bot deleted the gh/rohan-varma/181/head branch October 17, 2020 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants