Make grad point to bucket buffer in DDP to save memory usage by zhaojuanmao · Pull Request #41954 · pytorch/pytorch

zhaojuanmao · 2020-07-23T22:24:56Z

Stack from ghstack:

add doc regarding that grads are pointing to bucket views in DDP #43282 add doc regarding that grads are pointing to bucket views in DDP
Make grad point to bucket buffer in DDP to save memory usage #41954 Make grad point to bucket buffer in DDP to save memory usage

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage.
In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we
made changes in #41283.

Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to
keep grad undefined for unused parameters.

Test Plans:
For roberta_base model with ~1GB parameters, peak memory dropped ~1GB (8250MB-7183MB). Per iteration latency (0.982s ->0.909s), 8% speed up; will rerun a few times to confirm the speed up
For resnet model with ~97M parameters, peak memory dropped ~100MB (3089MB -> 2988MB). Per iteration latency has no change (0.122s -> 0.123s)

Differential Revision: D22707857

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) [ghstack-poisoned]

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) ghstack-source-id: 108384696 Pull Request resolved: #41954

dr-ci · 2020-07-23T23:06:28Z

💊 CI failures summary and remediations

As of commit 2082a5a (more details on the Dr. CI page):

1/2 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)
1/2 broken upstream at merge base fa6b34b from Aug 17 until Aug 19 (43 commits; 768c2a8 - 1e248ca)

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

Since your merge base is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.

binary_windows_libtorch_3_7_cpu_release_build from Aug 17 until Aug 19 (43 commits; 768c2a8 - 1e248ca)
- 🔁 rerun

ci.pytorch.org: 1 failed

Failed: pr/caffe2-pytorch-linux-xenial-rocm3.5.1-py3.6-test

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 34 times.

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) [ghstack-poisoned]

Pull Request resolved: #41954 Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. ghstack-source-id: 108461312 Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/)

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) [ghstack-poisoned]

Pull Request resolved: #41954 Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. ghstack-source-id: 108498988 Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/)

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) [ghstack-poisoned]

Pull Request resolved: #41954 Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. ghstack-source-id: 108579189 Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/)

torch/csrc/distributed/c10d/reducer.cpp

pritamdamania87 · 2020-07-29T02:50:44Z

Would be nice if we could also share some memory saving results as part of the PR description.

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) [ghstack-poisoned]

Pull Request resolved: #41954 Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) ghstack-source-id: 109198898

test/distributed/test_distributed.py

torch/csrc/autograd/VariableTypeManual.cpp

torch/nn/parallel/distributed.py

torch/csrc/distributed/c10d/reducer.cpp

torch/nn/parallel/distributed.py

albanD

Looks mostly good.
I would add a comment here as well:

pytorch/torch/csrc/autograd/functions/accumulate_grad.h

Lines 176 to 177 in c660d2a

    
           // However, that accumulation is sometimes in place and sometimes not, 
        
           // which may break user code.

as now the DDP relies on the fact that AccumulateGrad always change the .grad implace when it exists (when no double backward) even if the variable and the .grad don't have the same layout!

test/distributed/test_distributed.py

torch/csrc/distributed/c10d/reducer.cpp

zhaojuanmao · 2020-08-12T00:08:35Z

@albanD thanks for your review. I will add comments in accumulate_grad.h. But be noted, if grads are mutated in place, DDP can save memory; if grads are mutated out of place somehow, although DDP can not save memory, it is still working at it is today (checking grad is alias of bucket buffer or not, if not, copying grad to bucket buffer)

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Test Plans: For roberta_base model with 1GB parameters, this diff can save 1GB memory. DDP without this diff, peak allocated memory during training loop is 8250 MB; DDP with this diff, peak allocated memory during training loop is 7182 MB; Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) [ghstack-poisoned]

Pull Request resolved: #41954 Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. ghstack-source-id: 109704136 Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/)

albanD · 2020-08-12T01:03:59Z

Yes it will still give the right result I agree but it will defeat the purpose of this optimization (silently!). So it might be hard to detect that changes there actually break this optimization. Hence my request to add a comment there to make people aware that this can happen.

Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Test Plans: For roberta_base model with 1GB parameters, this diff can save 1GB memory. DDP without this diff, peak allocated memory during training loop is 8250 MB; DDP with this diff, peak allocated memory during training loop is 7182 MB; Differential Revision: [D22707857](https://our.internmc.facebook.com/intern/diff/D22707857/) [ghstack-poisoned]

…emory usage" reland #41954 Add one argument in DDP API to enable/disable letting grads pointing to views. When it is disabled, behavior is the same as DDP right now; when it is enabled, Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D23588186](https://our.internmc.facebook.com/intern/diff/D23588186/) [ghstack-poisoned]

Pull Request resolved: #44344 reland #41954 Add one argument in DDP API to enable/disable letting grads pointing to views. When it is disabled, behavior is the same as DDP right now; when it is enabled, Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. ghstack-source-id: 111827320 Differential Revision: [D23588186](https://our.internmc.facebook.com/intern/diff/D23588186/)

…ing of second iteration" Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration Differential Revision: [D23583017](https://our.internmc.facebook.com/intern/diff/D23583017/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23583017/)! [ghstack-poisoned]

…nd iteration Pull Request resolved: #44326 Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration ghstack-source-id: 112011490 Differential Revision: [D23583017](https://our.internmc.facebook.com/intern/diff/D23583017/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23583017/)!

Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well Differential Revision: [D23583347](https://our.internmc.facebook.com/intern/diff/D23583347/) [ghstack-poisoned]

Pull Request resolved: #44330 Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well ghstack-source-id: 112022404 Differential Revision: [D23583347](https://our.internmc.facebook.com/intern/diff/D23583347/)

…nd iteration (#44326) Summary: Pull Request resolved: #44326 Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration ghstack-source-id: 112011490 Test Plan: unit tests Reviewed By: mrshenli Differential Revision: D23583017 fbshipit-source-id: ef67f79437a820d9b5699b651803622418499a83

Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well Differential Revision: [D23583347](https://our.internmc.facebook.com/intern/diff/D23583347/) [ghstack-poisoned]

[test all] Pull Request resolved: #44330 Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well ghstack-source-id: 112185672 Differential Revision: [D23583347](https://our.internmc.facebook.com/intern/diff/D23583347/)

…g of second iteration [test all] Update for relanding: in ddp.join(), moved _rebuild_buckets from end of backward to beginning of forward as well. Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration ghstack-source-id: 112011490 Differential Revision: [D23735185](https://our.internmc.facebook.com/intern/diff/D23735185/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23735185/)! [ghstack-poisoned]

…to beginning of second iteration" [test all] Update for relanding: in ddp.join(), moved _rebuild_buckets from end of backward to beginning of forward as well. Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration Differential Revision: [D23735185](https://our.internmc.facebook.com/intern/diff/D23735185/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23735185/)! [ghstack-poisoned]

…emory usage" reland #41954 Add one argument in DDP API to enable/disable letting grads pointing to views. When it is disabled, behavior is the same as DDP right now; when it is enabled, Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D23588186](https://our.internmc.facebook.com/intern/diff/D23588186/) [ghstack-poisoned]

[test all] Pull Request resolved: #44344 reland #41954 Add one argument in DDP API to enable/disable letting grads pointing to views. When it is disabled, behavior is the same as DDP right now; when it is enabled, Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. ghstack-source-id: 112194326 Differential Revision: [D23588186](https://our.internmc.facebook.com/intern/diff/D23588186/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23588186/)!

…emory usage" reland #41954 Add one argument in DDP API to enable/disable letting grads pointing to views. When it is disabled, behavior is the same as DDP right now; when it is enabled, Make both variable.grad() and grad in distautograd context point to bucket buffer in DDP to save memory usage. In this case, grad will be view of bucket buffer tensors, in order to make it compatiable with optimizer.zero_grad(), we made changes in #41283. Also be noted that we can not make variable.grad() pointing to bucket buffer during construction time, because we want to keep grad undefined for unused parameters. Differential Revision: [D23588186](https://our.internmc.facebook.com/intern/diff/D23588186/) [ghstack-poisoned]

Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well Differential Revision: [D23583347](https://our.internmc.facebook.com/intern/diff/D23583347/) [ghstack-poisoned]

…to beginning of second iteration" [test all] Update for relanding: in ddp.join(), moved _rebuild_buckets from end of backward to beginning of forward as well. Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration Differential Revision: [D23735185](https://our.internmc.facebook.com/intern/diff/D23735185/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23735185/)! [ghstack-poisoned]

Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well Differential Revision: [D23583347](https://our.internmc.facebook.com/intern/diff/D23583347/) [ghstack-poisoned]

[test all] Pull Request resolved: #44330 Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well ghstack-source-id: 112243783 Differential Revision: [D23583347](https://our.internmc.facebook.com/intern/diff/D23583347/)

…to beginning of second iteration" [test all] Update for relanding: in ddp.join(), moved _rebuild_buckets from end of backward to beginning of forward as well. Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration Differential Revision: [D23735185](https://our.internmc.facebook.com/intern/diff/D23735185/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23735185/)! [ghstack-poisoned]

…g of second iteration Pull Request resolved: #44798 [test all] Update for relanding: in ddp.join(), moved _rebuild_buckets from end of backward to beginning of forward as well. Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration ghstack-source-id: 112244222 ghstack-source-id: 112244222 Differential Revision: [D23735185](https://our.internmc.facebook.com/intern/diff/D23735185/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D23735185/)!

zhaojuanmao requested review from mrshenli, pietern and pritamdamania87 as code owners July 23, 2020 22:24

zhaojuanmao mentioned this pull request Jul 23, 2020

grad detach_ only when it has grad_fn in zero_grad call #41283

Closed

pritamdamania87 reviewed Jul 29, 2020

View reviewed changes

torch/csrc/distributed/c10d/reducer.cpp Show resolved Hide resolved

torch/csrc/distributed/c10d/reducer.cpp Outdated Show resolved Hide resolved

torch/csrc/distributed/c10d/reducer.cpp Outdated Show resolved Hide resolved

torch/csrc/distributed/c10d/reducer.cpp Show resolved Hide resolved

rohan-varma self-requested a review July 31, 2020 00:40

zhaojuanmao requested review from albanD and apaszke as code owners August 4, 2020 23:27

mrshenli reviewed Aug 6, 2020

View reviewed changes

mrshenli mentioned this pull request Aug 7, 2020

Cap DDP total number of buckets #39022

Open

pritamdamania87 reviewed Aug 11, 2020

View reviewed changes

torch/nn/parallel/distributed.py Outdated Show resolved Hide resolved

pritamdamania87 mentioned this pull request Aug 11, 2020

Join-based API to support DDP uneven inputs #42577

Closed

12 tasks

albanD reviewed Aug 11, 2020

View reviewed changes

test/distributed/test_distributed.py Show resolved Hide resolved

torch/csrc/distributed/c10d/reducer.cpp Outdated Show resolved Hide resolved

zhaojuanmao mentioned this pull request Sep 16, 2020

[reland] move rebuild buckets from end of first iteration to beginning of second iteration #44798

Closed

This was referenced Sep 17, 2020

[For ci-all tests] refactor intialize bucket views #44865

Closed

[reland ci-all tests] move rebuild buckets from end of first iteration to beginning of second iteration #44893

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make grad point to bucket buffer in DDP to save memory usage#41954

Make grad point to bucket buffer in DDP to save memory usage#41954
zhaojuanmao wants to merge 8 commits intogh/zhaojuanmao/49/basefrom
gh/zhaojuanmao/49/head

zhaojuanmao commented Jul 23, 2020 •

edited

Loading

Uh oh!

dr-ci bot commented Jul 23, 2020 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pritamdamania87 commented Jul 29, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albanD left a comment

Uh oh!

Uh oh!

Uh oh!

zhaojuanmao commented Aug 12, 2020

Uh oh!

albanD commented Aug 12, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	// However, that accumulation is sometimes in place and sometimes not,
	// which may break user code.

Conversation

zhaojuanmao commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🚧 1 fixed upstream failure:

ci.pytorch.org: 1 failed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pritamdamania87 commented Jul 29, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zhaojuanmao commented Aug 12, 2020

Uh oh!

albanD commented Aug 12, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zhaojuanmao commented Jul 23, 2020 •

edited

Loading

dr-ci bot commented Jul 23, 2020 •

edited

Loading