-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[DDP] Refactor and remove sync_params #64514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) [ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 0eace72 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski @SciPioneer @H-Huang mrzzd cbalioglu gcramer23 [ghstack-poisoned]
torch/nn/parallel/distributed.py
Outdated
| # sync params according to location (before/after forward) user | ||
| # specified as part of hook, if hook was specified. | ||
| buffer_hook_registered = hasattr(self, 'buffer_hook') | ||
| if self.require_forward_param_sync and (not buffer_hook_registered or self.buffer_hook.buffer_comm_hook_location == _BufferCommHookLocation.PRE_FORWARD): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: put this into a helper function, it could also include 'if self.will_sync_module_buffers()' check in this helper function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g., will_sync_module_buffers() can include require_forward_param_sync check; should_sync_buffer_forward(), should_sync_buffer_backward()
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski @SciPioneer @H-Huang mrzzd cbalioglu gcramer23 [ghstack-poisoned]
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski @SciPioneer @H-Huang mrzzd cbalioglu gcramer23 [ghstack-poisoned]
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski @SciPioneer @H-Huang mrzzd cbalioglu gcramer23 [ghstack-poisoned]
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski @SciPioneer @H-Huang mrzzd cbalioglu gcramer23 [ghstack-poisoned]
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski @SciPioneer @H-Huang mrzzd cbalioglu gcramer23 [ghstack-poisoned]
sync_params is a misnomer since we don't actually synchroniz parameters. While removing this I realized `self._check_and_sync_module_buffers` does almost everything we need it to, so just refactored that and made DDP forward call into it. Differential Revision: [D30751231](https://our.internmc.facebook.com/intern/diff/D30751231/) cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski @SciPioneer @H-Huang mrzzd cbalioglu gcramer23 [ghstack-poisoned]
|
This pull request has been merged in 5739f77. |
Stack from ghstack:
sync_params is a misnomer since we don't actually synchroniz
parameters. While removing this I realized
self._check_and_sync_module_buffersdoes almost everything we need it to, sojust refactored that and made DDP forward call into it.
Differential Revision: D30751231
cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @agolynski @SciPioneer @H-Huang @mrzzd @cbalioglu @gcramer23