-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[c10d] Move DDP broadcast coalesced to C++ #9729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@pytorchbot retest this please |
16fff4f to
b23dd12
Compare
teng-li
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, please follow the c10d coding convention.
torch/csrc/distributed/c10d/ddp.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot retest this please |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goldsborough has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
torch/csrc/distributed/c10d/ddp.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/distributed/c10d/ddp.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/distributed/c10d/init.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
b1bf41c to
23c6713
Compare
|
@apaszke can you take a look at the new version? |
|
@pytorchbot retest this please |
teng-li
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest looks good to me.
Please run clang-format-7 -i on both files before landing
torch/csrc/distributed/c10d/ddp.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
2cfa2a1 to
5108147
Compare
|
clang-formatted and re-ran tests on FAIR cluster. Landing |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goldsborough is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: This PR depends on the tests added in pytorch#9670. It moves the first, tiny function from the c10d DDP to C++: `dist_broadcast_coalesced`. Let me know if ` torch/csrc/distributed/c10d/ddp.h` will be a good place to put these rewritten functions. pietern The controller you requested could not be found. apaszke Pull Request resolved: pytorch#9729 Differential Revision: D8985308 Pulled By: goldsborough fbshipit-source-id: dc459fe9040273714044152063585e746974752f
Summary: The next function I'm moving to C++ is `sync_params`. It is stacked on top of #9729, so some changes will go away when it lands and I rebase. I also split code into a `.h` and `.cpp` file for better code organization. The controller you requested could not be found. pietern apaszke Pull Request resolved: #9805 Differential Revision: D9688604 Pulled By: goldsborough fbshipit-source-id: 4467104d3f9e2354425503b9e4edbd59603e20a8
Summary: The next function I'm moving to C++ is `sync_params`. It is stacked on top of pytorch#9729, so some changes will go away when it lands and I rebase. I also split code into a `.h` and `.cpp` file for better code organization. The controller you requested could not be found. pietern apaszke Pull Request resolved: pytorch#9805 Differential Revision: D9688604 Pulled By: goldsborough fbshipit-source-id: 4467104d3f9e2354425503b9e4edbd59603e20a8
This PR depends on the tests added in #9670. It moves the first, tiny function from the c10d DDP to C++:
dist_broadcast_coalesced. Let me know iftorch/csrc/distributed/c10d/ddp.hwill be a good place to put these rewritten functions.@pietern @teng-li @apaszke