-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[SymmMEM] Allow to import _SymmetricMemory when NVSHMEM is not available #162142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162142
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 591b87d with merge base ed77e23 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| try: | ||
| from torch._C._distributed_c10d import _SymmetricMemory | ||
| except ImportError: | ||
| from torch.distributed._C_stubs import _SymmetricMemory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we always do this, to uncouple import from nvshmem even if it's available, and import nvshmem explicitly when needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not forget nvshmem4py that exists now... sigh...
|
Will land this PR first to fix the bug as the current code breaks async TP if people do not have NVSHMEM. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: Command Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
This PR got indirectly reverted from #162568 I need to remember to amend it in |
…ibuted modules importable even when backend not built (pytorch#159889) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Differential Revision: D82113620
…modules importable even when backend not built (#159889) (#162594) Summary: Original: D81957844 and D81957923 Also, #162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: #162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…modules importable even when backend not built (#159889) (#162594) Summary: Original: D81957844 and D81957923 Also, #162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: #162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ble (pytorch#162142) Summary: As we have multiple backends, _SymmetricMemory should not be imported together with NVSHMEM related modules Pull Request resolved: pytorch#162142 Approved by: https://github.com/dcci, https://github.com/kwen2501
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ble (pytorch#162142) Summary: As we have multiple backends, _SymmetricMemory should not be imported together with NVSHMEM related modules Pull Request resolved: pytorch#162142 Approved by: https://github.com/dcci, https://github.com/kwen2501
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ble (pytorch#162142) Summary: As we have multiple backends, _SymmetricMemory should not be imported together with NVSHMEM related modules Pull Request resolved: pytorch#162142 Approved by: https://github.com/dcci, https://github.com/kwen2501
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…modules importable even when backend not built (#159889) (#162594) Summary: Original: D81957844 and D81957923 Also, #162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: #162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ble (pytorch#162142) Summary: As we have multiple backends, _SymmetricMemory should not be imported together with NVSHMEM related modules Pull Request resolved: pytorch#162142 Approved by: https://github.com/dcci, https://github.com/kwen2501
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
…ibuted modules importable even when backend not built (pytorch#159889) (pytorch#162594) Summary: Original: D81957844 and D81957923 Also, pytorch#162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: pytorch#162594 Approved by: https://github.com/H-Huang, https://github.com/dcci
Stack from ghstack (oldest at bottom):
Summary:
As we have multiple backends, _SymmetricMemory should not be imported together with NVSHMEM related modules
cc @H-Huang @awgu @wanchaol @fduwjj @wz337 @wconstab @d4l3k @pragupta @ezyang @msaroufim