-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[CUDA] [CI] Add cu124 docker images #125944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125944
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 198cba4 with merge base aeb9934 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@nWEIdia Two errors seems not to be related to this PR, however please rebase and rerun CI to be sure |
Remove cu121 related changes as this is cu124
And sneak in some nice change :)
75b0926 to
0135402
Compare
|
@pytorchmergebot merge -f "One failure on distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_powerSGD is not related" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
❌ 🤖 pytorchbot command failed: Try |
|
@clee2000 Good catch! I now realize there might be UCC/UCX related regression that newer UCC/UCX may not be working as well with cuda 11.8. |
|
@pytorchbot successfully started a revert job. Check the current status here. |
|
@nWEIdia your PR has been successfully reverted. |
This reverts commit 5fb4a76. Reverted #125944 on behalf of https://github.com/nWEIdia due to test failure seems related https://hud.pytorch.org/pytorch/pytorch/commit/5fb4a766b88bcf633a23610bd66de0f3020f7c66 https://github.com/pytorch/pytorch/actions/runs/9085206167/job/24972040039 ([comment](#125944 (comment)))
together with ucx
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This reverts commit 5fb4a76. Reverted pytorch#125944 on behalf of https://github.com/nWEIdia due to test failure seems related https://hud.pytorch.org/pytorch/pytorch/commit/5fb4a766b88bcf633a23610bd66de0f3020f7c66 https://github.com/pytorch/pytorch/actions/runs/9085206167/job/24972040039 ([comment](pytorch#125944 (comment)))
Fixes issues encountered in pytorch#121956 Pull Request resolved: pytorch#125944 Approved by: https://github.com/atalman
Fixes issues encountered in #121956
cc @ptrblck @Aidyn-A @atalman @malfet