-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[triton] Update pin for PyTorch 2.6/Triton 3.2 #139206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139206
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 New Failures, 1 Unrelated FailureAs of commit 53f4666 with merge base 740d1eb ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
I'm going to look at the cumsum failure Edit: issue here #139348 |
c1ad437 to
5b030ef
Compare
Binary build is failing in trunk after #139206 lands, for example, https://github.com/pytorch/pytorch/actions/runs/11981181986/job/33410250461#step:17:539. It's a bit tricky to spot the issue but the difference is between `3.2.0+35c6c7c628` set by PyTorch and `3.2.0+git35c6c7c6` from triton (look closely one has the length of 10, the other of 8 characters) Triton now has its own nightly build logic in triton-lang/triton#4812 that takes only 8 characters by default while the original logic from PT took 10. So, PT nightly couldn't find the dependency. Pull Request resolved: #141410 Approved by: https://github.com/seemethere, https://github.com/malfet
Binary build is failing in trunk after #139206 lands, for example, https://github.com/pytorch/pytorch/actions/runs/11981181986/job/33410250461#step:17:539. It's a bit tricky to spot the issue but the difference is between `3.2.0+35c6c7c628` set by PyTorch and `3.2.0+git35c6c7c6` from triton (look closely one has the length of 10, the other of 8 characters) Triton now has its own nightly build logic in triton-lang/triton#4812 that takes only 8 characters by default while the original logic from PT took 10. So, PT nightly couldn't find the dependency. Pull Request resolved: #141410 Approved by: https://github.com/seemethere, https://github.com/malfet
|
Thanks @bertmaher, we will rebase our PR #137886 to update XPU triton. |
This reverts commit 74cde3c. # Motivation PyTorch community has uplifted triton to 3.2.0 in pytorch/pytorch#139206. We should follow it. # Solution Revert this commit introduced in #2716
Bump the Triton pin to the release candidate commit for Triton 3.2. A few changes beyond the pin bump itself are needed: * Remove the script that adds a git version hash suffix to the Triton wheel, since as of triton-lang/triton#4812 Triton adds that itself * Add `pybind11` to the Triton build setup, since Triton now depends on it * Use manylinux-2.28 for the Triton wheel builder, and use clang+lld for building to pick up the right glibc Pull Request resolved: pytorch#139206 Approved by: https://github.com/malfet, https://github.com/atalman Co-authored-by: Andrey Talman <[email protected]>
This reverts commit c93e57e. Reverted pytorch#139206 on behalf of https://github.com/atalman due to Will revert and reland skipping xpu builds ([comment](pytorch#139206 (comment)))
Bump the Triton pin to the release candidate commit for Triton 3.2. A few changes beyond the pin bump itself are needed: * Remove the script that adds a git version hash suffix to the Triton wheel, since as of triton-lang/triton#4812 Triton adds that itself * Add `pybind11` to the Triton build setup, since Triton now depends on it * Use manylinux-2.28 for the Triton wheel builder, and use clang+lld for building to pick up the right glibc Pull Request resolved: pytorch#139206 Approved by: https://github.com/malfet, https://github.com/atalman Co-authored-by: Andrey Talman <[email protected]>
Binary build is failing in trunk after pytorch#139206 lands, for example, https://github.com/pytorch/pytorch/actions/runs/11981181986/job/33410250461#step:17:539. It's a bit tricky to spot the issue but the difference is between `3.2.0+35c6c7c628` set by PyTorch and `3.2.0+git35c6c7c6` from triton (look closely one has the length of 10, the other of 8 characters) Triton now has its own nightly build logic in triton-lang/triton#4812 that takes only 8 characters by default while the original logic from PT took 10. So, PT nightly couldn't find the dependency. Pull Request resolved: pytorch#141410 Approved by: https://github.com/seemethere, https://github.com/malfet
Binary build is failing in trunk after pytorch#139206 lands, for example, https://github.com/pytorch/pytorch/actions/runs/11981181986/job/33410250461#step:17:539. It's a bit tricky to spot the issue but the difference is between `3.2.0+35c6c7c628` set by PyTorch and `3.2.0+git35c6c7c6` from triton (look closely one has the length of 10, the other of 8 characters) Triton now has its own nightly build logic in triton-lang/triton#4812 that takes only 8 characters by default while the original logic from PT took 10. So, PT nightly couldn't find the dependency. Pull Request resolved: pytorch#141410 Approved by: https://github.com/seemethere, https://github.com/malfet
Triton xpu build was stopped by pytorch#139206 temporally to wait triton xpu upgrade PR pytorch#137886 landed. Works for pytorch#139722 and pytorch#114850 Pull Request resolved: pytorch#141775 Approved by: https://github.com/atalman
Summary: Following upstream Triton pin update (pytorch/pytorch#139206), we can now enable more kernels in the CI. Pull Request resolved: #96 Reviewed By: adamomainz Differential Revision: D66763281 Pulled By: xuzhao9 fbshipit-source-id: 943472995c2da0f4cd0d01665b53786614f122ae
Update triton version to align with PyTorch: pytorch/pytorch#139206
…4a (#3497) Summary: X-link: facebookresearch/FBGEMM#577 Update triton version to align with PyTorch: pytorch/pytorch#139206 Pull Request resolved: #3497 Reviewed By: q10 Differential Revision: D67075647 Pulled By: brad-mengchi fbshipit-source-id: 6adc76484d1f6b827d62615c4311cf4b9fffd6b6
|
This seems to break int_mm #144705 . I'l try to produce a standalone Triton kernel to reproduce this error. |
…4a (pytorch#577) Summary: Pull Request resolved: facebookresearch/FBGEMM#577 Update triton version to align with PyTorch: pytorch/pytorch#139206 X-link: pytorch#3497 Reviewed By: q10 Differential Revision: D67075647 Pulled By: brad-mengchi fbshipit-source-id: 6adc76484d1f6b827d62615c4311cf4b9fffd6b6
Bump the Triton pin to the release candidate commit for Triton 3.2.
A few changes beyond the pin bump itself are needed:
pybind11to the Triton build setup, since Triton now depends on itcc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov