-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[CUDA] Only use vec128 if CUDA version is newer than 12.8 #150705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150705
Note: Links to docs will display an error until the docs builds have been completed. ❌ 7 New Failures, 2 Unrelated FailuresAs of commit 482e98a with merge base 5b368fa ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
fcdd985 to
f65fbb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should USE_ROCM here also be inverted if the CUDA_VERSION condition is >= 12080
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think so. Before #145746 vec8_alignment were only available to USE_ROCM, after it was enabled unconditionally and I want it to be enabled for either ROCM or CUDA newer than 12.6
f65fbb5 to
837e8b2
Compare
atalman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
| return 8; | ||
| } else | ||
| #else | ||
| #elif defined(CUDA_VERSION) && CUDA_VERSION >= 12080 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be some logic to handle the case when CUDA_VERSION < 12080?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ZainRizvi this is basically redoing #145746 only if CUDA >= 12.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hence this code should not be applied by default but only for CUDA 12.8+
if (address % vec8_alignment == 0) {
return 8;
} else
|
Do we need to tag ciflow binary to check the size reduction? |
No, not really, one generated in |
|
attached ciflow/binaries just in case, to validate that binaries are built correctly and we see difference between cu 126 and cu 128 |
|
Hi @malfet looks like we are getting |
|
@pytorchmergebot merge -f "trunk jobs look good, manywheel jobs as well" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot cherry-pick --onto release/2.7 -c critical |
By addressing a feedback requested at #145746 Pull Request resolved: #150705 Approved by: https://github.com/atalman (cherry picked from commit 5228986)
Cherry picking #150705The cherry pick PR is at #150818 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
[CUDA] Only use vec128 if CUDA version is newer than 12.8 (#150705) By addressing a feedback requested at #145746 Pull Request resolved: #150705 Approved by: https://github.com/atalman (cherry picked from commit 5228986) Co-authored-by: Nikita Shulga <[email protected]>
|
@pytorchmergebot revert -c nosignal -m "break periodic tests" |
|
@pytorchbot successfully started a revert job. Check the current status here. |
…50705)" This reverts commit 5228986. Reverted #150705 on behalf of https://github.com/atalman due to break periodic tests ([comment](#150705 (comment)))
|
@malfet your PR has been successfully reverted. |
|
Let's see what will happen if I'll just try to revert that PR completely: #150895 |
…0705) By addressing a feedback requested at pytorch#145746 Pull Request resolved: pytorch#150705 Approved by: https://github.com/atalman
…torch#150705)" This reverts commit 5228986. Reverted pytorch#150705 on behalf of https://github.com/atalman due to break periodic tests ([comment](pytorch#150705 (comment)))
…0705) By addressing a feedback requested at pytorch#145746 Pull Request resolved: pytorch#150705 Approved by: https://github.com/atalman
…torch#150705)" This reverts commit 5228986. Reverted pytorch#150705 on behalf of https://github.com/atalman due to break periodic tests ([comment](pytorch#150705 (comment)))
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
5043e31 to
2ef5638
Compare
2ef5638 to
482e98a
Compare
By addressing a feedback requested at #145746