-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[cuDNN][SDPA][Convolution] Expose cuDNN runtime version in CUDA hooks #167111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167111
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit caa7a77 with merge base 5c63946 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| long versionCUDART() const override; | ||
| long versionCuDNN() const override; | ||
| long versionRuntimeCuDNN() const override; | ||
| long versionCuDNNFrontend() const override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does Runtime CUDNN frontend matter? It cannot be changed right? It's a compile time include header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sidecar'd this change in as we'll need it in the near future for SDPA issues that require a cuDNN frontend version to be available for gating. In theory sdp_utils.cpp could be able to access this but I'm not sure I want to include that directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the runtime version be different for cudNNFronteEnd or should it be constexpr?
| static bool hasCuDNN() { | ||
| return detail::getCUDAHooks().hasCuDNN(); | ||
| } | ||
| static long versionCuDNN() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is really compile time? Why no constexpr? Would enable if constexpr logic that would simplify critical code paths in CUDNN dispatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes see
| return CUDNN_VERSION; |
other uses of
CUDNN_VERSION in the file are macros, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if they are macros they should be propogated with constexpr then. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, CUDNN_FRONTNED has it's equivalent function as constexpr
|
@Skylion007 are we building with C++20 only? not sure if |
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Ah, wasn't aware of that limitation. Not yet, no. :( |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 5, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) Details for Dev Infra teamRaised by workflow job |
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot cherry-pick --onto release/2.9 --fixes "cuDNN conv3d performance workaround" -c regression |
…#167111) cuDNN dispatching heuristics rely on versions checks but currently only that compile-time version is exposed, if we want to allow users to resolve #166643 on their end by updating their cuDNN version locally we need to check the runtime version rather than compile-time version. Pull Request resolved: #167111 Approved by: https://github.com/Skylion007 (cherry picked from commit e678450)
Cherry picking #167111The cherry pick PR is at #167327 and it is linked with issue cuDNN conv3d performance workaround. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
…#167327) [cuDNN][SDPA][Convolution] Expose cuDNN runtime version in CUDA hooks (#167111) cuDNN dispatching heuristics rely on versions checks but currently only that compile-time version is exposed, if we want to allow users to resolve #166643 on their end by updating their cuDNN version locally we need to check the runtime version rather than compile-time version. Pull Request resolved: #167111 Approved by: https://github.com/Skylion007 (cherry picked from commit e678450) Co-authored-by: Eddie Yan <[email protected]>
…pytorch#167111) cuDNN dispatching heuristics rely on versions checks but currently only that compile-time version is exposed, if we want to allow users to resolve pytorch#166643 on their end by updating their cuDNN version locally we need to check the runtime version rather than compile-time version. Pull Request resolved: pytorch#167111 Approved by: https://github.com/Skylion007
cuDNN dispatching heuristics rely on versions checks but currently only that compile-time version is exposed, if we want to allow users to resolve #166643 on their end by updating their cuDNN version locally we need to check the runtime version rather than compile-time version.
cc @csarofeen @ptrblck @xwang233 @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @aditew01