Skip to content

Conversation

@eqy
Copy link
Collaborator

@eqy eqy commented Oct 29, 2025

@eqy eqy added module: cudnn Related to torch.backends.cudnn, and CuDNN support module: cuda Related to torch.cuda, and CUDA support in general module: convolution Problems related to convolutions (THNN, THCUNN, CuDNN) open source topic: not user facing topic category labels Oct 29, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166480

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ab4e945 with merge base afaaaa3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Oct 29, 2025
Skylion007
Skylion007 previously approved these changes Oct 29, 2025
@Skylion007
Copy link
Collaborator

Skylion007 commented Oct 29, 2025

Shouldn't there be a CUDNN_FRONTEND version guard though instead of deleting the code?

@malfet
Copy link
Contributor

malfet commented Oct 29, 2025

@eqy but can you add a test, to make sure we'll not regress again?

@malfet malfet dismissed Skylion007’s stale review October 29, 2025 18:20

I think feedback about cudnn frontend verison guard is valid as well as unit test

@eqy
Copy link
Collaborator Author

eqy commented Oct 29, 2025

Test for this case was already checked in here: e2817ac#diff-31c5c90e1292af7427be151ba6c4aca280793122d3a2010698aeb18a4f69a508
Will just update the runtime version guard for now

@malfet
Copy link
Contributor

malfet commented Oct 29, 2025

@eqy IMO instead of deleting the check completely, you need to change the cudnn_version here (as with dynamic linking dev can choose to install pytorch with older cudnn (or newer))

@eqy eqy added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 30, 2025
@eqy
Copy link
Collaborator Author

eqy commented Oct 30, 2025

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@Lucaskabela
Copy link
Contributor

@pytorchbot cherry-pick --onto release/2.9 --fixes "4x performance regressions for 3d convs with AMP" -c regression

@pytorchbot
Copy link
Collaborator

Cherry picking #166480

The cherry pick PR is at #166908 and it is linked with issue 4x performance regressions for 3d convs with AMP. The following tracker issues are updated:

Details for Dev Infra team Raised by workflow job

@eqy
Copy link
Collaborator Author

eqy commented Nov 4, 2025

@Lucaskabela we cannot cherrypick this without a cuDNN version bump...
Otherwise we will dispatch to broken kernels. This issue should also be reflected in existing CI tests.

@Lucaskabela
Copy link
Contributor

Okay I will close this for now - once we have that version bump please submit the cherry pick :)

@eqy
Copy link
Collaborator Author

eqy commented Nov 4, 2025

@Lucaskabela we will discuss this in the core team sync meeting tomorrow... don't think we can bump a cuDNN backend version in a patch release.

@eqy
Copy link
Collaborator Author

eqy commented Nov 4, 2025

@Lucaskabela OK, after discussion in the meeting I think you can proceed with the cherrypick, as the reenablement is guarded based on cuDNN runtime version.
We don't upgrade the cuDNN runtime version in the release matrix but will recommend users who face the performance regression to upgrade their local cuDNN package as a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: convolution Problems related to convolutions (THNN, THCUNN, CuDNN) module: cpu CPU specific problem (e.g., perf, algorithm) module: cuda Related to torch.cuda, and CUDA support in general module: cudnn Related to torch.backends.cudnn, and CuDNN support open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4x performance regression for 3D convs with AMP on torch 2.9.0

7 participants