Skip to content

Conversation

@apakbin
Copy link
Contributor

@apakbin apakbin commented May 28, 2025

Get current device from Torch rather than HIP in MIOpen handle creation. The device may have already been set from torch side, otherwise device is set to 0 for handle. Additional audits of cudnn vs miopen Handle.cpp file.

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

@pytorch-bot
Copy link

pytorch-bot bot commented May 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154549

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 8411a70 with merge base d4ab8e7 (image):

NEW FAILURE - The following job has failed:

  • linux-binary-libtorch-release / libtorch-cpu-shared-with-deps-release-build / build (gh)
    E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/z/zip/zip_3.0-11build1_amd64.deb Could not connect to archive.ubuntu.com:80 (185.125.190.81). - connect (111: Connection refused) Could not connect to archive.ubuntu.com:80 (185.125.190.82). - connect (111: Connection refused) Could not connect to archive.ubuntu.com:80 (185.125.190.39). - connect (111: Connection refused) Could not connect to archive.ubuntu.com:80 (185.125.190.83). - connect (111: Connection refused) Could not connect to archive.ubuntu.com:80 (185.125.190.36). - connect (111: Connection refused)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the module: rocm AMD GPU support for Pytorch label May 28, 2025
@apakbin apakbin marked this pull request as ready for review May 28, 2025 18:58
@jeffdaily jeffdaily force-pushed the miopen_handle_creation_fix branch from 8b2fcae to a0821fa Compare May 28, 2025 20:33
@jeffdaily jeffdaily added release notes: rocm mandatorylabel ciflow/rocm Trigger "default" config CI on ROCm labels May 28, 2025
@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 28, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label May 28, 2025
@jeffdaily jeffdaily added the ciflow/rocm Trigger "default" config CI on ROCm label May 28, 2025
@jeffdaily
Copy link
Collaborator

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 29, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: linux-binary-libtorch-release / libtorch-cpu-shared-with-deps-release-build / build

Details for Dev Infra team Raised by workflow job

@pruthvistony
Copy link
Collaborator

Failure on this job - linux-binary-libtorch-release / libtorch-cpu-shared-with-deps-release-build / build is NOT related to this change.

@pruthvistony
Copy link
Collaborator

@pytorchbot merge -f "unrelated failures"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

apakbin added a commit to ROCm/pytorch that referenced this pull request Jun 2, 2025
…e creation (pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation. The device may have already been set from torch side, otherwise device is set to 0 for handle.  Additional audits of cudnn vs miopen Handle.cpp file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
apakbin added a commit to ROCm/pytorch that referenced this pull request Jun 2, 2025
…e creation (pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation. The device may have already been set from torch side, otherwise device is set to 0 for handle.  Additional audits of cudnn vs miopen Handle.cpp file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
apakbin added a commit to ROCm/pytorch that referenced this pull request Jun 2, 2025
…e creation (pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation. The device may have already been set from torch side, otherwise device is set to 0 for handle.  Additional audits of cudnn vs miopen Handle.cpp file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
apakbin added a commit to ROCm/pytorch that referenced this pull request Jun 2, 2025
…e creation (pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation. The device may have already been set from torch side, otherwise device is set to 0 for handle.  Additional audits of cudnn vs miopen Handle.cpp file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
iupaikov-amd pushed a commit to ROCm/pytorch that referenced this pull request Jun 4, 2025
…e creation (pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation. The device may have already been set from torch side, otherwise device is set to 0 for handle.  Additional audits of cudnn vs miopen Handle.cpp file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Jun 6, 2025
…n handle creation pytorch#154549 (#2216)

(This is a cherry-pick of
pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation.
The device may have already been set from torch side, otherwise device
is set to 0 for handle. Additional audits of cudnn vs miopen Handle.cpp
file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Jun 6, 2025
…n handle creation (pytorch#154549) (#2221)

(This is a cherry-pick of
pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation.
The device may have already been set from torch side, otherwise device
is set to 0 for handle. Additional audits of cudnn vs miopen Handle.cpp
file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Jun 6, 2025
…er than HIP in handle creation (pytorch#154549) (#2222)

(This is a cherry-pick of
pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation.
The device may have already been set from torch side, otherwise device
is set to 0 for handle. Additional audits of cudnn vs miopen Handle.cpp
file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Jun 6, 2025
…er than HIP in handle creation (pytorch#154549) (#2223)

(This is a cherry-pick of
pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation.
The device may have already been set from torch side, otherwise device
is set to 0 for handle. Additional audits of cudnn vs miopen Handle.cpp
file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
okakarpa pushed a commit to ROCm/pytorch that referenced this pull request Jun 7, 2025
…er than HIP in handle creation (pytorch#154549) (#2223)

(This is a cherry-pick of
pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation.
The device may have already been set from torch side, otherwise device
is set to 0 for handle. Additional audits of cudnn vs miopen Handle.cpp
file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
jithunnair-amd pushed a commit to ROCm/pytorch that referenced this pull request Jun 7, 2025
…er than HIP in handle creation (pytorch#154549) (#2223)

(This is a cherry-pick of
pytorch#154549)

Get current device from Torch rather than HIP in MIOpen handle creation.
The device may have already been set from torch side, otherwise device
is set to 0 for handle. Additional audits of cudnn vs miopen Handle.cpp
file.

Pull Request resolved: pytorch#154549
Approved by: https://github.com/jeffdaily, https://github.com/cyyever

Co-authored-by: Jeff Daily <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: rocm AMD GPU support for Pytorch open source release notes: rocm mandatorylabel triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants