[ROCm] Select gpu targets according to PYTORCH_ROCM_ARCH when building AOTriton from source #139432

xinyazhang · 2024-10-31T21:59:29Z

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

linux-foundation-easycla · 2024-10-31T21:59:33Z

✅login: vickytsang / (dfc74cd)
✅login: vickytsang / (dfc74cd, 7e1f559)

The committers listed above are authorized under a signed CLA.

pytorch-bot · 2024-10-31T21:59:33Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139432

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 7e1f559 with merge base 8c22e09 ():

NEW FAILURE - The following job has failed:

pull / cuda12.1-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu) (gh)
REGRESSION: benchmark ('basic_modules_ListOfLinears_eager', 'compile_time_instruction_count') failed, actual result 1053740528 is 2.50% higher than expected 1028000000 ±+1.50% if this is an expected regression, please update the expected results.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

xinyazhang · 2024-10-31T22:01:21Z

@pytorchbot label "topic: not user facing"

xinyazhang · 2024-11-08T21:11:05Z

@pytorchbot label "rocm"

jithunnair-amd

LGTM

jithunnair-amd · 2024-11-25T17:32:05Z

@pytorchbot merge -f "Unrelated CI failures"

pytorchmergebot · 2024-11-25T17:33:39Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…g AOTriton from source (pytorch#139432) Pull Request resolved: pytorch#139432 Approved by: https://github.com/jithunnair-amd, https://github.com/jeffdaily Co-authored-by: Vicky Tsang <[email protected]>

This is backported from upstream PR pytorch#140172, pytorch#137443 and pytorch#139432. Original commit message of pytorch#140172: Notable new features for SDPA operators on AMD systems from AOTriton 0.8b: 1. Nestedtensor support; 2. MQA/GQA support; 3. Restore Efficient attention support for causal=True and seqlen_q != seqlen_k cases; + The kernel should use top-left alignment, bottom right alignment will be added later 4. Move gfx1100 (RX7900/W7800/W7900) out of experimental support status. However, users are strongly recommended to update to ROCM 6.2.4, notably for its firmware updates. Related unit tests are enabled as well. Notable related changes from AOTriton 0.8b: 1. AOTriton 0.8b moves the GPU kernel out of libaotriton.so to a separate directory `aotriton.images`; 2. LZMA replaces ZSTD as GPU kernel compression algorithm for better compression ratio: aotriton0.8b (.so + aotriton.images take 350MB) compared to aotriton0.7b .so: 800MB 3. The compression cannot be disabled now, and `liblzma` is hard run-time dependency. + Should not be a problem, since `lzma` is part of Python Standard Library Pull Request resolved: pytorch#140172 Approved by: https://github.com/jithunnair-amd, https://github.com/jeffdaily Co-authored-by: Jithun Nair <[email protected]>

This is backported from upstream PR pytorch#140172, pytorch#137443 and pytorch#139432. Original commit message of pytorch#140172: Notable new features for SDPA operators on AMD systems from AOTriton 0.8b: 1. Nestedtensor support; 2. MQA/GQA support; 3. Restore Efficient attention support for causal=True and seqlen_q != seqlen_k cases; + The kernel should use top-left alignment, bottom right alignment will be added later 4. Move gfx1100 (RX7900/W7800/W7900) out of experimental support status. However, users are strongly recommended to update to ROCM 6.2.4, notably for its firmware updates. Related unit tests are enabled as well. Notable related changes from AOTriton 0.8b: 1. AOTriton 0.8b moves the GPU kernel out of libaotriton.so to a separate directory `aotriton.images`; 2. LZMA replaces ZSTD as GPU kernel compression algorithm for better compression ratio: aotriton0.8b (.so + aotriton.images take 350MB) compared to aotriton0.7b .so: 800MB 3. The compression cannot be disabled now, and `liblzma` is hard run-time dependency. + Should not be a problem, since `lzma` is part of Python Standard Library Pull Request resolved: pytorch#140172 Approved by: https://github.com/jithunnair-amd, https://github.com/jeffdaily Fixes #ISSUE_NUMBER --------- Co-authored-by: Jithun Nair <[email protected]>

…g AOTriton from source (pytorch#139432) Pull Request resolved: pytorch#139432 Approved by: https://github.com/jithunnair-amd, https://github.com/jeffdaily Co-authored-by: Vicky Tsang <[email protected]>

vickytsang added 2 commits October 31, 2024 21:55

aotriton build should honor PYTORCH_ROCM_ARCH flag

dfc74cd

remove Navi support. Use PYTORCH_ROCM_ARCH def from LoadHIP.cmake

7e1f559

pytorch-bot bot added the module: rocm AMD GPU support for Pytorch label Oct 31, 2024

xinyazhang mentioned this pull request Oct 31, 2024

AOTriton: select gpu targets according to PYTORCH_ROCM_ARCH ROCm/pytorch#1643

Merged

pytorch-bot bot added the topic: not user facing topic category label Oct 31, 2024

pytorchbot added the open source label Oct 31, 2024

pytorch-bot bot added the rocm This tag is for PRs from ROCm team label Nov 8, 2024

jithunnair-amd approved these changes Nov 22, 2024

View reviewed changes

jithunnair-amd marked this pull request as ready for review November 22, 2024 21:35

jithunnair-amd requested a review from jeffdaily November 22, 2024 21:36

jeffdaily approved these changes Nov 25, 2024

View reviewed changes

pytorchmergebot added the merging label Nov 25, 2024

pytorchmergebot added the Merged label Nov 25, 2024

pytorchmergebot closed this in 5ececd4 Nov 25, 2024

pytorchmergebot removed the merging label Nov 25, 2024

naromero77amd mentioned this pull request Dec 13, 2024

ROCm 6.3 fails to build #142248

Closed

xinyazhang mentioned this pull request Jan 13, 2025

[release/2.5] Support head dimension 512 with AOTriton 0.8.1b ROCm/pytorch#1832

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm] Select gpu targets according to PYTORCH_ROCM_ARCH when building AOTriton from source #139432

[ROCm] Select gpu targets according to PYTORCH_ROCM_ARCH when building AOTriton from source #139432

Uh oh!

xinyazhang commented Oct 31, 2024 •

edited by pytorch-bot bot

Loading

Uh oh!

linux-foundation-easycla bot commented Oct 31, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 31, 2024 •

edited

Loading

Uh oh!

xinyazhang commented Oct 31, 2024

Uh oh!

xinyazhang commented Nov 8, 2024

Uh oh!

jithunnair-amd left a comment

Uh oh!

jithunnair-amd commented Nov 25, 2024

Uh oh!

pytorchmergebot commented Nov 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[ROCm] Select gpu targets according to PYTORCH_ROCM_ARCH when building AOTriton from source #139432

[ROCm] Select gpu targets according to PYTORCH_ROCM_ARCH when building AOTriton from source #139432

Uh oh!

Conversation

xinyazhang commented Oct 31, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linux-foundation-easycla bot commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139432

❌ 1 New Failure

Uh oh!

xinyazhang commented Oct 31, 2024

Uh oh!

xinyazhang commented Nov 8, 2024

Uh oh!

jithunnair-amd left a comment

Choose a reason for hiding this comment

Uh oh!

jithunnair-amd commented Nov 25, 2024

Uh oh!

pytorchmergebot commented Nov 25, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

xinyazhang commented Oct 31, 2024 •

edited by pytorch-bot bot

Loading

linux-foundation-easycla bot commented Oct 31, 2024 •

edited

Loading

pytorch-bot bot commented Oct 31, 2024 •

edited

Loading