[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU. #154514

etaf · 2025-05-28T14:31:10Z

Stack from ghstack (oldest at bottom):

-> [AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU. #154514

Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs.

Fixes #154682, Fixes #154683, Fixes 154689, Fixes #154685 , Fixes #154690, Fixes #154681

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2025-05-28T14:31:15Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154514

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Experiencing "429: Too Many Requests" on downloading actions

✅ You can merge normally! (2 Unrelated Failures)

As of commit ef39ce7 with merge base 6cb6da6 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf, 1, 1, linux.24xl.spr-metal) (gh) (similar failure)
Process completed with exit code 1.

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3-clang12-executorch / build (gh) (#150261)
Final attempt failed. Child_process exited with error code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

etaf · 2025-05-28T23:36:18Z

Hi, @desertfire @eellison Would you mind taking a look at this PR when you have time? We need it to address the XPU CI failures. Thank you very much!

desertfire

I also did some config renaming in #154608. You will need to rebase after that.

desertfire · 2025-05-29T14:02:15Z

torch/_inductor/runtime/triton_heuristics.py

        binary = launcher.bin.asm[bin_type]
        # Also store asm code which can be used for debugging and generating cpp package
-        asm_type = {"hip": "amdgcn", "cuda": "ptx"}.get(self.device_props.type, None)
+        asm_type = {"hip": "amdgcn", "cuda": "ptx", "xpu": "spv"}.get(


Since we already generate spv for xpu, shouldn't the added new option be a no-op for xpu?

Hi @desertfire , Yes, in theory XPU doesn’t need this option. I added it to align with the user interface, and also to fix some errors that occur on XPU when this option is specified in UT. For example: https://hud.pytorch.org/pr/pytorch/pytorch/147693#43085076369

RuntimeError: Failed to run autotuning code block: Missing kernel assembly code

And this may also helps avoid adding multiple if-else checks in the code specifically for XPU.

desertfire · 2025-05-30T18:06:01Z

torch/_inductor/codecache.py

-    if hash_type in {"amdgcn", "code", "ptx"}:
+    if hash_type in {"amdgcn", "code", "ptx", "spv"}:
        return code_hash(content, extra)
    if hash_type in {"cubin", "hsaco", "spv"}:


It probably doesn't matter in terms of functionality, but "spv" will not hit this branch anymore.

…for XPU." Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

ghstack-source-id: e8b5bda Pull Request resolved: #154514

…for XPU." Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs. Fixes #154682, Fixes #154683, Fixes 154689, Fixes #154685 , Fixes #154690, Fixes #154681 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

ghstack-source-id: ec2d8dd Pull Request resolved: #154514

etaf · 2025-06-03T22:53:35Z

@pytorchbot merge

pytorchmergebot · 2025-06-03T22:56:08Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ytorch#154514) Following the design of pytorch#154413, this PR add XPU support for generating kernel binary files that support multiple archs. Fixes pytorch#154682, Fixes pytorch#154683, Fixes 154689, Fixes pytorch#154685 , Fixes pytorch#154690, Fixes pytorch#154681 Pull Request resolved: pytorch#154514 Approved by: https://github.com/desertfire, https://github.com/EikanWang

[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU.

03ed2d2

[ghstack-poisoned]

etaf mentioned this pull request May 28, 2025

[Inductor][Intel GPU] Support mkldnn Conv post op fusion for XPU. #150287

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels May 28, 2025

etaf requested a review from desertfire May 28, 2025 14:31

pytorchbot added the open source label May 28, 2025

etaf added topic: not user facing topic category ciflow/xpu Run XPU CI tasks labels May 28, 2025

etaf requested a review from EikanWang May 28, 2025 14:36

etaf added this to PyTorch Intel May 28, 2025

etaf moved this to Review Required in PyTorch Intel May 28, 2025

etaf added the ciflow/trunk Trigger trunk jobs on your pull request label May 28, 2025

etaf requested review from eellison and removed request for eellison May 28, 2025 23:16

This was referenced May 29, 2025

[Intel GPU] OneDNN primitive cache support for Int4 WOQ gemm on XPU #147693

Closed

[Intel GPU][AOTI] Add xpu mkldnn ops support for AOTInductor. #154586

Closed

guangyey mentioned this pull request May 29, 2025

[Intel GPU] Support f32 intermediate dtype, headdim size <=576 and f32 causal mask for SDPA #152091

Closed

desertfire reviewed May 29, 2025

View reviewed changes

desertfire approved these changes May 30, 2025

View reviewed changes

desertfire reviewed May 30, 2025

View reviewed changes

etaf added a commit that referenced this pull request Jun 3, 2025

[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU.

e6e61cd

ghstack-source-id: e8b5bda Pull Request resolved: #154514

etaf added a commit that referenced this pull request Jun 3, 2025

[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU.

016963a

ghstack-source-id: ec2d8dd Pull Request resolved: #154514

EikanWang approved these changes Jun 3, 2025

View reviewed changes

pytorchmergebot added the merging label Jun 3, 2025

pytorchmergebot added the Merged label Jun 3, 2025

pytorchmergebot closed this in cbdacd3 Jun 3, 2025

github-project-automation bot moved this from Review Required to Done in PyTorch Intel Jun 3, 2025

pytorchmergebot removed the merging label Jun 3, 2025

etaf mentioned this pull request Jun 3, 2025

DISABLED test_linear (__main__.TestAOTInductorPackageCpp_xpu) #154689

Closed

github-actions bot deleted the gh/etaf/133/head branch July 4, 2025 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU. #154514

[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU. #154514

Uh oh!

etaf commented May 28, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented May 28, 2025 •

edited

Loading

Uh oh!

etaf commented May 28, 2025

Uh oh!

desertfire left a comment

Uh oh!

desertfire May 29, 2025

Uh oh!

etaf May 29, 2025

Uh oh!

desertfire May 30, 2025

Uh oh!

etaf commented Jun 3, 2025

Uh oh!

pytorchmergebot commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU. #154514

[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU. #154514

Uh oh!

Conversation

etaf commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154514

❗ 1 Active SEVs

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

etaf commented May 28, 2025

Uh oh!

desertfire left a comment

Choose a reason for hiding this comment

Uh oh!

desertfire May 29, 2025

Choose a reason for hiding this comment

Uh oh!

etaf May 29, 2025

Choose a reason for hiding this comment

Uh oh!

desertfire May 30, 2025

Choose a reason for hiding this comment

Uh oh!

etaf commented Jun 3, 2025

Uh oh!

pytorchmergebot commented Jun 3, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

etaf commented May 28, 2025 •

edited

Loading

pytorch-bot bot commented May 28, 2025 •

edited

Loading