[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest by etaf · Pull Request #160729 · pytorch/pytorch

etaf · 2025-08-15T08:52:22Z

Stack from ghstack (oldest at bottom):

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @chenyang78

[ghstack-poisoned]

pytorch-bot · 2025-08-15T08:52:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160729

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit 67fc2df with merge base 98a4d7b ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor / unit-test / inductor-test / test (inductor, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pca_lowrank_cuda_float32
trunk / linux-jammy-rocm-py3.10 / test (distributed, 3, 3, linux.rocm.gpu.gfx950.4) (gh) (disabled by #171119 but the issue was closed recently and a rebase is needed to make it pass)
test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 2, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
pytorch_CycleGAN_and_pix2pix
inductor / inductor-test / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu, unstable) (gh) (#174919)
sam
inductor / inductor-test-cuda13 / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (#174930)
pytorch_CycleGAN_and_pix2pix

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 3fb3be7 Pull Request resolved: #160729

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben [ghstack-poisoned]

ghstack-source-id: 3a9c271 Pull Request resolved: #160729

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben [ghstack-poisoned]

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. [ghstack-poisoned]

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben [ghstack-poisoned]

etaf · 2026-02-13T01:13:24Z

@pytorchbot drci

etaf · 2026-02-13T01:15:33Z

@pytorchbot merge

pytorchmergebot · 2026-02-13T01:17:58Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2026-02-13T01:18:29Z

Merge failed

Reason: 19 jobs have failed, first few of them are: inductor / inductor-test-cuda13 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu), inductor / inductor-test / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu), trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 5, 5, linux.g6.4xlarge.experimental.nvidia.gpu), trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 1, 5, linux.g6.4xlarge.experimental.nvidia.gpu), trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (default, 4, 5, linux.g6.4xlarge.experimental.nvidia.gpu)

Details for Dev Infra team

Raised by workflow job

etaf · 2026-02-13T01:20:24Z

@pytorchbot merge

pytorchmergebot · 2026-02-13T01:22:36Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2026-02-13T01:23:14Z

Merge failed

Reason: 19 jobs have failed, first few of them are: inductor / inductor-test-cuda13 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu), inductor / inductor-test / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu), trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 5, 5, linux.g6.4xlarge.experimental.nvidia.gpu), trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 1, 5, linux.g6.4xlarge.experimental.nvidia.gpu), trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (default, 4, 5, linux.g6.4xlarge.experimental.nvidia.gpu)

Details for Dev Infra team

Raised by workflow job

etaf · 2026-02-13T06:03:30Z

@pytorchbot drci

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo chenyang78 [ghstack-poisoned]

etaf · 2026-02-14T00:58:39Z

@pytorchbot drci

etaf · 2026-02-14T04:47:52Z

@pytorchbot merge

pytorchmergebot · 2026-02-14T04:50:06Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo chenyang78 [ghstack-poisoned]

ghstack-source-id: 7d6f9f1 Pull Request resolved: pytorch#160729

[Inductor XPU GEMM] Step 8/N: Refactor CUDABenchmarkRequest

53e25e5

[ghstack-poisoned]

etaf mentioned this pull request Aug 15, 2025

[Inductor XPU GEMM] Step 4/N: Refactor CUDAKernel to CUTLASSKernel. #160687

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels Aug 15, 2025

This was referenced Aug 15, 2025

[Inductor XPU GEMM] Step 5/N: Refactor CUDACombinedScheduling and CUDACppScheduling. #160688

Closed

[Inductor XPU GEMM] Step 6/N: Refactor CUDACodeCache. #160706

Closed

etaf added a commit that referenced this pull request Aug 15, 2025

[Inductor XPU GEMM] Step 8/N: Refactor CUDABenchmarkRequest

fa16270

ghstack-source-id: 3fb3be7 Pull Request resolved: #160729

pytorchbot added the open source label Aug 15, 2025

Update on "[Inductor XPU GEMM] Step 8/N: Refactor CUDABenchmarkRequest"

2cae11a

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben [ghstack-poisoned]

etaf added a commit that referenced this pull request Aug 16, 2025

[Inductor XPU GEMM] Step 8/N: Refactor CUDABenchmarkRequest

f8e6b0f

ghstack-source-id: 3a9c271 Pull Request resolved: #160729

Update on "[Inductor XPU GEMM] Step 8/N: Refactor CUDABenchmarkRequest"

f0d3f7d

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben [ghstack-poisoned]

Update on "[Inductor XPU GEMM] Step 8/N: Refactor CUDABenchmarkRequest"

3d86ea1

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben [ghstack-poisoned]

etaf changed the title ~~[Inductor XPU GEMM] Step 8/N: Refactor CUDABenchmarkRequest~~ [Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest Sep 2, 2025

etaf added the topic: not user facing topic category label Sep 3, 2025

Update on "[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest"

1f8bbd4

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. [ghstack-poisoned]

etaf mentioned this pull request Sep 1, 2025

[RFC] Enable cutlass to support Intel GPU into PyTorch Inductor. #160175

Open

11 tasks

etaf added 3 commits September 4, 2025 08:53

Update on "[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest"

9a819cc

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. [ghstack-poisoned]

Update on "[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest"

8878554

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. [ghstack-poisoned]

Update on "[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest"

6b7e6d7

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. [ghstack-poisoned]

etaf requested a review from EikanWang September 9, 2025 02:29

etaf marked this pull request as draft September 10, 2025 07:24

etaf added 2 commits October 31, 2025 03:17

Update on "[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest"

0eff316

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. [ghstack-poisoned]

Update on "[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest"

72f69db

This PR is part of #160175. It refactors the CUDA-specific code in CUDABenchmarkRequest, renaming it to CUTLASSBenchmarkRequest so that it can be reused for XPU. [ghstack-poisoned]

etaf added the ciflow/b200 label Feb 12, 2026

pytorchmergebot added the merging label Feb 13, 2026

pytorchmergebot removed the merging label Feb 13, 2026

pytorchmergebot added the merging label Feb 13, 2026

pytorchmergebot removed the merging label Feb 13, 2026

etaf removed ciflow/h100 ciflow/b200 labels Feb 13, 2026

etaf mentioned this pull request Feb 13, 2026

[xpu][feature] Enable Inductor sycl-tla standalone runner. #174958

Draft

etaf added 2 commits February 13, 2026 17:26

jansel approved these changes Feb 14, 2026

View reviewed changes

pytorchmergebot added the merging label Feb 14, 2026

pytorchmergebot added the Merged label Feb 14, 2026

pytorchmergebot closed this in 2817bc7 Feb 14, 2026

pytorchmergebot removed the merging label Feb 14, 2026

etaf added a commit to etaf/pytorch-inductor-xpu that referenced this pull request Feb 15, 2026

[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest

4b8efce

ghstack-source-id: 7d6f9f1 Pull Request resolved: pytorch#160729

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest#160729

[Inductor XPU GEMM] Step 7/N: Refactor CUDABenchmarkRequest#160729
etaf wants to merge 41 commits intogh/etaf/161/basefrom
gh/etaf/161/head

etaf commented Aug 15, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 15, 2025 •

edited

Loading

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

pytorchmergebot commented Feb 13, 2026

Uh oh!

pytorchmergebot commented Feb 13, 2026

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

pytorchmergebot commented Feb 13, 2026

Uh oh!

pytorchmergebot commented Feb 13, 2026

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

etaf commented Feb 14, 2026

Uh oh!

etaf commented Feb 14, 2026

Uh oh!

pytorchmergebot commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

etaf commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160729

✅ You can merge normally! (5 Unrelated Failures)

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

pytorchmergebot commented Feb 13, 2026

Merge started

Uh oh!

pytorchmergebot commented Feb 13, 2026

Merge failed

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

pytorchmergebot commented Feb 13, 2026

Merge started

Uh oh!

pytorchmergebot commented Feb 13, 2026

Merge failed

Uh oh!

etaf commented Feb 13, 2026

Uh oh!

etaf commented Feb 14, 2026

Uh oh!

etaf commented Feb 14, 2026

Uh oh!

pytorchmergebot commented Feb 14, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

etaf commented Aug 15, 2025 •

edited

Loading

pytorch-bot bot commented Aug 15, 2025 •

edited

Loading