[CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM #25802

tianleiwu · 2025-08-20T22:26:17Z

Description

Add a build flag to enable/disable mixed gemm cutlass kernel.

To disable the kernel, you can append the following at the end of build command line:
--cmake_extra_defines onnxruntime_USE_FPA_INTB_GEMM=OFF

Motivation and Context

FpA IntB Gemm need a lot of time to compile. With such option, developer can speed up the build especially on build machine with limited memory.

hariharans29

Thanks!

### Description Add a build flag to enable/disable mixed gemm cutlass kernel. To disable the kernel, you can append the following at the end of build command line: `--cmake_extra_defines onnxruntime_USE_FPA_INTB_GEMM=OFF` ### Motivation and Context FpA IntB Gemm need a lot of time to compile. With such option, developer can speed up the build especially on build machine with limited memory.

Reduce Python and Nuget GPU package size (#26002) [CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM (#25802)

snnn · 2025-09-19T19:23:43Z

This PR has been cherry-picked into the rel-1.23.0 branch in PR #26087. Removing the release:1.23.0 label.

Add build flag USE_FPA_INTB_GEMM

67c5989

hariharans29 approved these changes Aug 20, 2025

View reviewed changes

tianleiwu merged commit d346333 into main Aug 21, 2025
92 checks passed

tianleiwu deleted the tlwu/build_flag_fpa_intb_gemm branch August 21, 2025 19:50

tianleiwu added the release:1.23.0 label Sep 16, 2025

chilo-ms mentioned this pull request Sep 19, 2025

Cherry-pick: Reduce Python and Nuget GPU package size (#26002) #26087

Merged

snnn pushed a commit that referenced this pull request Sep 19, 2025

Cherry-pick: Reduce Python and Nuget GPU package size (#26002) (#26087)

2a034d5

Reduce Python and Nuget GPU package size (#26002) [CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM (#25802)

snnn removed the release:1.23.0 label Sep 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM #25802

[CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM #25802

Uh oh!

tianleiwu commented Aug 20, 2025 •

edited

Loading

Uh oh!

hariharans29 left a comment

Uh oh!

Uh oh!

snnn commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM #25802

[CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM #25802

Uh oh!

Conversation

tianleiwu commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

hariharans29 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

snnn commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tianleiwu commented Aug 20, 2025 •

edited

Loading