Skip to content

Conversation

@tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Aug 20, 2025

Description

Add a build flag to enable/disable mixed gemm cutlass kernel.

To disable the kernel, you can append the following at the end of build command line:
--cmake_extra_defines onnxruntime_USE_FPA_INTB_GEMM=OFF

Motivation and Context

FpA IntB Gemm need a lot of time to compile. With such option, developer can speed up the build especially on build machine with limited memory.

Copy link
Member

@hariharans29 hariharans29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tianleiwu tianleiwu merged commit d346333 into main Aug 21, 2025
92 checks passed
@tianleiwu tianleiwu deleted the tlwu/build_flag_fpa_intb_gemm branch August 21, 2025 19:50
gedoensmax pushed a commit to gedoensmax/onnxruntime that referenced this pull request Sep 2, 2025
### Description

Add a build flag to enable/disable mixed gemm cutlass kernel.

To disable the kernel, you can append the following at the end of build
command line:
`--cmake_extra_defines onnxruntime_USE_FPA_INTB_GEMM=OFF`

### Motivation and Context

FpA IntB Gemm need a lot of time to compile. With such option, developer
can speed up the build especially on build machine with limited memory.
snnn pushed a commit that referenced this pull request Sep 19, 2025
### Description

Add a build flag to enable/disable mixed gemm cutlass kernel.

To disable the kernel, you can append the following at the end of build
command line:
`--cmake_extra_defines onnxruntime_USE_FPA_INTB_GEMM=OFF`

### Motivation and Context

FpA IntB Gemm need a lot of time to compile. With such option, developer
can speed up the build especially on build machine with limited memory.
snnn pushed a commit that referenced this pull request Sep 19, 2025
Reduce Python and Nuget GPU package size (#26002)
[CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM (#25802)
@snnn
Copy link
Contributor

snnn commented Sep 19, 2025

This PR has been cherry-picked into the rel-1.23.0 branch in PR #26087. Removing the release:1.23.0 label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants