Skip to content

[CUDA] Fix build for sm<53#24582

Merged
snnn merged 1 commit intomainfrom
tlwu/fix_matmul_8bits_old_gpu
Apr 29, 2025
Merged

[CUDA] Fix build for sm<53#24582
snnn merged 1 commit intomainfrom
tlwu/fix_matmul_8bits_old_gpu

Conversation

@tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Apr 28, 2025

Description

There is some build error for --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=52.

Some half2 function like __hfma2 used in MatMul 8 bits is not defined for sm < 53. Add an implementation that does not use half2 for those old GPUs.

Fix another build error using cuda 12.5 that is caused by extra const in MOE code for sm<53.

Motivation and Context

Fix nuget packaging pipeline, which uses CMAKE_CUDA_ARCHITECTURES=52-real;61-real;75-real;86-real;89-real;90-virtual.

@snnn snnn merged commit 76cee36 into main Apr 29, 2025
80 of 88 checks passed
@snnn snnn deleted the tlwu/fix_matmul_8bits_old_gpu branch April 29, 2025 02:16
vraspar pushed a commit that referenced this pull request May 1, 2025
### Description

There is some build error for `--cmake_extra_defines
CMAKE_CUDA_ARCHITECTURES=52`.

Some half2 function like `__hfma2` used in MatMul 8 bits is not defined
for sm < 53. Add an implementation that does not use half2 for those old
GPUs.

Fix another build error using cuda 12.5 that is caused by extra `const`
in MOE code for sm<53.

### Motivation and Context

Fix nuget packaging pipeline, which uses
`CMAKE_CUDA_ARCHITECTURES=52-real;61-real;75-real;86-real;89-real;90-virtual`.
jywu-msft pushed a commit that referenced this pull request May 1, 2025
### Description

Cherry pick the following into
[rel-1.22.0](https://github.com/microsoft/onnxruntime/tree/rel-1.22.0)

- (#24491)
- (#24509)
- (#24564)
- (#24574)
- (#24582)
- (#24584)
- (#24568)
- (#24587)
- (#24563)
- (#24592)
- (#24526)
- (#24552)
- (#24588)
- (#24605)
- (#24606)

---------

Co-authored-by: Jing Fang <[email protected]>
Co-authored-by: Tianlei Wu <[email protected]>
Co-authored-by: Baiju Meswani <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: Mark Schofield <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Ashwath Shankarnarayan <[email protected]>
Co-authored-by: saurabh <[email protected]>
Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: Hector Li <[email protected]>
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request May 12, 2025
### Description

There is some build error for `--cmake_extra_defines
CMAKE_CUDA_ARCHITECTURES=52`.

Some half2 function like `__hfma2` used in MatMul 8 bits is not defined
for sm < 53. Add an implementation that does not use half2 for those old
GPUs.

Fix another build error using cuda 12.5 that is caused by extra `const`
in MOE code for sm<53.

### Motivation and Context

Fix nuget packaging pipeline, which uses
`CMAKE_CUDA_ARCHITECTURES=52-real;61-real;75-real;86-real;89-real;90-virtual`.
@snnn
Copy link
Contributor

snnn commented Sep 5, 2025

This PR has been included in the rel-1.22.0 branch. Removing the release:1.22.0 label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants