Skip to content

Conversation

@fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Jul 14, 2025

Description

For f16 uniform variables, use u32 to bit-wise represent them.

Motivation and Context

Some devices supports f16 in shader/storage buffer, but not in uniform buffers. Dawn will set the f16_support to false for them. However, we don't necessarily have to use f16 in uniform.

This change together with #25349 will enable using f16 models on some Android devices.

@fs-eire fs-eire requested review from Copilot and vraspar July 14, 2025 21:28

This comment was marked as outdated.

@fs-eire fs-eire requested a review from Copilot July 14, 2025 22:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates how 16-bit floats are represented in uniform buffers by packing them into 32-bit unsigned integers and adjusts buffer layout computation and WGSL codegen accordingly.

  • Compute size and alignment for f16 uniforms as u32-backed containers in webgpu_context.cc.
  • Generate WGSL bitcast expressions in GetElementAt for f16 access in shader_variable.h.
  • Change uniform struct declarations to use u32 types and adjusted lengths in shader_helper.cc.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
onnxruntime/core/providers/webgpu/webgpu_context.cc Revised uniform offset/size logic to handle f16 as u32 with correct alignment and padding.
onnxruntime/core/providers/webgpu/shader_variable.h Expanded GetElementAt to emit bitcast<vec2<f16>> accesses for f16 uniforms.
onnxruntime/core/providers/webgpu/shader_helper.cc Mutates data_type/length for f16 and emits appropriate WGSL types (u32, vecN, array).
Comments suppressed due to low confidence (1)

onnxruntime/core/providers/webgpu/webgpu_context.cc:381

  • The comment for the f16 array threshold (>8) does not match the implementation (which uses length > 6). Please update the comment to reflect the actual branch condition or adjust the threshold to align with the intended behavior.
    // - length > 8      : array<vec4<u32>, N>   (align 16) (size 16 * N, N = ceil(length / 8))

Copy link
Contributor

@vraspar vraspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how will it impact perf where device support f16, but we use u32

@fs-eire
Copy link
Contributor Author

fs-eire commented Jul 30, 2025

I wonder how will it impact perf where device support f16, but we use u32

technically the perf impact is let val = uniforms.f16_val; vs. let val = bitcast<vec2<f16>>(uniforms.u32_val)[0], considered cheap in my understanding.

Anyway, we haven't got any actual use case for f16 in uniform yet, so this is not going to impact anything.

@fs-eire fs-eire merged commit c29737d into main Jul 30, 2025
104 of 106 checks passed
@fs-eire fs-eire deleted the fs-eire/use-u32-for-f16-in-uniform-var branch July 30, 2025 20:58
adrianlizarraga pushed a commit that referenced this pull request Aug 8, 2025
### Description

For f16 uniform variables, use u32 to bit-wise represent them.

### Motivation and Context

Some devices supports f16 in shader/storage buffer, but not in uniform
buffers. Dawn will set the f16_support to false for them. However, we
don't necessarily have to use f16 in uniform.

This change together with #25349 will enable using f16 models on some
Android devices.
adrianlizarraga added a commit that referenced this pull request Aug 8, 2025
…5, 25652 (#25701)

### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:

- #25391
- #25611
- #25656
- #25346
- #25374
- #25664
- #25675
- #25652


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Ishwar Raut <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Gaurav Garg <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: Abhishek Jindal <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025
### Description

For f16 uniform variables, use u32 to bit-wise represent them.

### Motivation and Context

Some devices supports f16 in shader/storage buffer, but not in uniform
buffers. Dawn will set the f16_support to false for them. However, we
don't necessarily have to use f16 in uniform.

This change together with microsoft#25349 will enable using f16 models on some
Android devices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants