-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[webgpu] use u32 to represent f16 in uniform #25391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates how 16-bit floats are represented in uniform buffers by packing them into 32-bit unsigned integers and adjusts buffer layout computation and WGSL codegen accordingly.
- Compute size and alignment for f16 uniforms as u32-backed containers in
webgpu_context.cc. - Generate WGSL bitcast expressions in
GetElementAtfor f16 access inshader_variable.h. - Change uniform struct declarations to use u32 types and adjusted lengths in
shader_helper.cc.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| onnxruntime/core/providers/webgpu/webgpu_context.cc | Revised uniform offset/size logic to handle f16 as u32 with correct alignment and padding. |
| onnxruntime/core/providers/webgpu/shader_variable.h | Expanded GetElementAt to emit bitcast<vec2<f16>> accesses for f16 uniforms. |
| onnxruntime/core/providers/webgpu/shader_helper.cc | Mutates data_type/length for f16 and emits appropriate WGSL types (u32, vecN, array). |
Comments suppressed due to low confidence (1)
onnxruntime/core/providers/webgpu/webgpu_context.cc:381
- The comment for the f16 array threshold (>8) does not match the implementation (which uses
length > 6). Please update the comment to reflect the actual branch condition or adjust the threshold to align with the intended behavior.
// - length > 8 : array<vec4<u32>, N> (align 16) (size 16 * N, N = ceil(length / 8))
vraspar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how will it impact perf where device support f16, but we use u32
technically the perf impact is Anyway, we haven't got any actual use case for f16 in uniform yet, so this is not going to impact anything. |
### Description For f16 uniform variables, use u32 to bit-wise represent them. ### Motivation and Context Some devices supports f16 in shader/storage buffer, but not in uniform buffers. Dawn will set the f16_support to false for them. However, we don't necessarily have to use f16 in uniform. This change together with #25349 will enable using f16 models on some Android devices.
…5, 25652 (#25701) ### Description Cherry-pick the following PRs into the `rel-1.23.0` branch: - #25391 - #25611 - #25656 - #25346 - #25374 - #25664 - #25675 - #25652 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Yulong Wang <[email protected]> Co-authored-by: Ishwar Raut <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Gaurav Garg <[email protected]> Co-authored-by: Scott McKay <[email protected]> Co-authored-by: Chi Lo <[email protected]> Co-authored-by: Abhishek Jindal <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]>
### Description For f16 uniform variables, use u32 to bit-wise represent them. ### Motivation and Context Some devices supports f16 in shader/storage buffer, but not in uniform buffers. Dawn will set the f16_support to false for them. However, we don't necessarily have to use f16 in uniform. This change together with microsoft#25349 will enable using f16 models on some Android devices.
Description
For f16 uniform variables, use u32 to bit-wise represent them.
Motivation and Context
Some devices supports f16 in shader/storage buffer, but not in uniform buffers. Dawn will set the f16_support to false for them. However, we don't necessarily have to use f16 in uniform.
This change together with #25349 will enable using f16 models on some Android devices.