Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

mcourteaux · 2025-03-17T13:45:32Z

This clears up all the preprocessor prologues or wrapper function prologues.

Update: pow() for WGSL is removed from the preamble now, and its behavior is emulated during codegen in the WGSL_CodeGen backend. There is a potential for simplyfing that again if there are meaningful bounds on expressions inferred. I'd consider that future work. Removing this from the preamble and using Halide's IR to emulate this, enables support for vectorized calls. There is a caveat with this: WGSL says there are no nans. However, their own builtin pow() function returns nan for negative inputs to x and y. Emulating that was tricky, because I think the WGSL simplifier removes those? Not sure why I couldn't simply do make_const(x.type(), std::nanf("")). I had to resort to make_const(x.type(), 1.0f) * Call::make(Float(32), "nan_f32", {}, Call::PureExtern) to get the NaN value in.

mcourteaux · 2025-03-21T00:25:32Z

I seem to have created a few issues. https://buildbot.halide-lang.org/master/#/builders/98/builds/88 (halide-testbranch-main-llvm20-arm-64-osx-cmake) shows useful errors. Will address them.

mcourteaux · 2025-04-19T07:41:18Z

Is this Windows issue fixed by now?

JIT session error: could not register eh-frame: __register_frame function not found
Assertion failed: !G && "InFlight alloc neither abandoned nor finalized", file C:\build_bot\worker\llvm-20-x86-64-windows\llvm-project\llvm\lib\ExecutionEngine\JITLink\JITLinkMemoryManager.cpp, line 251

@steven-johnson Could you request a rebuild for those failed build bots?

abadams · 2025-04-23T17:47:13Z

Working on the windows error in #8615

mcourteaux · 2025-04-26T08:21:32Z

@alexreinking Can you fix mac-x86-worker-2: Python package numpy is no found. If that one is not found, I'm assuming it's a fresh install and more stuff might be missing...

mcourteaux · 2025-05-01T07:08:06Z

@abadams can you assess the test failure on arm. There is a hexagon related test that failed. I doubt it is related to my work.
@derek-gerstmann can you assess the test failure on Windows? I don't see any error.

derek-gerstmann · 2025-05-01T17:58:06Z

@mcourteaux The test failure on Windows was the correctness_math test on the Vulkan backend segfaulting due to a failed compilation for mismatched types for built-in constants for inf_f32, neg_inf_f32, and nan_f32. These are handled as extern calls and they assumed they were always scalars, which caused the pow vector x2 tests to fail. I've pushed fixes to make sure proper vector constants are generated.

abadams · 2025-05-06T19:38:55Z

I will defer to Derek for approval

derek-gerstmann

Do we need so many alias macros or is there cleaner way? Otherwise LGTM.

mcourteaux · 2025-05-22T13:41:15Z

Well, so my motivation was:

Halide uses suffixes for floating-point math functions, like _f32, _f64, _f16.
Every graphics API spec declares those math functions, but are tend to have different names, depending on the API (e.g., fabs() vs overloaded abs()).
Not every API provides implementations for f16 or f64 variants of the math functions.

As such, general-purpose Halide IR must use sin_f32 as the function to compute sine of (a vector of) 32-bit floats. Halide's C backend runtime provides those functions explicitly and simply redirects them to libm (sin, sinf), and we pay for the extra function-call overhead. LLVM-based backends automatically scalarize those calls.

Given the above, a general-purpose mapping table in the C codegen base-class seemed like the right place to put all of this shared functionality to rewrite extern function calls to have different function names (e.g., from sin_f32 to sin).

In general, there is a lot less code now, compared to all of the #defines in the shader code.
The alias macros each write 2 or 3 renaming entries into the table. But as I said, automating the generation of those entries is not elegant, as every backend uses different naming schemes, and some don't even provide functionality with the same semantics as we expect (i.e., WebGPU with pow()). In my opinion, I replaced several long tables with several short tables, with increased functionality (i.e., support for vector-arguments).

…unction for GPU backends.

…e or a signed typ.

mcourteaux · 2025-05-26T15:39:02Z

Ping @derek-gerstmann.

derek-gerstmann

Thanks. I was merely commenting on the usage of #define alias which didn't seem necessary since it only replicated 2-3 lines of code and hid the actual mapping declaration.

mcourteaux requested review from abadams and halidebuildbots March 17, 2025 13:46

mcourteaux force-pushed the math_funcs_table_gpu branch from 475c010 to 7b91926 Compare March 17, 2025 20:45

mcourteaux force-pushed the math_funcs_table_gpu branch from e883584 to 1c730e5 Compare April 25, 2025 20:44

abadams requested a review from derek-gerstmann May 6, 2025 19:38

derek-gerstmann reviewed May 20, 2025

View reviewed changes

mcourteaux added code_cleanup No functional changes. Reformatting, reorganizing, or refactoring existing code. gpu labels May 22, 2025

mcourteaux requested a review from derek-gerstmann May 22, 2025 13:42

mcourteaux and others added 13 commits May 25, 2025 11:33

Rewrite function calls to math functions to the native built-in API f…

c5a06e0

…unction for GPU backends.

Test vectorized support for math functions in correctness/math.cpp

ca21dcb

Clang format.

ddedfc0

Add missing #include

c5c0fd5

Fix fast_inverse on Metal.

0029007

Fix two small mistakes.

f4d159d

Move WGSL emulation of pow to IR instead of a function in the preamble.

1a0f566

Make distinction between backends where abs() returns an unsigned typ…

aa3ed82

…e or a signed typ.

Correct the type cast in WGSL pow().

265faa4

Attempt to fix pow() on WGSL.

a3e65e3

Attempt to make pow() return NaN on WebGPU.

c6b08e6

Trigger build.

a03170b

Make extern calls for nan_f32, inf_f32, neg_inf_f32 handle vector types.

9115a16

Clang format pass.

067cc26

mcourteaux force-pushed the math_funcs_table_gpu branch from d675a4b to 067cc26 Compare May 25, 2025 09:34

derek-gerstmann approved these changes May 27, 2025

View reviewed changes

mcourteaux merged commit 4a08ef9 into halide:main May 27, 2025
19 checks passed

alexreinking mentioned this pull request Aug 13, 2025

Correct soname for libOpenCL.so? #8569

Closed

BrewTestBot mentioned this pull request Sep 16, 2025

halide 21.0.0 Homebrew/homebrew-core#244220

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

Uh oh!

mcourteaux commented Mar 17, 2025 •

edited by alexreinking

Loading

Uh oh!

mcourteaux commented Mar 21, 2025 •

edited

Loading

Uh oh!

mcourteaux commented Apr 19, 2025

Uh oh!

abadams commented Apr 23, 2025

Uh oh!

mcourteaux commented Apr 26, 2025

Uh oh!

mcourteaux commented May 1, 2025

Uh oh!

derek-gerstmann commented May 1, 2025

Uh oh!

abadams commented May 6, 2025

Uh oh!

derek-gerstmann left a comment

Uh oh!

mcourteaux commented May 22, 2025

Uh oh!

mcourteaux commented May 26, 2025

Uh oh!

derek-gerstmann left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

Uh oh!

Conversation

mcourteaux commented Mar 17, 2025 • edited by alexreinking Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcourteaux commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcourteaux commented Apr 19, 2025

Uh oh!

abadams commented Apr 23, 2025

Uh oh!

mcourteaux commented Apr 26, 2025

Uh oh!

mcourteaux commented May 1, 2025

Uh oh!

derek-gerstmann commented May 1, 2025

Uh oh!

abadams commented May 6, 2025

Uh oh!

derek-gerstmann left a comment

Choose a reason for hiding this comment

Uh oh!

mcourteaux commented May 22, 2025

Uh oh!

mcourteaux commented May 26, 2025

Uh oh!

derek-gerstmann left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mcourteaux commented Mar 17, 2025 •

edited by alexreinking

Loading

mcourteaux commented Mar 21, 2025 •

edited

Loading