-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
475c010 to
7b91926
Compare
|
I seem to have created a few issues. https://buildbot.halide-lang.org/master/#/builders/98/builds/88 (halide-testbranch-main-llvm20-arm-64-osx-cmake) shows useful errors. Will address them. |
|
Is this Windows issue fixed by now?
@steven-johnson Could you request a rebuild for those failed build bots? |
|
Working on the windows error in #8615 |
e883584 to
1c730e5
Compare
|
@alexreinking Can you fix mac-x86-worker-2: Python package numpy is no found. If that one is not found, I'm assuming it's a fresh install and more stuff might be missing... |
|
@abadams can you assess the test failure on arm. There is a hexagon related test that failed. I doubt it is related to my work. |
|
@mcourteaux The test failure on Windows was the correctness_math test on the Vulkan backend segfaulting due to a failed compilation for mismatched types for built-in constants for |
|
I will defer to Derek for approval |
derek-gerstmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need so many alias macros or is there cleaner way? Otherwise LGTM.
|
Well, so my motivation was:
As such, general-purpose Halide IR must use Given the above, a general-purpose mapping table in the C codegen base-class seemed like the right place to put all of this shared functionality to rewrite extern function calls to have different function names (e.g., from In general, there is a lot less code now, compared to all of the |
…unction for GPU backends.
…e or a signed typ.
d675a4b to
067cc26
Compare
|
Ping @derek-gerstmann. |
derek-gerstmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I was merely commenting on the usage of #define alias which didn't seem necessary since it only replicated 2-3 lines of code and hid the actual mapping declaration.
This clears up all the preprocessor prologues or wrapper function prologues.
Fixes #8594
Fixes #8569
Update:
pow()for WGSL is removed from the preamble now, and its behavior is emulated during codegen in the WGSL_CodeGen backend. There is a potential for simplyfing that again if there are meaningful bounds on expressions inferred. I'd consider that future work. Removing this from the preamble and using Halide's IR to emulate this, enables support for vectorized calls. There is a caveat with this: WGSL says there are no nans. However, their own builtin pow() function returns nan for negative inputs to x and y. Emulating that was tricky, because I think the WGSL simplifier removes those? Not sure why I couldn't simply domake_const(x.type(), std::nanf("")). I had to resort tomake_const(x.type(), 1.0f) * Call::make(Float(32), "nan_f32", {}, Call::PureExtern)to get the NaN value in.