-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
Description
Proposed new feature or change:
Continuing from and separately to #24018, let's discuss SIMD intrinsics here.
I understand there is an ongoing effort to replace(?) the NEP38 macros with a C++ wrapper (#21057).
part that is because Sayed has momentum for the "custom" universal intrinsics
Unfortunately it seems there is some duplication of effort :( Highway has been under development since 2017 with contributions from about a dozen engineers, and open sourced in 2019 with several dozen open-source collaborators since then (mostly bugfixes).
Wouldn't we get further by collaborating, perhaps by extending Highway with Numpy-specific operations? This would make it easier to share code and onboard developers (Highway has 2.6k Github stars) as opposed to a custom wrapper only used by Numpy.
Wouldn't it also make sense to benefit from all the ongoing maintenance efforts? This is quite costly (multiple patches per week) given all the platforms/compilers to support.
@jan-wassenberg , I think this would just be something like MaskedGatherLoad and MaskedBlendedStore
Thanks, makes sense. FYI it is possible to emulate these by IfThenElse on the indices, replacing invalid ones with a safe/dummy address. (We found this to have no observable perf impact on x86.)
We can soon add native versions of these, though, because it would be more obvious/convenient for user code.