ARROW-9042: [C++] Add Subtract and Multiply arithmetic kernels with wrap-around behavior#7341
ARROW-9042: [C++] Add Subtract and Multiply arithmetic kernels with wrap-around behavior#7341kszucs wants to merge 23 commits intoapache:masterfrom
Conversation
I'm not sure that's desirable. For one thing this leads to inconsistent handling of 64 bit integer types, which are currently allowed to overflow (NB: that means this kernel includes undefined behavior for int64). There are a few other approaches we could take (ordered by personal preference):
@wesm ? This is probably nuanced enough to merit a mailing list discussion. |
Certainly. What kind of overflow strategies would could we have for other functions with higher probability of overflowing like product? |
|
Per the mailing list discussion, I think we should apply the strategies used by open source analytic DBMS that we have access to and not think too hard about it:
|
|
I'll update the PR as discussed, and defer the implicit promotions to a follow-up PR. |
|
As I said on the ML, I'm -1 on implicit promotion. We may discuss whether overflow should be detected or not. But trying to make the output type "big enough" is a can of worms. |
|
Agreed. Implicit casts or type promotions is something that is typically negotiated by the orchestration/planning layer 1 to 2 levels above the kernels. |
|
The According to this SO post we might need a special case for uint16 (or perhaps it depends on the arch) multiplication. |
|
We could also simply not generate multiply kernels for 8- and 16-byte integer types. |
|
Just fixed it. |
There was a problem hiding this comment.
The array equality check fails despite that after printing out both sides the values are the same. I'm not sure why, perhaps a float precision problem?
|
@bkietz I think it is ready for a review now, the build failures are present on the master branch. |
Co-authored-by: Benjamin Kietzman <[email protected]>
|
Nice! Thanks for upgrading it! |
| *out_data++ = Op::template Call<OUT, ARG0, ARG1>(ctx, *arg0_data++, *arg1_data++); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
If you're going to remove this, you absolutely must write benchmarks to show that the more general version is not slower.
There was a problem hiding this comment.
Writing a benchmark, the jira to track it https://issues.apache.org/jira/browse/ARROW-9079
Quickly wanted to add a benchmark for the `Add` function to verify that no significant regressions were introduced by #7341 Before: ``` --------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------- AddArrayArrayKernel/32768/10000 18 us 18 us 35892 null_percent=0.01 size=32.768k 1.67854GB/s AddArrayArrayKernel/32768/100 19 us 19 us 37540 null_percent=1 size=32.768k 1.61941GB/s AddArrayArrayKernel/32768/10 20 us 20 us 37049 null_percent=10 size=32.768k 1.55599GB/s AddArrayArrayKernel/32768/2 20 us 20 us 35394 null_percent=50 size=32.768k 1.54512GB/s AddArrayArrayKernel/32768/1 19 us 19 us 37901 null_percent=100 size=32.768k 1.63153GB/s ``` After: ``` --------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------- AddArrayArrayKernel/32768/10000 19 us 19 us 36704 null_percent=0.01 size=32.768k 1.64619GB/s AddArrayArrayKernel/32768/100 18 us 18 us 37194 null_percent=1 size=32.768k 1.67588GB/s AddArrayArrayKernel/32768/10 18 us 18 us 36341 null_percent=10 size=32.768k 1.65205GB/s AddArrayArrayKernel/32768/2 18 us 18 us 37502 null_percent=50 size=32.768k 1.662GB/s AddArrayArrayKernel/32768/1 18 us 18 us 38622 null_percent=100 size=32.768k 1.66593GB/s ``` cc @wesm Closes #7417 from kszucs/ARROW-9079 Lead-authored-by: Krisztián Szűcs <[email protected]> Co-authored-by: Benjamin Kietzman <[email protected]> Co-authored-by: Wes McKinney <[email protected]> Signed-off-by: Wes McKinney <[email protected]>
Currently the output type of the Add function is identical with the argument types which makes it unsafe to add numeric limit values, so instead of using(int8, int8) -> int8signature we should use(int8, int8) -> int16.follow-ups: