ARROW-8772: [C++] Unrolled aggregate dense for better speculative execution#7267
ARROW-8772: [C++] Unrolled aggregate dense for better speculative execution#7267frankdjx wants to merge 1 commit intoapache:masterfrom
Conversation
1. Expand SumKernel benchmark to more types(Float, Double, Int8, Int16, Int32, Int64). 2. Unlooped the aggregate kernel dense part to speculative add the result in parrel. Signed-off-by: Frank Du <[email protected]>
|
We find the SumKernel benchmark dense(null percent 0) results is relatively low compared to sparse part for float and double type. Below is the result before unrolled the loop.
With the unroll change, the dense sumkernel benchmark get 3.7x improvement for float and 2.6x speed for double.
Anyway, it can get more higher performance if using intrinsic, I'd like to work at later point. |
|
@jianxind thank you for looking into this! I had suspected that there was room for improvement in this algorithm. You're certainly free to implement kernels requiring intrinsics and put them in aggregate_basic_$SIMD_VERSION.cc, and then we can utilize the CpuInfo from |
|
This shows a small benefit on ARM64 architecture (ThunderX1): before: after (see the Int64 benchmarks): |
|
Indeed the dense performance issue only happens for float and double type, compiler did a good job for all int types, that is why I add all data types to sumkernel benchmark.
Sure I will work on this SIMD chance then, thanks. |
…cution 1. Expand SumKernel benchmark to more types(Float, Double, Int8, Int16, Int32, Int64). 2. Unrolled the aggregate kernel dense part to speculative add the result in parrel. Signed-off-by: Frank Du <[email protected]> Closes apache#7267 from jianxind/SumKernelBenchmark Authored-by: Frank Du <[email protected]> Signed-off-by: Wes McKinney <[email protected]>
|
I just wanted to point that this vary greatly depending on the toolchain, e.g. LLVM is usually more aggressive at unrolling than GCC. Since most of our users (python, R) will have libarrow built with gcc, I think it's fine. |
Signed-off-by: Frank Du [email protected]