ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch#7475
ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch#7475wesm wants to merge 2 commits intoapache:masterfrom
Conversation
|
AMD64 Ubuntu 18.04 C++ Benchmark (#113134) builder failed with an exception. Revision: 999865b Archery: Traceback (most recent call last):
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 1475, in gotResult
_inlineCallbacks(r, g, status)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
--- <exception caught here> ---
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/buildbot/process/buildstep.py", line 566, in startStep
self.results = yield self.run()
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/home/ursabot/ursabot/ursabot/steps.py", line 67, in run
await log.addContent(content)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/buildbot/process/log.py", line 130, in addContent
return self.lbf.append(text)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/buildbot/util/lineboundaries.py", line 62, in append
text = self.newline_re.sub('\n', text)
builtins.TypeError: expected string or bytes-like object |
|
AMD64 Ubuntu 18.04 C++ Benchmark (#113245) builder failed with an exception. Revision: 999865b Archery: Traceback (most recent call last):
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 1475, in gotResult
_inlineCallbacks(r, g, status)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
--- <exception caught here> ---
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/buildbot/process/buildstep.py", line 566, in startStep
self.results = yield self.run()
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/home/ursabot/ursabot/ursabot/steps.py", line 67, in run
await log.addContent(content)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/buildbot/process/log.py", line 130, in addContent
return self.lbf.append(text)
File "/home/ursabot/.conda/envs/ursabot/lib/python3.7/site-packages/buildbot/util/lineboundaries.py", line 62, in append
text = self.newline_re.sub('\n', text)
builtins.TypeError: expected string or bytes-like object |
|
@fsaintjacques @kszucs any idea what went wrong with buildbot? |
|
Buildbot parses a specific stdio format from the archery command which was a bit different for this invocation, my guess is passing a specific commit makes the output format different. I'm triggering a benchmark without the contender commit so see whether it is a buildbot parser issue or an archery output formatting error. The output contains only a single resultset in the logs whereas the passing benchmarks contain two, so archery doesn't produce the result diff as a json. |
|
Another guess is |
|
AMD64 Ubuntu 18.04 C++ Benchmark (#113289) builder has been succeeded. Revision: 999865b ====================================== =============== =============== =========
benchmark baseline contender change
====================================== =============== =============== =========
- FilterStringFilterNoNulls/262144/0 3.205 GiB/sec 563.351 MiB/sec -82.832%
- FilterInt64FilterWithNulls/262144/4 1.448 GiB/sec 643.680 MiB/sec -56.595%
- FilterFSLInt64FilterWithNulls/262144/6 1.061 GiB/sec 334.867 MiB/sec -69.188%
FilterFSLInt64FilterNoNulls/262144/2 1.355 GiB/sec 6.347 GiB/sec 368.336%
- FilterFSLInt64FilterNoNulls/262144/0 1.404 GiB/sec 720.926 MiB/sec -49.872%
FilterFSLInt64FilterWithNulls/262144/1 186.786 MiB/sec 516.247 MiB/sec 176.385%
FilterFSLInt64FilterWithNulls/262144/7 171.996 MiB/sec 468.500 MiB/sec 172.390%
FilterStringFilterWithNulls/262144/2 2.408 GiB/sec 9.139 GiB/sec 279.573%
FilterInt64FilterWithNulls/262144/5 544.180 MiB/sec 5.138 GiB/sec 866.755%
FilterStringFilterNoNulls/262144/9 90.139 MiB/sec 392.643 MiB/sec 335.595%
FilterInt64FilterNoNulls/262144/9 570.820 MiB/sec 3.250 GiB/sec 482.971%
FilterStringFilterNoNulls/262144/8 416.738 MiB/sec 10.990 GiB/sec 2600.350%
- FilterInt64FilterWithNulls/262144/0 1.463 GiB/sec 622.819 MiB/sec -58.424%
FilterFSLInt64FilterWithNulls/262144/2 1.061 GiB/sec 4.517 GiB/sec 325.695%
- FilterStringFilterWithNulls/262144/3 524.535 MiB/sec 438.494 MiB/sec -16.403%
FilterInt64FilterNoNulls/262144/3 597.101 MiB/sec 4.326 GiB/sec 641.848%
FilterInt64FilterWithNulls/262144/7 518.449 MiB/sec 620.439 MiB/sec 19.672%
FilterStringFilterNoNulls/262144/1 553.473 MiB/sec 716.671 MiB/sec 29.486%
- FilterInt64FilterNoNulls/262144/4 2.166 GiB/sec 680.128 MiB/sec -69.332%
FilterFSLInt64FilterWithNulls/262144/5 179.177 MiB/sec 4.391 GiB/sec 2409.209%
FilterInt64FilterWithNulls/262144/9 496.572 MiB/sec 547.030 MiB/sec 10.161%
FilterStringFilterWithNulls/262144/8 284.351 MiB/sec 8.655 GiB/sec 3016.828%
FilterInt64FilterNoNulls/262144/1 647.779 MiB/sec 1.024 GiB/sec 61.870%
- FilterFSLInt64FilterWithNulls/262144/0 1.091 GiB/sec 398.361 MiB/sec -64.327%
FilterInt64FilterNoNulls/262144/7 565.141 MiB/sec 657.051 MiB/sec 16.263%
FilterFSLInt64FilterNoNulls/262144/9 169.973 MiB/sec 269.496 MiB/sec 58.552%
FilterStringFilterNoNulls/262144/2 3.155 GiB/sec 11.443 GiB/sec 262.664%
FilterStringFilterWithNulls/262144/5 518.426 MiB/sec 8.833 GiB/sec 1644.691%
FilterStringFilterNoNulls/262144/7 486.759 MiB/sec 681.910 MiB/sec 40.092%
FilterInt64FilterNoNulls/262144/2 2.160 GiB/sec 7.943 GiB/sec 267.766%
- FilterStringFilterWithNulls/262144/4 2.359 GiB/sec 649.099 MiB/sec -73.125%
- FilterStringFilterWithNulls/262144/6 2.135 GiB/sec 434.104 MiB/sec -80.147%
- FilterStringFilterWithNulls/262144/0 2.435 GiB/sec 444.067 MiB/sec -82.190%
FilterInt64FilterWithNulls/262144/1 594.768 MiB/sec 648.937 MiB/sec 9.108%
FilterInt64FilterNoNulls/262144/5 594.885 MiB/sec 7.189 GiB/sec 1137.460%
- FilterInt64FilterWithNulls/262144/6 1.438 GiB/sec 584.712 MiB/sec -60.292%
- FilterStringFilterNoNulls/262144/4 3.134 GiB/sec 711.198 MiB/sec -77.837%
FilterStringFilterWithNulls/262144/9 85.327 MiB/sec 398.211 MiB/sec 366.691%
FilterFSLInt64FilterNoNulls/262144/1 184.492 MiB/sec 565.075 MiB/sec 206.287%
- FilterStringFilterNoNulls/262144/6 2.876 GiB/sec 488.107 MiB/sec -83.424%
FilterFSLInt64FilterWithNulls/262144/8 1.087 GiB/sec 4.335 GiB/sec 298.769%
FilterInt64FilterNoNulls/262144/0 2.192 GiB/sec 7.987 GiB/sec 264.420%
FilterFSLInt64FilterNoNulls/262144/8 1.427 GiB/sec 5.784 GiB/sec 305.352%
FilterFSLInt64FilterNoNulls/262144/7 175.996 MiB/sec 467.499 MiB/sec 165.630%
- FilterFSLInt64FilterWithNulls/262144/4 1.061 GiB/sec 478.103 MiB/sec -55.992%
FilterStringFilterNoNulls/262144/5 526.042 MiB/sec 11.046 GiB/sec 2050.145%
FilterFSLInt64FilterNoNulls/262144/3 176.402 MiB/sec 560.631 MiB/sec 217.815%
FilterStringFilterWithNulls/262144/1 545.748 MiB/sec 648.966 MiB/sec 18.913%
- FilterFSLInt64FilterNoNulls/262144/4 1.359 GiB/sec 523.037 MiB/sec -62.410%
FilterInt64FilterWithNulls/262144/3 546.504 MiB/sec 612.891 MiB/sec 12.148%
FilterFSLInt64FilterNoNulls/262144/5 176.620 MiB/sec 5.881 GiB/sec 3309.737%
FilterFSLInt64FilterWithNulls/262144/9 178.978 MiB/sec 290.600 MiB/sec 62.367%
FilterStringFilterWithNulls/262144/7 482.739 MiB/sec 647.174 MiB/sec 34.063%
FilterInt64FilterWithNulls/262144/8 1.453 GiB/sec 5.236 GiB/sec 260.446%
- FilterFSLInt64FilterNoNulls/262144/6 1.355 GiB/sec 403.734 MiB/sec -70.899%
FilterInt64FilterWithNulls/262144/2 1.449 GiB/sec 5.155 GiB/sec 255.704%
FilterInt64FilterNoNulls/262144/8 2.214 GiB/sec 7.144 GiB/sec 222.645%
FilterInt64FilterNoNulls/262144/6 2.164 GiB/sec 3.805 GiB/sec 75.808%
FilterStringFilterNoNulls/262144/3 529.096 MiB/sec 550.969 MiB/sec 4.134%
FilterFSLInt64FilterWithNulls/262144/3 178.992 MiB/sec 351.963 MiB/sec 96.636%
====================================== =============== =============== ========= |
|
@kszucs oh right, that would do it |
|
Note: those Filter benchmarks are garbage because they don't include the RandomArrayGenerator::Boolean bugfix |
|
I created https://github.com/wesm/arrow/tree/ARROW-8500-comparison for running apples-to-apples benchmark comparisons For filtering record batches, the new selection vector approach is 5-40x faster. The performance improvement goes up drastically for low selectivity filters. |
|
+1, awaiting CI |
|
I confirm locally with taxi dataset, runtime for a low selectivity (total_amount > 200$, 120k / 1.5b rows) goes from 9s to 3s. Niice improvement. |
|
Good stuff. |
Since I changed Filter on RecordBatch to transform the filter to indices and use Take, I wanted to have a benchmark to compare the before/after performance so this can also be monitored over time. These benchmarks could use some refactoring but this is at least a starting point.