ARROW-764: [C++] Improves performance of CopyBitmap and adds benchmarks#1422
ARROW-764: [C++] Improves performance of CopyBitmap and adds benchmarks#1422seibs wants to merge 1 commit intoapache:masterfrom
Conversation
xhochy
left a comment
There was a problem hiding this comment.
+1, thank you for taking a look at this. There are two code formatting issues reported by Travis. Can you fix these two?
cpp/src/arrow/util/CMakeLists.txt
Outdated
There was a problem hiding this comment.
Can you remove these reindentations?
e5dd129 to
74e4219
Compare
|
Thanks for taking a look @xhochy. My latest push fixes the formatting issues and also adds a test+fix for a few corner cases my first upload didn't handle correctly. Benchmark with recent changes w.r.t. the comments above in |
|
@seibs Can resolve the two comments in the other pull request? I would then merge both :) |
wesm
left a comment
There was a problem hiding this comment.
+1, rebased. Will merge once build passes
|
Rebased again to see if build passes... |
|
What are the general performance concerns we are addressing here? I am just trying to understand if there is something similar to be done on JAVA side for bitvector. |
cpp/src/arrow/util/bit-util.cc
Outdated
There was a problem hiding this comment.
We should write down explicit types for these autos
There was a problem hiding this comment.
Replaced the autos with types.
|
@siddharthteotia there's a couple places in the codebase where we copy bits out of a bitmap, e.g. after slicing vectors arrow/cpp/src/arrow/ipc/writer.cc Line 63 in 2e3832f |
wesm
left a comment
There was a problem hiding this comment.
+1. Build failing due to some standard flakiness
I took a swing at improving the CopyBitmap performance (benchmarks below). I'm a C/C++ novice, so I thought I'd get some feedback before I went too much further.
Starting Point
Using stanford bithacks for SetBitTo
memcpy + shifting
This solution provides varying performance depending on whether or not the bit offset is a multiple of 8