ARROW-9115: [C++] Implementation of ascii_lower/ascii_upper by processing input data buffers in batch#7418
Closed
wesm wants to merge 4 commits intoapache:masterfrom
Closed
ARROW-9115: [C++] Implementation of ascii_lower/ascii_upper by processing input data buffers in batch#7418wesm wants to merge 4 commits intoapache:masterfrom
wesm wants to merge 4 commits intoapache:masterfrom
Conversation
Member
Author
Member
|
I get similar numbers here. It seems to be a 15x speedup over git master. |
pitrou
reviewed
Jun 12, 2020
Member
|
Before: After: |
cyb70289
reviewed
Jun 12, 2020
| const ArrayData& input = *batch[0].array(); | ||
| ArrayData* out_arr = out->mutable_array(); | ||
| // Reuse offsets from input | ||
| out_arr->buffers[1] = input.buffers[1]; |
Contributor
There was a problem hiding this comment.
These buffers[1], buffers[2] are mysterious to me. Any hint to figure it out? Thanks.
Member
Author
Contributor
|
Excellent, I was planning to look at this next week. But this probably saved me quite some time, and gives me some more examples, thanks. |
Member
Author
|
Sounds good. I'll probably implement some more example kernels soon that have to process each value individually and be more efficient than the prior examples (by not copying anything if it isn't necessary, etc). Thinking strip/rstrip/lstrip |
maartenbreddels
added a commit
to maartenbreddels/arrow
that referenced
this pull request
Jun 15, 2020
Following up on apache#7418 I tried and benchmarked a different way for * ascii_lower * ascii_upper Before (lower is similar): ``` -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- AsciiUpper_median 4922843 ns 4918961 ns 10 bytes_per_second=3.1457G/s items_per_second=213.17M/s ``` After: ``` -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- AsciiUpper_median 1391272 ns 1390014 ns 10 bytes_per_second=11.132G/s items_per_second=754.363M/s ``` This is a 3.7x speedup (on a AMD machine). Using http://quick-bench.com/JaDErmVCY23Z1tu6YZns_KBt0qU I found 4.6x speedup for clang 9, 6.4x for GCC 9.2. Also, the test is expanded a bit to include a non-ascii codepoint, to make explicit it is fine to upper or lower case a utf8 string. The non-overlap encoding of utf8 make this ok (see section 2.5 of Unicode Standard Core Specification v13.0).
pitrou
pushed a commit
that referenced
this pull request
Jun 15, 2020
Following up on #7418 I tried and benchmarked a different way for * ascii_lower * ascii_upper Before (lower is similar): ``` -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- AsciiUpper_median 4922843 ns 4918961 ns 10 bytes_per_second=3.1457G/s items_per_second=213.17M/s ``` After: ``` -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- AsciiUpper_median 1391272 ns 1390014 ns 10 bytes_per_second=11.132G/s items_per_second=754.363M/s ``` This is a 3.7x speedup (on a AMD machine). Using http://quick-bench.com/JaDErmVCY23Z1tu6YZns_KBt0qU I found 4.6x speedup for clang 9, 6.4x for GCC 9.2. Also, the test is expanded a bit to include a non-ascii codepoint, to make explicit it is fine to upper or lower case a utf8 string. The non-overlap encoding of utf8 make this ok (see section 2.5 of Unicode Standard Core Specification v13.0). Closes #7434 from maartenbreddels/ARROW-9131 Authored-by: Maarten A. Breddels <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Following on discussion in #7357. I added a simple benchmark also.