ARROW-16590: [C++] Consolidate files dealing with row-major storage, add some helper methods#13172
Closed
westonpace wants to merge 9 commits intoapache:masterfrom
Closed
Conversation
|
|
…sts for encoding and decoding rows as well as a test for comparing column-major data with row-major data.
da41628 to
ba12892
Compare
Member
Author
|
I'm still chasing a few things around with lint/format and there is a, possibly legitimate and existing, bug with big endian architectures, but I think this is close enough to start review. |
Member
Author
|
CC @michalursa |
Member
Author
Member
Author
|
The second half of this PR has been moved to #13220 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is the second PR in a series to merge in #12326. The first was #12872. Some of this is new functionality (from michalursa) and can be found in 7daafeb. The new behavior is mostly in
RowTableandRowTableMerge. In addition to these new types this PR moves around, and renames, a lot of existing work revolving aroundKeyRowArray.The primary goal of this refactor of old code was to improve the readability and clarity of the code base. I did not make any functional changes to the code and if any functional changes are suggested which modify existing code I will happily discuss them here but defer the changes themselves to follow-up PRs. I would very much appreciate any feedback on naming, making sure we have sufficient test coverage, and overall layout of the code.
It's a very large change but most of it is moving things around. What I'd like input on most:
arrow/compute/rowand put all row-based utilities in here. Most of the files are now marked as_internaland the content in these files is not used outside ofarrow/compute/row. The grouper had previously been alongside the kernel code and it didn't really belong there. It also relies very heavily on the internal structure of the row encoding.arrow/compute/row/row_internal.hand would appreciate review of the content here.arrow/compute/row/grouper.h, which is more or less unchanged, andarrow/compute/encode.hwhich is mostly new code from 7daafeb. We have quite a few tests exercisingGrouperand I was unable to easily extract them from the aggregate kernel tests so I left the tests alone. I added tests to the new utilities inencode.h. These utilities are not yet properly external because they depend onRowTableImpland I would need to migrate them to the PIMPL pattern to remove this dependency. Since that would be a functional change I think it would be best to wait.