add extend_dictionary in dictionary builder for improved performance#6875
Merged
alamb merged 7 commits intoapache:mainfrom Dec 19, 2024
Merged
add extend_dictionary in dictionary builder for improved performance#6875alamb merged 7 commits intoapache:mainfrom
extend_dictionary in dictionary builder for improved performance#6875alamb merged 7 commits intoapache:mainfrom
Conversation
rluvaton
commented
Dec 12, 2024
rluvaton
commented
Dec 12, 2024
rluvaton
commented
Dec 12, 2024
tustvold
reviewed
Dec 12, 2024
Member
Author
|
Hey @tustvold, can you please re-review? |
Contributor
|
I added some small suggestions on how to improve the docstrings, but we could do that as a follow on PR as well |
Co-authored-by: Andrew Lamb <[email protected]>
Member
Author
applied |
Contributor
|
Thanks @rluvaton |
Contributor
|
CurtHagenlocher
pushed a commit
to CurtHagenlocher/arrow-rs
that referenced
this pull request
Dec 28, 2024
apache#6875) * add `extend_dictionary` in dictionary builder for improved performance * fix extends all nulls * support null in mapped value * adding comment * run `clippy` and `fmt` * fix ci * Apply suggestions from code review Co-authored-by: Andrew Lamb <[email protected]> --------- Co-authored-by: Andrew Lamb <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
No issue
Rationale for this change
This is done to improve the performance when wanting to add already build dictionary to existing builder by taking advantage of the fact that we don't need to check the values for each key
What changes are included in this PR?
added
extend_dictionaryforPrimitiveDictionaryBuilderand forGenericByteDictionaryBuilderAre there any user-facing changes?
yes, these are public methods