You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Spark supports only i32 indixed arrays but Comet attempts to support both ListArray (i32 indexed) and LargeListArray (i64 indexed). As a result the code in list.rs contains additional complexity that is not reachable and is not tested anyhow.
What changes are included in this PR?
Refactoring of the list.rs, ListExtract, GetArrayStructField and ArrayInsert now accept only ListArray instead of GenericListArray
@andygrove As you mentioned in #1073, it would be interesting to remove all logic related to LargeList and check if there is any regression. I did just that and all tests passed.
The difference I think is that a LargeList can store more than Integer.MAX_VALUE entries in all rows in a single batch, so if you have multiple Spark rows all with the max num of rows supported, it wouldn't fit into an Arrow List array. That would probably need to be supported elsewhere, but it may be worth keeping the LargeList handling around in case that scenario is supported? And other DataFusion expressions might return a LargeList even if it doesn't come directly from Spark? Does the native Parquet reader ever use a LargeList?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #1114
Start of the discussion: comment in the PR1073
Rationale for this change
Spark supports only i32 indixed arrays but Comet attempts to support both ListArray (i32 indexed) and LargeListArray (i64 indexed). As a result the code in
list.rscontains additional complexity that is not reachable and is not tested anyhow.What changes are included in this PR?
Refactoring of the
list.rs,ListExtract,GetArrayStructFieldandArrayInsertnow accept onlyListArrayinstead ofGenericListArrayHow are these changes tested?
All existing tests.