Expose SortingColumn in parquet files#3103
Expose SortingColumn in parquet files#3103tustvold merged 9 commits intoapache:masterfrom askoa:sorting-column
SortingColumn in parquet files#3103Conversation
tustvold
left a comment
There was a problem hiding this comment.
Could we get an end-to-end test of this, i.e. write a parquet file to a Vec<u8> then read it back and verify the sort column was round-tripped correctly
parquet/src/file/metadata.rs
Outdated
There was a problem hiding this comment.
| value: Option<Vec<SortingColumnMetaData>>, | |
| ) -> Self { | |
| self.sorting_columns = value; | |
| value: Vec<SortingColumnMetaData>, | |
| ) -> Self { | |
| self.sorting_columns = Some(value); |
This is consistent with set_page_offset
There was a problem hiding this comment.
I got a reverse review before. The reason given by the reviewer was that if we remove Option from signature then the function cannot be used to set None for this field. I am going to keep this as is.
|
Thank you 👍 |
|
Benchmark runs are scheduled for baseline = 8bb2917 and contender = 371ec57. 371ec57 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3090
Reading from file:
The function
footer.rs#decode_metadatareadsorting_columnfrom file. However the functionRowGroupMetaData::from_thriftwas not reading the field intoRowGroupMetaData. The function was modified to read thesorting_columnintoRowGroupMetaDataarrow-rs/parquet/src/file/footer.rs
Lines 74 to 82 in 430eb84
Writing into file:
The function
format.rs#RowGroup#write_to_out_protocolwritessorting_columnto file. However the functionmetadata.rs#RowGroupMetaData#to_thriftwas not writing the field toRowGroup. The function was modified to writesorting_columnfromRowGroupMetaDatatoRowGrouparrow-rs/parquet/src/format.rs
Lines 4155 to 4162 in 430eb84