Fuzz test different parquet encodings#1156
Conversation
parquet/src/arrow/arrow_reader.rs
Outdated
There was a problem hiding this comment.
This seems to not be supported at the moment
There was a problem hiding this comment.
Looks like that came in with the original array reader in #384 from @yordan-pavlov
Do you think it would be a good exercise to support DELTA_LENGTH_BYTE_ARRAY in the reader? If so, I can file a ticket and see if anyone else is interested in picking it up
36a70db to
96a2d4a
Compare
| ConvertedType::NONE, | ||
| None, | ||
| &converter, | ||
| &[Encoding::PLAIN, Encoding::RLE_DICTIONARY], |
There was a problem hiding this comment.
Is the reason not to include DELTA_BINARY_PACKED here that that encoding is not supported for FixedLengthByteArrays?
There was a problem hiding this comment.
That was at least my interpretation of https://github.com/apache/parquet-format/blob/master/Encodings.md
FIXED_LEN_BYTE_ARRAY only seems support PLAIN and dictionary encoding
parquet/src/arrow/arrow_reader.rs
Outdated
There was a problem hiding this comment.
Looks like that came in with the original array reader in #384 from @yordan-pavlov
Do you think it would be a good exercise to support DELTA_LENGTH_BYTE_ARRAY in the reader? If so, I can file a ticket and see if anyone else is interested in picking it up
| max_data_page_size: 1024 * 1024, | ||
| max_dict_page_size: 1024 * 1024, | ||
| writer_version: WriterVersion::PARQUET_1_0, | ||
| ..Default::default() |
There was a problem hiding this comment.
I double checked that these are the same values as in impl Default for TestOptions 👍
Which issue does this PR close?
Relates to #1053
Rationale for this change
Improved test coverage
What changes are included in this PR?
#1110 extended the fuzz tests to support different array types, nulls and multiple page row groups. This builds on this to also test different encodings
Are there any user-facing changes?
No, this PR only adds more tests