ARROW-9790: [Rust][Parquet]: Increase test coverage in arrow_reader.rs#8009
ARROW-9790: [Rust][Parquet]: Increase test coverage in arrow_reader.rs#8009alamb wants to merge 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
This is the basic idea -- drive the test with a table that is shared between bool, utf, and fixed_len_binary_array types
There was a problem hiding this comment.
Since you felt the need to add a comment explaining the values, a tiny suggestion is to declare all members of TestOptions as pub and use
TestOptions {
num_row_groups: 3,
num_rows: 25,
record_batch_size: 5,
num_iterations: 50,
}
instead, to increase readability of the values' meaning.
There was a problem hiding this comment.
This is a good idea -- I will do so before turning this into a full on PR for review
3967510 to
f149500
Compare
f149500 to
bc384b3
Compare
| } | ||
|
|
||
| #[test] | ||
| fn test_bool_single_column_reader_test_batch_size_divides_into_row_group_size() { |
There was a problem hiding this comment.
The coverage of these tests was moved into two table entries here: https://github.com/apache/arrow/pull/8009/files#diff-4c103051156d7b901fad8d9e26104932R393
I added two test cases in #8007, which increased coverage. However, upon further review, I noticed the choice of parameters to hit edge conditions didn't cover the string data types.
Rather than adding a bunch more copies of basically the same test to add new parameters for different tests, I instead propose using the same set of parameters for all data types and drive the tests using a table in this PR.
It makes the test logic slightly harder to follow, in my opinion, but it does increase coverage