feat: Add support for Timestamp data types in data page statistics.#11123
Conversation
8c6feae to
c63875b
Compare
| Ok(match unit { | ||
| TimeUnit::Second => { | ||
| Arc::new(match timezone { | ||
| Some(tz) => TimestampSecondArray::from_iter(iter).with_timezone(tz.clone()), |
There was a problem hiding this comment.
Not tested, but perhaps with_timezone_opt could reduce some the repetitiveness of the code.
| Some(tz) => TimestampNanosecondArray::from_iter(iter).with_timezone(tz.clone()), | ||
| None => TimestampNanosecondArray::from_iter(iter), | ||
| }) | ||
| } |
There was a problem hiding this comment.
Applied simplification using with_timezone_opt to the get_statistics macro as well.
| @@ -713,6 +692,15 @@ macro_rules! get_data_page_statistics { | |||
| )), | |||
| Some(DataType::Float32) => Ok(Arc::new(Float32Array::from_iter([<$stat_type_prefix Float32DataPageStatsIterator>]::new($iterator).flatten()))), | |||
| Some(DataType::Float64) => Ok(Arc::new(Float64Array::from_iter([<$stat_type_prefix Float64DataPageStatsIterator>]::new($iterator).flatten()))), | |||
There was a problem hiding this comment.
Should this repeated code be extracted? I assume it would need to be put into a macro?
There was a problem hiding this comment.
Which repeated code do you mean?
If you wnated to more fully macroize this table and avoid repetition that sounds like a great idea to me (though perhaps as a follow on PR)
There was a problem hiding this comment.
The four lines in the match unit expression are exactly the same in both macros.
| @@ -713,6 +692,15 @@ macro_rules! get_data_page_statistics { | |||
| )), | |||
| Some(DataType::Float32) => Ok(Arc::new(Float32Array::from_iter([<$stat_type_prefix Float32DataPageStatsIterator>]::new($iterator).flatten()))), | |||
| Some(DataType::Float64) => Ok(Arc::new(Float64Array::from_iter([<$stat_type_prefix Float64DataPageStatsIterator>]::new($iterator).flatten()))), | |||
There was a problem hiding this comment.
Which repeated code do you mean?
If you wnated to more fully macroize this table and avoid repetition that sounds like a great idea to me (though perhaps as a follow on PR)
| TimeUnit::Second => Arc::new(TimestampSecondArray::from_iter(iter).with_timezone_opt(timezone.clone())), | ||
| TimeUnit::Millisecond => Arc::new(TimestampMillisecondArray::from_iter(iter).with_timezone_opt(timezone.clone())), | ||
| TimeUnit::Microsecond => Arc::new(TimestampMicrosecondArray::from_iter(iter).with_timezone_opt(timezone.clone())), | ||
| TimeUnit::Nanosecond => Arc::new(TimestampNanosecondArray::from_iter(iter).with_timezone_opt(timezone.clone())), |
…pache#11123) * feat: Add support for Timestamp data types in data page statistics. * Simplify array creation using with_timezone_opt. --------- Co-authored-by: Eric Fredine <[email protected]>
Which issue does this PR close?
Closes #11112.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Covered by existing tests.
Are there any user-facing changes?