-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
For now row group filter in datafusion pass a closure to arrow-rs
fn build_row_group_predicate(
pruning_predicate: &PruningPredicate,
metrics: ParquetFileMetrics,
) -> Box<dyn FnMut(&RowGroupMetaData, usize) -> bool> {
So for page filter in datafusion, define filter_predicate
Box<dyn FnMut(&[pageIndex], &[pageLocation], usize) -> &[bool]>
datafusion will send a mask(&[bool]) to arrow-rs,
then use mask call compute_row_ranges to construct RowRanges : row ranges in a row-group (one col) if col is sorted vec size will be 1.
For multi filter combine:
if there are two filters use and connect,use RowRanges::intersection to get the final rowRange; two filters use or connect,use RowRanges::union to get the final rowRange.
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.