-
Notifications
You must be signed in to change notification settings - Fork 2k
Open
Description
FileScanConfig::try_pushdown_sort could support re-sorting or re-arranging the FileGroups themselves using min/max statistics to satisfy the queries preferred sort order.
This is described in section 5.3 of Pruning in Snowflake: Working Smarter, Not Harder.
Some considerations are:
- If we start re-building groups what should the parallelism be? One the one hand it would make sense to try to match the original parallelism, on the other hand that may not be possible (e.g. if we can only satisfy the sort ordering by making groups
[[f1, f2, f3], [f4]]maybe it's worth it to have lopsided groups, less or more groups) or even optimal (in a TopK query reduced parallelism can lead to faster queries if we end up only scanning 1 group or even 1 file; all of the work opening the others is wasted effort; this is also known asProgressiveEvaland discussed in Analysis to supportSortPreservingMerge-->ProgressiveEval#15191).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels