-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[WIP] Support preceding and following in WINDOW #14233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Support preceding and following in WINDOW #14233
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14233 +/- ##
============================================
+ Coverage 61.75% 63.80% +2.04%
- Complexity 207 1535 +1328
============================================
Files 2436 2623 +187
Lines 133233 144467 +11234
Branches 20636 22116 +1480
============================================
+ Hits 82274 92172 +9898
- Misses 44911 45494 +583
- Partials 6048 6801 +753
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
| && !upperBound.isUnbounded()); | ||
|
|
||
| if (hasRankBasedAgg) { | ||
| Preconditions.checkState(!hasBound, "Window frame is not supported for ROW_NUMBER, RANK, DENSE_RANK functions"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres seems to allow any window frame for these functions, but the result is the same regardless of the defined window frame (i.e., following the behavior of ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING). Although I think the behavior here of imposing the unbounded window frame condition is probably better to avoid user confusion.
| } | ||
| // For Phase 1 only the default frame is supported | ||
| Preconditions.checkState(!windowGroup.isRows || isRowsOnlyTypeAggregateCall, | ||
| Preconditions.checkState(!windowGroup.isRows || hasRankBasedAgg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is a rank based aggregation, we already check above that the window frame is unbounded (whether range or rows). So this condition of !windowGroup.isRows || hasRankBasedAgg doesn't really seem to make sense here unless I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 272-283 was supposed to be removed. I accidentally reverted the removal to compare with the new logic
| private boolean hasRankBasedAgg(ImmutableList<Window.RexWinAggCall> aggCalls) { | ||
| for (Window.RexWinAggCall aggCall : aggCalls) { | ||
| if (RANK_BASED_FUNCTION_KIND.contains(aggCall.getKind())) { | ||
| return true; | ||
| } | ||
| } | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just discovered that we probably don't need any of the hasRankBasedAgg checks at all because Calcite itself ensures that functions like RANK / DENSE_RANK / ROW_NUMBER aren't used with a custom ROWS / RANGE window frame - https://github.com/apache/calcite/blob/df2328b7c2834cf7ffbf7305fe2c7aec5d07fe64/core/src/main/java/org/apache/calcite/sql/SqlWindow.java#L669-L674.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Let's add a test to ensure it doesn't get planned
| // Has rank based aggregation call kind (e.g. ROW_NUMBER, RANK, DENSE_RANK) | ||
| boolean hasRankBasedAgg = hasRankBasedAgg(windowGroup.aggCalls); | ||
|
|
||
| // FOLLOWING lower bound or PRECEDING upper bound is not allowed, and won't reach here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this enforced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it and iirc it is banned from parsing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UNBOUNDED PRECEDING is not allowed for upper frame boundary and UNBOUNDED FOLLOWING is not allowed for lower frame boundary. However, n PRECEDING is allowed for upper frame boundary and n FOLLOWING is allowed for lower frame boundary (not at the same time though). This behavior is similar in both Calcite and Postgres.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! Does it mean it can do something like BETWEEN 5 PRECEDING AND 3 PRECEDING?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Postgres allows that, Calcite allows that, and with #14249, Pinot will support that as well!
Support the following functionalities in WINDOW function: