-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add support for defining custom window frame bounds for window functions #14273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for defining custom window frame bounds for window functions #14273
Conversation
…indow.aggregate to org.apache.pinot.query.runtime.operator.window
…fficient sliding window based aggregations for aggregate window functions
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14273 +/- ##
============================================
+ Coverage 61.75% 63.76% +2.01%
- Complexity 207 1555 +1348
============================================
Files 2436 2659 +223
Lines 133233 145441 +12208
Branches 20636 22218 +1582
============================================
+ Hits 82274 92744 +10470
- Misses 44911 45843 +932
- Partials 6048 6854 +806
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
…aggregator is not needed
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done!
...nner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotWindowExchangeNodeInsertRule.java
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/plannode/WindowNode.java
Show resolved
Hide resolved
...n/java/org/apache/pinot/query/runtime/operator/window/aggregate/AggregateWindowFunction.java
Show resolved
Hide resolved
.../java/org/apache/pinot/query/runtime/operator/window/aggregate/MaxWindowValueAggregator.java
Outdated
Show resolved
Hide resolved
.../java/org/apache/pinot/query/runtime/operator/window/aggregate/SumWindowValueAggregator.java
Outdated
Show resolved
Hide resolved
…indow value aggregators; use primitive double in sum window value aggregator; update condition for removal support in aggregate window function; throw exception if unable to read literal offset value for window bounds
yashmayya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review Jackie! I've made the requested changes.
FIRST_VALUE/LAST_VALUEassume that the window frame is alwaysROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGeven though the default window frame as per standard SQL isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Furthermore, support for defining the lower bound explicitly asUNBOUNDED PRECEDING/CURRENT ROW/n FOLLOWING/n PRECEDINGand the upper bound asUNBOUNDED FOLLOWING/CURRENT ROW/n FOLLOWING/n PRECEDINGdoes not exist.ROWSwindow frames, and also adds support forUNBOUNDED PRECEDING/CURRENT ROW/UNBOUNDED FOLLOWINGbounds forRANGEwindow frames. There are a ton of edge cases to be handled here but this patch attempts to add test cases to cover most of these scenarios.ROWSandRANGEbased window frame bounds, whereas Postgres also supportsGROUPS.WindowValueAggregatorinterface along with implementations forSUM,COUNT,MIN,MAX,BOOLAND,BOOLOR. These use sliding window based aggregation algorithms to efficiently compute aggregations for the aggregate type window functions. This replaces the olderMergerinterface and ensures that the time complexity for computing aggregate window functions forROWSbased windows isO(N)and notO(N * K)(whereNis the total number of rows andKis the window size) which would've been the case if we continued using theMergerbased approach. We still need to add support for type specific window value aggregators (which was also the case with the mergers) so this isn't a regression.SUM,COUNT,MIN,MAXetc.) andFIRST_VALUE/LAST_VALUE. The other window functions currently supported by Pinot (LAG,LEAD,RANK,DENSE_RANK,ROW_NUMBER) don't support custom window frame bounds and Calcite ensures that during query planning.UNBOUNDED FOLLOWING/ upper bound isn'tUNBOUNDED PRECEDING, lower bound isn'tn FOLLOWINGif upper bound isn PRECEDINGand vice versa etc.