[multistage] Add support for the ranking ROW_NUMBER() window function #10527
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for the ranking
ROW_NUMBER()window function in Apache Pinot.ROW_NUMBER()requiresROWtype window function support rather thanRANGEtype for which we added support in Phase 1. This PR sets up a potential framework to useROWtype window functions but only implements this forROW_NUMBER().ROW_NUMBER()can be used in the following types of queries:OVER()(s)with some select column [see limitations below]OVER(ORDER BY)(s)OVER(PARTITION BY)(s)OVER(PARTITION BY ORDER BY)(s)There are some limitations with
ROW_NUMBER()which are:ROW. Due to the lack of support for multiple window groups today,ROW_NUMBER()cannot be combined with other window aggregation functions in the same query.OVER()without any other column results in Apache Calcite not projecting any columns. E.g. query:SELECT ROW_NUMBER() OVER() from table;. I've added a TODO to look into how to get Apache Calcite to project at least one column in this scenario.ProjectWindowTransposeRuleto better understand what's happening here. Basically it tries to push aProjectbelow theWindow, but finds no input fields referenced. Due to this it creates an emptyProjectbelow theWindow. TheProjectabove theWindowgets marked as trivial and is removed resulting in the following type of plan:The design document and issue for window functions support can be found below:
Prior Phase 1 PRs related to window functions:
empty OVER()andOVER(PARTITION BY): [multistage] Initial (phase 1) Query runtime for window functions - empty OVER() and OVER(PARTITION BY) #10286SortExchange: [multistage] Implement ordering for SortExchange #10408OVER(ORDER BY)andOVER(PARTITION BY ORDER BY): [multistage] Initial (phase 1) Query runtime for window functions with ORDER BY within the OVER() clause #10449cc @siddharthteotia @walterddr @vvivekiyer @ankitsultana