Skip to content

Conversation

@Jackie-Jiang
Copy link
Contributor

Support the following functionalities in WINDOW function:

  • ROWS with bounded preceding/following
  • IGNORE NULLS in aggregates

@Jackie-Jiang Jackie-Jiang added feature multi-stage Related to the multi-stage query engine labels Oct 15, 2024
@codecov-commenter
Copy link

codecov-commenter commented Oct 15, 2024

Codecov Report

Attention: Patch coverage is 67.64706% with 22 lines in your changes missing coverage. Please review.

Project coverage is 63.80%. Comparing base (59551e4) to head (bac36dc).
Report is 1186 commits behind head on master.

Files with missing lines Patch % Lines
...e/rel/rules/PinotWindowExchangeNodeInsertRule.java 67.64% 10 Missing and 12 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14233      +/-   ##
============================================
+ Coverage     61.75%   63.80%   +2.04%     
- Complexity      207     1535    +1328     
============================================
  Files          2436     2623     +187     
  Lines        133233   144467   +11234     
  Branches      20636    22116    +1480     
============================================
+ Hits          82274    92172    +9898     
- Misses        44911    45494     +583     
- Partials       6048     6801     +753     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.78% <67.64%> (+2.07%) ⬆️
java-21 63.69% <67.64%> (+2.06%) ⬆️
skip-bytebuffers-false 63.79% <67.64%> (+2.04%) ⬆️
skip-bytebuffers-true 63.67% <67.64%> (+35.94%) ⬆️
temurin 63.80% <67.64%> (+2.04%) ⬆️
unittests 63.79% <67.64%> (+2.04%) ⬆️
unittests1 55.53% <67.64%> (+8.64%) ⬆️
unittests2 34.30% <5.88%> (+6.57%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

&& !upperBound.isUnbounded());

if (hasRankBasedAgg) {
Preconditions.checkState(!hasBound, "Window frame is not supported for ROW_NUMBER, RANK, DENSE_RANK functions");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgres seems to allow any window frame for these functions, but the result is the same regardless of the defined window frame (i.e., following the behavior of ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING). Although I think the behavior here of imposing the unbounded window frame condition is probably better to avoid user confusion.

}
// For Phase 1 only the default frame is supported
Preconditions.checkState(!windowGroup.isRows || isRowsOnlyTypeAggregateCall,
Preconditions.checkState(!windowGroup.isRows || hasRankBasedAgg,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a rank based aggregation, we already check above that the window frame is unbounded (whether range or rows). So this condition of !windowGroup.isRows || hasRankBasedAgg doesn't really seem to make sense here unless I'm missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 272-283 was supposed to be removed. I accidentally reverted the removal to compare with the new logic

Comment on lines +286 to +292
private boolean hasRankBasedAgg(ImmutableList<Window.RexWinAggCall> aggCalls) {
for (Window.RexWinAggCall aggCall : aggCalls) {
if (RANK_BASED_FUNCTION_KIND.contains(aggCall.getKind())) {
return true;
}
}
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just discovered that we probably don't need any of the hasRankBasedAgg checks at all because Calcite itself ensures that functions like RANK / DENSE_RANK / ROW_NUMBER aren't used with a custom ROWS / RANGE window frame - https://github.com/apache/calcite/blob/df2328b7c2834cf7ffbf7305fe2c7aec5d07fe64/core/src/main/java/org/apache/calcite/sql/SqlWindow.java#L669-L674.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Let's add a test to ensure it doesn't get planned

// Has rank based aggregation call kind (e.g. ROW_NUMBER, RANK, DENSE_RANK)
boolean hasRankBasedAgg = hasRankBasedAgg(windowGroup.aggCalls);

// FOLLOWING lower bound or PRECEDING upper bound is not allowed, and won't reach here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this enforced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it and iirc it is banned from parsing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UNBOUNDED PRECEDING is not allowed for upper frame boundary and UNBOUNDED FOLLOWING is not allowed for lower frame boundary. However, n PRECEDING is allowed for upper frame boundary and n FOLLOWING is allowed for lower frame boundary (not at the same time though). This behavior is similar in both Calcite and Postgres.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Does it mean it can do something like BETWEEN 5 PRECEDING AND 3 PRECEDING?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Postgres allows that, Calcite allows that, and with #14249, Pinot will support that as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants