Skip to content

Conversation

@wirybeaver
Copy link
Contributor

@wirybeaver wirybeaver commented Jun 16, 2025

Resolve the issue #15985

Background
The Core Problem: Shifting Input Field Counts. A LogicalWindow's expressions (RexNodes) can refer to two things via RexInputRef: (1) Columns from its input relation (index < inputFieldCount); (2) Literals from its constants list (index >= inputFieldCount).

When we chain the windows, the inputFieldCount for each subsequent window increases.

  • Original Window: Input has N fields. A reference to the k-th constant is RexInputRef(N + k).
  • First New Window (LW1): Input is the original input, so inputFieldCount is N. It adds agg1_count fields. Its output has N + agg1_count fields.
  • Second New Window (LW2): Input is the output of LW1, so its inputFieldCount is now N' = N + agg1_count.

If a group in LW2 keep using the original RexInputRef(N + k), LW2 will interpret it as a reference to the (N + k)-th column of its input. This is now incorrect. It's pointing to one of LW1's aggregate results instead of the intended constant. The reference needs to be shifted to RexInputRef(N' + k).

Solution
Correct Row Type construction

  • For each new LogicalWindow, we calculate its output RelDataType.
  • The new type is built by taking all the fields from its input (currentInput.getRowType()) and appending the new aggregate fields that this specific group will produce.
  • To ensure the final output schema is identical to the original, we get the field definitions (including names and types) for the new aggregate columns directly from the originalWindow's output row type. The cumulativeAggFieldCount variable correctly tracks our position in the original aggregate field list.

Correct RexInputRef pointing to the constant variable
Use a RexShuttle to traverse the expressions in each Window.Group and adjust any RexInputRef that points to a constant. Specially, visitInputRef is overridden. It checks if an input reference's index points to a constant (index >= originalInputFieldCount). If it does, it creates a new RexInputRef with the index increased by the shift amount. Otherwise, it returns the original reference. The shift amount denotes the number of new fields added by previous windows in the chain.

Why groupSet and orderKeys Don't Need Shifting
groupSet and orderKeys Refer To: The integer indices in groupSet (PARTITION BY) and orderKeys (ORDER BY) refer to the columns of the window's direct input relation. Calcite provides mechanisms to fail early or eliminate the constant variables appearing in the window functions's Partition By or Order By. Therefore, these indices will always be < inputFieldCount and will never point to the constants list.

Outcome
Dataset: the Quick Start of COLOCATED_JOIN
Test SQL:

SET useMultistageEngine=true;
SET usePhysicalOptimizer=true;
SELECT
  userUUID,
  cityName,
  productCode,
  ts,
  tripAmount,
  sum(tripAmount) over(partition by cityName order by ts asc ROWS BETWEEN CURRENT ROW and 2 FOLLOWING) as amount_post_2_city,
  lag(tripAmount, 1, '0') OVER (partition by cityName order by ts asc) AS prev_amount_city,
  tripAmount-prev_amount_city as diff,
  row_number() over(partition by cityName order by ts asc) as row_city,
  sum(tripAmount) over(partition by productCode order by ts asc ROWS BETWEEN 1 PRECEDING and CURRENT ROW) as amount_pre_1_productCode,
  row_number() over(partition by productCode order by ts asc) as row_productCode
FROM
    userFactEvents
WHERE
    ts <= 1735696010 
    AND ts > 1735689810 
    AND cityName IN ('Chicago', 'New York')

Removing SET usePhysicalOptimizer=true; get the same correct result.

Result

userUUID cityName productCode ts tripAmount amount_post_2_city prev_amount_city diff row_city amount_pre_1_productCode row_productCode
user-18432 Chicago uber-share 1735689833 10.876945 44.42476 0 10.876945 1 10.876945 1
user-32420 New York uber-share 1735694437 36.740086 100.96569200000002 38.234801 -1.4947149999999993 5 47.617031 2
user-2140 New York uber-share 1735694857 26.239464 26.239464000000027 37.986142 -11.746678 7 62.97955 3
user-20120 New York uber-route-share 1735690352 45.467037 106.615582 0 45.467037 1 45.467037 1
user-17332 New York uber-route-share 1735694658 37.986142 64.22560600000003 36.740086 1.246056000000003 6 83.453179 2
user-2892 New York uber-black 1735692196 39.332224 99.383346 45.467037 -6.134813000000001 2 39.332224 1
user-21748 New York uber-xl 1735692433 21.816321 96.79120800000001 39.332224 -17.515902999999998 3 21.816321 1
user-1748 New York uber-xl 1735694177 38.234801 112.96102900000001 21.816321 16.41848 4 60.05112199999999 2
user-7392 Chicago uber-x 1735691292 33.547815 33.547815 10.876945 22.67087 2 33.547815 1

@wirybeaver wirybeaver marked this pull request as draft June 16, 2025 06:27
@wirybeaver wirybeaver force-pushed the xuanyili/windowMultiGroup branch 4 times, most recently from 53efb75 to f9bb276 Compare June 25, 2025 07:10
@codecov-commenter
Copy link

codecov-commenter commented Jun 25, 2025

Codecov Report

❌ Patch coverage is 94.82759% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.29%. Comparing base (1a476de) to head (aa95a60).
⚠️ Report is 972 commits behind head on master.

Files with missing lines Patch % Lines
.../pinot/calcite/rel/rules/PinotWindowSplitRule.java 94.82% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #16109      +/-   ##
============================================
+ Coverage     62.90%   63.29%   +0.39%     
+ Complexity     1386     1365      -21     
============================================
  Files          2867     2960      +93     
  Lines        163354   170635    +7281     
  Branches      24952    26103    +1151     
============================================
+ Hits         102755   108008    +5253     
- Misses        52847    54431    +1584     
- Partials       7752     8196     +444     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.26% <94.82%> (+0.39%) ⬆️
java-21 63.18% <94.82%> (+0.36%) ⬆️
skip-bytebuffers-false ?
skip-bytebuffers-true ?
temurin 63.29% <94.82%> (+0.39%) ⬆️
unittests 63.29% <94.82%> (+0.39%) ⬆️
unittests1 64.74% <94.82%> (+8.92%) ⬆️
unittests2 33.47% <5.17%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wirybeaver wirybeaver force-pushed the xuanyili/windowMultiGroup branch from f9bb276 to f882733 Compare June 25, 2025 22:21
@wirybeaver wirybeaver force-pushed the xuanyili/windowMultiGroup branch from f882733 to 0301b38 Compare June 26, 2025 17:20
.map(operand -> operand.accept(this))
.collect(Collectors.toList());
return new Window.RexWinAggCall(
(SqlAggFunction) winCall.getOperator(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we have a Preconditions.checkState for clear error handling if the cast is not correct?

Copy link
Contributor Author

@wirybeaver wirybeaver Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RexWinAggCall currently can only be initialized with SqlAggFunction and the condition to step into this block is that the if (call instanceof Window.RexWinAggCall)

public static class RexWinAggCall extends RexCall {
    public final int ordinal;
    public final boolean distinct;
    public final boolean ignoreNulls;

    /** @deprecated */
    @Deprecated
    public RexWinAggCall(SqlAggFunction aggFun, RelDataType type, List<RexNode> operands, int ordinal, boolean distinct) {
        this(aggFun, type, operands, ordinal, distinct, false);
    }

    public RexWinAggCall(SqlAggFunction aggFun, RelDataType type, List<RexNode> operands, int ordinal, boolean distinct, boolean ignoreNulls) {
        super(type, aggFun, operands);
        this.ordinal = ordinal;
        this.distinct = distinct;
        this.ignoreNulls = ignoreNulls;
    }
}

@wirybeaver wirybeaver force-pushed the xuanyili/windowMultiGroup branch from 23ec998 to f44a2e9 Compare June 27, 2025 07:11
@wirybeaver wirybeaver changed the title draft window split rule Multiple Window Group Support Jun 27, 2025
@wirybeaver wirybeaver requested a review from shauryachats June 27, 2025 16:33
@wirybeaver wirybeaver marked this pull request as ready for review June 27, 2025 16:35
@ankitsultana ankitsultana changed the title Multiple Window Group Support [multistage] Multiple Window Group Support Jun 27, 2025
@wirybeaver wirybeaver requested a review from ankitsultana June 27, 2025 22:18
@ankitsultana ankitsultana merged commit 782b697 into apache:master Jul 8, 2025
18 checks passed
@deemoliu deemoliu added release-notes Referenced by PRs that need attention when compiling the next release notes multi-stage Related to the multi-stage query engine window-functions Related to SQL window functions on the multi-stage query engine labels Jul 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-stage Related to the multi-stage query engine release-notes Referenced by PRs that need attention when compiling the next release notes window-functions Related to SQL window functions on the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants