Skip to content

Conversation

@ankitsultana
Copy link
Contributor

@ankitsultana ankitsultana commented Apr 28, 2025

Note: Does NOT touch any existing multistage code path and only contains changes to the new optimizer.

Summary

Adds Sort / Aggregate Pushdown and Worker Assignment rules. Also adds the PRelToPlanNodeConverter which is taken as is from the RelToPlanNodeConverter. I have commented out the Spooling and metrics related code for now and will be fixing them in a future PR as part of productionization of the Physical Optimizer.

Sort Pushdown

Sort pushdown pushes a Sort past an Exchange when there's a non-null "fetch" set in the Sort. The value of the fetch is computed using "fetch + offset" value of the top-level sort.

Note: There are possibilities to do more kinds of pushdown here. For instance, if there's a fetch only Sort right above a Join, then we could annotate the Join with the fetch limit and remove the Sort.

Aggregate Pushdown

This is similar to the existing PinotAggregateNodeExchangeInsertRule.

Worker Assignment

Worker assignment is where Exchanges are simplified, allowing for full plan colocation. I have added some details in the Javadoc of the corresponding class.

Test Plan

We have the full Physical Optimizer changes running in a test cluster, and the next PR will also add Integration Tests. You can see a preview of them here: https://github.com/ankitsultana/pinot/pull/46/files#diff-25751d60157478f9d864a75c86bf6902f7870c879640887741e59d17e0162643

@ankitsultana ankitsultana added multi-stage Related to the multi-stage query engine mse-physical-optimizer labels Apr 28, 2025
@codecov-commenter
Copy link

codecov-commenter commented Apr 28, 2025

Codecov Report

Attention: Patch coverage is 1.87266% with 524 lines in your changes missing coverage. Please review.

Project coverage is 62.66%. Comparing base (c894265) to head (1cf9edd).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...cal/v2/opt/rules/WorkerExchangeAssignmentRule.java 0.00% 211 Missing ⚠️
...y/planner/physical/v2/PRelToPlanNodeConverter.java 0.00% 156 Missing ⚠️
...r/physical/v2/opt/rules/AggregatePushdownRule.java 0.00% 133 Missing ⚠️
...lanner/physical/v2/opt/rules/SortPushdownRule.java 32.25% 21 Missing ⚠️
...ry/planner/physical/v2/opt/PhysicalOptRuleSet.java 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15658      +/-   ##
============================================
- Coverage     62.79%   62.66%   -0.14%     
  Complexity     1384     1384              
============================================
  Files          2865     2869       +4     
  Lines        162973   163514     +541     
  Branches      24931    25060     +129     
============================================
+ Hits         102339   102460     +121     
- Misses        52910    53316     +406     
- Partials       7724     7738      +14     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 62.63% <1.87%> (-0.13%) ⬇️
java-21 62.57% <1.87%> (-0.21%) ⬇️
skip-bytebuffers-false 62.65% <1.87%> (-0.14%) ⬇️
skip-bytebuffers-true 62.55% <1.87%> (-0.20%) ⬇️
temurin 62.66% <1.87%> (-0.14%) ⬇️
unittests 62.65% <1.87%> (-0.14%) ⬇️
unittests1 55.50% <1.87%> (-0.23%) ⬇️
unittests2 33.53% <0.00%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

throw new IllegalArgumentException("Unsupported collection type: " + relDataType);
}
LOGGER.warn("Unexpected SQL type: {}, use OBJECT instead", sqlTypeName);
return ColumnDataType.OBJECT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What falls under this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is taken as is from RelToPlanNodeConverter.

I think this is just to be future compatible. We may add more types to Pinot but would likely miss out on covering them here. This way we'll log a warning and can retroactively fix the case handling here.

aggPRelNode.getLimit(), idGenerator);
}

// TODO: Currently it only handles one WITHIN GROUP collation across all AggregateCalls.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the failure mode when there are multiple collations? should we have some precondition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Similar to the other comment, this is taken as is from existing code:

private static RelCollation extractWithinGroupCollation(Aggregate aggRel) {

Ideally we should throw an error but not sure if we do that right now. cc: @Jackie-Jiang

@ankitsultana ankitsultana merged commit d7443e6 into apache:master May 1, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mse-physical-optimizer multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants