Skip to content

Conversation

@ankitsultana
Copy link
Contributor

@ankitsultana ankitsultana commented Jun 1, 2025

Summary

Adds support for enabling broker pruning (for servers and segments). This can especially be beneficial for MSE Lite Mode that aims to mimic the scatter gather pattern of the V1 Engine to enable high-qps use-cases.

Test Plan

Added some unit-tests and tested on Colocated Join Quickstart. I had to add the routing config to enable partition pruning in the broker to enable this.

SET useMultistageEngine=true;
SET usePhysicalOptimizer=true;
-- SET useBrokerPruning=true;

explain implementation plan for select count(*)
from userAttributes
where userUUID = 'user-1'

[0]@10.24.6.236:33955|[0] MAIL_RECEIVE(SINGLETON)
└── [1]@10.24.6.236:33321|[0] MAIL_SEND(SINGLETON)->{[0]@10.24.6.236:33955|[0]}
    └── [1]@10.24.6.236:33321|[0] AGGREGATE_FINAL
        └── [1]@10.24.6.236:33321|[0] MAIL_RECEIVE(SINGLETON)
            ├── [2]@10.24.6.236:38493|[1] MAIL_SEND(SINGLETON)->{[1]@10.24.6.236:35465|[0]} (Subtree Omitted)
            └── [2]@10.24.6.236:33321|[0] MAIL_SEND(SINGLETON)->{[1]@10.24.6.236:35465|[0]}
                └── [2]@10.24.6.236:33321|[0] AGGREGATE_LEAF
                    └── [2]@10.24.6.236:33321|[0] FILTER
                        └── [2]@10.24.6.236:33321|[0] TABLE SCAN (userAttributes) null
SET useMultistageEngine=true;
SET usePhysicalOptimizer=true;
SET useBrokerPruning=true;

explain implementation plan for select count(*)
from userAttributes
where userUUID = 'user-1'

[0]@10.24.6.236:33955|[0] MAIL_RECEIVE(SINGLETON)
└── [1]@10.24.6.236:44355|[0] MAIL_SEND(SINGLETON)->{[0]@10.24.6.236:33955|[0]}
    └── [1]@10.24.6.236:44355|[0] AGGREGATE_DIRECT
        └── [1]@10.24.6.236:44355|[0] FILTER
            └── [1]@10.24.6.236:44355|[0] TABLE SCAN (userAttributes) null

@ankitsultana ankitsultana added multi-stage Related to the multi-stage query engine mse-physical-optimizer labels Jun 1, 2025
@codecov-commenter
Copy link

codecov-commenter commented Jun 1, 2025

Codecov Report

Attention: Patch coverage is 89.28571% with 9 lines in your changes missing coverage. Please review.

Project coverage is 63.41%. Comparing base (1a476de) to head (0ad1251).
Report is 416 commits behind head on master.

Files with missing lines Patch % Lines
...t/query/planner/logical/LeafStageToPinotQuery.java 89.18% 0 Missing and 4 partials ⚠️
...he/pinot/query/context/PhysicalPlannerContext.java 50.00% 1 Missing and 1 partial ⚠️
...pinot/query/parser/CalciteRexExpressionParser.java 84.61% 2 Missing ⚠️
...al/v2/opt/rules/LeafStageWorkerAssignmentRule.java 96.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15959      +/-   ##
============================================
+ Coverage     62.90%   63.41%   +0.50%     
+ Complexity     1386     1355      -31     
============================================
  Files          2867     2911      +44     
  Lines        163354   166938    +3584     
  Branches      24952    25533     +581     
============================================
+ Hits         102755   105859    +3104     
- Misses        52847    53057     +210     
- Partials       7752     8022     +270     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.37% <89.28%> (+0.50%) ⬆️
java-21 63.38% <89.28%> (+0.56%) ⬆️
skip-bytebuffers-false ?
skip-bytebuffers-true ?
temurin 63.41% <89.28%> (+0.50%) ⬆️
unittests 63.40% <89.28%> (+0.50%) ⬆️
unittests1 56.47% <89.28%> (+0.65%) ⬆️
unittests2 33.45% <0.00%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// --------------------------------------------------------------------------

public static List<Expression> convertRexNodes(List<RexExpression> rexNodes, PinotQuery pinotQuery) {
public static List<Expression> convertRexNodes(List<RexExpression> rexNodes, List<Expression> selectList) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplifying this code. It was taking in a wider object than required. Only access to the selectList was required

@ankitsultana ankitsultana marked this pull request as ready for review June 3, 2025 04:09
}

@VisibleForTesting
static PinotQuery createPinotQuery(String tableName, TableScan tableScan, Deque<PRelNode> parents,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for reviewers: This logic to create a PinotQuery out of a leaf stage is similar to ServerPlanRequestVisitor which is run in the servers. However, that visitor deals with PlanNode types, whereas here we deal with PRelNode.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very useful for MSE. Is it possible to extract the code into a common util class? Maybe move the code from SSE to a util?

Comment on lines +639 to +640
// TODO(mse-physical): Consider removing this query option and making this the default, since there's already
// a table config to enable broker pruning (it is disabled by default).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which table config are you referring to?

Copy link
Contributor Author

@ankitsultana ankitsultana Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the routing. segmentPrunerTypes key in the table config. Based on my tests we don't prune segments unless this is set (maybe I am missing something)

@ankitsultana ankitsultana requested a review from Jackie-Jiang June 4, 2025 04:29
@ankitsultana
Copy link
Contributor Author

This is very useful for MSE. Is it possible to extract the code into a common util class? Maybe move the code from SSE to a util?

Moved the logic to a util class.

Copy link
Contributor

@wirybeaver wirybeaver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. IIUC, after applying the BrokerPruner, the Rel Trait of the Data distribution of the TableScan will be singleton so that the following Aggregation Pushdown Rule would apply the AggregateDirect.

@ankitsultana ankitsultana merged commit fb2e4df into apache:master Jun 4, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mse-physical-optimizer multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants