Skip to content

Conversation

@ankitsultana
Copy link
Contributor

@ankitsultana ankitsultana commented Apr 2, 2025

Builds on top of #15371.

This adds the following:

  • All the required Physical Plan Nodes
  • Trait constraint assignment, as we defined it in the design doc.
  • Aggregate rule that converts to PinotLogicalAggregate and pushes down group-trim. It is nearly the same as the existing PinotAggregateExchangeNodeInsertRule. The differences are mentioned in the javadocs of the new rule.

Trait Constraint Assignment

The key change in this PR is to add the Trait Constraint Assignment logic. We are using traits as constraints, and the Physical Planner will add Exchange to meet these constraints as required.

Test Plan

I have created a separate tracker for tests related to the Physical Planner. Once the E2E logic is wired in, I'll copy the JSON files with the explain plan output, which will do complete E2E testing for all features.

@codecov-commenter
Copy link

codecov-commenter commented Apr 2, 2025

Codecov Report

❌ Patch coverage is 3.12500% with 310 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.83%. Comparing base (59551e4) to head (0c6fbc6).
⚠️ Report is 3391 commits behind head on master.

Files with missing lines Patch % Lines
...ache/pinot/calcite/rel/traits/TraitAssignment.java 0.00% 109 Missing ⚠️
...t/calcite/rel/rules/PinotLogicalAggregateRule.java 11.84% 67 Missing ⚠️
...y/planner/physical/v2/nodes/PhysicalAggregate.java 0.00% 18 Missing ⚠️
...uery/planner/physical/v2/nodes/PhysicalWindow.java 0.00% 17 Missing ⚠️
...ery/planner/physical/v2/nodes/PhysicalProject.java 0.00% 15 Missing ⚠️
.../query/planner/physical/v2/nodes/PhysicalJoin.java 0.00% 13 Missing ⚠️
...query/planner/physical/v2/nodes/PhysicalMinus.java 0.00% 13 Missing ⚠️
.../query/planner/physical/v2/nodes/PhysicalSort.java 0.00% 13 Missing ⚠️
...query/planner/physical/v2/nodes/PhysicalUnion.java 0.00% 13 Missing ⚠️
...uery/planner/physical/v2/nodes/PhysicalValues.java 0.00% 12 Missing ⚠️
... and 2 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15439      +/-   ##
============================================
+ Coverage     61.75%   62.83%   +1.08%     
- Complexity      207     1375    +1168     
============================================
  Files          2436     2847     +411     
  Lines        133233   160712   +27479     
  Branches      20636    24609    +3973     
============================================
+ Hits          82274   100983   +18709     
- Misses        44911    52115    +7204     
- Partials       6048     7614    +1566     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 62.81% <3.12%> (+1.10%) ⬆️
java-21 62.81% <3.12%> (+1.19%) ⬆️
skip-bytebuffers-false 62.83% <3.12%> (+1.08%) ⬆️
skip-bytebuffers-true 62.79% <3.12%> (+35.06%) ⬆️
temurin 62.83% <3.12%> (+1.08%) ⬆️
unittests 62.83% <3.12%> (+1.08%) ⬆️
unittests1 55.80% <3.12%> (+8.91%) ⬆️
unittests2 33.54% <3.12%> (+5.81%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ankitsultana ankitsultana changed the title [WIP] [multistage] Add all Physical Plan Nodes and Complete Logical Planner [multistage] Add Physical Plan Nodes / Trait Assignment / Logical Agg Rule Apr 3, 2025
@ankitsultana ankitsultana marked this pull request as ready for review April 3, 2025 20:49
Copy link
Contributor

@itschrispeck itschrispeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

* When group-by keys are empty, we can use SINGLETON distribution. Otherwise, we use hash distribution on the
* group-by keys.
*/
@VisibleForTesting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UTs will be in a follow up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I had started writing them but have stashed them for now. I want to get closer to the E2E working state before getting into unit-testing since I might have to move things around quite a bit. I am tracking UT for Trait Assigment as a todo here: #15456.

* </li>
* </ol>
*/
public class PinotExecStrategyTrait implements RelTrait {
Copy link
Contributor Author

@ankitsultana ankitsultana Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I will likely store Exec Strategy via the PRelNode interface in the future and get rid of this trait. It's not exactly a trait in my opinion but rather a marker for certain plan nodes.

@ankitsultana ankitsultana added the multi-stage Related to the multi-stage query engine label Apr 4, 2025
* All of these will be done in the Physical Planning phase instead, since that is when we will know whether the
* aggregate has been split or not. (e.g. project under aggregate is required when you skip partial aggregate).
*/
public class PinotLogicalAggregateRule {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we are only taking hints out and put them in the PinotLogicalAggregate. IMO these hints can directly be processed during the physical planning phase, and we should just skip this rule. Do you see a special need why we need PinotLogicalAggregate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's a very good point. I added a tracker for this here: #15467

I think I'll then be able to do all the aggregate cleanup in a single rule which would be significantly more intuitive.

import org.apache.pinot.query.planner.plannode.AggregateNode;


public class PhysicalAggregate extends Aggregate implements PRelNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not yet the physical plan yet, so suggest using a different name. E.g. AggregatePRelNode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair point. Noted in #15467.

@ankitsultana ankitsultana merged commit 37ec14d into apache:master Apr 4, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mse-physical-optimizer multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants