-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[multistage] Adding Basic Constructs for Physical Optimization #15371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #15371 +/- ##
============================================
+ Coverage 61.75% 62.89% +1.13%
- Complexity 207 1375 +1168
============================================
Files 2436 2834 +398
Lines 133233 160078 +26845
Branches 20636 24542 +3906
============================================
+ Hits 82274 100675 +18401
- Misses 44911 51810 +6899
- Partials 6048 7593 +1545
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
...query-planner/src/main/java/org/apache/pinot/query/planner/physical/v2/ExchangeStrategy.java
Outdated
Show resolved
Hide resolved
...-planner/src/main/java/org/apache/pinot/query/planner/physical/v2/PinotDataDistribution.java
Outdated
Show resolved
Hide resolved
| */ | ||
| public class TableScanMetadata { | ||
| private final Set<String> _scannedTables; | ||
| private final Map<Integer, Map<String, List<String>>> _workedIdToSegmentsMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the value here? Map<String, List<String>? workerId => table type => segments?
Could we use a class or add a comment so we don't have to track the initialization back to understand the structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Added a javadoc comment.
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in high level idea. We can revisit the details in the following PRs
Summary
Adds basic constructs required for Physical Optimizers as specified in this doc: https://docs.google.com/document/d/17ApZbvNphKgEdSAOlZwTwAnnL_dAt9QbjcpzjHb4M0w/edit?tab=t.0
This is largely based on the PoC I had raised a few weeks back, with some changes: #15282
Changes wrt PoC
PRelNode Interface
PRelNodeis no longer a single concrete type (class) and a wrapper on top ofRelNode. Instead I have modeled it as a separate interface. This not only keeps us somewhat compatible with Calcite, it also will also cleanup the code quite a bit.The main con of this approach is that we'll have to define our own Sort, Window, and other nodes. But that is what Calcite recommends, as @gortiz has also shared before.
To show what this looks like, this PR has added a couple of custom Physical plan nodes (exchange, filter and table scan).
Implementation Related Changes in RuleExecutor
RuleExecutorimplements thePRelNodeTransformer. The transformer is a general concept, and ideally Physical Planner should run a series of transformers, some of which may be executed viaRuleExecutor.Why? This is because for some of the optimizations we may want complete control on the flow and constructs like tracking parent plan nodes,
PRelOptRuleCalletc. may not be necessary. Moreover, theonMatch/matchescontract might not fit the requirements well.This is nothing new. Calcite has the concept of
Program, Presto hasPlanOptimizer, etc.Test Plan / Rollout Safety
This is completely isolated code right now, and we'll keep it behind a flag for a few weeks until we test it in our clusters.