-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Enable dynamically turning on/off optProgram rules #15999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I suppose these knobs are not just for debugging. It is expected that users may use this in their production workloads. Correct?
I think this makes sense to some extent for rules in Calcite. But for rules implemented by us, we should instead use shorter names. We could create a new abstract class that overrides |
Yes.
Certainly, there are rules whose name is too long / doesn't make sense. I found out our EXPLAIN PLAN FOR query timing simply uses rule's description (just that for Calcite CORE_RULES the description is the same as class name), so we could "rename" rules just using
Will update this now if it makes sense |
That may not be guaranteed to be the same across Calcite version upgrades. I suppose Calcite won't change class names since that would be a breaking change, but descriptions could be changed. |
I see. Our current explain plan uses RelOptRule.toString() as the display name, which returns the rule description. This is also subject to change if the default description is changed in Calcite. So I'm proposing to explicitly assign a description with a defined constant when instantiating each rule. In this way the identifier of the rule would always be controlled by us (which will always be its description), renaming rules would be easy by just changing the constant, and we can upgrade / replace certain rules with Pinot version while keeping the same name. I think this is similar to the idea of defining a wrapper around rules, except we utilize the description field of RelOptRules, which is intended for this. Assigning descriptions to rules should be a good practice as it enables us to distinguish instances of same rule class under different context, plus we would have a nice list of all rule name constants used in CommonConstants, which is easier to maintain. For example, in Now we can rename it easily or upgrade it to something like Please let me know if this makes sense? |
65ce02b to
beefa20
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #15999 +/- ##
============================================
+ Coverage 62.90% 63.22% +0.31%
+ Complexity 1386 1352 -34
============================================
Files 2867 2951 +84
Lines 163354 169639 +6285
Branches 24952 25943 +991
============================================
+ Hits 102755 107249 +4494
- Misses 52847 54279 +1432
- Partials 7752 8111 +359
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Show resolved
Hide resolved
...on-tests/src/test/java/org/apache/pinot/integration/tests/OfflineClusterIntegrationTest.java
Outdated
Show resolved
Hide resolved
| .defaultSchema(rootSchema.plus()).sqlToRelConverterConfig(PinotRuleUtils.PINOT_SQL_TO_REL_CONFIG).build(); | ||
| _catalogReader = new PinotCatalogReader( | ||
| rootSchema, List.of(database), _typeFactory, CONNECTION_CONFIG, config.isCaseSensitive()); | ||
| _optProgram = getOptProgram(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still want to keep this default _optProgram, so that for majority of the cases (no rule override) we don't need to construct a new one
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java
Outdated
Show resolved
Hide resolved
|
IMHO, it would be a better idea to introduce a single |
|
changed option format to |
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done!
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java
Outdated
Show resolved
Hide resolved
| assertEquals(response1Json.get("rows").get(0).get(2).asText(), "Rule Execution Times\n" | ||
| + "Rule: AggregateProjectMergeRule -> Time:*\n" | ||
| + "Rule: Project -> Time:*\n" | ||
| + "Rule: EvaluateProjectLiteralRule -> Time:*\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) Consider changing the HashMap into a LinkedHashMap to preserve the order of rule application, and also make this deterministic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do this in a seperate PR since this might change a dozen tests
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironment.java
Outdated
Show resolved
Hide resolved
Implemented knobs as queryOption that controls turning on / off rules per query. Currently supported all rules in optProgram. All rules are enbaled by default. One can specify disabling rule like: ``` SET plannerRule_skipPinotAggregateReduceFunctionsRule=true; SET plannerRule_skipAggregateCaseToFilterRule=true; ``` The naming convention here is consistent with the rule name shown on query console from `EXPLAIN PLAN FOR`, it's just `plannerRule_skip<name as shown on console>`. The name shown on console is the description of the rule. This commit introduced a convention that all rules when instantiated should be explicitly provided with a description. This is added to all rules used in optProgram now. The rule description will serve as its identifier in both `EXPLAIN PLAN FOR` and for turning on and off using knobs. Relevant constructors for Pinot rules that passes the description as arg were added. All constants introduced are declared in pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java. A major change to QueryEnvironment is, the optProgram in the QueryEnvironment is now constructed per-query (to contain only enabled rules), this is acceptable as creating HepProgram is not expensive. TraitProgram is already created per-query before this PR. Several tests are added to QueryCompilationTest and QueryPlannerRuleOptionsTest.
…cuase of the description change This is due to that we use a HashMap with rule name as the key to record the rule timings, and the display order is the traversal order of the hash map.
…Environment.java Co-authored-by: Xiaotian (Jackie) Jiang <[email protected]>
Now disable rules by setting skipPlannerRules to a comma separated list of rule names. For example ``` SET skipPlannerRules='FilterProjectTransposeRule,PruneEmptySort,EvaluateProjectLiteralRule'; ```
…Environment.java Co-authored-by: Xiaotian (Jackie) Jiang <[email protected]>
…Environment.java Co-authored-by: Xiaotian (Jackie) Jiang <[email protected]>
Implemented knobs as queryOption that controls turning on / off rules per query.
Currently supported all rules in optProgram. One can specify disabling rule by setting a comma separated list of rule names to be skipped (no spaces), like:
The purpose of these knobs are for users to turn off rules that are taking too long / produces suboptimal plan. All rules are enabled by default.
The rule name here is consistent with the rule name shown on query console from
EXPLAIN PLAN FOR(which is the description of the rule)A major change to QueryEnvironment is, the optProgram in PlannerContext is now constructed per-query (to contain only enabled rules), this is acceptable as creating HepProgram is not expensive. TraitProgram is already created per-query before this PR.
Another major change is, a convention is added that every rule used in optProgram in PinotQueryRuleSets is now instantiated with a defined constant description. These descriptions would be used to show rule in EXPLAIN_PLAN_FOR and also for the knob. This allows us to manage and identify rule instances more easily.
Several tests are added to QueryCompilationTest and QueryPlannerRuleOptionsTest to test the new rule and knobs.
A logger is added to QueryEnvironment, which will be used to debug parsed / decorrelated / trimmed plans in the future.