-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Hack to fix some problematic queries #13174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| /** | ||
| * Rule to expand search condition in filter. | ||
| * <p> | ||
| * For example, the filter condition: | ||
| * <p> | ||
| * <code>SEARCH(col1, Sarg[[1..2)])</code> | ||
| * </p> | ||
| * Is expanded to: | ||
| * <p> | ||
| * <code>AND(>=(col1, 1), <(col1, 2))</code> | ||
| * | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boy Scout Rule
| * <p>It provide the higher level entry interface to convert a SQL string into a {@link DispatchableSubPlan}. | ||
| */ | ||
| public class QueryEnvironment { | ||
| private static final Logger LOGGER = LoggerFactory.getLogger(QueryEnvironment.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a bunch of logs to better understand whether the tautologies were removed by the validator or the rules. In the cases I mention it is clear it is always the rules, but it was useful to detect other cases where I actually include the tautology in the SQL request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All logs are in debug mode, so they shouldn't be enabled by default.
| <!-- Change FULL_PLAN marker from onMatch="DENY" to onMatch='ACCEPT" to see the full plan before and after each | ||
| rule that modifies the plan is applied --> | ||
| <!-- <MarkerFilter marker="FULL_PLAN" onMatch="ACCEPT" onMismatch="NEUTRAL"/>--> | ||
| <MarkerFilter marker="FULL_PLAN" onMatch="DENY" onMismatch="NEUTRAL"/> | ||
| <AppenderRef ref="console"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this very useful to actually see what was going on in the rules. ACCEPT is better, but very verbose. I think it could be useful to have in our CI tests, but I will use DENY for a while to be sure we don't spam too much.
BTW, with this log I discovered that PinotRelDistributionTraitRule is being applied multiple times, which may introduce some extra latency in very large plans.
| "sql": "EXPLAIN PLAN FOR WITH CTE_B AS (SELECT 1692057600000 AS __ts FROM a GROUP BY __ts) SELECT 1692057600000 AS __ts FROM CTE_B WHERE __ts >= 1692057600000 GROUP BY __ts", | ||
| "output": [ | ||
| "Execution Plan", | ||
| "\nLogicalProject(__ts=[1692057600000:BIGINT])", | ||
| "\n LogicalAggregate(group=[{0}])", | ||
| "\n PinotLogicalExchange(distribution=[hash[0]])", | ||
| "\n LogicalAggregate(group=[{0}])", | ||
| "\n LogicalProject(__ts=[1692057600000:BIGINT])", | ||
| "\n LogicalTableScan(table=[[default, a]])", | ||
| "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query works in master. Our PinotEvaluateLiteralRule.Filter.INSTANCE is able to get rid of it.
But for example a query like:
WITH CTE_B AS (
SELECT 'a' AS __ts FROM a GROUP BY __ts
) SELECT 1
FROM CTE_B WHERE __ts >= `b`Fails because PinotEvaluateLiteralRule.Filter.INSTANCE tries to apply >= in V1, which only works with numbers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should no longer be an issue (#13711).
| "sql": "EXPLAIN PLAN FOR WITH CTE_B AS (SELECT 1692057600000 AS __ts FROM a GROUP BY __ts) SELECT 1692057600000 AS __ts FROM CTE_B WHERE __ts >= 1692057600000 AND __ts < 1693267200000 GROUP BY __ts", | ||
| "output": [ | ||
| "Execution Plan", | ||
| "\nLogicalProject(__ts=[1692057600000:BIGINT])", | ||
| "\n LogicalAggregate(group=[{0}])", | ||
| "\n PinotLogicalExchange(distribution=[hash[0]])", | ||
| "\n LogicalAggregate(group=[{0}])", | ||
| "\n LogicalProject(__ts=[1692057600000:BIGINT])", | ||
| "\n LogicalTableScan(table=[[default, a]])", | ||
| "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to the query in the PR description. It can be fixed by applying PinotEvaluateLiteralRule after PinotFilterExpandSearchRule or adding CompileTimeFunctionsInvoker into ServerPlanRequestUtils.QUERY_REWRITERS.
| "description": "Search + OR filter on constants is simplified", | ||
| "sql": "EXPLAIN PLAN FOR WITH tmp2 AS (SELECT CASE WHEN col2 = 'VAL1' THEN 'A' ELSE col2 END AS cased FROM a) SELECT 1 FROM tmp2 WHERE ((cased = 'B') OR (cased = 'A'))", | ||
| "output": [ | ||
| "Execution Plan", | ||
| "\nLogicalProject(EXPR$0=[1])", | ||
| "\n LogicalFilter(condition=[SEARCH($1, Sarg[_UTF-8'A':VARCHAR CHARACTER SET \"UTF-8\", _UTF-8'B':VARCHAR CHARACTER SET \"UTF-8\", _UTF-8'VAL1':VARCHAR CHARACTER SET \"UTF-8\"]:VARCHAR CHARACTER SET \"UTF-8\")])", | ||
| "\n LogicalTableScan(table=[[default, a]])", | ||
| "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query fails in master and cannot be solved by applying PinotEvaluateLiteralRule after PinotFilterExpandSearchRule or adding CompileTimeFunctionsInvoker into ServerPlanRequestUtils.QUERY_REWRITERS.
But it works if we apply CoreRules.FILTER_REDUCE_EXPRESSIONS followed by PinotFilterExpandSearchRule twice in a row.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13174 +/- ##
=============================================
- Coverage 61.75% 35.09% -26.67%
+ Complexity 207 6 -201
=============================================
Files 2436 2445 +9
Lines 133233 134648 +1415
Branches 20636 20843 +207
=============================================
- Hits 82274 47249 -35025
- Misses 44911 83919 +39008
+ Partials 6048 3480 -2568
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Queries like:
Fail when they are executed. The reason is that very tricky, but mostly is due to:
1 >= 1 and 1 < 2is always true), so it is pushed into the leaf stageCompileTimeFunctionsInvoker, so these tautologies are not removed from the expression treecol1 > doubleLiteral.These tautologies are generated in
PinotFilterExpandSearchRule. We have two rules that should ideally remove them:PinotEvaluateLiteralRule(which evaluates the expressions using V1 semantics)CoreRules.FILTER_REDUCE_EXPRESSIONS(which evalautes the expressions using Calcite semantics)Alternatively, we should be able to call
CompileTimeFunctionsInvokerrewriter in the leaf stage. This invoker is also using V1 semantics.But any of these changes works.
PinotEvaluateLiteralRuleandCompileTimeFunctionsInvokerpartially work because V1 can execute=assuming each operand is a number. Things like1=1or'123'='123'but cannot execute'a' = 'a'.CoreRules.FILTER_REDUCE_EXPRESSIONScan get rid of the tautologies, but in some scenarios it creates just anotherSEARCHoperation. V1 cannot execute them.I'm not proud of the solution found in this PR. It basically consist of applying the pair
PinotFilterExpandSearchRule,CoreRules.FILTER_REDUCE_EXPRESSIONStwice. Although it seems to be working in all cases I tried, I'm sure there will be cases were this solution is not good enough.IMO it would be better to:
searchin V1.Do you have other ideas on how to fix this problem?