-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[multistage] Derive SUM return type to be PostgreSQL compatible #11151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #11151 +/- ##
==========================================
+ Coverage 0.00% 0.11% +0.11%
==========================================
Files 2214 2231 +17
Lines 119705 120181 +476
Branches 18163 18227 +64
==========================================
+ Hits 0 137 +137
- Misses 119705 120024 +319
- Partials 0 20 +20
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 31 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
dc7b159 to
8d4c33e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem with this is b/c the 0.5 has DECIMAL value and it is a Literal
this will cause Calcite to explicitly hoist the SUM results by a specific precision and scale.
no one will actually use this in practices, and i suggest we disable this test until further fix can be put in place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is for join schema mismatch from both side, one is ps_availqty the type is decimal(19, 1), the other side is 0.5 * SUM(l_quantity), which is decimal(21,1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cast as double here and put a TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query failed due to a de-correlation issue for type mismatch of DECIMAL(19, 1) and ``DECIMAL(21, 1)`
java.lang.AssertionError: Cannot add expression of different type to set:
set type is RecordType(VARCHAR NOT NULL col1, VARCHAR NOT NULL col2, INTEGER NOT NULL col3, DECIMAL(1000, 0) NOT NULL col4, BOOLEAN NOT NULL col5, INTEGER NOT NULL col6, BIGINT NOT NULL ts, DECIMAL(21, 1) EXPR$0) NOT NULL
expression type is RecordType(VARCHAR NOT NULL col1, VARCHAR NOT NULL col2, INTEGER NOT NULL col3, DECIMAL(1000, 0) NOT NULL col4, BOOLEAN NOT NULL col5, INTEGER NOT NULL col6, BIGINT NOT NULL ts, DECIMAL(19, 1) EXPR$0) NOT NULL
set is rel#6543:LogicalCorrelate.(left=HepRelVertex#6533,right=HepRelVertex#6542,correlation=$cor0,joinType=left,requiredColumns={0, 1})
expression is LogicalProject(col1=[$0], col2=[$1], col3=[$2], col4=[$3], col5=[$4], col6=[$5], ts=[$6], EXPR$0=[*(0.5:DECIMAL(2, 1), $7)])
LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0, 1}])
LogicalTableScan(table=[[a]])
LogicalAggregate(group=[{}], agg#0=[SUM($0)])
LogicalProject(col3=[$2])
LogicalFilter(condition=[AND(=($1, $cor0.col2), =($0, $cor0.col1))])
LogicalTableScan(table=[[b]])
at org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:391)
at org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:60)
at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:269)
at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:284)
at org.apache.calcite.sql2rel.RelDecorrelator$AdjustProjectForCountAggregateRule.onMatch2(RelDecorrelator.java:2703)
at org.apache.calcite.sql2rel.RelDecorrelator$AdjustProjectForCountAggregateRule.onMatch(RelDecorrelator.java:2610)
at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
at org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
at org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
at org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211)
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210)
at org.apache.calcite.plan.hep.HepProgram$State.execute(HepProgram.java:118)
at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:205)
at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:191)
at org.apache.calcite.sql2rel.RelDecorrelator.decorrelate(RelDecorrelator.java:291)
at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateQuery(RelDecorrelator.java:230)
at org.apache.pinot.query.QueryEnvironment.decorrelateIfNeeded(QueryEnvironment.java:282)
at org.apache.pinot.query.QueryEnvironment.compileQuery(QueryEnvironment.java:275)
at org.apache.pinot.query.QueryEnvironment.explainQuery(QueryEnvironment.java:202)
at org.apache.pinot.query.QueryEnvironment.explainQuery(QueryEnvironment.java:220)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally the max precision and scale is 1000, per: https://github.com/apache/pinot/blob/master/pinot-query-planner/src/main/java/org/apache/pinot/query/type/TypeSystem.java#L32
So the intermediate decimal results type may go pretty wild.
...on-tests/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTestSet.java
Outdated
Show resolved
Hide resolved
8d4c33e to
79a07f9
Compare
|
We still need to find a way to solve this decimal type hoist problem. |
79a07f9 to
7b3ee2b
Compare
|
based on https://www.postgresql.org/docs/8.2/functions-aggregate.html and https://learn.microsoft.com/en-us/sql/t-sql/functions/sum-transact-sql?view=sql-server-ver16 it doesn't seem to have a consistent requirement for type hoisting in SQL standards. |
vvivekiyer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a runtime test that verifies SUM (and other aggregations) on BIG_DECIMAL column? I remember testing this long back and there were some issues. Not sure if they've been fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really. given the fact that we are currently working around the type-hoisting behavior to not do any automatic conversion. it should be same type in same type out
|
For reference, here is the latest PostgreSQL document for result types: https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE |
|
i think what's missing is that
one solution I can think of is to register our own behavior by adding our own OperatorTable and our own return type inference this way we don't need to worry about the TypeFactory/TypeSystem overrides which intricately went through Calcite internal where we don't have direct control. |
7b3ee2b to
cc4e26d
Compare
621b47b to
b6c9f2f
Compare
bede511 to
292099d
Compare
walterddr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. thank you for fixing this issue
pinot-query-planner/src/main/java/org/apache/pinot/query/type/TypeSystem.java
Outdated
Show resolved
Hide resolved
pinot-query-planner/src/test/resources/queries/AggregatePlans.json
Outdated
Show resolved
Hide resolved
292099d to
cd72e87
Compare
Derive sum return type for v2.
Based on https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
for query:
Before:

After:
