Skip to content

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented May 23, 2024

This PR re-implement #11105, which use hints to carry literals due to the calcite limitation.
Since the recent upgrade of Calcite library, we can use rexList((https://issues.apache.org/jira/browse/CALCITE-4334 and apache/calcite@ddb4200)) to carry original aggregation functions arguments for literals and intermediate stage with placeholder.

  • Use RexList to carry all aggregation arguments for literals and setting intermediate stage placeholders
  • Removed existing hack of setting literals in v2 functions

@xiangfu0 xiangfu0 added feature multi-stage Related to the multi-stage query engine labels May 23, 2024
@xiangfu0 xiangfu0 force-pushed the FunnelCompleteCount branch from 7c011f6 to 5287a86 Compare May 24, 2024 00:00
@codecov-commenter
Copy link

codecov-commenter commented May 24, 2024

Codecov Report

Attention: Patch coverage is 90.78947% with 7 lines in your changes are missing coverage. Please review.

Project coverage is 62.15%. Comparing base (59551e4) to head (9f35387).
Report is 499 commits behind head on master.

Files Patch % Lines
...inot/query/runtime/operator/AggregateOperator.java 78.78% 4 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13217      +/-   ##
============================================
+ Coverage     61.75%   62.15%   +0.40%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2532      +96     
  Lines        133233   138976    +5743     
  Branches      20636    21531     +895     
============================================
+ Hits          82274    86378    +4104     
- Misses        44911    46141    +1230     
- Partials       6048     6457     +409     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 62.10% <90.78%> (+0.39%) ⬆️
java-21 62.03% <90.78%> (+0.41%) ⬆️
skip-bytebuffers-false 62.13% <90.78%> (+0.38%) ⬆️
skip-bytebuffers-true 62.02% <90.78%> (+34.29%) ⬆️
temurin 62.15% <90.78%> (+0.40%) ⬆️
unittests 62.14% <90.78%> (+0.40%) ⬆️
unittests1 46.67% <90.78%> (-0.23%) ⬇️
unittests2 27.82% <0.00%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@xiangfu0 xiangfu0 force-pushed the FunnelCompleteCount branch 7 times, most recently from 89fc9c0 to d48b931 Compare May 24, 2024 16:53
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Are you trying to get boolean out of NULL? This won't be accurate because NULL is not equivalent to boolean false

Copy link
Contributor Author

@xiangfu0 xiangfu0 May 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for null, it anyway won't work with type boolean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to only handle the Integer conversion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug is on the RexExpression conversion side, where boolean value is not properly preserved. We should change RexExpressionUtils to keep the value type aligned with the data type. Also, if I read it correctly, TIMESTAMP is also not properly handled as it is not supported within LiteralContext yet

@xiangfu0 xiangfu0 force-pushed the FunnelCompleteCount branch 3 times, most recently from 2a0fcff to c836128 Compare May 25, 2024 21:09
@xiangfu0 xiangfu0 requested a review from Jackie-Jiang May 25, 2024 21:10
@Jackie-Jiang Jackie-Jiang added the backward-incompat Referenced by PRs that introduce or fix backward compat issues label May 25, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand it better, why can't we reuse _functionOperands here for literals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, it will break the query plan, e.g. intermediate stage result reference.
So need to apply _functionPreArgs to _functionOperands during the query execution phase.

@xiangfu0 xiangfu0 force-pushed the FunnelCompleteCount branch from c836128 to 50639e2 Compare May 26, 2024 03:00
@xiangfu0 xiangfu0 requested a review from Jackie-Jiang May 26, 2024 04:05
@xiangfu0 xiangfu0 changed the title Use RexList of carray pre agg arguments for intermediate stage Use RexList to carry aggregation pre arguments for intermediate stage May 26, 2024
Comment on lines +251 to +252
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it won't be backward compatible anyway, can we directly use this as operands? I think that is the general way to handle it, where we already support reading literal from operands

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work for intermediate stage, which expects only one argument in the operand list with InputRef to the type of an intermediate object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use same way to process raw input and intermediate input since we already have all operands?
Right you we assume we don't need function for raw input which is not always true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference is that first argument type in intermediate stage function operands might be different than raw input stage.

For count(*), no argument in rawInput stage, but you need one InputRef in the intermediate stage.

@xiangfu0
Copy link
Contributor Author

xiangfu0 commented Jun 2, 2024

Close this since it's already implemented in #13282

@xiangfu0 xiangfu0 closed this Jun 2, 2024
@xiangfu0 xiangfu0 deleted the FunnelCompleteCount branch June 2, 2024 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backward-incompat Referenced by PRs that introduce or fix backward compat issues feature multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants