Skip to content

Conversation

@walterddr
Copy link
Contributor

@walterddr walterddr commented Oct 26, 2023

this fixes #11881

this rule will be merged into a more sophisticated rule that will be created in #11831

@walterddr walterddr force-pushed the hotfix_multi_exchange branch from 7f2962b to ce4a8eb Compare October 26, 2023 17:16
@Jackie-Jiang Jackie-Jiang added bugfix multi-stage Related to the multi-stage query engine labels Oct 26, 2023
@codecov-commenter
Copy link

codecov-commenter commented Oct 26, 2023

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.41%. Comparing base (6853991) to head (e224211).
⚠️ Report is 3237 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #11882      +/-   ##
============================================
- Coverage     61.42%   61.41%   -0.02%     
+ Complexity     1147     1146       -1     
============================================
  Files          2375     2375              
  Lines        128501   128501              
  Branches      19846    19846              
============================================
- Hits          78936    78919      -17     
- Misses        43859    43875      +16     
- Partials       5706     5707       +1     
Flag Coverage Δ
custom-integration1 <0.01% <ø> (ø)
integration <0.01% <ø> (ø)
integration1 <0.01% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 61.37% <ø> (-0.03%) ⬇️
java-21 61.29% <ø> (+0.01%) ⬆️
skip-bytebuffers-false 61.40% <ø> (-0.02%) ⬇️
skip-bytebuffers-true 61.26% <ø> (+<0.01%) ⬆️
temurin 61.41% <ø> (-0.02%) ⬇️
unittests 61.41% <ø> (-0.02%) ⬇️
unittests1 46.63% <ø> (ø)
unittests2 27.57% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@xiangfu0 xiangfu0 Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting for two filters, I think visitFilter in ServerPlanRequestVisitor will override each other, so one predicate will be skipped.

Copy link
Contributor Author

@walterddr walterddr Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not unexpected. (we can potentially make it more smart). the problem is that

WITH tmp1
     AS (SELECT *
         FROM   a
         WHERE  col2 NOT IN ( 'foo', 'bar' )),
     tmp2
     AS (SELECT *
         FROM   b
         WHERE  col1 IN (SELECT col1
                         FROM   tmp1)
                AND col3 < 100)
SELECT *
FROM   tmp2
WHERE  col1 IN (SELECT col1
                FROM   tmp1  -- this is the tmp1 modification into tmp1'
                WHERE  col3 > 10) 

at this stage

  • Calcite's doesn't know if you want to do a table spool of tmp1 and make tmp1' derived from tmp1 --> thus col3 > 10 is not pushed down.
  • only when we decided that table spool is not possible and then we decided to copy tmp1 and tmp1' into 2 separate sub queries, at that time i think the hep optimizer already get passed the filter merging phase.

rewriting the query in this way

WITH tmp1
     AS (SELECT *
         FROM   a
         WHERE  col2 NOT IN ( 'foo', 'bar' )),
     tmp2
     AS (SELECT *
         FROM   b
         WHERE  col1 IN (SELECT col1
                         FROM   tmp1)
                AND col3 < 100),
     tmp3 -- here we explicitly tell that tmp3 and tmp1 are not related
     AS (SELECT *
         FROM   a
         WHERE  col2 NOT IN ( 'foo', 'bar' ) AND col3 > 10),
SELECT *
FROM   tmp2
WHERE  col1 IN (SELECT col1
                FROM   tmp3) 

produces no multi-filter plan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we cannot avoid this, then it means the ServerPlanRequestVisitor should handle the leaf query generation properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a un-plannable sql before or after this bug fix PR. I would suggest differ the fix into a different PR (i will change the test to make sure it doesn't happen)

@walterddr walterddr force-pushed the hotfix_multi_exchange branch from ce4a8eb to e224211 Compare October 27, 2023 16:19
@walterddr walterddr merged commit 86735ce into apache:master Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[multistage] multiple nested SEMI-join create wrong plan

4 participants