Skip to content

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Apr 4, 2023

Query compilation support for SetOperations: UNION/INTERSECT/MINUS(EXCEPT) in multi-stage engine.

Will have a few follow up PRs for the implementations of:
UNION ALL, UNION, INTERSECT, MINUS(EXCEPT)

@xiangfu0 xiangfu0 changed the title [multi-stage] [phase 1] Support SetOperations(UNION/INTERSECT/MINUS) compilation in query planner [multi-stage] [phase 1] Query compilation support for SetOperations: UNION/INTERSECT/MINUS(EXCEPT) in multi-stage engine Apr 4, 2023
@xiangfu0 xiangfu0 changed the title [multi-stage] [phase 1] Query compilation support for SetOperations: UNION/INTERSECT/MINUS(EXCEPT) in multi-stage engine [multi-stage] Query compilation support for SetOperations: UNION/INTERSECT/MINUS(EXCEPT) in multi-stage engine Apr 4, 2023
@xiangfu0 xiangfu0 force-pushed the v2-set-op-support branch 8 times, most recently from b78280d to c2487f0 Compare April 4, 2023 10:40
@codecov-commenter
Copy link

codecov-commenter commented Apr 4, 2023

Codecov Report

Merging #10535 (2f56bf2) into master (7adac6f) will decrease coverage by 1.84%.
The diff coverage is 49.25%.

@@             Coverage Diff              @@
##             master   #10535      +/-   ##
============================================
- Coverage     70.35%   68.52%   -1.84%     
+ Complexity     6499     6486      -13     
============================================
  Files          2107     2108       +1     
  Lines        113460   113535      +75     
  Branches      17101    17117      +16     
============================================
- Hits          79822    77795    -2027     
- Misses        28038    30185    +2147     
+ Partials       5600     5555      -45     
Flag Coverage Δ
integration1 24.44% <0.00%> (+0.04%) ⬆️
integration2 ?
unittests1 67.87% <50.00%> (-0.04%) ⬇️
unittests2 13.88% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...roller/helix/core/relocation/SegmentRelocator.java 67.83% <0.00%> (+0.98%) ⬆️
...e/pinot/query/planner/ExplainPlanStageVisitor.java 78.57% <0.00%> (-4.98%) ⬇️
...t/query/planner/logical/ShuffleRewriteVisitor.java 0.00% <0.00%> (ø)
...hysical/colocated/GreedyShuffleRewriteContext.java 0.00% <0.00%> (ø)
...located/GreedyShuffleRewritePreComputeVisitor.java 0.00% <0.00%> (ø)
...hysical/colocated/GreedyShuffleRewriteVisitor.java 0.00% <0.00%> (ø)
...lanner/stage/DefaultPostOrderTraversalVisitor.java 0.00% <0.00%> (ø)
...t/query/runtime/plan/ServerRequestPlanVisitor.java 86.32% <0.00%> (-0.75%) ⬇️
...org/apache/pinot/sql/parsers/CalciteSqlParser.java 84.84% <37.50%> (-0.84%) ⬇️
.../pinot/query/runtime/plan/PhysicalPlanVisitor.java 93.61% <50.00%> (-4.17%) ⬇️
... and 9 more

... and 167 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xiangfu0 xiangfu0 force-pushed the v2-set-op-support branch from c2487f0 to f13a273 Compare April 4, 2023 18:45
@xiangfu0 xiangfu0 requested review from KKcorps and walterddr April 4, 2023 23:42
@xiangfu0 xiangfu0 force-pushed the v2-set-op-support branch 2 times, most recently from 4f9e11b to 5e95130 Compare April 9, 2023 08:08
@xiangfu0 xiangfu0 force-pushed the v2-set-op-support branch 3 times, most recently from 37fe4a4 to 00f2011 Compare April 12, 2023 06:46
@Jackie-Jiang Jackie-Jiang added feature multi-stage Related to the multi-stage query engine labels Apr 18, 2023
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitsultana can you take a look and see if this works with your expectation? even if we do not allow visitSetOp is a valid current limitation IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new field for SetOp. Partitioning is a must to ensure the correctness.

Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise

@ankitsultana
Copy link
Contributor

@xiangfu0 : Can you add tests which demonstrate what the plan is going to look like for queries with set-ops?

@xiangfu0 xiangfu0 force-pushed the v2-set-op-support branch 3 times, most recently from 8d49e66 to 0d612d1 Compare April 19, 2023 22:29
@xiangfu0
Copy link
Contributor Author

@xiangfu0 : Can you add tests which demonstrate what the plan is going to look like for queries with set-ops?

Added a SetOpPlans.json file

@xiangfu0 xiangfu0 force-pushed the v2-set-op-support branch 8 times, most recently from 1a2580f to 8d844d5 Compare April 20, 2023 00:18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangfu0 : Shouldn't intersect require an exchange?

If the row col1=10, col2=11 in tables a and b are on different servers, if we don't shuffle using the intersection key (or broadcast one side) we may end up dropping them from the result-set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangfu0 xiangfu0 force-pushed the v2-set-op-support branch from 8d844d5 to 2f56bf2 Compare April 20, 2023 04:47
@xiangfu0 xiangfu0 merged commit 2d6a38c into apache:master Apr 20, 2023
@xiangfu0 xiangfu0 deleted the v2-set-op-support branch April 20, 2023 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants