-
Notifications
You must be signed in to change notification settings - Fork 715
fix: INTERSECT/EXCEPT query #8862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Summary
Fixed incorrect join reordering for INTERSECT and EXCEPT SQL operations by preventing join input swapping when PartitionMode::CollectLeft is used.
- Added condition
hash_join.mode != PartitionMode::CollectLeftto the swap guard inswap_join_orderfunction - INTERSECT/EXCEPT operations use
LeftSemi/LeftAntijoins which are not commutative, so swapping their inputs would break query semantics - Added 4 new comprehensive test cases covering INTERSECT/EXCEPT queries both with and without deduplication optimization
- Refactored test setup to use
NewEmptyTableinstead ofMemTablefor cleaner test code
Confidence Score: 5/5
- This PR is safe to merge - it fixes a correctness bug with comprehensive test coverage
- The fix adds a single, well-targeted guard condition that prevents incorrect join reordering for INTERSECT/EXCEPT queries. The change is minimal and surgical, only affecting the specific problematic case. Four new test cases thoroughly validate the fix for both INTERSECT and EXCEPT operations with and without deduplication.
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| src/service/search/datafusion/optimizer/physical_optimizer/join_reorder.rs | 5/5 | Adds guard condition to prevent incorrect join swapping for INTERSECT/EXCEPT queries with CollectLeft mode, includes comprehensive test coverage |
Sequence Diagram
sequenceDiagram
participant QP as Query Planner
participant JRR as JoinReorderRule
participant HJE as HashJoinExec
participant AG as AggregateExec
QP->>JRR: optimize(plan)
JRR->>JRR: transform_down(swap_join_order)
alt HashJoin found
JRR->>HJE: check join_type.supports_swap()
JRE->>HJE: check mode != CollectLeft
alt Not aggregate on left AND aggregate on right AND swappable AND NOT CollectLeft
HJE->>HJE: swap_inputs(hash_join, mode)
HJE-->>JRR: Transformed::yes(swapped_join)
else Cannot swap (CollectLeft or other reasons)
HJE-->>JRR: Transformed::no(plan)
end
else Not a HashJoin
JRR-->>QP: Transformed::no(plan)
end
JRR-->>QP: optimized plan
1 file reviewed, no comments
Testdino Test Results
|
|
| Status | Total | Passed | Failed | Skipped | Flaky | Pass Rate | Duration |
|---|---|---|---|---|---|---|---|
| All tests passed | 365 | 344 | 0 | 19 | 2 | 94% | 5m 14s |
related to #8863