Coerce types for all union children plans when eliminating nesting#11386
Merged
alamb merged 1 commit intoapache:mainfrom Jul 11, 2024
Merged
Coerce types for all union children plans when eliminating nesting#11386alamb merged 1 commit intoapache:mainfrom
alamb merged 1 commit intoapache:mainfrom
Conversation
9 tasks
alamb
reviewed
Jul 10, 2024
Contributor
alamb
left a comment
There was a problem hiding this comment.
Thanks @gruuya
This fix makes sense to me. Thank you
cc @erratic-pattern as I think you observed something similar at some point
| .into_iter() | ||
| .flat_map(extract_plans_from_union) | ||
| .collect::<Vec<_>>(); | ||
| .map(|plan| coerce_plan_expr_for_schema(&plan, &schema)) |
Contributor
There was a problem hiding this comment.
This makes sense -- that the children were not all coerced with respect to the outermost union schema but rather with respect to the inner one.
| ---- | ||
| 5 | ||
|
|
||
| # three way union with aggregate and type coercion |
Contributor
There was a problem hiding this comment.
I verified that this test covers the fix by running this test without the code changes and it fails as expected
Running "pg_compat/pg_compat_union.slt"
Running "union.slt"
External error: query failed: DataFusion error: External error: External error: External error: Arrow error: Invalid argument error: RowConverter column schema mismatch, expected Int32 got Int64
[SQL] SELECT c1, SUM(c2) FROM (
SELECT 1 as c1, 1::int as c2
UNION
SELECT 2 as c1, 2::int as c2
UNION
SELECT 3 as c1, COALESCE(3::int, 0) as c2
) as a
GROUP BY c1
at test_files/union.slt:139
Error: Execution("1 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`
alamb
approved these changes
Jul 10, 2024
Contributor
#11258 is (possibly) a similar type coercion bug when |
Contributor
|
Thanks again! |
Lordworms
pushed a commit
to Lordworms/arrow-datafusion
that referenced
this pull request
Jul 12, 2024
findepi
pushed a commit
to findepi/datafusion
that referenced
this pull request
Jul 16, 2024
xinlifoobar
pushed a commit
to xinlifoobar/datafusion
that referenced
this pull request
Jul 17, 2024
xinlifoobar
pushed a commit
to xinlifoobar/datafusion
that referenced
this pull request
Jul 18, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #11385.
Rationale for this change
Investigating the above issue led me to identify a couple of aspects that need to align in order for the bug to manifest:
In particular here's a minimal repro of the above issue
What happens is that the nested union elimination unwraps the first two child plans and coerces their schema, however the remaining plan isn't being coerced. Upon physical planning Union inherits the schema of the first child plan.
Consequently during execution, the RowConverter gets instantiated with
Int32type, whereas the last child will produceInt64elements, sincecoalesceenforces type coercion to align the left element with the right one (represented asInt64(0))What changes are included in this PR?
Coerce all child plans of the outer union as per it's schema, not only plans in the inner union.
Are these changes tested?
There's a new SLT.
Are there any user-facing changes?
No error in TPC-DS Q75