fix: common_subexpr_eliminate rule should not apply to short-circuit expression#8928
fix: common_subexpr_eliminate rule should not apply to short-circuit expression#8928alamb merged 7 commits intoapache:mainfrom
Conversation
| 0 0 | ||
|
|
||
| query TT | ||
| explain select coalesce(1, y/x), coalesce(2, y/x) from t; |
There was a problem hiding this comment.
because of the reason describe in #8927, use plan to test.
There was a problem hiding this comment.
Can we please add some comments here explaining the rationale and what the expected outputs are (so future readers know if changes are expected)
Something like this perhaps:
# Expressions that short circuit should not be refactored out as that may cause side effects (divide by zero)
# at plan time that would not actually happen during execution
Also, can you please add the tests that now pass (e.g. select coalesce(1, y/x), coalesce(2, y/x) from t;) so if someone breaks this code by accident, those queries would start failing, which might be easier to quickly tell is incorrect
There was a problem hiding this comment.
Thank you for this PR @haohuaijin -- the code looks good to me 🙏
I also took a look at https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.BuiltinScalarFunction.html# and https://docs.rs/datafusion/latest/datafusion/logical_expr/expr/enum.Expr.html and didn't see any additional functions that shoudl be marked as short circuiting.
I had a code structure suggestion that I don't think is needed (though I think it would improve this PR).
I think a few more tests that actually run (and don't error) should be added prior to merge.
Thanks again!
| 0 0 | ||
|
|
||
| query TT | ||
| explain select coalesce(1, y/x), coalesce(2, y/x) from t; |
There was a problem hiding this comment.
Can we please add some comments here explaining the rationale and what the expected outputs are (so future readers know if changes are expected)
Something like this perhaps:
# Expressions that short circuit should not be refactored out as that may cause side effects (divide by zero)
# at plan time that would not actually happen during execution
Also, can you please add the tests that now pass (e.g. select coalesce(1, y/x), coalesce(2, y/x) from t;) so if someone breaks this code by accident, those queries would start failing, which might be easier to quickly tell is incorrect
jackwener
left a comment
There was a problem hiding this comment.
Thanks @haohuaijin @alamb .
@alamb 's review suggestions are already very detailed, and I agree with his opinion. After those impromentent, we can merge this PR
|
Thanks @alamb and @jackwener's review and great suggestions. I apply those suggestions, but I'm unsure if the test I added is appropriate(560cb6c). I also add describe in #8928 (comment), to show what the change in this pr. |
| ProjectionExec: expr=[y@1 = 0 OR 1 / CAST(y@1 AS Int64) < 1 as t.y = Int64(0) OR Int64(1) / t.y < Int64(1), x@0 = 0 OR y@1 = 0 OR 1 / CAST(y@1 AS Int64) < 1 / CAST(x@0 AS Int64) as t.x = Int64(0) OR t.y = Int64(0) OR Int64(1) / t.y < Int64(1) / t.x] | ||
| --MemoryExec: partitions=1, partition_sizes=[1] | ||
|
|
||
| # due to the reason describe in https://github.com/apache/arrow-datafusion/issues/8927, |
There was a problem hiding this comment.
I'm unsure if those tests are appropriate.
alamb
left a comment
There was a problem hiding this comment.
Looks good to me -- thank you @haohuaijin
| matches!(op, Operator::And | Operator::Or) | ||
| } | ||
| Expr::Case { .. } => true, | ||
| // Use explicit pattern match instead of a default |
|
Thank you @haohuaijin and @jackwener -- let's merge this PR and we can work on #8910 as a follow on |
…expression (apache#8928) * fix: common_subexpr_eliminate rule should not apply to short-circuit expression * add more tests * format * minor * apply reviews * add some commont * fmt
… to v34 (#227) * fix: don't extract common sub expr in `CASE WHEN` clause (apache#8833) * fix: don't extract common sub expr in CASE WHEN clause * fix ci * fix * fix: common_subexpr_eliminate rule should not apply to short-circuit expression (apache#8928) * fix: common_subexpr_eliminate rule should not apply to short-circuit expression * add more tests * format * minor * apply reviews * add some commont * fmt --------- Co-authored-by: Huaijin <[email protected]>
Which issue does this PR close?
Closes #8874
Rationale for this change
see #8874
What changes are included in this PR?
before this pr
after this pr
But due to the reason describe in #8927, the above two query will get same result(get drive by zero error).
Are these changes tested?
yes, add some negative test
Are there any user-facing changes?