Add proper support for null literal by introducing ScalarValue::Null#2364
Add proper support for null literal by introducing ScalarValue::Null#2364alamb merged 2 commits intoapache:masterfrom
null literal by introducing ScalarValue::Null#2364Conversation
|
For someone reviews this pr, I convert this pr to draft for code review. It's ok to go if new |
|
cc @alamb @andygrove @yjshen @xudong963 Please have a review, thank you |
Cargo.toml
Outdated
| codegen-units = 1 | ||
| lto = true | ||
|
|
||
| [patch.crates-io] |
There was a problem hiding this comment.
BTW arrow-rs 13.0.0 with these changes should be released sometime early next wee
| #[derive(Clone)] | ||
| pub enum ScalarValue { | ||
| /// represents null | ||
| Null, |
datafusion/common/src/scalar.rs
Outdated
| eq_array_primitive!(array, index, IntervalMonthDayNanoArray, val) | ||
| } | ||
| ScalarValue::Struct(_, _) => unimplemented!(), | ||
| ScalarValue::Null => false, |
| SQLExpr::Value(Value::SingleQuotedString(s)) => Ok(lit(s)), | ||
| SQLExpr::Value(Value::Null) => { | ||
| Ok(Expr::Literal(ScalarValue::Utf8(None))) | ||
| Ok(Expr::Literal(ScalarValue::Null)) |
There was a problem hiding this comment.
Well this explains a lot of odd type coercion errors I have been seeing
|
|
||
| /// coercion rules from NULL type. Since NULL can be casted to most of types in arrow, | ||
| /// either lhs or rhs is NULL, if NULL can be casted to type of the other side, the coecion is valid. | ||
| fn null_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> { |
ScalarValue::Null to dfnull literal by introducing ScalarValue::Null
|
cc @alamb PTAL, thank you ❤️ |
501add5 to
0100277
Compare
|
Since #2382 is merged, I switch this pr to |
|
cc @alamb @andygrove @yjshen @xudong963 |
| "| 999 |", | ||
| "+----------------------------------------------------------------------------------------------+", | ||
| "+----------------------------------------------------------------------------------------+", | ||
| "| CASE WHEN #t1.c1 = Utf8(\"a\") THEN Int64(1) WHEN NULL THEN Int64(2) ELSE Int64(999) END |", |
There was a problem hiding this comment.
This is the key difference here. Very nice 👍
| INNER JOIN (SELECT null AS id2) t2 ON id1 = id2"; | ||
|
|
||
| let expected = vec!["++", "++"]; | ||
| let expected = vec![ |
There was a problem hiding this comment.
This answer is not correct -- there should be no rows that match.
This is because the join should produce rows where id1 = id2 evaluates to true
However, null = null evaluates to null 🤯
Here is the query in postgres:
alamb=# SELECT * FROM (SELECT null AS id1) t1
INNER JOIN (SELECT null AS id2) t2 ON id1 = id2
alamb-# ;
id1 | id2
-----+-----
(0 rows)
| pub fn eq_null(left: &NullArray, _right: &NullArray) -> Result<BooleanArray> { | ||
| let length = left.len(); | ||
| make_boolean_array(length, false) | ||
| } |
There was a problem hiding this comment.
I don't think this is correct -- specifically I think the resulting boolean array should be full of nulls not false
Perhaps something like:
std::iter::repeat(left.len(), None).collect()0100277 to
13a1601
Compare
|
cc @alamb PTAL, thank you |
| INNER JOIN (SELECT null AS id2) t2 ON id1 = id2"; | ||
|
|
||
| let expected = vec!["++", "++"]; | ||
| #[rustfmt::skip] |
|
🎉 we are going to have real null support 😍 |
|
Thank you all @alamb @andygrove |
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
…ll` (apache#2364) * introduce null * fix fmt
Which issue does this PR close?
Closes #2363 .
Rationale for this change
To solve Null constants issues listed in #1184 , and since /apache/arrow-rs#1572 Null casted from and to most of types in arrow-rs kernel, it's reasonable that introduce Null type to df for type coercion.
What changes are included in this PR?
Introduce
ScalarValue::Nulltype to dfAre there any user-facing changes?
No.