Commit 91148f4
[SPARK-28481][SQL] More expressions should extend NullIntolerant
### What changes were proposed in this pull request?
1. Make more expressions extend `NullIntolerant`.
2. Add a checker(in `ExpressionInfoSuite`) to identify whether the expression is `NullIntolerant`.
### Why are the changes needed?
Avoid skew join if the join column has many null values and can improve query performance. For examples:
```sql
CREATE TABLE t1(c1 string, c2 string) USING parquet;
CREATE TABLE t2(c1 string, c2 string) USING parquet;
EXPLAIN SELECT t1.* FROM t1 JOIN t2 ON upper(t1.c1) = upper(t2.c1);
```
Before and after this PR:
```sql
== Physical Plan ==
*(2) Project [c1#5, c2#6]
+- *(2) BroadcastHashJoin [upper(c1#5)], [upper(c1#7)], Inner, BuildLeft
:- BroadcastExchange HashedRelationBroadcastMode(List(upper(input[0, string, true]))), [id=#41]
: +- *(1) ColumnarToRow
: +- FileScan parquet default.t1[c1#5,c2#6]
+- *(2) ColumnarToRow
+- FileScan parquet default.t2[c1#7]
== Physical Plan ==
*(2) Project [c1#5, c2#6]
+- *(2) BroadcastHashJoin [upper(c1#5)], [upper(c1#7)], Inner, BuildRight
:- *(2) Project [c1#5, c2#6]
: +- *(2) Filter isnotnull(c1#5)
: +- *(2) ColumnarToRow
: +- FileScan parquet default.t1[c1#5,c2#6]
+- BroadcastExchange HashedRelationBroadcastMode(List(upper(input[0, string, true]))), [id=#59]
+- *(1) Project [c1#7]
+- *(1) Filter isnotnull(c1#7)
+- *(1) ColumnarToRow
+- FileScan parquet default.t2[c1#7]
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Unit test.
Closes #28626 from wangyum/SPARK-28481.
Authored-by: Yuming Wang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>1 parent 37a1fb8 commit 91148f4
File tree
15 files changed
+180
-103
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
- xml
- core/src/test/scala/org/apache/spark/sql/expressions
15 files changed
+180
-103
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
147 | | - | |
| 147 | + | |
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
| |||
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
130 | | - | |
| 130 | + | |
| 131 | + | |
131 | 132 | | |
132 | 133 | | |
133 | 134 | | |
| |||
164 | 165 | | |
165 | 166 | | |
166 | 167 | | |
167 | | - | |
| 168 | + | |
| 169 | + | |
168 | 170 | | |
169 | 171 | | |
170 | 172 | | |
| |||
Lines changed: 22 additions & 17 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
| 144 | + | |
145 | 145 | | |
146 | 146 | | |
147 | 147 | | |
| |||
332 | 332 | | |
333 | 333 | | |
334 | 334 | | |
335 | | - | |
| 335 | + | |
336 | 336 | | |
337 | 337 | | |
338 | 338 | | |
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
364 | | - | |
| 364 | + | |
| 365 | + | |
365 | 366 | | |
366 | 367 | | |
367 | 368 | | |
| |||
649 | 650 | | |
650 | 651 | | |
651 | 652 | | |
652 | | - | |
| 653 | + | |
653 | 654 | | |
654 | 655 | | |
655 | 656 | | |
| |||
873 | 874 | | |
874 | 875 | | |
875 | 876 | | |
876 | | - | |
| 877 | + | |
877 | 878 | | |
878 | 879 | | |
879 | 880 | | |
| |||
1017 | 1018 | | |
1018 | 1019 | | |
1019 | 1020 | | |
1020 | | - | |
| 1021 | + | |
| 1022 | + | |
1021 | 1023 | | |
1022 | 1024 | | |
1023 | 1025 | | |
| |||
1086 | 1088 | | |
1087 | 1089 | | |
1088 | 1090 | | |
1089 | | - | |
| 1091 | + | |
1090 | 1092 | | |
1091 | 1093 | | |
1092 | 1094 | | |
| |||
1185 | 1187 | | |
1186 | 1188 | | |
1187 | 1189 | | |
1188 | | - | |
| 1190 | + | |
1189 | 1191 | | |
1190 | 1192 | | |
1191 | 1193 | | |
| |||
1410 | 1412 | | |
1411 | 1413 | | |
1412 | 1414 | | |
1413 | | - | |
| 1415 | + | |
1414 | 1416 | | |
1415 | 1417 | | |
1416 | 1418 | | |
| |||
1688 | 1690 | | |
1689 | 1691 | | |
1690 | 1692 | | |
1691 | | - | |
| 1693 | + | |
| 1694 | + | |
1692 | 1695 | | |
1693 | 1696 | | |
1694 | 1697 | | |
| |||
1755 | 1758 | | |
1756 | 1759 | | |
1757 | 1760 | | |
1758 | | - | |
| 1761 | + | |
| 1762 | + | |
1759 | 1763 | | |
1760 | 1764 | | |
1761 | 1765 | | |
| |||
1831 | 1835 | | |
1832 | 1836 | | |
1833 | 1837 | | |
1834 | | - | |
| 1838 | + | |
1835 | 1839 | | |
1836 | 1840 | | |
1837 | 1841 | | |
| |||
1909 | 1913 | | |
1910 | 1914 | | |
1911 | 1915 | | |
1912 | | - | |
| 1916 | + | |
1913 | 1917 | | |
1914 | 1918 | | |
1915 | 1919 | | |
| |||
2245 | 2249 | | |
2246 | 2250 | | |
2247 | 2251 | | |
2248 | | - | |
| 2252 | + | |
2249 | 2253 | | |
2250 | 2254 | | |
2251 | 2255 | | |
| |||
2884 | 2888 | | |
2885 | 2889 | | |
2886 | 2890 | | |
2887 | | - | |
| 2891 | + | |
2888 | 2892 | | |
2889 | 2893 | | |
2890 | 2894 | | |
| |||
3081 | 3085 | | |
3082 | 3086 | | |
3083 | 3087 | | |
3084 | | - | |
| 3088 | + | |
3085 | 3089 | | |
3086 | 3090 | | |
3087 | 3091 | | |
| |||
3219 | 3223 | | |
3220 | 3224 | | |
3221 | 3225 | | |
3222 | | - | |
| 3226 | + | |
| 3227 | + | |
3223 | 3228 | | |
3224 | 3229 | | |
3225 | 3230 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
255 | 255 | | |
256 | 256 | | |
257 | 257 | | |
258 | | - | |
| 258 | + | |
259 | 259 | | |
260 | 260 | | |
261 | 261 | | |
| |||
476 | 476 | | |
477 | 477 | | |
478 | 478 | | |
479 | | - | |
| 479 | + | |
480 | 480 | | |
481 | 481 | | |
482 | 482 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
211 | 211 | | |
212 | 212 | | |
213 | 213 | | |
214 | | - | |
| 214 | + | |
| 215 | + | |
215 | 216 | | |
216 | 217 | | |
217 | 218 | | |
| |||
0 commit comments