Commit 99f8489
[SPARK-34003][SQL][FOLLOWUP] Avoid pushing modified Char/Varchar sort attributes into aggregate for existing ones
### What changes were proposed in this pull request?
In 0f8e5dd, we partially fix the rule conflicts between `PaddingAndLengthCheckForCharVarchar` and `ResolveAggregateFunctions`, as error still exists in
sql like ```SELECT substr(v, 1, 2), sum(i) FROM t GROUP BY v ORDER BY substr(v, 1, 2)```
```sql
[info] Failed to analyze query: org.apache.spark.sql.AnalysisException: expression 'spark_catalog.default.t.`v`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;
[info] Project [substr(v, 1, 2)#100, sum(i)#101L]
[info] +- Sort [aggOrder#102 ASC NULLS FIRST], true
[info] +- !Aggregate [v#106], [substr(v#106, 1, 2) AS substr(v, 1, 2)#100, sum(cast(i#98 as bigint)) AS sum(i)#101L, substr(v#103, 1, 2) AS aggOrder#102
[info] +- SubqueryAlias spark_catalog.default.t
[info] +- Project [if ((length(v#97) <= 3)) v#97 else if ((length(rtrim(v#97, None)) > 3)) cast(raise_error(concat(input string of length , cast(length(v#97) as string), exceeds varchar type length limitation: 3)) as string) else rpad(rtrim(v#97, None), 3, ) AS v#106, i#98]
[info] +- Relation[v#97,i#98] parquet
[info]
[info] Project [substr(v, 1, 2)#100, sum(i)#101L]
[info] +- Sort [aggOrder#102 ASC NULLS FIRST], true
[info] +- !Aggregate [v#106], [substr(v#106, 1, 2) AS substr(v, 1, 2)#100, sum(cast(i#98 as bigint)) AS sum(i)#101L, substr(v#103, 1, 2) AS aggOrder#102
[info] +- SubqueryAlias spark_catalog.default.t
[info] +- Project [if ((length(v#97) <= 3)) v#97 else if ((length(rtrim(v#97, None)) > 3)) cast(raise_error(concat(input string of length , cast(length(v#97) as string), exceeds varchar type length limitation: 3)) as string) else rpad(rtrim(v#97, None), 3, ) AS v#106, i#98]
[info] +- Relation[v#97,i#98] parquet
```
We need to look recursively into children to find char/varchars.
In this PR, we try to resolve the full attributes including the original `Aggregate` expressions and the candidates in `SortOrder` together, then use the new re-resolved `Aggregate` expressions to determine which candidate in the `SortOrder` shall be pushed. This can avoid mismatch for the same attributes w/o this change, as the expressions returned by `executeSameContext` will change when `PaddingAndLengthCheckForCharVarchar` takes effects. W/ this change, the expressions can be matched correctly.
For those unmatched, w need to look recursively into children to find char/varchars instead of the expression itself only.
### Why are the changes needed?
bugfix
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
add new tests
Closes #31129 from yaooqinn/SPARK-34003-F.
Authored-by: Kent Yao <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>1 parent 02a17e9 commit 99f8489
File tree
6 files changed
+59
-37
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
- core/src/test
- resources/tpcds-plan-stability/approved-plans-v1_4
- q85.sf100
- q85
- scala/org/apache/spark/sql
6 files changed
+59
-37
lines changedLines changed: 32 additions & 21 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2381 | 2381 | | |
2382 | 2382 | | |
2383 | 2383 | | |
2384 | | - | |
2385 | | - | |
2386 | | - | |
| 2384 | + | |
| 2385 | + | |
| 2386 | + | |
| 2387 | + | |
| 2388 | + | |
2387 | 2389 | | |
2388 | | - | |
2389 | | - | |
2390 | | - | |
| 2390 | + | |
| 2391 | + | |
| 2392 | + | |
| 2393 | + | |
2391 | 2394 | | |
2392 | 2395 | | |
2393 | 2396 | | |
| |||
2401 | 2404 | | |
2402 | 2405 | | |
2403 | 2406 | | |
2404 | | - | |
2405 | | - | |
2406 | | - | |
2407 | | - | |
2408 | | - | |
2409 | | - | |
| 2407 | + | |
| 2408 | + | |
| 2409 | + | |
| 2410 | + | |
| 2411 | + | |
| 2412 | + | |
| 2413 | + | |
2410 | 2414 | | |
2411 | | - | |
2412 | | - | |
2413 | | - | |
2414 | | - | |
| 2415 | + | |
| 2416 | + | |
| 2417 | + | |
| 2418 | + | |
| 2419 | + | |
| 2420 | + | |
| 2421 | + | |
| 2422 | + | |
2415 | 2423 | | |
2416 | | - | |
2417 | | - | |
| 2424 | + | |
2418 | 2425 | | |
2419 | | - | |
2420 | | - | |
2421 | | - | |
2422 | 2426 | | |
2423 | 2427 | | |
2424 | 2428 | | |
| |||
2443 | 2447 | | |
2444 | 2448 | | |
2445 | 2449 | | |
| 2450 | + | |
| 2451 | + | |
| 2452 | + | |
| 2453 | + | |
| 2454 | + | |
| 2455 | + | |
| 2456 | + | |
2446 | 2457 | | |
2447 | 2458 | | |
2448 | 2459 | | |
| |||
Lines changed: 8 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
104 | | - | |
| 104 | + | |
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
| 126 | + | |
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
| |||
229 | 229 | | |
230 | 230 | | |
231 | 231 | | |
232 | | - | |
| 232 | + | |
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
| |||
278 | 278 | | |
279 | 279 | | |
280 | 280 | | |
281 | | - | |
| 281 | + | |
282 | 282 | | |
283 | 283 | | |
284 | 284 | | |
| |||
302 | 302 | | |
303 | 303 | | |
304 | 304 | | |
305 | | - | |
| 305 | + | |
306 | 306 | | |
307 | 307 | | |
308 | 308 | | |
309 | 309 | | |
310 | 310 | | |
311 | 311 | | |
312 | | - | |
| 312 | + | |
313 | 313 | | |
314 | 314 | | |
315 | | - | |
316 | | - | |
| 315 | + | |
| 316 | + | |
317 | 317 | | |
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
Lines changed: 4 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
272 | 272 | | |
273 | 273 | | |
274 | 274 | | |
275 | | - | |
| 275 | + | |
276 | 276 | | |
277 | 277 | | |
278 | 278 | | |
279 | 279 | | |
280 | 280 | | |
281 | 281 | | |
282 | | - | |
| 282 | + | |
283 | 283 | | |
284 | 284 | | |
285 | | - | |
286 | | - | |
| 285 | + | |
| 286 | + | |
287 | 287 | | |
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
Lines changed: 11 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
474 | 474 | | |
475 | 475 | | |
476 | 476 | | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
477 | 488 | | |
478 | 489 | | |
479 | 490 | | |
| |||
0 commit comments