[SPARK-54439][SQL] KeyGroupedPartitioning and join key size mismatch #53142

peter-toth · 2025-11-20T19:37:23Z

What changes were proposed in this pull request?

Fix KeyGroupedShuffleSpec.createPartitioning() as clustering required at the other side of the join might contain more clustering expressions than the number of expressions in the shuffle spec's KeyGroupedPartitioning, so simply zipping them is not correct.

Why are the changes needed?

Fix a correctness issue due to wrong partitioning on the shuffle side.

Does this PR introduce any user-facing change?

Yes, it fixes the query.

How was this patch tested?

Added new UT.

Was this patch authored or co-authored using generative AI tooling?

No.

dongjoon-hyun

Thank you, @peter-toth .

cc @szehon-ho and @sunchao from the following PR

#46255

dongjoon-hyun · 2025-11-20T19:43:17Z

Also, cc @cloud-fan , @viirya , too.

viirya · 2025-11-20T20:28:55Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala

-      case (c, e: TransformExpression) => TransformExpression(
-        e.function, Seq(c), e.numBucketsOpt)
-      case (c, _) => c
+    val clusteringMap = distribution.clustering.map(_.canonicalized).zip(clustering).toMap


Should we assert the size of distribution.clustering matches clustering? Or they are definitely matched already.

i think by the time it gets here it should be, but good idea to assert

Ok, let me add the assert.

viirya · 2025-11-20T20:29:18Z

sql/core/src/test/scala/org/apache/spark/sql/connector/KeyGroupedPartitioningSuite.scala

+            " is not enabled")
+        }
+
+        checkAnswer(df, Seq(Row(1, "aa", 40.0, 42.0)))


Hmm, this looks like a correctness bug.

Yes, without the fix this check fails.

viirya

Thanks for this fix. The fix looks correct to me. Wait for @sunchao or @szehon-ho to confirm.

szehon-ho · 2025-11-20T22:50:12Z

FYI @chirag-s-db who has gained good knowledge of this area as well

szehon-ho · 2025-11-20T23:03:02Z

Let me look closer in an hour or two

szehon-ho

Looks like it should fix this case, but nice if @sunchao can also take a look

szehon-ho · 2025-11-21T08:51:39Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala

-      case (c, e: TransformExpression) => TransformExpression(
-        e.function, Seq(c), e.numBucketsOpt)
-      case (c, _) => c
+    val clusteringMap = distribution.clustering.map(_.canonicalized).zip(clustering).toMap


i think by the time it gets here it should be, but good idea to assert

peter-toth · 2025-11-21T09:43:07Z

I just realized that we have KeyGroupedShuffleSpec.keyPositions available and probably it can be used to build the partitioning similary to how HashShuffleSpec does it. Let me validate the idea before merging.

szehon-ho · 2025-11-21T09:52:01Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala

-      case (c, _) => c
+    val clusteringMap = distribution.clustering.map(_.canonicalized).zip(clustering).toMap
+    val newExpressions: Seq[Expression] = partitioning.expressions.map {
+      case te: TransformExpression =>


can we add a test as well for transform case?

Added in 903a064.

peter-toth · 2025-11-21T14:36:27Z

I just realized that we have KeyGroupedShuffleSpec.keyPositions available and probably it can be used to build the partitioning similary to how HashShuffleSpec does it. Let me validate the idea before merging.

903a064 changes the implementation to use keyPositions instead of clusteringMap.

dongjoon-hyun · 2025-11-21T15:46:31Z

Thank you all. Ya, it would be nice if @sunchao can also take a look this correctness issue because the correctness issue is a blocker for Apache Spark 4.1.0.

Looks like it should fix this case, but nice if @sunchao can also take a look

dongjoon-hyun

From my side, +1, LGTM.

sunchao · 2025-11-21T16:59:26Z

I'll take a look today.

dongjoon-hyun · 2025-11-21T17:01:13Z

Thank you so much!

sunchao

LGTM, thanks @peter-toth !

dongjoon-hyun · 2025-11-21T22:30:33Z

Thank you, @peter-toth and all!

Merged to master/4.1/4.0.

### What changes were proposed in this pull request? Fix `KeyGroupedShuffleSpec.createPartitioning()` as clustering required at the other side of the join might contain more clustering expressions than the number of expressions in the shuffle spec's `KeyGroupedPartitioning`, so simply zipping them is not correct. ### Why are the changes needed? Fix a correctness issue due to wrong partitioning on the shuffle side. ### Does this PR introduce _any_ user-facing change? Yes, it fixes the query. ### How was this patch tested? Added new UT. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #53142 from peter-toth/SPARK-54439-keygroupedpartitioning-and-join-key-size-mismatch. Authored-by: Peter Toth <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 05602d5) Signed-off-by: Dongjoon Hyun <[email protected]>

Fix `KeyGroupedShuffleSpec.createPartitioning()` as clustering required at the other side of the join might contain more clustering expressions than the number of expressions in the shuffle spec's `KeyGroupedPartitioning`, so simply zipping them is not correct. Fix a correctness issue due to wrong partitioning on the shuffle side. Yes, it fixes the query. Added new UT. No. Closes #53142 from peter-toth/SPARK-54439-keygroupedpartitioning-and-join-key-size-mismatch. Authored-by: Peter Toth <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 05602d5) Signed-off-by: Dongjoon Hyun <[email protected]>

### What changes were proposed in this pull request? Fix `KeyGroupedShuffleSpec.createPartitioning()` as clustering required at the other side of the join might contain more clustering expressions than the number of expressions in the shuffle spec's `KeyGroupedPartitioning`, so simply zipping them is not correct. ### Why are the changes needed? Fix a correctness issue due to wrong partitioning on the shuffle side. ### Does this PR introduce _any_ user-facing change? Yes, it fixes the query. ### How was this patch tested? Added new UT. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#53142 from peter-toth/SPARK-54439-keygroupedpartitioning-and-join-key-size-mismatch. Authored-by: Peter Toth <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

peter-toth · 2025-12-01T18:33:55Z

Thank you all for the review!

see 1. apache/spark#53132 2. apache/spark#53142

## Changes | Cause | Type | Category | Description | Affected Files | |-------|------|----------|-------------|----------------| | N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` | | [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` | | [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` | | [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` | | [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` | | [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` | | N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` | | [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` | | N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` |

## Changes | Cause | Type | Category | Description | Affected Files | |-------|------|----------|-------------|----------------| | N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` | | [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` | | [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` | | [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` | | [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` | | [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` | | N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` | | [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` | | N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` | | apache#11252 | 4.1.0 | Test Exclusion | Exclude Gluten test for SPARK-47939: Explain should work with parameterized queries | `gluten-ut/spark41/.../VeloxTestSettings.scala` |

## Changes | Cause | Type | Category | Description | Affected Files | |-------|------|----------|-------------|----------------| | N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` | | [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` | | [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` | | [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` | | [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` | | [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` | | N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` | | [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` | | N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` | | #11252 | 4.1.0 | Test Exclusion | Exclude Gluten test for SPARK-47939: Explain should work with parameterized queries | `gluten-ut/spark41/.../VeloxTestSettings.scala` |

[SPARK-54439][SQL] KeyGroupedPartitioning and join key size mismatch

a8584bb

github-actions bot added the SQL label Nov 20, 2025

dongjoon-hyun reviewed Nov 20, 2025

View reviewed changes

viirya reviewed Nov 20, 2025

View reviewed changes

simplify test, add comment

dad0740

szehon-ho approved these changes Nov 21, 2025

View reviewed changes

viirya approved these changes Nov 21, 2025

View reviewed changes

szehon-ho reviewed Nov 21, 2025

View reviewed changes

use keyPositions in createPartitioning(), add transform test

903a064

dongjoon-hyun approved these changes Nov 21, 2025

View reviewed changes

sunchao approved these changes Nov 21, 2025

View reviewed changes

dongjoon-hyun closed this in 05602d5 Nov 21, 2025

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 22, 2025

[Fix] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

8eba2d3

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 29, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

6c4e2da

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 30, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

e6a65a3

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

c160349

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

26aab3d

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

359210a

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

e6125db

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

bb17787

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

e8ccf1d

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

6210aa7

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

83b0d2c

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

1e5814c

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

b5ea9fc

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

da3ae24

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

936085e

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

de3161d

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

10de474

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

1a14e1a

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

53435db

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

36678b3

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

8721127

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 5, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

2820d4c

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 5, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

1b385a8

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 5, 2026

[4.1.0] Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests

e1bba8c

see 1. apache/spark#53132 2. apache/spark#53142

baibaichen mentioned this pull request Jan 7, 2026

[GLUTEN-11343][CORE][VL] Support Spark 4.1 UT apache/incubator-gluten#11353

Merged

baibaichen mentioned this pull request Jan 13, 2026

[VL] Track on Spark-4.1.x failed unit tests apache/incubator-gluten#11400

Open

[SPARK-54439][SQL] KeyGroupedPartitioning and join key size mismatch #53142

[SPARK-54439][SQL] KeyGroupedPartitioning and join key size mismatch #53142

Uh oh!

Conversation

peter-toth commented Nov 20, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Nov 20, 2025

Uh oh!

viirya Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

szehon-ho Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

peter-toth Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

viirya Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

peter-toth Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya left a comment

Choose a reason for hiding this comment

Uh oh!

szehon-ho commented Nov 20, 2025

Uh oh!

szehon-ho commented Nov 20, 2025

Uh oh!

szehon-ho left a comment

Choose a reason for hiding this comment

Uh oh!

szehon-ho Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

peter-toth commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

szehon-ho Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

peter-toth Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

peter-toth commented Nov 21, 2025

Uh oh!

dongjoon-hyun commented Nov 21, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

sunchao commented Nov 21, 2025

Uh oh!

dongjoon-hyun commented Nov 21, 2025

Uh oh!

sunchao left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Nov 21, 2025

Uh oh!

peter-toth commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

dongjoon-hyun left a comment •

edited

Loading

peter-toth Nov 20, 2025 •

edited

Loading

peter-toth commented Nov 21, 2025 •

edited

Loading