Skip to content

Conversation

@anishshri-db
Copy link
Contributor

What changes were proposed in this pull request?

Reorganize streaming execution dir around runtime and checkpoint areas

Why are the changes needed?

Refactoring for ease of future changes in this area

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests

Was this patch authored or co-authored using generative AI tooling?

No

@anishshri-db
Copy link
Contributor Author

@HeartSaVioR - PTAL, thx !

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Contributor

@anishshri-db Please resolve conflict, thanks!

@LuciferYang
Copy link
Contributor

Although these files have been moved to a new folder, their compilation results will still be in the original package because the package name hasn't been changed. Is this the expected behavior?

@anishshri-db
Copy link
Contributor Author

@LuciferYang - package naming is not consistent across bunch of these files. Refactoring that would require a big change I think. Not sure whether we should do it or not (or whether it would break any external dependencies ?) cc - @HeartSaVioR - thoughts ?

@LuciferYang
Copy link
Contributor

@LuciferYang - package naming is not consistent across bunch of these files. Refactoring that would require a big change I think. Not sure whether we should do it or not (or whether it would break any external dependencies ?) cc - @HeartSaVioR - thoughts ?

I'm currently trying to refactor the code. If it doesn't cause any negative impacts, I'll submit a pr.

@anishshri-db
Copy link
Contributor Author

I'm currently trying to refactor the code. If it doesn't cause any negative impacts, I'll submit a pr.

ok sounds good. Thanks !

baibaichen added a commit to apache/incubator-gluten that referenced this pull request Dec 31, 2025
baibaichen added a commit to baibaichen/gluten that referenced this pull request Dec 31, 2025
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 4, 2026
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 5, 2026
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 5, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark*/.../QueryExecutionShim.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala<br>shims/spark*/.../v2/DataSourceV2RelationShim.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 5, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark*/.../QueryExecutionShim.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala<br>shims/spark*/.../v2/DataSourceV2RelationShim.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 5, 2026
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 6, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark\*/.../shims/spark\*/Spark*Shims.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 6, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark\*/.../shims/spark\*/Spark*Shims.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 6, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark\*/.../shims/spark\*/Spark*Shims.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 7, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark\*/.../shims/spark\*/Spark*Shims.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to apache/incubator-gluten that referenced this pull request Jan 7, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark\*/.../shims/spark\*/Spark*Shims.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 7, 2026
## Changes

| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` |
| [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` |
| [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` |
| [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` |
| [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` |
| [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` |
| N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` |
| [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` |
| N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 7, 2026
## Changes

| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` |
| [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` |
| [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` |
| [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` |
| [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` |
| [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` |
| N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` |
| [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` |
| N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 7, 2026
## Changes

| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` |
| [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` |
| [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` |
| [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` |
| [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` |
| [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` |
| N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` |
| [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` |
| N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` |
baibaichen added a commit to baibaichen/gluten that referenced this pull request Jan 8, 2026
## Changes

| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` |
| [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` |
| [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` |
| [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` |
| [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` |
| [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` |
| N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` |
| [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` |
| N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` |
| apache#11252 | 4.1.0 | Test Exclusion | Exclude Gluten test for SPARK-47939: Explain should work with parameterized queries |  `gluten-ut/spark41/.../VeloxTestSettings.scala` |
QCLyu pushed a commit to QCLyu/incubator-gluten that referenced this pull request Jan 8, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark\*/.../shims/spark\*/Spark*Shims.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
QCLyu pushed a commit to QCLyu/incubator-gluten that referenced this pull request Jan 8, 2026
| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| - | Feat | Feature | Introduce Spark41Shims and update build configuration to support Spark 4.1. | pom.xml<br>shims/pom.xml<br>shims/spark41/pom.xml<br>shims/spark41/.../META-INF/services/org.apache.gluten.sql.shims.SparkShimProvider<br>shims/spark41/.../spark41/Spark41Shims.scala<br>shims/spark41/.../spark41/SparkShimProvider.scala |
| [#51477](apache/spark#51477) | Fix | Compatibility | Use class name instead of class object for streaming call detection to ensure Spark 4.1 compatibility. | gluten-core/.../caller/CallerInfo.scala |
| [#50852](apache/spark#50852) | Fix | Compatibility | Add printOutputColumns parameter to generateTreeString methods | shims/spark41/.../GenerateTreeStringShim.scala |
| [#51775](apache/spark#51775) | Fix | Compatibility | Remove unused MDC import in FileSourceScanExecShim.scala | shims/spark41/.../FileSourceScanExecShim.scala |
| [#51979](apache/spark#51979) | Fix | Compatibility | Add missing StoragePartitionJoinParams import in BatchScanExecShim and AbstractBatchScanExec | shims/spark41/.../v2/AbstractBatchScanExec.scala<br>shims/spark41/.../v2/BatchScanExecShim.scala |
| [#51302](apache/spark#51302) | Fix | Compatibility | Remove TimeAdd from ExpressionConverter and ExpressionMappings for test | gluten-substrait/.../ExpressionConverter.scala<br>gluten-substrait/.../ExpressionMappings.scala |
| [#50598](apache/spark#50598) | Fix | Compatibility | Adapt to QueryExecution.createSparkPlan interface change | gluten-substrait/.../GlutenImplicits.scala<br>shims/spark\*/.../shims/spark\*/Spark*Shims.scala |
| [#52599](apache/spark#52599) | Fix | Compatibility | Adapt to DataSourceV2Relation interface change | backends-velox/.../ArrowConvertorRule.scala |
| [#52384](apache/spark#52384) | Fix | Compatibility | Using new interface of ParquetFooterReader | backends-velox/.../ParquetMetadataUtils.scala<br>gluten-ut/spark40/.../parquet/GlutenParquetRowIndexSuite.scala<br>shims/spark*/.../parquet/ParquetFooterReaderShim.scala |
| [#52509](apache/spark#52509) | Fix | Build | Update Scala version to 2.13.17 in pom.xml to fix `java.lang.NoSuchMethodError: 'java.lang.String scala.util.hashing.MurmurHash3$.caseClassHash$default$2()'` | pom.xml |
| - | Fix | Test | Refactor Spark version checks in VeloxHashJoinSuite to improve readability and maintainability | backends-velox/.../VeloxHashJoinSuite.scala |
| [#50849](apache/spark#50849) | Fix | Test | Fix MiscOperatorSuite to support OneRowRelationExec plan Spark 4.1 | backends-velox/.../MiscOperatorSuite.scala |
| [#52723](apache/spark#52723) | Fix | Compatibility | Add GeographyVal and GeometryVal support in ColumnarArrayShim | shims/spark41/.../vectorized/ColumnarArrayShim.java |
| [#48470](apache/spark#48470) | 4.1.0 | Exclude | Exclude split test in VeloxStringFunctionsSuite | backends-velox/.../VeloxStringFunctionsSuite.scala |
| [#51259](apache/spark#51259) | 4.1.0 | Exclude | Only Run ArrowEvalPythonExecSuite tests up to Spark 4.0, we need update ci python to 3.10 | backends-velox/.../python/ArrowEvalPythonExecSuite.scala |
baibaichen added a commit to apache/incubator-gluten that referenced this pull request Jan 8, 2026
## Changes

| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` |
| [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` |
| [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` |
| [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` |
| [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` |
| [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` |
| N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` |
| [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` |
| N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` |
| #11252 | 4.1.0 | Test Exclusion | Exclude Gluten test for SPARK-47939: Explain should work with parameterized queries |  `gluten-ut/spark41/.../VeloxTestSettings.scala` |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants