[GLUTEN-8497][VL] Fix columnar batch type mismatch in table cache by zhztheplayer · Pull Request #9230 · apache/gluten

zhztheplayer · 2025-04-04T15:00:16Z

This fixes the test case added in #8498.

The patch conditionally adds a ColumnarToRowRemovalGuard node that does nothing on top of a

+- ColumnarToRow
   +- FileScan parquet

which is to be cached to avoid this Spark code from removing the C2R, the plan will become:

ColumnarToRowRemovalGuard
+- ColumnarToRow
   +- FileScan parquet [l_orderkey_read#128L] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/tmp/spark-e732391d-d3f4-45e7-ae2e-d521d7658b01], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<l_orderkey_read:bigint>

which will be treated as regular row-based plan by ColumnarCachedBatchSerializer then be handled with vanilla Spark batch serializer.

Previously, there was an columnar batch type mismatch error because Spark's cache planner removes the top ColumnarToRow but treats the remaining of the plan as vanilla Spark columnar plan.

github-actions · 2025-04-04T15:00:33Z

#8497

github-actions · 2025-04-04T15:00:49Z

Run Gluten ClickHouse CI on ARM

github-actions · 2025-04-04T15:16:03Z

Run Gluten ClickHouse CI on ARM

zhouyuan

👍 LGTM

fixup

0a7b7f5

github-actions bot added CORE works for Gluten Core VELOX labels Apr 4, 2025

fixup

526c012

zhztheplayer marked this pull request as ready for review April 7, 2025 08:17

zhouyuan approved these changes Apr 9, 2025

View reviewed changes

zhztheplayer merged commit 9b9f63f into apache:main Apr 10, 2025
54 checks passed

zhztheplayer mentioned this pull request Jan 16, 2025

[VL] Use of columnar table cache causes error when data emitted from the cached plan is in vanilla columnar format #8497

Open

zhztheplayer mentioned this pull request Apr 24, 2025

[VL] Remove unused logic in ColumnarCachedBatchSerializer#supportsColumnarInput #9413

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLUTEN-8497][VL] Fix columnar batch type mismatch in table cache#9230

[GLUTEN-8497][VL] Fix columnar batch type mismatch in table cache#9230
zhztheplayer merged 2 commits intoapache:mainfrom
zhztheplayer:wip-fix-cache-1

zhztheplayer commented Apr 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

zhouyuan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhztheplayer commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

github-actions bot commented Apr 4, 2025

Uh oh!

zhouyuan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhztheplayer commented Apr 4, 2025 •

edited

Loading