Skip to content

feat: Implement CAST between struct types#1074

Merged
andygrove merged 11 commits intoapache:mainfrom
andygrove:cast-struct-struct
Nov 11, 2024
Merged

feat: Implement CAST between struct types#1074
andygrove merged 11 commits intoapache:mainfrom
andygrove:cast-struct-struct

Conversation

@andygrove
Copy link
Member

Which issue does this PR close?

Closes #815

Rationale for this change

We need support for casting between struct types to support reading structs from Parquet using DataFusion's ParquetExec.

What changes are included in this PR?

How are these changes tested?

@andygrove andygrove changed the title [WIP] feat: Implement CAST between struct types feat: [WIP] Implement CAST between struct types Nov 11, 2024
@andygrove andygrove changed the title feat: [WIP] Implement CAST between struct types feat: Implement CAST between struct types Nov 11, 2024
@andygrove andygrove marked this pull request as ready for review November 11, 2024 17:03
@andygrove
Copy link
Member Author

@parthchandra @mbutrovich could you review?

We need more extensive tests for sure, but it will be easier to add those as part of the comet-parquet-exec feature branch.

@andygrove andygrove requested a review from viirya November 11, 2024 18:34
) -> DataFusionResult<ArrayRef> {
match (from_type, to_type) {
(DataType::Struct(from_fields), DataType::Struct(to_fields)) => {
assert!(to_fields.len() <= from_fields.len());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why we have this assert? In Spark, Cast expression requires from_fields length equal to to_fields length. So we shouldn't encounter the case that they are not equal on an analyzed query plan.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have removed that. I was confused by an error about an unsupported cast that was dropping a struct field, but this came from DataFusion and I understand why now.

@viirya
Copy link
Member

viirya commented Nov 11, 2024

We need more extensive tests for sure, but it will be easier to add those as part of the comet-parquet-exec feature branch.

Spark should have many Cast expression tests. As we pass Spark tests, it should be fine for general cases.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just one question about the assert added.

@andygrove andygrove merged commit 712658e into apache:main Nov 11, 2024
@andygrove andygrove deleted the cast-struct-struct branch November 11, 2024 21:25
coderfender pushed a commit to coderfender/datafusion-comet that referenced this pull request Dec 13, 2025
* implement basic native code for casting struct to struct

* add another test

* rustdoc

* add scala side

* code cleanup

* clippy

* clippy

* add scala test

* improve test

* remove assert

* clippy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement cast between struct types

2 participants