[WIP] Implement project for Transform. #264#269
[WIP] Implement project for Transform. #264#269marvinlanhenke wants to merge 6 commits intoapache:mainfrom
Conversation
|
@liurenjie1024 @Xuanwo
I would really appreciate your thoughts on this |
crates/iceberg/src/spec/transform.rs
Outdated
| PrimitiveLiteral::Long(v) => func | ||
| .transform(Arc::new(Int64Array::from_value(v, 1)))? | ||
| .as_any() | ||
| .downcast_ref::<Int32Array>() |
There was a problem hiding this comment.
@sdd
if i understood correctly the fn transform implementation of Bucket always returns arrow_array::Int32Array.
https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/transform/bucket.rs#L89
|
@liurenjie1024 @sdd @Xuanwo @ZENOTME |
|
Thanks for this job!
I think the transform can provide an interface like |
| self.op | ||
| } | ||
|
|
||
| pub(crate) fn literals(&self) -> &FnvHashSet<Datum> { |
There was a problem hiding this comment.
| pub(crate) fn literals(&self) -> &FnvHashSet<Datum> { | |
| pub(crate) fn literals(&self) -> impl Iterator<Item=Datum> { |
How about returning iterator only? It's better not expose the inner data structure.
|
Hi, @marvinlanhenke This pr looks great to me. But I think @ZENOTME is right, we should implement #283 first. The reason we implement transform on arrow array could be found here |
I will init the interface as soon as possible to avoid blocking this. |
Thank you for providing the reason why we use arrow array in the first place. |
Sounds good to me |
#264
I just wanted to share this as a starting point - and to get some feedback, thoughts about the approach here.
@liurenjie1024 @Xuanwo
Please take a look if this is a viable solution at all.
I'm quite unsure about the
fn transformand the handling of the arrow_array - seems kinda clunky?Anyway, let me know what you think. best regards.