[Parquet] Reuse buffer in ByteViewArrayDecoderPlain #6930
Conversation
ByteViewArrayDecoderPlain ByteViewArrayDecoderPlain
alamb
left a comment
There was a problem hiding this comment.
Thanks @XiangpengHao -- this looks great to me
arrow-buffer/src/buffer/immutable.rs
Outdated
There was a problem hiding this comment.
Can this instead be a From implementation?
There was a problem hiding this comment.
I would like to recommend a From implementation in addition to the explicit function
My rationale is that Rust experts are likely to find the From implementation, but for beginners/intermediate, having an explicit method is easier to find
There was a problem hiding this comment.
I think the name is very confusing, as it isn't clear what external means. This confusion is not helped by the fact arrow has a Bytes type that isn't currently public but the from_bytes method is.
Just adding a From implementation is a way to decouple this PR from cleaning this mess up #6754
There was a problem hiding this comment.
I have some ideas of how to document / make this easier to find -- be back shortly
There was a problem hiding this comment.
Here is a proposal to add From impls and documentation:
There was a problem hiding this comment.
Sounds great! Let's wait #6939 to be merged and I'll rebase to that commit
There was a problem hiding this comment.
There was a problem hiding this comment.
I'll rebase to that commit
Done!
Co-authored-by: Raphael Taylor-Davies <[email protected]>
bc838a4 to
0111ae6
Compare
etseidl
left a comment
There was a problem hiding this comment.
LGTM. Thanks @XiangpengHao
|
Thanks again everyone! |
* reuse buffer in view array * Update parquet/src/arrow/array_reader/byte_view_array.rs Co-authored-by: Raphael Taylor-Davies <[email protected]> * use From<Bytes> instead --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
* reuse buffer in view array * Update parquet/src/arrow/array_reader/byte_view_array.rs Co-authored-by: Raphael Taylor-Davies <[email protected]> * use From<Bytes> instead --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
* reuse buffer in view array * Update parquet/src/arrow/array_reader/byte_view_array.rs Co-authored-by: Raphael Taylor-Davies <[email protected]> * use From<Bytes> instead --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
Which issue does this PR close?
Part of #6921
Rationale for this change
Avoid adding a new buffer to the stringview buffer if the buffer-to-be-add is the same as the last buffer in the buffer list.
This is typically not a problem as we usually have large continuous reads. But it becomes a problem with row-level filtering where we can have hundreds/thousands of small reads over the same buffer.
What changes are included in this PR?
Are there any user-facing changes?