-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
BufferBuilder is effectively a Vec<T: ArrowNativeType> that can be converted to Buffer without needing to copy. This is an incredibly useful abstraction for incrementally building and mutating data in place, prior to freezing it for hand-off to other systems.
Unfortunately it is currently lacking some relatively minor functionality to enable this use-case. In particular access to the data that has been already written. The parquet crate currently rolls its own ScalarBuffer to work around this, but as described in #1849 it would be nice to avoid this duplication.
Describe the solution you'd like
Add the following APIs to BufferBuilder
fn as_slice(&self) -> &[T] {
unsafe { std::slice::from_raw_parts(self.buffer.as_ptr(), self.len) }
}
fn as_slice_mut(&mut self) -> &mut [T] {
unsafe { std::slice::from_raw_parts_mut(self.buffer.as_ptr_mut(), self.len) }
}
fn truncate(&mut self, len: usize) {}
fn extend_zeroed(&mut self, len: usize) {}
Describe alternatives you've considered
We could not do this
Additional context
- Implement UnionArray FieldData using Type Erasure #1842 is necessary to provide these APIs safely
- BufferBuilder should be moved to its own module to avoid others mutating its inner MutableArray (Split up
arrow::array::buildermodule #1843)