Skip to content

[Variant] Support VariantBuilder to write to buffers owned by the caller #7805

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Right now VariantBuilder allocates and manages its own internal buffers (Vec).

As @harshmotw-db points out on #7783 (comment), for some usecases such as writing multiple Variant objects into a contiguous array, it would avoid a copy (and thus be more performant) if the VariantBuilder could write into an array owned by the caller

Describe the solution you'd like
Some way for VariantBuilder to write into an array owned by the caller.

Something like

// write multiple variant values back to back into the buffer
let mut output_metadata = Vec::with_capacity(8192);
let mut output_values = Vec::with_capacity(8192);

// write the first variant value to the buffers
let mut builder1 = VariantBuilder::with_existing_buffers(&mut output_metadata, &mut output_values);
 // .. write fields to builder 1.

builder1.finish();
// we know that output_metadata and output_values contain the first value, so we know 
// where the second value starts
let value2_metadata_offset = output_metadata.len();
let value2_value_offset = output_value.len();

// write the secpmd variant value to the *SAME* buffers
let mut builder2 = VariantBuilder::with_existing_buffers(&mut output_metadata, &mut output_values);
 // .. write fields to builder 2.
builder1.finish();

open questions:

  1. how to optimize the common case that the metadata is the same across values?

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

enhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions