-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelog
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Parquet row groups are meant to contain large numbers of rows. This helps amortize statistics, metadata, and IO overheads, and make the best use of dictionary encoding.
Currently every call to ArrowWriter::write creates a new row group, this is unfortunate
Describe the solution you'd like
ArrowWriter should only close a row group once it exceeds the configured WriterProperties::max_row_group_size
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelog