Skip to content

Implement parsing for ParquetWriterOptions #4693

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We are implementing configurable parquet writing in DataFusion

We want to be able to allow users to specify the parquet writing options (like compression) via a string like

set parquet.writer_version = 2.0
set parquet.compression = zstd(5)

Describe the solution you'd like
Implement FromStr for the following structures, with some tests.

Each of these can be done via a separate PR

The basic code can probably be ported from DataFusion here and add some unit tests: https://github.com/apache/arrow-datafusion/blob/ed85abbb878ef3d60e43797376cb9a40955cd89a/datafusion/core/src/datasource/file_format/parquet.rs#L13

Bonus points for good error messages that give example values (like "Invalid encoding. Valid values: plain_dictionary, rle, etc

Describe alternatives you've considered

Additional context

@devinjdangelo implemented parsing for these in apache/datafusion#7244 however, I think these features could be more generally useful to others

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changeloggood first issueGood for newcomersparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions