Skip to content

Bulk and ad-hoc parquet export APIs #97

@nwoolmer

Description

@nwoolmer

Currently, Parquet files can be exported from QuestDB by converting a table's partitions to Parquet format, and copying them out. This goes via the WAL, so can impact ingestion performance.

Instead, we will add two more means to export data in parquet format:

  • bulk export: COPY ... TO ... WITH FORMAT PARQUET;
    • This supports exporting a table or arbitrary query to parquet
    • The parquet format can be configured (row group sizes, compression codecs)
    • The data can be newly partitioned
  • ad-hoc export: /exp?query=...&format=parquet
    • This is similar to the above option, but works via the REST API
    • This allows you to download smaller datasets in parquet format, instead of CSV.

Chronologically-ordered parquet files can be ingested very quickly into QuestDB. This feature will make it much simpler to get data in and out of the database, and hook it up to other tools such as Pandas/Polars/DuckDB.

Metadata

Metadata

Assignees

No one assigned

    Labels

    QoLSQLSQL engine featurescompatibilityCompatibility with 3rd party toolsopen formatopen sourceOpen source featuresstorageStorage engine core features

    Type

    No type

    Projects

    Status

    Shipped

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions