Skip to content

fix(sql): export parquet support empty table/partition#6420

Merged
bluestreak01 merged 5 commits intomasterfrom
export_empty_table_parquet
Nov 24, 2025
Merged

fix(sql): export parquet support empty table/partition#6420
bluestreak01 merged 5 commits intomasterfrom
export_empty_table_parquet

Conversation

@kafka1991
Copy link
Copy Markdown
Collaborator

@kafka1991 kafka1991 commented Nov 20, 2025

Fixes Parquet export to properly handle empty tables and partitions. Previously, exporting empty table could fail. This change ensures that valid Parquet files with only schema are generated.
close #6318

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Nov 20, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR modifies parquet export functionality to gracefully handle empty source tables. Changes include removing row count validation in the Rust schema layer, implementing per-partition empty detection and population in Java serialization, and adding tests verifying successful export of empty tables with proper schema.

Changes

Cohort / File(s) Summary
Rust validation relaxation
core/rust/qdbr/src/parquet_write/schema.rs
Removed runtime assertion that required row_count > 0 in Column::from_raw_data, allowing zero-row data to be processed.
Java parquet export logic
core/src/main/java/io/questdb/cutlass/parquet/SerialParquetExporter.java
Replaced table-level empty tracking with per-partition empty detection. Partitions with openPartition <= 0 now use new PartitionEncoder.populateEmptyPartition() instead of failing. Refactored status update sequence to always mark CONVERTING_PARTITIONS phase as finished before proceeding to file moves. Added logging after parquet conversion.
Empty partition population
core/src/main/java/io/questdb/griffin/engine/table/parquet/PartitionEncoder.java
Added public static method populateEmptyPartition(TableReader, PartitionDescriptor, int) to initialize a PartitionDescriptor for empty partitions, setting table name, timestamp index, and populating metadata columns with default zeroed parameters.
Export endpoint tests
core/src/test/java/io/questdb/test/cutlass/http/ExpParquetExportTest.java
Updated testEmptyTable to expect successful Parquet binary response (starting with PAR1) instead of JSON error payload. Fixed import location for ActiveConnectionTracker.
Copy export tests
core/src/test/java/io/questdb/test/griffin/CopyExportTest.java
Enhanced testCopyParquetEmptyTable to create an all_types_empty table with multiple data types (boolean, byte, short, int, long, float, double, string, symbol, timestamp_ns, array, timestamp) and verify exported Parquet content includes all defined columns.

Sequence Diagram(s)

sequenceDiagram
    participant Exporter as SerialParquetExporter
    participant PE as PartitionEncoder
    participant TR as TableReader
    participant Desc as PartitionDescriptor
    
    rect rgb(230, 245, 230)
    Note over Exporter: New Flow: Per-Partition Empty Handling
    
    Exporter->>Exporter: For each partition
    Exporter->>Exporter: Check if empty (openPartition <= 0)
    
    alt Partition Empty
        Exporter->>PE: populateEmptyPartition()
        PE->>TR: Get table metadata
        PE->>Desc: Set table name, timestamp index
        PE->>Desc: Populate metadata columns (non-zero types)
        PE-->>Exporter: Partition descriptor populated
    else Partition Not Empty
        Exporter->>PE: populateFromTableReader()
        PE-->>Exporter: Partition descriptor populated
    end
    
    Exporter->>Exporter: Mark CONVERTING_PARTITIONS → FINISHED
    Exporter->>Exporter: If no export result, MOVE_FILES → FINISHED
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • SerialParquetExporter.java: Logic flow restructuring from table-level to per-partition empty detection requires careful verification that the new status update sequence maintains correctness across all code paths (normal export, empty partitions, error handling).
  • PartitionEncoder.java: New populateEmptyPartition() method needs review to confirm it correctly initializes schema for empty partitions with all supported column types and proper default values.
  • Test updates: Verify that test expectations align with expected Parquet binary format and that the new test cases adequately cover empty table scenarios.

Suggested reviewers

  • RaphDal

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(sql): export parquet support empty table/partition' accurately describes the main objective of the PR: enabling Parquet export for empty tables and partitions.
Linked Issues check ✅ Passed The PR implements the requirements from issue #6318 by adding support for empty Parquet export: relaxes row count validation in Rust schema code, introduces per-partition empty handling in Java exporter with new PartitionEncoder.populateEmptyPartition method, and updates tests to verify empty table/partition export works correctly.
Out of Scope Changes check ✅ Passed All code changes are directly related to supporting empty Parquet export. Import relocation in test and test enhancements are both in scope for validating the empty table export functionality.
Description check ✅ Passed The pull request description is directly related to the changeset, explaining that it fixes Parquet export to handle empty tables and partitions, which aligns with the file changes that remove validation restrictions and add empty partition handling.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bluestreak01
Copy link
Copy Markdown
Member

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Nov 21, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 23 / 23 (100.00%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/table/parquet/PartitionEncoder.java 10 10 100.00%
🔵 io/questdb/cutlass/parquet/SerialParquetExporter.java 13 13 100.00%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parquet export fails ungracefully on empty source table

3 participants