fix(parquet): fix crash when parquet partition statistics are missing#6827
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughReplaced row-group statistics buffer usage with on-demand, file-backed timestamp decoding. Added new decoder helpers and JNI/native methods for row-group min/max timestamps, updated findRowGroupByTimestamp to accept file pointer/size, and removed RowGroupStatBuffers from Java paths and tests. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
core/src/test/java/io/questdb/test/cairo/fuzz/O3ParquetPartitionFuzzTest.java (1)
261-264: Stabilize the parquet-statistics toggle per test instance.At Line 263, using
rnd.nextBoolean()directly can flip configuration between calls in the same run. Sample once and return that fixed value so a single run tests one coherent mode.💡 Suggested patch
public class O3ParquetPartitionFuzzTest extends AbstractO3Test { + private final boolean parquetStatisticsEnabled = rnd.nextBoolean(); + @@ `@Override` protected boolean isParquetStatisticsEnabled() { - return rnd.nextBoolean(); + return parquetStatisticsEnabled; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/cairo/fuzz/O3ParquetPartitionFuzzTest.java` around lines 261 - 264, The isParquetStatisticsEnabled() method currently returns rnd.nextBoolean() each call causing the parquet-statistics mode to flip during a test run; instead, sample rnd.nextBoolean() once per test instance into a private final boolean field (e.g., parquetStatisticsEnabled) initialized in the test constructor or a `@Before` method and have isParquetStatisticsEnabled() return that field so the mode remains stable for the entire test instance; reference the isParquetStatisticsEnabled() method and the rnd RNG when making this change.core/rust/qdbr/src/parquet_read/decode.rs (1)
802-811: Missing-stats path decodes bounds twice per row groupLine 809 and Line 835 can each trigger a separate
decode_single_timestampcall when both stats are absent. On stats-disabled files, this doubles setup/page traversal work per row group.Consider a single helper that returns
(min,max)in one path (or reuses oneDecodeContext/buffers across both probes) to reduce fallback overhead.
Based on learnings: provide line-cited, push-back style reviews with concrete follow-ups (tests, diffs, and performance considerations).Also applies to: 828-844
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/rust/qdbr/src/parquet_read/decode.rs` around lines 802 - 811, The current fallback path calls decode_single_timestamp twice per row group when stats are missing (see row_group_timestamp_stat::<false>, validate_timestamp_sorting_key, and decode_single_timestamp), doubling decoding/setup cost; fix by creating a single helper (e.g., decode_row_group_timestamp_bounds or reuse a single DecodeContext/buffers) that returns both min and max in one traversal and call validate_timestamp_sorting_key before that helper as needed; update the branches that currently call decode_single_timestamp at lines around the min and max probes to use this helper/reused context, and add a unit/perf test that exercises stats-disabled files to ensure only one decode occurs per row group.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/rust/qdbr/src/parquet_read/decode.rs`:
- Around line 835-842: The timestamp decode paths perform size - 1 without
guarding against zero, causing usize underflow; before calling
decode_single_timestamp ensure column_chunk_size != 0 and handle the zero case
(e.g., return an Err with context) instead of passing column_chunk_size - 1;
similarly, in the code paths around row_group_size use an explicit check if
row_group_size == 0 then return Err(InvalidLayout, "row group {} has zero rows
for timestamp column {}", ...) in the function that handles row group layout,
and at the loop near the nested loops add if row_group_size == 0 { continue; }
to skip processing empty groups; update the functions referenced
(decode_single_timestamp call site, the function validating row_group_size, and
the loop around the nested timestamp decode logic) and add a regression test
that feeds an empty row group with stats disabled and asserts a controlled error
(no panic/underflow).
- Around line 924-951: decode_single_timestamp currently casts
ColumnChunkBuffers.data_vec (u8) to *const i64 and unsafely dereferences it,
risking empty-buffer deref and alignment UB; also code that computes
column_chunk_size - 1 can underflow when num_values == 0. Fix by making
decode_single_timestamp perform safe bounds and alignment-checked extraction:
verify bufs.data_vec.len() >= size_of::<i64>(), read the first 8 bytes with a
safe conversion (e.g., copy into an [u8;8] and use i64::from_le_bytes or
equivalent) rather than casting to *const i64, and return a descriptive
ParquetResult error if buffer is empty or too short. Also audit the code that
computes column_chunk_size - 1 (the place using column_chunk_size) to guard
against num_values == 0 and handle that branch explicitly (return empty/zero
result or an error) to avoid underflow. Add regression tests exercising
decode_single_timestamp / decode_column_chunk for (1) missing stats with empty
fallback decode and (2) column_chunk_size == 0 to ensure the new checks and
error paths behave as expected.
In `@core/src/main/java/io/questdb/cairo/ParquetTimestampFinder.java`:
- Around line 144-146: minTimestampExact() currently calls
partitionDecoder.rowGroupMinTimestamp(0, timestampIdAndType.get(0)) without
checking for empty metadata; add a guard like the existing max-path precheck to
detect zero row groups (or empty metadata) and throw a clear CairoException if
none exist, otherwise proceed to call partitionDecoder.rowGroupMinTimestamp(0,
timestampIdAndType.get(0)); ensure you reference the same metadata/row-group
count check used elsewhere so the behavior is symmetric with the max-timestamp
path.
In `@core/src/main/java/io/questdb/cairo/TableWriter.java`:
- Line 8879: The code calls parquetDecoder.rowGroupMinTimestamp(0,
timestampIndex) (and similarly rowGroupMaxTimestamp) without checking
rowGroupCount, risking invalid native reads on empty/corrupt metadata; update
the logic in TableWriter to first check rowGroupCount == 0 and throw a
CairoException with a clear message when zero, and only then call
rowGroupMinTimestamp/rowGroupMaxTimestamp with validated indices, and add a
regression unit test that constructs/loads a parquet file with zero row groups
and asserts the CairoException is thrown.
---
Nitpick comments:
In `@core/rust/qdbr/src/parquet_read/decode.rs`:
- Around line 802-811: The current fallback path calls decode_single_timestamp
twice per row group when stats are missing (see
row_group_timestamp_stat::<false>, validate_timestamp_sorting_key, and
decode_single_timestamp), doubling decoding/setup cost; fix by creating a single
helper (e.g., decode_row_group_timestamp_bounds or reuse a single
DecodeContext/buffers) that returns both min and max in one traversal and call
validate_timestamp_sorting_key before that helper as needed; update the branches
that currently call decode_single_timestamp at lines around the min and max
probes to use this helper/reused context, and add a unit/perf test that
exercises stats-disabled files to ensure only one decode occurs per row group.
In
`@core/src/test/java/io/questdb/test/cairo/fuzz/O3ParquetPartitionFuzzTest.java`:
- Around line 261-264: The isParquetStatisticsEnabled() method currently returns
rnd.nextBoolean() each call causing the parquet-statistics mode to flip during a
test run; instead, sample rnd.nextBoolean() once per test instance into a
private final boolean field (e.g., parquetStatisticsEnabled) initialized in the
test constructor or a `@Before` method and have isParquetStatisticsEnabled()
return that field so the mode remains stable for the entire test instance;
reference the isParquetStatisticsEnabled() method and the rnd RNG when making
this change.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
core/src/main/resources/io/questdb/bin/darwin-aarch64/libquestdbr.dylibis excluded by!**/*.dylibcore/src/main/resources/io/questdb/bin/darwin-x86-64/libquestdbr.dylibis excluded by!**/*.dylibcore/src/main/resources/io/questdb/bin/linux-aarch64/libquestdbr.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/linux-x86-64/libquestdbr.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/windows-x86-64/questdbr.dllis excluded by!**/*.dll
📒 Files selected for processing (11)
core/rust/qdbr/src/parquet_read/decode.rscore/rust/qdbr/src/parquet_read/jni.rscore/rust/qdbr/src/parquet_read/slicer/mod.rscore/rust/qdbr/src/parquet_write/decimal.rscore/src/main/java/io/questdb/cairo/O3PartitionJob.javacore/src/main/java/io/questdb/cairo/ParquetTimestampFinder.javacore/src/main/java/io/questdb/cairo/TableWriter.javacore/src/main/java/io/questdb/griffin/engine/table/parquet/PartitionDecoder.javacore/src/test/java/io/questdb/test/cairo/TimestampFinderTest.javacore/src/test/java/io/questdb/test/cairo/fuzz/O3ParquetPartitionFuzzTest.javacore/src/test/java/io/questdb/test/cairo/o3/AbstractO3Test.java
|
Claude review: Bug 1 (Medium): Unsafe read from potentially empty buffer in decode_single_timestamp decode.rs:949-950: After decode_column_chunk returns, the code reads from bufs.data_vec without verifying it contains at least 8 bytes. If the row group is corrupt or zero-size, data_vec could be empty, Bug 2 (Low): usize underflow on zero-size row group decode.rs:1022-1030 in row_group_max_timestamp: And similarly in find_row_group_by_timestamp at line 840: If a row group has 0 rows, row_group_size - 1 wraps to usize::MAX. An empty row group is probably invalid per the Parquet spec, but this would cause a spectacular failure rather than a Issue 3 (Medium): Unnecessary allocations in fallback path decode.rs:924-951 — decode_single_timestamp creates a fresh DecodeContext and ColumnChunkBuffers on every call: In find_row_group_by_timestamp, this can be called up to 2x per row group (min + max), for every row group in the file. Each call allocates a decompression buffer internally. For a file Consider hoisting the DecodeContext and ColumnChunkBuffers out of the loop in find_row_group_by_timestamp and passing them into decode_single_timestamp. For the standalone Issue 4 (Nit): Redundant metadata access in find_row_group_by_timestamp Lines 786-795 already fetch columns_meta and column_chunk_meta for the current row group. Then row_group_timestamp_stat (line 802) re-fetches the same Issue 5 (Nit): Redundant condition check decode.rs:846: The timestamp >= min_value part is always true at this point, because the timestamp < min_value case returns on line 823. The check is logically correct but redundant. Minor readability Test Coverage: Adequate
Gap: The TableWriter.readMinTimestampParquet and readParquetMinMaxTimestamps paths are only exercised indirectly. A direct unit test writing a Parquet partition with statistics |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
core/src/main/java/io/questdb/cairo/ParquetTimestampFinder.java (1)
133-136:⚠️ Potential issue | 🟡 MinorReplace
assertwith an explicit runtime guard for empty row-group metadata.
assert rowGroupCount > 0is not a production-safe precondition. With assertions off,maxTimestampExact()can callrowGroupMaxTimestamp(rowGroupCount - 1, ...)with-1. Please keep the assert if you want, but add a real guard and throwCairoExceptionin both methods.🛠️ Proposed fix
`@Override` public long maxTimestampExact() { final int rowGroupCount = partitionDecoder.metadata().getRowGroupCount(); - assert rowGroupCount > 0; + if (rowGroupCount < 1) { + throw CairoException.critical(0) + .put("parquet partition has no row groups [table=").put(tableToken) + .put(", partitionIndex=").put(partitionIndex) + .put(']'); + } + assert rowGroupCount > 0; return partitionDecoder.rowGroupMaxTimestamp(rowGroupCount - 1, timestampIdAndType.get(0)); } @@ `@Override` public long minTimestampExact() { final int rowGroupCount = partitionDecoder.metadata().getRowGroupCount(); - assert rowGroupCount > 0; + if (rowGroupCount < 1) { + throw CairoException.critical(0) + .put("parquet partition has no row groups [table=").put(tableToken) + .put(", partitionIndex=").put(partitionIndex) + .put(']'); + } + assert rowGroupCount > 0; return partitionDecoder.rowGroupMinTimestamp(0, timestampIdAndType.get(0)); }Run this to verify where assert-only guards exist and whether
-eais configured in repo-managed runtime configs:#!/bin/bash rg -nP 'maxTimestampExact|minTimestampExact|assert\s+rowGroupCount\s*>\s*0|rowGroup(Max|Min)Timestamp' core/src/main/java/io/questdb/cairo/ParquetTimestampFinder.java -C2 rg -nP '(^|\s)-ea(\s|$)|enableassertions' -g '*.sh' -g '*.service' -g '*.md' -g '*.xml' -g '*.yml' -g '*.yaml' .Also applies to: 145-147
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/cairo/ParquetTimestampFinder.java` around lines 133 - 136, Replace the assert with an explicit runtime check in both maxTimestampExact() and minTimestampExact(): read rowGroupCount = partitionDecoder.metadata().getRowGroupCount(), and if rowGroupCount <= 0 throw a new CairoException with a clear message (e.g. "no row groups in partition metadata") instead of relying on assert, then call partitionDecoder.rowGroupMaxTimestamp(rowGroupCount - 1, timestampIdAndType.get(0)) / partitionDecoder.rowGroupMinTimestamp(rowGroupCount - 1, timestampIdAndType.get(0)) as before; keep or remove the existing assert as you prefer but ensure the CairoException guard exists so production runs cannot pass a -1 index to rowGroupMaxTimestamp/rowGroupMinTimestamp.
🧹 Nitpick comments (2)
core/rust/qdbr/src/parquet_read/decode.rs (2)
979-1014: Add focused regression tests for directrow_group_min_timestamp/row_group_max_timestampfallback behavior.Please add a unit that disables stats and validates: (1) first/last-row decode path returns expected values, and (2) zero-row-group/zero-row error paths return controlled
InvalidLayouterrors. This will lock the new public API semantics independent of fuzz coverage.Based on learnings: in questdb/questdb PR reviews, include concrete test follow-ups.
Also applies to: 1047-1065
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/rust/qdbr/src/parquet_read/decode.rs` around lines 979 - 1014, Add focused unit tests for the public fallback behavior of row_group_min_timestamp and row_group_max_timestamp: disable statistics (so row_group_timestamp_stat returns None) and assert that the first/last-row decode path via validate_timestamp_sorting_key + decode_single_timestamp returns expected timestamps, and separately assert that requesting min/max on an empty row group or a file with zero rows produces a controlled InvalidLayout error; target tests to cover the functions row_group_min_timestamp and row_group_max_timestamp and the fallback code paths (row_group_timestamp_stat::<false>, validate_timestamp_sorting_key, decode_single_timestamp) so the API semantics are locked in independent of fuzzing.
805-818: Reuse decode scratch objects in the fallback scan path to avoid repeated allocations.When statistics are disabled,
find_row_group_by_timestampmay probe min/max per row group, and each probe currently allocates a freshDecodeContext+ColumnChunkBuffersviadecode_single_timestamp. Hoisting these out of the loop will cut allocator churn on hot read paths.♻️ Proposed refactor
- fn decode_single_timestamp( - &self, - file_ptr: *const u8, - file_size: u64, + fn decode_single_timestamp( + &self, + ctx: &mut DecodeContext, + bufs: &mut ColumnChunkBuffers, row_group_index: usize, timestamp_column_index: usize, row_lo: usize, row_hi: usize, ) -> ParquetResult<i64> { - let mut ctx = DecodeContext::new(file_ptr, file_size); - let mut bufs = ColumnChunkBuffers::new(self.allocator.clone()); let col_info = QdbMetaCol { column_type: ColumnType::new(ColumnTypeTag::Timestamp, 0), column_top: 0, format: None, }; self.decode_column_chunk( - &mut ctx, - &mut bufs, + ctx, + bufs, row_group_index, row_lo, row_hi, timestamp_column_index, col_info, )?;pub fn find_row_group_by_timestamp(...) -> ParquetResult<u64> { + let mut decode_ctx = DecodeContext::new(file_ptr, file_size); + let mut decode_bufs = ColumnChunkBuffers::new(self.allocator.clone()); ... - self.decode_single_timestamp(file_ptr, file_size, row_group_idx, ts, 0, 1)? + self.decode_single_timestamp(&mut decode_ctx, &mut decode_bufs, row_group_idx, ts, 0, 1)? ... - self.decode_single_timestamp(file_ptr, file_size, row_group_idx, ts, column_chunk_size - 1, column_chunk_size)? + self.decode_single_timestamp(&mut decode_ctx, &mut decode_bufs, row_group_idx, ts, column_chunk_size - 1, column_chunk_size)?Based on learnings: in questdb/questdb PR reviews, include concrete performance follow-ups.
Also applies to: 835-855, 952-953
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/rust/qdbr/src/parquet_read/decode.rs` around lines 805 - 818, The fallback path in find_row_group_by_timestamp repeatedly allocates DecodeContext and ColumnChunkBuffers by calling decode_single_timestamp for each probe; hoist reusable scratch objects out of the loop and reuse them to avoid allocator churn. Concretely, allocate a single DecodeContext and ColumnChunkBuffers before the row-group probe loop (or add an overload like decode_single_timestamp_with_scratch that accepts &mut DecodeContext and &mut ColumnChunkBuffers), then call that variant from find_row_group_by_timestamp’s fallback branch (the code that currently calls decode_single_timestamp and validate_timestamp_sorting_key) so each probe reuses the same buffers; apply the same hoist/reuse change for the other probe locations referenced around the blocks at the ranges corresponding to 835-855 and 952-953.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@core/src/main/java/io/questdb/cairo/ParquetTimestampFinder.java`:
- Around line 133-136: Replace the assert with an explicit runtime check in both
maxTimestampExact() and minTimestampExact(): read rowGroupCount =
partitionDecoder.metadata().getRowGroupCount(), and if rowGroupCount <= 0 throw
a new CairoException with a clear message (e.g. "no row groups in partition
metadata") instead of relying on assert, then call
partitionDecoder.rowGroupMaxTimestamp(rowGroupCount - 1,
timestampIdAndType.get(0)) / partitionDecoder.rowGroupMinTimestamp(rowGroupCount
- 1, timestampIdAndType.get(0)) as before; keep or remove the existing assert as
you prefer but ensure the CairoException guard exists so production runs cannot
pass a -1 index to rowGroupMaxTimestamp/rowGroupMinTimestamp.
---
Nitpick comments:
In `@core/rust/qdbr/src/parquet_read/decode.rs`:
- Around line 979-1014: Add focused unit tests for the public fallback behavior
of row_group_min_timestamp and row_group_max_timestamp: disable statistics (so
row_group_timestamp_stat returns None) and assert that the first/last-row decode
path via validate_timestamp_sorting_key + decode_single_timestamp returns
expected timestamps, and separately assert that requesting min/max on an empty
row group or a file with zero rows produces a controlled InvalidLayout error;
target tests to cover the functions row_group_min_timestamp and
row_group_max_timestamp and the fallback code paths
(row_group_timestamp_stat::<false>, validate_timestamp_sorting_key,
decode_single_timestamp) so the API semantics are locked in independent of
fuzzing.
- Around line 805-818: The fallback path in find_row_group_by_timestamp
repeatedly allocates DecodeContext and ColumnChunkBuffers by calling
decode_single_timestamp for each probe; hoist reusable scratch objects out of
the loop and reuse them to avoid allocator churn. Concretely, allocate a single
DecodeContext and ColumnChunkBuffers before the row-group probe loop (or add an
overload like decode_single_timestamp_with_scratch that accepts &mut
DecodeContext and &mut ColumnChunkBuffers), then call that variant from
find_row_group_by_timestamp’s fallback branch (the code that currently calls
decode_single_timestamp and validate_timestamp_sorting_key) so each probe reuses
the same buffers; apply the same hoist/reuse change for the other probe
locations referenced around the blocks at the ranges corresponding to 835-855
and 952-953.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
core/src/main/resources/io/questdb/bin/darwin-aarch64/libquestdbr.dylibis excluded by!**/*.dylibcore/src/main/resources/io/questdb/bin/darwin-x86-64/libquestdbr.dylibis excluded by!**/*.dylibcore/src/main/resources/io/questdb/bin/linux-aarch64/libquestdbr.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/linux-x86-64/libquestdbr.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/windows-x86-64/questdbr.dllis excluded by!**/*.dll
📒 Files selected for processing (3)
core/rust/qdbr/src/parquet_read/decode.rscore/src/main/java/io/questdb/cairo/ParquetTimestampFinder.javacore/src/main/java/io/questdb/cairo/TableWriter.java
🚧 Files skipped from review as they are similar to previous changes (1)
- core/src/main/java/io/questdb/cairo/TableWriter.java
core/src/main/java/io/questdb/cairo/ParquetTimestampFinder.java
Outdated
Show resolved
Hide resolved
[PR Coverage check]😍 pass : 217 / 293 (74.06%) file detail
|
The
cairo.partition.encoder.parquet.statistics.enabledconfiguration allows users to disable Parquet statistics, but the read path (ParquetTimestampFinder,TableWriter) and the O3 merge path (O3PartitionJob.processParquetPartition) hard-depend on timestamp column statistics. When statistics are absent,getMinValueLonghitsassert size == Long.BYTES(crash with-ea) or reads garbage memory (silent data corruption without-ea). This PR removes that hard dependency.PartitionDecoderaddsrowGroupMinTimestamp/rowGroupMaxTimestampmethods that try Parquet column statistics first (zero cost), then fall back to decoding the first/last row from actual data pages when statistics are absent.findRowGroupByTimestampalso falls back to decoding when statistics are missing, instead of reading garbage memory.O3PartitionJob,ParquetTimestampFinder, andTableWritermigrate fromreadRowGroupStats+getMinValueLong/getMaxValueLongto the new methods.