feat(sql): add column projection pushdown for `read_parquet()` by kafka1991 · Pull Request #6551 · questdb/questdb

kafka1991 · 2025-12-18T07:36:08Z

Part of #6369.

Push column projection down to parquet decoder, reading only required columns instead of all columns.

Also optimized multi-threaded reads by sharing Parquet metadata across threads (parsed once per file).

Q1 Clickbench and hits.parquet on my dev environment (Mac M4)

SELECT count(*) FROM read_parquet('hits.parquet') WHERE AdvEngineId <> 0;

	Single-threaded	Multithreaded
Master	712 s	DNF (OOM)
PR without sharing parquet metadata across threads	5.93s	2.39s
PR sharing parquet metadata across threads	5.93s	787 ms

Other CursorFunctionCursors don't need this, they read on in-memory data where the existing optimizer's SelectedRecord approach is already efficient.

coderabbitai · 2025-12-18T07:36:15Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR introduces column projection support for Parquet reads by creating a new ProjectableRecordCursorFactory base class, refactoring Parquet cursor factories to inherit from it, updating SqlCodeGenerator to build and propagate projected metadata, and adding projection validation helpers in the Parquet read path.

Changes

Cohort / File(s)	Summary
New base abstraction `core/src/main/java/io/questdb/cairo/ProjectableRecordCursorFactory.java`	Introduces new abstract class implementing `RecordCursorFactory` with support for base metadata and separately projected metadata, providing hook methods for subclass cleanup via `_close()`.
Parquet cursor factories `core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameRecordCursorFactory.java`, `core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursorFactory.java`	Transition from `AbstractRecordCursorFactory` to `ProjectableRecordCursorFactory`, remove eager constructor dependencies (CairoConfiguration, FilesFacade), implement lazy initialization via execution context, add resource cleanup in `_close()`, include column metadata in plan output.
Parquet cursors `core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameCursor.java`, `core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursor.java`	Replace metadata change checks with `canProjectMetadata()` validation helper; `ReadParquetRecordCursor` adds new `canProjectMetadata()` method to validate column projection and optionally populate column index mappings.
Parquet function factory `core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetFunctionFactory.java`	Simplify constructor invocations for `ReadParquetPageFrameRecordCursorFactory` and `ReadParquetRecordCursorFactory` by removing CairoConfiguration and FilesFacade parameters.
SQL code generation `core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`	Add new `buildQueryMetadata()` method to centralize projected metadata construction, add temporary column index/size shift lists, propagate projected metadata to factories via `setQueryProjectedMetadata()`.
Parquet infrastructure `core/src/main/java/io/questdb/griffin/engine/table/parquet/PartitionDecoder.java`, `core/src/main/java/io/questdb/cairo/sql/PageFrameMemoryPool.java`	Add `getColumnIndex(CharSequence)` lookup in `PartitionDecoder.Metadata`; remove explicit `fromParquetColumnIndexes` mapping in `PageFrameMemoryPool`, extend `ParquetBuffers.decode` to store column type alongside parquet index.
Tests `core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java`, `core/src/test/java/io/questdb/test/griffin/engine/table/parquet/ReadParquetFunctionTest.java`	Update expected plan outputs to include column metadata; add new projection tests (`testColumnProjectionDifferentOrder`, `testColumnProjectionSingleColumn`, `testColumnProjectionWithExpression`).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

PageFrameMemoryPool.java: Refactoring of column index mapping logic and decoder changes requires careful verification of data flow correctness.
SqlCodeGenerator.java: New buildQueryMetadata() method involves column inspection, validation, and metadata construction with timestamp handling—review logic density and edge cases.
ReadParquetRecordCursor.java: New canProjectMetadata() method introduces projection validation with type checking and special VARCHAR↔SYMBOL remapping; verify all code paths.
Inheritance chain: Multiple classes transitioning to ProjectableRecordCursorFactory with lazy initialization patterns and resource cleanup—verify proper resource lifecycle and exception handling.

Possibly related PRs

fix(sql): fix empty result set from limit queries #6504: Modifies SqlCodeGenerator column metadata construction for TableColumnMetadata creation.
fix(sql): breaking change 💥- invalid column error returned from GROUP BY on joined tables #6275: Alters SqlCodeGenerator metadata construction and propagation to cursor factories.
fix(sql): support more Parquet field type combinations in read_parquet #6069: Modifies Parquet read path including PartitionDecoder, Parquet cursors/factories, and address cache handling.

Suggested labels

Enhancement, SQL, Performance

Suggested reviewers

kafka1991
puzpuzpuz

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 4.88% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and specifically describes the main change: adding column projection pushdown for read_parquet().
Description check	✅ Passed	The PR description clearly explains the feature being implemented: column projection pushdown for read_parquet() to read only required columns instead of all columns, with performance benchmarks demonstrating significant improvements.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ideoma · 2025-12-18T12:51:02Z

@coderabbitai review

coderabbitai · 2025-12-18T12:51:09Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/src/main/java/io/questdb/cairo/sql/PageFrameMemoryPool.java (1)

261-278: Add validation to ensure projected column indexes are within parquet column count.

At line 276 in PageFrameMemoryPool.java, the code uses parquetColumnIndex to index into fromParquetColumnIndexes (sized to parquetMetadata.getColumnCount()). The current validation at line 261 only checks column count, not individual index bounds. With projection pushdown, selected column indexes can be non-consecutive. Add bounds checking:

         fromParquetColumnIndexes.setAll(parquetMetadata.getColumnCount(), -1);
         for (int i = 0, n = addressCache.getColumnCount(); i < n; i++) {
             final int parquetColumnIndex = addressCache.getColumnIndexes().getQuick(i);
+            if (parquetColumnIndex < 0 || parquetColumnIndex >= parquetMetadata.getColumnCount()) {
+                throw CairoException.nonCritical()
+                        .put("parquet column index out of range [index=")
+                        .put(parquetColumnIndex)
+                        .put(", parquetColumnCount=")
+                        .put(parquetMetadata.getColumnCount())
+                        .put(']');
+            }
             final int columnType = addressCache.getColumnTypes().getQuick(i);
             parquetColumns.add(parquetColumnIndex);
             fromParquetColumnIndexes.setQuick(parquetColumnIndex, i);
             parquetColumns.add(columnType);
         }

🧹 Nitpick comments (6)

core/src/main/java/io/questdb/cairo/ProjectableRecordCursorFactory.java (3)
34-36: Consider adding null validation for the metadata parameter.

The constructor accepts metadata without validation. Adding a null check would prevent potential NPEs and make the contract clearer.
🔎 Apply this diff to add null validation:
 public ProjectableRecordCursorFactory(RecordMetadata metadata) {
+    if (metadata == null) {
+        throw new IllegalArgumentException("metadata cannot be null");
+    }
     this.metadata = metadata;
 }
30-58: Add javadoc for this public API class.

This abstract class serves as a public API but lacks documentation explaining:

Its purpose (enabling column projection for cursor factories)

When to extend it vs implementing RecordCursorFactory directly

The lifecycle and usage contract of setQueryProjectedMetadata()

Threading and concurrency guarantees

Adding comprehensive javadoc would significantly improve maintainability and help future contributors understand the projection mechanism.

43-53: Document the lifecycle and contract of setQueryProjectedMetadata().

The method has no documentation and accepts null without validation. It can also be called multiple times, silently overwriting previous values. While the field is only written once per factory instance (in SqlCodeGenerator during planning), the lack of documentation makes the intended usage unclear:

Should this be called exactly once, or multiple times?

When should it be called relative to getCursor()?

Is null a valid value, or should it be rejected?

Consider adding javadoc explaining the expected lifecycle and, if appropriate, adding validation (e.g., rejecting null or preventing multiple calls).
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (2)

1196-1221: Verify that projection covers all columns used in filters / ordering for function tables

In generateFunctionQuery, when the table function factory is a ProjectableRecordCursorFactory, you now:

Build queryMeta via buildQueryMetadata(...) using model.getTopDownColumns().

Call factory.setQueryProjectedMetadata(queryMeta) so subsequent getMetadata() calls see only projected columns.

This assumes that model.getTopDownColumns() always contains every column referenced anywhere in the query against this function (SELECT list, WHERE, GROUP BY, ORDER BY, JOIN conditions, etc.). If any column appears only in predicates or ordering and is not present in topDownColumns, then after setting the projected metadata:

Later compilation using factory.getMetadata() (e.g., in generateFilter, group‑by, order‑by) could fail with invalid column or mis-plan the query, since those columns would no longer exist in the metadata view.

The underlying function implementation might also not project all columns it needs to evaluate such expressions.

Please double‑check that QueryModel.getTopDownColumns() is guaranteed to include all such referenced columns for cursor functions like read_parquet(). If not, you likely need to extend the projection input set here (or reuse an existing “all referenced columns” collection) before pushing it into ProjectableRecordCursorFactory.

6274-6291: Refactoring table-query metadata construction into buildQueryMetadata looks correct

The new use of buildQueryMetadata(...) in generateTableQuery0:

Replaces the previous inline construction of queryMeta, columnIndexes, and columnSizeShifts with a shared helper.

Preserves timestamp handling via readerTimestampIndex = getTimestampIndex(model, metadata) and requiresTimestamp = joinsRequiringTimestamp[model.getJoinType()].

Continues to feed the resulting columnIndexes / columnSizeShifts into all downstream factories (latest‑by, filter/on‑values, page‑frame, etc.) in the same way.

Once the buildQueryMetadata clearing issue is fixed as suggested above, this refactor improves cohesion without changing behavior.
core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameRecordCursorFactory.java (1)
86-91: Duplicated lazy initialization for pageFrameCursor.

The lazy initialization logic for pageFrameCursor is duplicated between getCursor (lines 70-72) and getPageFrameCursor (lines 86-88). This is acceptable since both entry points need the cursor, but consider extracting a helper method if this pattern expands.
🔎 Optional: Extract helper method
private ReadParquetPageFrameCursor getOrCreatePageFrameCursor(SqlExecutionContext executionContext) {
    if (this.pageFrameCursor == null) {
        this.pageFrameCursor = new ReadParquetPageFrameCursor(
            executionContext.getCairoEngine().getConfiguration().getFilesFacade(), 
            getMetadata()
        );
    }
    return this.pageFrameCursor;
}

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f74209 and 00268e0.

📒 Files selected for processing (11)

core/src/main/java/io/questdb/cairo/ProjectableRecordCursorFactory.java (1 hunks)
core/src/main/java/io/questdb/cairo/sql/PageFrameMemoryPool.java (1 hunks)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (5 hunks)
core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetFunctionFactory.java (1 hunks)
core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameCursor.java (2 hunks)
core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameRecordCursorFactory.java (3 hunks)
core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursor.java (3 hunks)
core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursorFactory.java (2 hunks)
core/src/main/java/io/questdb/griffin/engine/table/parquet/PartitionDecoder.java (1 hunks)
core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (3 hunks)
core/src/test/java/io/questdb/test/griffin/engine/table/parquet/ReadParquetFunctionTest.java (4 hunks)

🧰 Additional context used

🧬 Code graph analysis (7)

core/src/main/java/io/questdb/griffin/engine/table/parquet/PartitionDecoder.java (1)

core/src/main/java/io/questdb/std/Chars.java (1)

Chars (43-1646)

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameCursor.java (1)

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursor.java (1)

ReadParquetRecordCursor (68-528)

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameRecordCursorFactory.java (3)

core/src/main/java/io/questdb/cairo/ProjectableRecordCursorFactory.java (1)

ProjectableRecordCursorFactory (30-58)

core/src/main/java/io/questdb/griffin/engine/table/PageFrameRecordCursorImpl.java (1)

PageFrameRecordCursorImpl (43-218)

core/src/main/java/io/questdb/griffin/engine/table/PageFrameRowCursorFactory.java (1)

PageFrameRowCursorFactory (34-75)

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursor.java (2)

core/src/main/java/io/questdb/std/IntList.java (1)

IntList (34-410)

core/src/main/java/io/questdb/cairo/CairoException.java (1)

CairoException (39-429)

core/src/test/java/io/questdb/test/griffin/engine/table/parquet/ReadParquetFunctionTest.java (1)

core/src/main/java/io/questdb/griffin/engine/table/parquet/PartitionEncoder.java (1)

PartitionEncoder (39-213)

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursorFactory.java (2)

core/src/main/java/io/questdb/cairo/ProjectableRecordCursorFactory.java (1)

ProjectableRecordCursorFactory (30-58)

core/src/main/java/io/questdb/std/str/Path.java (1)

Path (51-533)

core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)

core/src/main/java/io/questdb/cairo/ProjectableRecordCursorFactory.java (1)

ProjectableRecordCursorFactory (30-58)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (34)

GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
GitHub Check: New pull request (Trigger Enterprise CI Trigger Enterprise Pipeline)
GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
GitHub Check: New pull request (Hosted Running tests on windows-other-2)
GitHub Check: New pull request (Hosted Running tests on windows-other-1)
GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
GitHub Check: New pull request (Hosted Running tests on mac-other)
GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
GitHub Check: New pull request (Hosted Running tests on mac-cairo)
GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
GitHub Check: New pull request (Hosted Running tests on mac-griffin)
GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
GitHub Check: New pull request (Check Changes Check changes)

🔇 Additional comments (17)

core/src/main/java/io/questdb/cairo/sql/PageFrameMemoryPool.java (1)

392-418: LGTM — decode correctly handles [parquet_index, column_type] pairs.

The iteration at line 406 correctly divides by 2 to account for the paired structure, and the remapping logic using fromParquetColumnIndexes properly translates parquet column order back to query column order. The conditional aux pointer handling for variable-size types is also correct.

core/src/main/java/io/questdb/cairo/ProjectableRecordCursorFactory.java (1)

55-57: No resource cleanup needed for queryProjectMetadata.

The _close() hook doesn't clean up queryProjectMetadata or the base metadata field because RecordMetadata is a lightweight metadata interface without resources requiring explicit cleanup. It holds only metadata information (column names, types, indexing) and is safe to leave unreleased.

core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)

42-42: Import of ProjectableRecordCursorFactory is appropriate

The new import is consistent with the use of ProjectableRecordCursorFactory in generateFunctionQuery and looks correct.

core/src/main/java/io/questdb/griffin/engine/table/parquet/PartitionDecoder.java (1)

289-297: LGTM! Clean reverse-lookup implementation.

The linear search through column names is appropriate for this use case. Column projection typically happens once during query planning, and Parquet column counts are manageable, making O(n) lookup acceptable here.

core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (1)

7043-7071: Parquet explain plan expectations correctly reflect projected columns

The new columns: lines under parquet page frame scan in the three read_parquet cases match the query’s actual column usage (all columns for SELECT *, only a_long for avg(a_long), and a_str,a_long for a_str, max(a_long)). Formatting is consistent with other plan expectations, so these updates look good.

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameCursor.java (1)

135-139: LGTM - Clean integration of canProjectMetadata.

The switch from metadataHasChanged to canProjectMetadata centralizes the projection validation logic. Clearing columnIndexes before the call and letting canProjectMetadata populate it as an output parameter is a sound approach.

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursorFactory.java (2)

51-53: LGTM - Lazy initialization pattern is well-implemented.

The lazy cursor creation defers FilesFacade acquisition to execution time, correctly retrieving it from the execution context. This aligns with the removal of FilesFacade from the constructor signature and is consistent with the parallel factory's approach.

70-73: LGTM - Resource cleanup is correct.

Misc.free returns null on success, properly nulling the fields to prevent double-free scenarios.

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetFunctionFactory.java (1)

94-97: LGTM - Simplified factory construction.

The removal of CairoConfiguration and FilesFacade parameters from the call sites is consistent with the lazy initialization pattern introduced in the cursor factories. These dependencies are now obtained from SqlExecutionContext at cursor creation time.

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetPageFrameRecordCursorFactory.java (2)

61-72: LGTM - Lazy initialization correctly implemented.

The cursor and pageFrameCursor are created on-demand using the execution context. The PageFrameRecordCursorImpl is correctly configured with entityCursor=true (appropriate for full parquet reads) and filter=null.

110-114: LGTM - Resource cleanup is complete.

All three resources (cursor, pageFrameCursor, path) are properly freed. Note that cursor and pageFrameCursor don't need the assignment pattern since they're not accessed after close, but consistency with the path pattern is fine.

core/src/test/java/io/questdb/test/griffin/engine/table/parquet/ReadParquetFunctionTest.java (4)

100-114: Excellent test documentation and validation.

The comment clearly explains that without projection pushdown, a SelectedRecord operator would appear in the plan. The plan assertions correctly verify the projection is pushed down for both parallel and non-parallel modes.

119-154: Good coverage for projection with column reordering.

This test validates that columns can be projected in a different order than declared in the Parquet file, which is important for the pushdown optimization.

193-229: Key test for expression handling.

This test correctly verifies that when expressions (like a_long + 1) are used, a VirtualRecord layer is still present in the plan while the underlying parquet scan only reads the required column (a_long). This confirms the projection pushdown works correctly with expressions.

413-413: Minor schema change for test clarity.

Renaming ts to ts1 in table y makes the schema difference between tables x and y more explicit, which better tests the TableReferenceOutOfDateException scenario.

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursor.java (2)

47-47: LGTM!

The IntList import is correctly added to support the new canProjectMetadata method signature.

194-202: LGTM!

The column projection initialization logic is correct:

Clears and repopulates the columns mapping based on current metadata

Properly sets capacity for pairs (parquetIndex, actualType)

Throws TableReferenceOutOfDateException when projection fails, triggering query recompilation

core/src/main/java/io/questdb/griffin/engine/functions/table/ReadParquetRecordCursor.java

core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java

…push_down

…rements - Add assertions to verify parquet decoder is initialized in all partition frame cursors (Fwd/Bwd, Full/Interval) - Document PartitionDecoder.of(other) lifetime requirements: the source decoder must remain valid while the copy is in use 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

glasstiger · 2025-12-29T03:00:54Z

[PR Coverage check]

😍 pass : 277 / 289 (95.85%)

file detail

	path	covered line	new line	coverage
🔵	qdbr/src/parquet_write/file.rs	0	1	00.00%
🔵	io/questdb/griffin/engine/table/SelectedRecordCursorFactory.java	0	1	00.00%
🔵	io/questdb/griffin/engine/functions/table/ReadParquetRecordCursor.java	49	58	84.48%
🔵	io/questdb/cairo/ProjectableRecordCursorFactory.java	10	11	90.91%
🔵	io/questdb/griffin/engine/table/parquet/PartitionDecoder.java	26	26	100.00%
🔵	io/questdb/cairo/IntervalBwdPartitionFrameCursor.java	2	2	100.00%
🔵	qdbr/src/parquet_read/mod.rs	7	7	100.00%
🔵	io/questdb/griffin/engine/table/FwdTableReaderPageFrameCursor.java	2	2	100.00%
🔵	io/questdb/griffin/engine/functions/table/ReadParquetFunctionFactory.java	2	2	100.00%
🔵	io/questdb/cairo/IntervalFwdPartitionFrameCursor.java	2	2	100.00%
🔵	io/questdb/cairo/ParquetTimestampFinder.java	2	2	100.00%
🔵	qdbr/src/parquet_read/meta.rs	8	8	100.00%
🔵	io/questdb/griffin/engine/functions/table/ReadParquetRecordCursorFactory.java	3	3	100.00%
🔵	io/questdb/griffin/engine/table/BwdTableReaderPageFrameCursor.java	2	2	100.00%
🔵	io/questdb/griffin/SqlCodeGenerator.java	52	52	100.00%
🔵	io/questdb/griffin/engine/table/ExtraNullColumnCursorFactory.java	1	1	100.00%
🔵	io/questdb/cairo/TableReader.java	18	18	100.00%
🔵	qdbr/src/parquet_read/decode.rs	27	27	100.00%
🔵	io/questdb/griffin/engine/functions/table/ReadParquetPageFrameRecordCursorFactory.java	10	10	100.00%
🔵	io/questdb/griffin/engine/functions/table/ReadParquetPageFrameCursor.java	3	3	100.00%
🔵	qdbr/src/parquet_read/jni.rs	35	35	100.00%
🔵	io/questdb/cairo/sql/PageFrameMemoryPool.java	4	4	100.00%
🔵	io/questdb/cairo/sql/PageFrameAddressCache.java	8	8	100.00%
🔵	io/questdb/cairo/FullBwdPartitionFrameCursor.java	2	2	100.00%
🔵	io/questdb/cairo/FullFwdPartitionFrameCursor.java	2	2	100.00%

questdb-butler · 2025-12-29T03:08:10Z

⚠️ Enterprise CI Failed

The enterprise test suite failed for this PR.

Build: View Details
Tested Commit: 323b5d4e92d610d7bed8ea8c0055dcf00404de4d

Please investigate the failure before merging.

kafka1991 added 5 commits December 18, 2025 11:48

parquet column pushdown

5f2e5a1

merge fallout

d8f18e8

code cleanup

1c182d3

add projection columns in read_parquet explain output

119de67

code review

8458b5b

kafka1991 added 2 commits December 18, 2025 15:37

code cleanup

cb1de16

rename

9838b9f

kafka1991 changed the title ~~feat(sql): read_parquet support projection push down~~ feat(sql): add column projection pushdown for read_parquet() Dec 18, 2025

kafka1991 added 2 commits December 18, 2025 17:10

fix tests

02ef3d9

Merge branch 'master' into parquet_projection_push_down

00268e0

coderabbitai bot reviewed Dec 18, 2025

View reviewed changes

bluestreak01 and others added 16 commits December 18, 2025 15:48

Merge branch 'master' into parquet_projection_push_down

3406275

cache parquet file metadata for parallel parquet read

ff378e5

code review

9e907b9

Merge branch 'master' into parquet_projection_push_down

f8c8c4e

Rebuild Rust libraries

266b80f

remove unused code

76bf45c

add skiprows for read_parquet and add tests

58f7e56

fix IntervalBwdPartitionFrame

7326fe4

ParquetTimestampFinder use cached PartitionDecoder

7259523

fix llvm-tools path resolution in install-llvm.yml

279b84b

Merge remote-tracking branch 'origin/master' into parquet_projection_…

0732dda

…push_down

fix compile error

e9c6136

Merge branch 'master' into parquet_projection_push_down

ab3a92f

fix explain tests

35f0748

Merge branch 'master' into parquet_projection_push_down

134a35f

kafka1991 and others added 4 commits December 22, 2025 23:24

fix calculateSize in ReadParquetRecordCursor

8855edf

Merge branch 'master' into parquet_projection_push_down

ad8e907

Merge branch 'master' into parquet_projection_push_down

a0269bb

Merge branch 'master' into parquet_projection_push_down

323b5d4

bluestreak01 approved these changes Dec 29, 2025

View reviewed changes

bluestreak01 merged commit a4b7058 into master Dec 29, 2025
44 of 45 checks passed

bluestreak01 deleted the parquet_projection_push_down branch December 29, 2025 03:54

coderabbitai bot mentioned this pull request Jan 30, 2026

feat(core): optimize parquet partition read with late materialization, zero-copy page reading, and use raw array encoding #6675

Merged

coderabbitai bot mentioned this pull request Feb 25, 2026

feat(core): parquet row group pruning with min/max statistics and bloom filters #6739

Merged

17 tasks

coderabbitai bot mentioned this pull request Mar 10, 2026

fix(core): fix read_parquet() crash on SYMBOL columns from native parquet files #6865

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sql): add column projection pushdown for `read_parquet()`#6551

feat(sql): add column projection pushdown for `read_parquet()`#6551
bluestreak01 merged 29 commits intomasterfrom
parquet_projection_push_down

kafka1991 commented Dec 18, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Dec 18, 2025 •

edited

Loading

Review skipped

Uh oh!

ideoma commented Dec 18, 2025

Uh oh!

coderabbitai bot commented Dec 18, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glasstiger commented Dec 29, 2025

Uh oh!

questdb-butler commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

kafka1991 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

ideoma commented Dec 18, 2025

Uh oh!

coderabbitai bot commented Dec 18, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glasstiger commented Dec 29, 2025

[PR Coverage check]

file detail

Uh oh!

questdb-butler commented Dec 29, 2025

⚠️ Enterprise CI Failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kafka1991 commented Dec 18, 2025 •

edited

Loading

coderabbitai bot commented Dec 18, 2025 •

edited

Loading