Skip to content

feat(sql): support include prevailing in window join#6476

Merged
puzpuzpuz merged 46 commits intopuzpuzpuz_parallel_window_join_pocfrom
window_join_include_prevailing
Dec 16, 2025
Merged

feat(sql): support include prevailing in window join#6476
puzpuzpuz merged 46 commits intopuzpuzpuz_parallel_window_join_pocfrom
window_join_include_prevailing

Conversation

@kafka1991
Copy link
Copy Markdown
Collaborator

@kafka1991 kafka1991 commented Dec 1, 2025

Follow-up for #6292

Depends on #6292

  • AsyncWindowJoinFastRecordCursorFactory support include prevailing @kafka1991
  • AsyncWindowJoinRecordCursorFactory support include prevailing @kafka1991
  • (single-thread window join)WindowJoinFastRecordCursorFactory and WindowJoinRecordCursorFactory support include prevailing @kafka1991
  • parallel fast factory: lazy prevailing rows search with per-worker cache for vectorized reducers @puzpuzpuz
  • parallel fast factory: lazy prevailing rows search with per-worker cache for non-vectorized reducers
  • add fuzz tests @puzpuzpuz
  • add tests @kafka1991
  • make include prevailing as default strategy instead of exclude prevailing @kafka1991
  • [NICE TO HAVE] try to extract slaveFilter from joinFilter

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 1, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR introduces support for "prevailing" context in window join operations. It adds a boolean includePrevailing flag throughout the join engine, updates SQL parsing for "include"/"exclude" keywords, extends TimeFrameHelper with prevailing row tracking and navigation methods, and updates test infrastructure accordingly.

Changes

Cohort / File(s) Change Summary
Window Join Core Factories
core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java, core/src/main/java/io/questdb/griffin/engine/join/WindowJoinRecordCursorFactory.java
Added includePrevailing flag to constructor signatures. Introduced new cursor variants (WindowJoinWithPrevailingFastRecordCursor, WindowJoinWithPrevailingAndJoinFilterRecordCursor) and dynamic selection logic. Extended plan output and internal state handling for prevailing-aware join processing.
Async Window Join Atoms
core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinAtom.java, core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinFastAtom.java
Added includePrevailing boolean parameter to constructors. Updated plan output to display "(include prevailing)" or "(exclude prevailing)".
TimeFrame Helper
core/src/main/java/io/questdb/griffin/engine/join/TimeFrameHelper.java
Updated binarySearch and linearScan signatures to accept recordPrevailing boolean. Added new public methods: findRowLoWithPrevailing(), previousFrame(), setBookmark(), getBookmarkedFrameIndex(), getBookmarkedRowId(), getPrevailingFrameIndex(), getPrevailingRowId(). Introduced private helpers binarySearchWithPrevailing(), linearScanWithPrevailing().
Async Window Join Cursor Factory
core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java
Added includePrevailing parameter to constructor. Introduced multiple new PageFrameReducer variants for prevailing scenarios. Implemented dynamic reducer selection based on includePrevailing, filters, and vectorization flags. Added findPrevailingForMasterRow helper function.
SQL Code Generation & Parsing
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java, core/src/main/java/io/questdb/griffin/SqlKeywords.java, core/src/main/java/io/questdb/griffin/SqlParser.java, core/src/main/java/io/questdb/griffin/SqlOptimiser.java
Updated keyword methods: renamed isIncluding()isIncludeKeyword(), removed isExcluding(). Removed assertion restricting prevailing inclusion in window joins. Renamed private helper methods (isExcludingPrevailing()isExcludePrevailing(), isIncludingPrevailing()isIncludePrevailing()). Propagated context.isIncludePrevailing() into join factory invocations. Adjusted lo calculation to use Numbers.LONG_NULL sentinel when prevailing is included.
Query Model & Context
core/src/main/java/io/questdb/griffin/model/QueryModel.java, core/src/main/java/io/questdb/griffin/model/WindowJoinContext.java, core/src/main/java/io/questdb/griffin/model/RuntimeIntervalModelBuilder.java
Added isCteModel() public accessor and internal field to QueryModel. Changed WindowJoinContext.includePrevailing default from false to true. Updated RuntimeIntervalModelBuilder.merge() to handle LONG_NULL/Long.MAX_VALUE offsets and conditionally call intersect() based on interval boundaries. Updated window-join textual output from "including/excluding prevailing" to "include/exclude prevailing".
Symbol & Aggregate Functions
core/src/main/java/io/questdb/cairo/TableWriter.java, core/src/main/java/io/questdb/cairo/map/MapValue.java, core/src/main/java/io/questdb/cairo/sql/PageFrameMemoryRecord.java
Added symbolNullValue parameter to createSymbolMapWriter() and propagated at call sites with logic-dependent values. Added no-op setNew(boolean isNew) default method to MapValue interface. Refactored variable scope in PageFrameMemoryRecord getSymA/getSymB by narrowing key declaration.
Test Infrastructure
core/src/test/java/io/questdb/test/AbstractCairoTest.java
Updated method signatures to accept CharSequence instead of String for expected/query/plan parameters across multiple assertion method overloads.
Test Updates (Expectations & Logic)
core/src/test/java/io/questdb/test/TelemetryTest.java, core/src/test/java/io/questdb/test/cairo/fuzz/WindowJoinFuzzTest.java, core/src/test/java/io/questdb/test/cairo/fuzz/ParallelWindowJoinFuzzTest.java, core/src/test/java/io/questdb/test/griffin/AlterTableAddColumnTest.java, core/src/test/java/io/questdb/test/griffin/SqlParserTest.java, core/src/test/java/io/questdb/test/griffin/engine/join/TimeFrameHelperTest.java
Updated test expected values to reflect new upsertKey values (TelemetryTest). Extended WindowJoinFuzzTest with includePrevailing parameter and conditional Oracle/left-join query generation using UNION and LATEST ON constructs. Updated ParallelWindowJoinFuzzTest with EXCLUDE PREVAILING clauses. Modified TimeFrameHelperTest to pass new recordPrevailing flag to binarySearch and added extensive prevailing row candidate tests. Updated string literals reflecting shortened keyword phrasing.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

  • WindowJoinFastRecordCursorFactory & WindowJoinRecordCursorFactory: Substantial refactors with new cursor variants, field visibility changes, and complex prevailing row aggregation logic requiring careful validation of state transitions and index adjustments.
  • TimeFrameHelper public API expansion: Multiple new public methods for prevailing row tracking; verify consistency of frameIndex/rowId state management and backward compatibility of updated signatures.
  • AsyncWindowJoinRecordCursorFactory: Multiple new PageFrameReducer variants and dynamic selection logic; validate reducer selection conditions and prevailing data flow correctness.
  • WindowJoinFuzzTest oracle query logic: UNION-based subquery with LATEST ON construct is non-trivial; verify correctness of expected result generation and equivalence to prevailing join semantics.
  • TimeFrameHelperTest: Large new test suite for prevailing logic; verify edge cases and frame boundary conditions are adequately covered.

Possibly related PRs

Suggested labels

Enhancement

Suggested reviewers

  • puzpuzpuz

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(sql): support include prevailing in window join' clearly and concisely summarizes the main feature being implemented—adding support for the include prevailing clause in window join operations.
Description check ✅ Passed The PR description clearly relates to the changeset, describing the implementation of 'include prevailing' support in various window join factories with completed checklist items and test additions.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kafka1991 kafka1991 marked this pull request as draft December 1, 2025 06:30
@kafka1991 kafka1991 changed the title feat(core): Window join support include prevailing feat(sql): Window join support include prevailing Dec 1, 2025
@puzpuzpuz puzpuzpuz changed the title feat(sql): Window join support include prevailing feat(sql): support include prevailing in window join Dec 1, 2025
@puzpuzpuz puzpuzpuz added New feature Feature requests SQL Issues or changes relating to SQL execution labels Dec 1, 2025
@bluestreak01
Copy link
Copy Markdown
Member

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 12, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/src/main/java/io/questdb/cairo/TableWriter.java (1)

1041-1046: Fix wrong column index passed to SymbolMapWriter during type conversion.
At Line 1042, metadata.getColumnCount() - 1 refers to the last existing column at this point, not the new columnIndex (Line 1034). This can desync SymbolMapWriter.getColumnIndex() from its actual position in symbolMapWriters, which can break downstream logic that relies on getColumnIndex().

-                createSymbolMapWriter(columnName, columnNameTxn, symbolCapacity, symbolCacheFlag, false, metadata.getColumnCount() - 1);
+                createSymbolMapWriter(columnName, columnNameTxn, symbolCapacity, symbolCacheFlag, false, columnIndex);
🧹 Nitpick comments (8)
core/src/test/java/io/questdb/test/AbstractCairoTest.java (4)

1842-1860: Make expectedTimestamp @Nullable CharSequence for consistency (and avoid needless String-only API).
assertFactoryCursor(...) already forwards expectedTimestamp into assertTimestamp(CharSequence, ...), so keeping it as String is an unnecessary restriction/inconsistency with other helpers in this class.

 protected void assertFactoryCursor(
         CharSequence expected,
-        String expectedTimestamp,
+        @Nullable CharSequence expectedTimestamp,
         RecordCursorFactory factory,
         boolean supportsRandomAccess,
         SqlExecutionContext executionContext,
         boolean expectSize,
         boolean sizeCanBeVariable
 ) throws SqlException {

1891-1897: Widening expected/query to CharSequence looks good; consider widening expectedTimestamp too.
This overload now accepts CharSequence for expected/query, but still forces expectedTimestamp to be a String. Aligning it to @Nullable CharSequence would keep the API uniform.


1934-1939: assertQueryAndPlan(...) signature widening looks fine; align expectedTimestamp type with other helpers.
Same note: expectedTimestamp being String is the odd one out once expected/query/plan are CharSequence.


1966-1987: Compiler overload widening is OK; annotate/nullability for expectedTimestamp to match actual usage.
This overload is a good place to standardize expectedTimestamp as @Nullable CharSequence, since it’s passed through and other assertTimestamp(...) overloads are already CharSequence.

 protected void assertQueryNoLeakCheck(
         SqlCompiler compiler,
         CharSequence expected,
         CharSequence query,
-        String expectedTimestamp,
+        @Nullable CharSequence expectedTimestamp,
         SqlExecutionContext sqlExecutionContext,
         boolean supportsRandomAccess,
         boolean expectSize
 ) throws SqlException {
core/src/main/java/io/questdb/griffin/SqlParser.java (1)

704-713: Deduplicate prevailing-modifier parsing helpers. These two methods are copy/paste symmetric; consider one helper that takes the keyword predicate + sets the boolean, to reduce future drift.

-    private boolean isExcludePrevailing(GenericLexer lexer, CharSequence tok) throws SqlException {
-        if (isExcludeKeyword(tok)) {
-            tok = tok(lexer, "'prevailing'");
-            if (isPrevailingKeyword(tok)) {
-                return true;
-            }
-            throw SqlException.$(lexer.lastTokenPosition(), "'prevailing' expected");
-        }
-        return false;
-    }
+    private boolean parsePrevailingModifier(GenericLexer lexer, CharSequence tok, boolean include) throws SqlException {
+        if ((include && isIncludeKeyword(tok)) || (!include && isExcludeKeyword(tok))) {
+            tok = tok(lexer, "'prevailing'");
+            if (isPrevailingKeyword(tok)) {
+                return true;
+            }
+            throw SqlException.$(lexer.lastTokenPosition(), "'prevailing' expected");
+        }
+        return false;
+    }

Also applies to: 719-728

core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (3)

338-339: Avoid redundant static import of QueryModel.QUERY (may trip checkstyle).
You already have import static io.questdb.griffin.model.QueryModel.*;, so the explicit QUERY import is likely redundant and could be flagged by style tooling.

 import static io.questdb.griffin.model.QueryModel.*;
-import static io.questdb.griffin.model.QueryModel.QUERY;

3778-3824: Good: propagate includePrevailing into AsyncWindowJoinFastRecordCursorFactory creation.
Plumbing the flag here is the right hook point for prevailing-aware cursor behavior.

Consider assigning final boolean includePrevailing = context.isIncludePrevailing(); once and reusing it across the four constructor calls (less noise + less risk of argument misplacement in these long parameter lists).


3826-3870: Good: propagate includePrevailing into AsyncWindowJoinRecordCursorFactory creation.
This keeps the non-fast async path aligned with the fast path feature-wise.

Same note: prefer a local includePrevailing variable to reduce churn and argument-list fragility across overloads.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9da6a9 and 94b2cbf.

📒 Files selected for processing (24)
  • core/src/main/java/io/questdb/cairo/TableWriter.java (4 hunks)
  • core/src/main/java/io/questdb/cairo/map/MapValue.java (1 hunks)
  • core/src/main/java/io/questdb/cairo/sql/PageFrameMemoryRecord.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (6 hunks)
  • core/src/main/java/io/questdb/griffin/SqlKeywords.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/SqlOptimiser.java (0 hunks)
  • core/src/main/java/io/questdb/griffin/SqlParser.java (4 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinAtom.java (4 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinFastAtom.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java (12 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/TimeFrameHelper.java (13 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java (12 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/WindowJoinRecordCursorFactory.java (8 hunks)
  • core/src/main/java/io/questdb/griffin/model/QueryModel.java (4 hunks)
  • core/src/main/java/io/questdb/griffin/model/RuntimeIntervalModelBuilder.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/model/WindowJoinContext.java (2 hunks)
  • core/src/test/java/io/questdb/test/AbstractCairoTest.java (5 hunks)
  • core/src/test/java/io/questdb/test/TelemetryTest.java (1 hunks)
  • core/src/test/java/io/questdb/test/cairo/fuzz/ParallelWindowJoinFuzzTest.java (12 hunks)
  • core/src/test/java/io/questdb/test/cairo/fuzz/WindowJoinFuzzTest.java (5 hunks)
  • core/src/test/java/io/questdb/test/griffin/AlterTableAddColumnTest.java (1 hunks)
  • core/src/test/java/io/questdb/test/griffin/SqlParserTest.java (2 hunks)
  • core/src/test/java/io/questdb/test/griffin/engine/join/TimeFrameHelperTest.java (4 hunks)
💤 Files with no reviewable changes (1)
  • core/src/main/java/io/questdb/griffin/SqlOptimiser.java
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-10T14:28:48.329Z
Learnt from: mtopolnik
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2025-11-10T14:28:48.329Z
Learning: In AsOfJoinDenseRecordCursorFactoryBase.java, the `backwardScanExhausted` flag is intentionally NOT reset in `toTop()` because backward scan results are reusable across cursor rewinds. The backward scan caches historical matches that remain valid when the cursor is rewound.

Applied to files:

  • core/src/main/java/io/questdb/griffin/engine/join/TimeFrameHelper.java
  • core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java
  • core/src/main/java/io/questdb/griffin/model/QueryModel.java
  • core/src/main/java/io/questdb/griffin/engine/join/WindowJoinRecordCursorFactory.java
  • core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java
📚 Learning: 2025-11-19T12:21:00.062Z
Learnt from: jerrinot
Repo: questdb/questdb PR: 6413
File: core/src/test/java/io/questdb/test/cutlass/pgwire/PGJobContextTest.java:11982-12002
Timestamp: 2025-11-19T12:21:00.062Z
Learning: QuestDB Java tests use a deterministic random seed. The test utilities (e.g., io.questdb.test.tools.TestUtils and io.questdb.std.Rnd) produce reproducible sequences, so rnd_* functions (including rnd_uuid4) yield deterministic outputs across runs. Do not flag tests in core/src/test/** that assert against values produced by rnd_* as flaky due to randomness.

Applied to files:

  • core/src/test/java/io/questdb/test/TelemetryTest.java
🧬 Code graph analysis (6)
core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java (1)
core/src/main/java/io/questdb/std/Vect.java (1)
  • Vect (27-501)
core/src/main/java/io/questdb/griffin/engine/join/TimeFrameHelper.java (1)
core/src/main/java/io/questdb/std/Rows.java (1)
  • Rows (27-44)
core/src/main/java/io/questdb/griffin/model/RuntimeIntervalModelBuilder.java (1)
core/src/main/java/io/questdb/std/Numbers.java (1)
  • Numbers (44-3523)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (2)
core/src/main/java/io/questdb/griffin/model/QueryModel.java (1)
  • QueryModel (62-2210)
core/src/main/java/io/questdb/std/Numbers.java (1)
  • Numbers (44-3523)
core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java (2)
core/src/main/java/io/questdb/std/Vect.java (1)
  • Vect (27-501)
core/src/main/java/io/questdb/std/Rows.java (1)
  • Rows (27-44)
core/src/main/java/io/questdb/griffin/engine/join/WindowJoinRecordCursorFactory.java (2)
core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java (1)
  • AsyncWindowJoinRecordCursorFactory (76-1741)
core/src/main/java/io/questdb/std/Rows.java (1)
  • Rows (27-44)
🔇 Additional comments (45)
core/src/test/java/io/questdb/test/AbstractCairoTest.java (1)

1959-1964: assertQueryFullFatNoLeakCheck(...) CharSequence change is safe.
No functional concerns here; it cleanly delegates to the compiler overload.

core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java (1)

48-56: The logic is sound and appropriate for window join operations.

This change correctly improves null/NaN handling in computeBatch() by explicitly counting non-null values with Vect.countDouble() and only writing aggregates when nonNullCount > 0. This ensures consistency with computeFirst() (which checks Numbers.isFinite()) and aligns with the merge() method's logic, which already guards against zero counts.

For double values, this distinction is important since NaN values must be excluded from both the sum and count. The change is directly relevant to window join operations with "include prevailing," where batch aggregation must handle null/NaN values consistently with row-by-row aggregation paths.

core/src/test/java/io/questdb/test/griffin/AlterTableAddColumnTest.java (1)

874-874: No issues found. The symbolTableSize = 1 for the newly added SYMBOL column "second" is the correct expected behavior. Newly created SYMBOL columns are initialized with a symbol table containing one entry (reserved for null/empty values), which is consistent with QuestDB's symbol handling design. The test expectation is accurate.

core/src/main/java/io/questdb/griffin/model/WindowJoinContext.java (2)

41-41: LGTM: Default strategy changed to include prevailing.

The default value change from false to true aligns with the PR objective to make "include prevailing" the default strategy for window joins.


65-65: LGTM: Reset logic updated consistently.

The reset value in clear() correctly matches the new default initialization value, ensuring consistent behavior after context reuse.

core/src/main/java/io/questdb/cairo/sql/PageFrameMemoryRecord.java (1)

485-485: LGTM: Variable scope appropriately narrowed.

Declaring key directly within the conditional block improves code clarity by limiting its scope to where it's actually used.

Also applies to: 495-495

core/src/main/java/io/questdb/cairo/map/MapValue.java (1)

128-129: Optional hook method added for implementations.

The empty default implementation follows the standard pattern for adding optional interface methods without breaking existing implementations. Implementations that need to track "new" state can override this method.

core/src/test/java/io/questdb/test/cairo/fuzz/ParallelWindowJoinFuzzTest.java (1)

66-432: LGTM: Comprehensive test coverage for prevailing semantics.

The tests now validate both default (include prevailing) and explicit EXCLUDE PREVAILING behaviors across multiple scenarios including filtered queries, symbol joins, bind variables, and vectorized execution. The consistent pattern of paired query-result blocks ensures thorough coverage.

core/src/test/java/io/questdb/test/griffin/SqlParserTest.java (2)

1655-1666: The test expectations in SqlParserTest.java correctly reflect the current default behavior where window joins default to exclude prevailing. The stringification code always outputs the prevailing mode explicitly, so there is no brittleness risk from silent defaults. Comprehensive parameterized test coverage for both include and exclude modes exists in WindowJoinTest.java. No changes needed to these tests.


12124-12178: All assertions in this section include exclude prevailing in the expected output, making the originally identified inconsistency not present in this code. If there is an actual inconsistency with exclude prevailing handling, it may involve test cases outside this range or a different aspect of window join formatting that warrants investigation.

core/src/main/java/io/questdb/griffin/model/QueryModel.java (2)

1112-1115: Accessor/setter additions look consistent
isCteModel() and setIsCteModel(boolean) are straightforward, and the setIsUpdate(boolean) move doesn’t appear to change behavior.

Also applies to: 1378-1385


1947-1951: Confirm plan/sink wording change is intentional and stable for tests
Switching to " include prevailing" / " exclude prevailing" will change string outputs used by plan/sink-based assertions; worth double-checking any snapshot/plan tests rely on the old phrasing.

core/src/main/java/io/questdb/cairo/TableWriter.java (3)

3294-3312: New SYMBOL column on non-empty table: symbolNullValue = getRowCount() > 0 seems reasonable; confirm semantics.
Line 3298: for an added SYMBOL column, existing rows are effectively NULL (due to column tops), so persisting a “has nulls” flag up-front is consistent with the on-disk reality.

Two quick things to double-check:

  • That MapWriter.put(null) (or equivalent) still flips the null-flag for the “empty table at creation” case.
  • That readers/recovery paths interpret this flag as “may contain NULL” (safe if true) vs “must contain NULL” (could be problematic).

1-40: This file change looks unrelated to the PR objective (window join include prevailing) — please confirm it’s intended.
The modifications here are about SYMBOL null handling in TableWriter, not window joins.


4307-4341: createSymbolMapWriter: Null handling during symbol conversion is correctly managed.

The hard-coded symbolNullValue=false at line 1042 is safe and intentional. During conversion, when convertFixedToSymbol0 encounters NULL values, it calls symbolMapWriter.resolveSymbol(null), which delegates to TableWriter.getSymbolIndexNoTransientCountUpdate()SymbolMapWriter.put(null, ...). The put() method automatically calls updateNullFlag(true) whenever a NULL symbol is added (line 242), dynamically correcting the flag if needed. The pattern of initializing with false and allowing dynamic correction during data processing is sound.

core/src/main/java/io/questdb/griffin/SqlParser.java (2)

3245-3249: Comment formatting change looks fine.


2949-2961: Verify the necessity of lexer.unparseLast() when optTok() returns null.

The code calls unparseLast() both when a token is read but rejected (lines 2955-2956) and when no token is returned (lines 2957-2959). However, unparseLast() only acts when last != null, which means when optTok() returns null at end-of-input, it will unparse the previously processed token rather than a token that was actually read and rejected. This differs from the intent on lines 2950-2956, where a token was read and determined not to be INCLUDE/EXCLUDE.

Consider whether the else block (lines 2957-2959) is necessary. If the lexer position doesn't require resetting when optTok() returns null, this call can be removed.

core/src/test/java/io/questdb/test/cairo/fuzz/WindowJoinFuzzTest.java (2)

74-74: LGTM! Good fuzz test coverage for the new flag.

The random selection of includePrevailing ensures both code paths are exercised during fuzz testing.


186-228: Complex oracle query for includePrevailing - verify correctness.

The UNION-based approach to emulate ASOF JOIN with join filters using LEFT JOIN + LATEST ON is complex. A few observations:

  1. Line 187: The first_not_null replacement is intentional since prevailing rows may contribute non-null values where the standard window would have nulls.

  2. Line 227: The LATEST ON pts PARTITION BY ts, sym, price, id clause partitions by master row identifiers to find one prevailing row per master row, which is the correct approach for emulating ASOF semantics.

  3. The query structure unions:

    • Standard LEFT JOIN results (rows within the window)
    • LATEST ON results (the single prevailing row before the window)

This is a reasonable oracle for testing purposes.

Consider adding a comment explaining the oracle query construction strategy, as the UNION + LATEST ON approach is non-obvious. This would help future maintainers understand why this complex structure is needed.

core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinAtom.java (1)

593-613: LGTM! Plan output correctly reflects the prevailing flag.

The toPlan method now outputs whether prevailing rows are included or excluded, which aids in query plan debugging and analysis.

core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinFastAtom.java (1)

65-114: LGTM! Correctly forwards the new parameter to superclass.

The includePrevailing parameter is properly added to the constructor signature and forwarded to the parent AsyncWindowJoinAtom constructor.

core/src/test/java/io/questdb/test/griffin/engine/join/TimeFrameHelperTest.java (3)

141-348: Excellent test coverage for prevailing candidate functionality.

The new tests comprehensively cover:

  • Basic prevailing recording
  • Cross-frame scenarios
  • Edge cases (no prevailing, all rows below range, exact matches)
  • Binary search vs linear scan code paths
  • Frame boundary conditions

The test method naming clearly describes each scenario being tested.


532-574: Well-designed fuzz test for prevailing functionality.

The fuzz test correctly implements the expected semantics:

  1. First searches for the last row with timestamp <= timestampLo (prevailing candidate)
  2. Falls back to the first row in [timestampLo, timestampHi] if no prevailing exists

This matches the expected behavior of findRowLoWithPrevailing and provides good randomized coverage.


617-675: Thorough sequential bookmark test covering all edge cases.

The testFindRowLoWithPrevailingMultiFrameSequentialBookmark test methodically covers:

  • No prevailing scenarios
  • Exact matches at frame boundaries
  • Mid-frame positions
  • Gaps between frames
  • Past all frames

This is excellent regression test coverage.

core/src/main/java/io/questdb/griffin/model/RuntimeIntervalModelBuilder.java (1)

226-243: Unable to locate file for verification. The file RuntimeIntervalModelBuilder.java and its code snippet cannot be found in the repository despite extensive searching. The review comment cannot be verified without access to the actual code being reviewed.

core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (3)

3874-3891: Good: propagate includePrevailing into WindowJoinFastRecordCursorFactory creation.
Ensures single-thread fast window join behavior matches the async fast path.


3893-3907: Good: propagate includePrevailing into WindowJoinRecordCursorFactory creation.
Completes the flag propagation across single-threaded non-fast window join.


6835-6845: Correctness: lo = Numbers.LONG_NULL for include prevailing avoids interval pruning that could drop needed prevailing rows.
This looks like the right behavior for “prevailing” semantics (you may need rows earlier than the nominal lo window to find the prevailing row), and the time-unit conversion is correctly skipped in that branch.

Please ensure there’s coverage (unit/fuzz/regression) that exercises partition/interval pruning with WINDOW ... include prevailing across partition boundaries (i.e., prevailing row in an earlier partition than the nominal lo would allow), since this change is specifically guarding that behavior.

core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java (3)

173-183: LGTM! Clear reducer selection logic.

The branching structure for selecting the appropriate reducer based on includePrevailing, masterFilter, joinFilter, and vectorization is well-organized and covers all necessary combinations.


1687-1731: Well-structured bookmark save/restore pattern in findPrevailingForMasterRow.

The helper correctly saves and restores the TimeFrameHelper bookmark state in a try-finally block, ensuring cursor position is preserved even if an exception occurs. The backward scan logic through frames is sound.

One observation: the method is package-private (static) but appears to be used from WindowJoinRecordCursorFactory via static import. This is appropriate for cross-class sharing within the package.


593-597: Verify whether rows 825–828 and 1588–1591 require the prevailing formula.

The formula Math.max(-rowLo - 2, 0) is applied at lines 595, 708, 1317, and 1450 for prevailing-row logic with group-by computation. However, lines 825–828 and 1588–1591 use the same binary search pattern with masterSlaveTimestampLo/Hi but apply -rowLo - 1 instead. These latter sections perform an exact-match check (timestamps.get(rowLo) == masterSlaveTimestampLo) after the search. Confirm whether the prevailing formula should be applied here as well, or if the different formula is intentional for this exact-match code path.

core/src/main/java/io/questdb/griffin/engine/join/WindowJoinRecordCursorFactory.java (4)

324-329: LGTM! Clean conditional for prevailing path selection.

The branching between findRowLoWithPrevailing and findRowLo based on includePrevailing is straightforward and maintains backward compatibility.


458-471: Potential issue: slaveRowId may be Long.MIN_VALUE when only prevailing exists.

When slaveRowId == Long.MIN_VALUE && prevailingFrameIndex != -1, the code correctly calls findPrevailingForMasterRow and returns. However, this early return means no in-window rows are processed, which is the intended behavior when there are no rows in the window but a prevailing row exists.


477-488: Verify slaveRowId increment after exact match at window boundary.

After checking if slaveTimestamp == slaveTimestampLo and the join filter passes (lines 480-487), slaveRowId is incremented. This is correct to avoid double-processing the boundary row. However, if slaveRowId now equals getTimeFrameRowHi(), the subsequent loop (lines 503-528) handles frame advancement properly.


503-528: Frame boundary check moved inside the loop.

Unlike the parent class where ++slaveRowId and frame boundary check happen at loop end, here the check is at loop start (line 505). This correctly handles the case where slaveRowId was incremented after processing the boundary row before the loop began.

core/src/main/java/io/questdb/griffin/engine/join/TimeFrameHelper.java (5)

68-73: Correct prevailing candidate recording during binary search.

When an exact match is found at mid, the code correctly records index - 1 as the prevailing candidate (the row just before the match). This aligns with the "prevailing" semantics.


207-333: Complex but well-structured findRowLoWithPrevailing method.

The method correctly:

  1. Tracks prevailing candidates across frames during navigation
  2. Returns the prevailing row when it's the best match (no in-window rows)
  3. Updates bookmarks appropriately for subsequent calls

The use of local prevailingRowId and prevailingFrameIndex variables (lines 210-211) that shadow instance fields is intentional—these track candidates during search before committing to instance state.


266-277: Verify the return value when frame starts after timestampLo.

When frameTsLo > timestampLo (line 266), if a prevailing candidate exists, the code returns prevailingRowId (line 272). However, the record is positioned at the prevailing row, but the return value is prevailingRowId which is a local row index within a frame, not a global row ID.

This appears intentional since the caller expects a local row index and uses slaveTimeFrameHelper.getTimeFrameIndex() separately to construct the full row ID.


397-409: LGTM! previousFrame navigation helper.

The method correctly decrements the bookmarked frame index, opens the previous frame, and positions at the last row. The return value correctly indicates whether navigation succeeded.


537-570: Encoding convention for linearScanWithPrevailing return values.

The method uses:

  • >= 0: found row index
  • Long.MIN_VALUE: no prevailing found
  • Negative other: -(last scanned row) - 2 to signal "need binary search"

The decoding at line 321 (-scanResult - 2) correctly recovers scanHi - 1, which is the last scanned position.

core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java (5)

762-785: Good! Prevailing initialization for vectorized path.

When includePrevailing is true, the code calls findInitialPrevailingRowsVect to pre-populate prevailing rows into the slave data structures before processing in-window rows. This ensures prevailing data is available for subsequent aggregation.


841-841: Prevailing row boundary in vectorized cursor.

The conditional includePrevailing ? Math.max(-rowLo - 2, 0) : -rowLo - 1 correctly applies the prevailing adjustment only when enabled, maintaining backward compatibility for the non-prevailing case.


1039-1057: Prevailing search with join filter - correct fallback to frame scan.

When in-window rows don't match the join filter and cached rows are exhausted, the code correctly falls back to findPrevailingForMasterRow which scans earlier frames. The bookmark save/restore in that helper ensures cursor state is preserved.


1174-1186: WindowJoinWithPrevailingFastRecordCursor uses findInitialPrevailingRows.

Unlike the join-filtered variant, this cursor pre-populates prevailing rows via findInitialPrevailingRows regardless of whether slaveRowId != Long.MIN_VALUE. This ensures prevailing candidates are available even when there are no in-window rows.


510-515: Index layout differs intentionally between cursor variants—no inconsistency found.

Non-vectorized cursors (WindowJoinFastRecordCursor and its prevailing variants) consistently use: idx=0 for rowIDs, idx=1 for timestamps, idx=2 for rowLo. The vectorized cursor (WindowJoinFastVectRecordCursor) uses a different layout: idx=0 for timestamps, idx=1 for rowLo, idx=2+ for aggregation columns—intentionally omitting rowIDs since it aggregates directly without row tracking. Both layouts are internally consistent within their respective implementations.

Comment on lines 175 to 180
private boolean forceBackwardScan;
private boolean isCteModel;
//simple flag to mark when limit x,y in current model (part of query) is already taken care of by existing factories e.g. LimitedSizeSortedLightRecordCursorFactory
//and doesn't need to be enforced by LimitRecordCursor. We need it to detect whether current factory implements limit from this or inner query .
private boolean isLimitImplemented;
// A flag to mark intermediate SELECT translation models. Such models do not contain the full list of selected
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

isCteModel must be included in equals() and hashCode()
You added isCteModel and correctly reset it in clear(), but it’s not accounted for in equals() (test helper) or hashCode(), which contradicts the class contract comment and can mask future reset/regression issues.

--- a/core/src/main/java/io/questdb/griffin/model/QueryModel.java
+++ b/core/src/main/java/io/questdb/griffin/model/QueryModel.java
@@ -659,6 +659,7 @@
                 && orderByAdviceMnemonic == that.orderByAdviceMnemonic
                 && tableId == that.tableId
                 && isUpdateModel == that.isUpdateModel
+                && isCteModel == that.isCteModel
                 && modelType == that.modelType
                 && artificialStar == that.artificialStar
                 && skipped == that.skipped
@@ -1100,7 +1101,7 @@
                 modelPosition, orderByAdviceMnemonic, tableId,
-                isUpdateModel, modelType, updateTableModel,
+                isUpdateModel, isCteModel, modelType, updateTableModel,
                 updateTableToken, artificialStar, fillFrom, fillStride, fillTo, fillValues, decls, windowJoinContext
         );
     }

Also applies to: 480-506, 618-727, 1069-1106

🤖 Prompt for AI Agents
In core/src/main/java/io/questdb/griffin/model/QueryModel.java around lines
175-180 (and also review sections 480-506, 618-727, 1069-1106), the newly added
boolean field isCteModel is reset in clear() but not included in equals() or
hashCode(); update all equals() implementations and the hashCode() computation
to include isCteModel so object equality and hashing reflect this state, and
ensure any test helper equals/hashCode variants in the ranges noted are updated
similarly to compare the isCteModel flag and incorporate it into the hash
calculation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll handle this one.

@puzpuzpuz puzpuzpuz merged commit cd56fe2 into puzpuzpuz_parallel_window_join_poc Dec 16, 2025
5 checks passed
@puzpuzpuz puzpuzpuz deleted the window_join_include_prevailing branch December 16, 2025 14:01
@puzpuzpuz
Copy link
Copy Markdown
Contributor

Closing in favor of #6292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

New feature Feature requests SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants