Skip to content

feat(sql): join enhancements#6076

Merged
bluestreak01 merged 38 commits intomasterfrom
victor_hash_join
Sep 30, 2025
Merged

feat(sql): join enhancements#6076
bluestreak01 merged 38 commits intomasterfrom
victor_hash_join

Conversation

@kafka1991
Copy link
Copy Markdown
Collaborator

@kafka1991 kafka1991 commented Aug 25, 2025

Problem

In our current HashJoin implementation, we always use the right table as the buildTable and the left table as the probe. This leads to suboptimal performance in scenarios where the right table is significantly larger than the left table. However, due to the lack of statistical information, we are currently unable to perform join reordering at the optimizer level.

Enhancements

This PR introduces the following enhancements:

  • For HashLightInnerJoin and HashLightFullOuterJoin, we now support limited runtime reordering of left and right cursor. This allows the join to dynamically choose the more efficient build/probe table combination in certain scenarios.
  • Added support for Right Outer Join, providing users with an alternative when Left Outer Join performance is unsatisfactory.
  • Implemented Full Outer Join functionality, expanding the range of join operations supported.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Aug 25, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch victor_hash_join

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

  • Public repositories are always opted into early access features.
  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

Comment @coderabbitai help to get the list of available commands and usage tips.

@kafka1991 kafka1991 self-assigned this Aug 25, 2025
@kafka1991 kafka1991 marked this pull request as draft August 25, 2025 13:06
@kafka1991 kafka1991 changed the title feat(query): join enhancements feat(sql): join enhancements Aug 25, 2025
@kafka1991 kafka1991 added Enhancement Enhance existing functionality Performance Performance improvements labels Aug 25, 2025
@kafka1991 kafka1991 marked this pull request as ready for review August 29, 2025 12:27
@puzpuzpuz
Copy link
Copy Markdown
Contributor

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Sep 5, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/src/main/java/io/questdb/griffin/engine/join/HashJoinRecordCursorFactory.java (1)

199-213: Add circuit-breaker checks to long-running loops.

Both the probe loop and size computation can scan large masters without touching the circuit breaker, risking uninterruptible work.

Apply:

@@ public class HashJoinRecordCursorFactory extends AbstractJoinRecordCursorFactory {
-            while (masterCursor.hasNext()) {
+            while (masterCursor.hasNext()) {
+                circuitBreaker.statefulThrowExceptionIfTripped();
                 MapKey key = joinKeyMap.withKey();
                 key.put(masterRecord, masterKeySink);
                 MapValue value = key.findValue();
                 if (value != null) {
                     slaveChain.of(value.getInt(0));
@@
-    private static long computeCursorSizeFromMap(RecordCursor masterCursor, Map map, RecordSink keySink) {
+    private static long computeCursorSizeFromMap(RecordCursor masterCursor, Map map, RecordSink keySink, SqlExecutionCircuitBreaker circuitBreaker) {
         final Record masterRecord = masterCursor.getRecord();
         long size = 0;
         try {
             masterCursor.toTop();
-            while (masterCursor.hasNext()) {
+            while (masterCursor.hasNext()) {
+                circuitBreaker.statefulThrowExceptionIfTripped();
                 MapKey key = map.withKey();
                 key.put(masterRecord, keySink);
                 MapValue value = key.findValue();
                 if (value != null) {
                     size += value.getLong(2);
                 }
             }
             return size;
         } finally {
             masterCursor.toTop();
         }
     }
@@         public long size() {
-            return size = computeCursorSizeFromMap(masterCursor, joinKeyMap, masterKeySink);
+            return size = computeCursorSizeFromMap(masterCursor, joinKeyMap, masterKeySink, circuitBreaker);

Also applies to: 130-147, 226-227

core/src/main/java/io/questdb/griffin/SqlOptimiser.java (1)

1454-1464: Fix joinsRequiringTimestamp mapping length in SqlOptimiser
The static boolean array at SqlOptimiser.java:121 only has 11 entries (indexes 0–10), but QueryModel.JOIN_MAX is 12, so you must add two trailing false values for JOIN_CROSS_RIGHT (11) and JOIN_CROSS_FULL (12).

🧹 Nitpick comments (55)
core/src/test/java/io/questdb/test/griffin/SqlParserTest.java (4)

62-62: Join-type parameterization field: solid, with a small enhancement suggestion

The list is fine for now. Consider also covering the “outer” keyword forms in a follow-up (e.g., "left outer", "right outer", "full outer") to ensure the parser accepts both synonyms.


6247-6249: Add negative cases including the ‘outer’ keyword

You validate missing ON for “left/right/full join”. Add the equivalent “left/right/full outer join” variants to ensure identical diagnostics.

Also applies to: 6253-6256


7348-7355: Rename tests for clarity

Tests now exercise left/right/full joins but still read “LeftOuterJoin”. Rename to reflect multi-variant coverage (e.g., testOuterJoinVariants) to reduce confusion.

Also applies to: 8589-8595


12143-12152: Use String.replace instead of replaceAll

You’re doing a literal token swap; replaceAll incurs regex parsing and has edge cases with replacement metacharacters. replace is cheaper and aligns with assertWindowQuery’s approach.

Apply this diff:

-    for (String outerJoinType : outerJoinTypes) {
-            assertQuery(
-                    expected.replaceAll("#OUTER_JOIN_TYPE", outerJoinType),
-                    query.replaceAll("#OUTER_JOIN_TYPE", outerJoinType),
-                    tableModels
-            );
-        }
+    for (String outerJoinType : outerJoinTypes) {
+        assertQuery(
+            expected.replace("#OUTER_JOIN_TYPE", outerJoinType),
+            query.replace("#OUTER_JOIN_TYPE", outerJoinType),
+            tableModels
+        );
+    }
core/src/main/java/io/questdb/griffin/model/QueryModel.java (2)

1776-1801: Add explicit rendering for CROSS_ join types in toSink0*

JOIN_CROSS_LEFT/RIGHT/FULL currently fall through to default " join ", which misrepresents the join in textual output. Emit " cross join " for these as well.

                         switch (model.getJoinType()) {
+                            case JOIN_CROSS_LEFT:
+                            case JOIN_CROSS_RIGHT:
+                            case JOIN_CROSS_FULL:
+                                sink.putAscii(" cross join ");
+                                break;
                             case JOIN_LEFT_OUTER:
                                 sink.putAscii(" left join ");
                                 break;
                             case JOIN_RIGHT_OUTER:
                                 sink.putAscii(" right join ");
                                 break;
                             case JOIN_FULL_OUTER:
                                 sink.putAscii(" full join ");
                                 break;
                             case JOIN_ASOF:
                                 sink.putAscii(" asof join ");
                                 break;
                             case JOIN_SPLICE:
                                 sink.putAscii(" splice join ");
                                 break;
                             case JOIN_CROSS:
                                 sink.putAscii(" cross join ");
                                 break;

804-807: Annotate getJoinContext() as @nullable

Matches usage check at Line 1817 and existing style (e.g., getAsOfJoinTolerance()).

-    public JoinContext getJoinContext() {
+    @Nullable
+    public JoinContext getJoinContext() {
         return context;
     }
core/src/test/java/io/questdb/test/griffin/GroupByTest.java (1)

1470-1619: Nice parametrization; consider adding a negative-match case for RIGHT/FULL

Data uses 1..10 on both sides, so LEFT/RIGHT/FULL yield identical results. Add a case with extra rows on one side to assert null-extended semantics for RIGHT/FULL too.

I can add an extra id in dim_apTemperature or fact_table and extend expectedResult for each join type.

core/src/main/java/io/questdb/griffin/engine/join/AbstractHashOuterJoinRecordCursor.java (2)

90-114: Replace magic indices in MapValue with named constants

Indexes 0/1/2/3 are non-obvious. Define constants to prevent future mistakes.

@@
 public abstract class AbstractHashOuterJoinRecordCursor extends AbstractJoinCursor {
     protected final Map joinKeyMap;
     protected final RecordChain slaveChain;
+    // MapValue layout:
+    // 0: head offset in chain, 1: tail offset, 2: count, 3: match flag (optional)
+    protected static final int VAL_HEAD_OFFSET = 0;
+    protected static final int VAL_TAIL_OFFSET = 1;
+    protected static final int VAL_COUNT       = 2;
+    protected static final int VAL_MATCH_FLAG  = 3;
@@
-                value.putLong(0, offset);
-                value.putLong(1, offset);
-                value.putLong(2, 1);
+                value.putLong(VAL_HEAD_OFFSET, offset);
+                value.putLong(VAL_TAIL_OFFSET, offset);
+                value.putLong(VAL_COUNT, 1);
             } else {
-                value.putLong(1, chain.put(record, value.getLong(1)));
-                value.addLong(2, 1);
+                value.putLong(VAL_TAIL_OFFSET, chain.put(record, value.getLong(VAL_TAIL_OFFSET)));
+                value.addLong(VAL_COUNT, 1);
             }
@@
-                value.putLong(0, offset);
-                value.putLong(1, offset);
-                value.putLong(2, 1);
-                value.putBool(3, false);
+                value.putLong(VAL_HEAD_OFFSET, offset);
+                value.putLong(VAL_TAIL_OFFSET, offset);
+                value.putLong(VAL_COUNT, 1);
+                value.putBool(VAL_MATCH_FLAG, false);
             } else {
-                value.putLong(1, chain.put(record, value.getLong(1)));
-                value.addLong(2, 1);
+                value.putLong(VAL_TAIL_OFFSET, chain.put(record, value.getLong(VAL_TAIL_OFFSET)));
+                value.addLong(VAL_COUNT, 1);
             }

Also applies to: 116-141


143-157: Minor: clarify slaveRecord source

Using slaveChain.getRecord() is intentional (readback from chain). Add a brief comment for future readers.

-        slaveRecord = slaveChain.getRecord();
+        // read joined rows back from the chain, not directly from slaveCursor
+        slaveRecord = slaveChain.getRecord();
core/src/main/java/io/questdb/cairo/ArrayColumnTypes.java (1)

31-31: Make ArrayColumnTypes.EMPTY unmodifiable
Prevent any accidental add/addAll/clear calls on the shared EMPTY instance by replacing it with a read-only subclass:

-    public static final ArrayColumnTypes EMPTY = new ArrayColumnTypes();
+    public static final ArrayColumnTypes EMPTY = new UnmodifiableArrayColumnTypes();
// below existing methods in ArrayColumnTypes.java
private static final class UnmodifiableArrayColumnTypes extends ArrayColumnTypes {
    @Override public ArrayColumnTypes add(int type)              { throw new UnsupportedOperationException("EMPTY is read-only"); }
    @Override public ArrayColumnTypes add(int index, int type)   { throw new UnsupportedOperationException("EMPTY is read-only"); }
    @Override public ArrayColumnTypes addAll(ArrayColumnTypes c) { throw new UnsupportedOperationException("EMPTY is read-only"); }
    @Override public void clear()                                { throw new UnsupportedOperationException("EMPTY is read-only"); }
}
core/src/test/java/io/questdb/test/griffin/SqlCodeGeneratorTest.java (2)

2406-2511: RIGHT JOIN ON-clause test looks good; consider minor resiliency/readability tweaks.

  • To avoid future ambiguity, qualify projected/filtered columns (l.x, rr.y) in both SELECT and ON expressions.
  • Optional: the expected output depends on the engine’s internal iteration order (no ORDER BY). If runtime join reordering changes row emission order, this may become brittle. Consider stabilizing with an explicit ORDER BY rr.y and updating the expected, or adding a companion assertion that checks set-equality to reduce brittleness.

Proposed minimal qualification-only edit:

- "select x, y from l right join rr on l.x = rr.y and (y > 0 or y > 10)",
+ "select l.x, rr.y from l right join rr on l.x = rr.y and (rr.y > 0 or rr.y > 10)",

2582-2690: RIGHT JOIN with WHERE-clause filtering: semantics covered; qualify columns for clarity and consider order stability.

  • Qualify x/y to l.x/rr.y in SELECT and WHERE to prevent future name clashes and improve readability.
  • As above, lack of ORDER BY makes the long expected block sensitive to internal ordering. Consider adding ORDER BY rr.y (and updating the expected) or supplementing with an order-agnostic check.

Proposed minimal qualification-only edit:

- "select x, y\n" +
-         "from l right join rr on l.x = rr.y\n" +
-         "where y > 0 or y > 10",
+ "select l.x, rr.y\n" +
+         "from l right join rr on l.x = rr.y\n" +
+         "where rr.y > 0 or rr.y > 10",
core/src/main/java/io/questdb/cairo/RecordIdSink.java (1)

31-36: Avoid exposing a mutable singleton; return an immutable view instead.

RECORD_ID_COLUMN_TYPE is publicly mutable. Any accidental add/remove elsewhere will desync the sink from its declared key schema.

Apply:

@@
-package io.questdb.cairo;
+package io.questdb.cairo;
 
 import io.questdb.cairo.sql.Function;
 import io.questdb.cairo.sql.Record;
+import io.questdb.cairo.ColumnTypes;
 import io.questdb.std.ObjList;
 
 public class RecordIdSink implements RecordSink {
-    public static final ArrayColumnTypes RECORD_ID_COLUMN_TYPE = new ArrayColumnTypes();
+    private static final ArrayColumnTypes RECORD_ID_COLUMN_TYPES = new ArrayColumnTypes(1);
     public static final RecordSink RECORD_ID_SINK = new RecordIdSink();
@@
     static {
-        RECORD_ID_COLUMN_TYPE.add(ColumnType.LONG);
+        RECORD_ID_COLUMN_TYPES.add(ColumnType.LONG);
     }
+
+    public static ColumnTypes recordIdColumnTypes() {
+        return RECORD_ID_COLUMN_TYPES;
+    }
 }

Also applies to: 47-49

core/src/test/java/io/questdb/test/griffin/engine/join/HashJoinTest.java (1)

121-137: Stabilize result ordering to avoid flakes.

The assertions depend on row order without ORDER BY. Runtime join reordering may change enumeration order. Add ORDER BY to make expectations deterministic.

Apply:

-            assertQueryNoLeakCheck("i\tlocale_name\ti1\tstate\tcity\n" +
-                    "1\tpl\t1\ta\tpl\n", "select * from taba left join tabb on taba.i = tabb.i and (locale_name = state OR locale_name=city)", null);
+            assertQueryNoLeakCheck("i\tlocale_name\ti1\tstate\tcity\n" +
+                    "1\tpl\t1\ta\tpl\n", "select * from taba left join tabb on taba.i = tabb.i and (locale_name = state OR locale_name=city) order by 3,4,5", null);
-            assertQueryNoLeakCheck("i\tlocale_name\ti1\tstate\tcity\n" +
+            assertQueryNoLeakCheck("i\tlocale_name\ti1\tstate\tcity\n" +
                     "1\tpl\t1\ta\tpl\n" +
-                    "null\t\t1\tb\tb\n", "select * from taba right join tabb on taba.i = tabb.i and (locale_name = state OR locale_name=city)", null);
+                    "null\t\t1\tb\tb\n", "select * from taba right join tabb on taba.i = tabb.i and (locale_name = state OR locale_name=city) order by 3,4,5", null);
-            assertQueryNoLeakCheck("i\tlocale_name\ti1\tstate\tcity\n" +
+            assertQueryNoLeakCheck("i\tlocale_name\ti1\tstate\tcity\n" +
                     "1\tpl\t1\ta\tpl\n" +
-                    "null\t\t1\tb\tb\n", "select * from taba full join tabb on taba.i = tabb.i and (locale_name = state OR locale_name=city)", null);
+                    "null\t\t1\tb\tb\n", "select * from taba full join tabb on taba.i = tabb.i and (locale_name = state OR locale_name=city) order by 3,4,5", null);
core/src/main/java/io/questdb/griffin/engine/join/RightOuterJoinRecord.java (1)

29-36: Lock down inheritance and annotate override.

This record is a utility wrapper; making it final and adding @OverRide improves clarity and inlining chances.

Apply:

-public class RightOuterJoinRecord extends JoinRecord {
+public final class RightOuterJoinRecord extends JoinRecord {
@@
-    public void of(Record master, Record slave) {
+    @Override
+    public void of(Record master, Record slave) {
         super.of(master, slave);
         this.flappingMaster = master;
     }

Also applies to: 38-41

core/src/main/java/io/questdb/griffin/SqlParser.java (2)

2584-2603: RIGHT/FULL handling looks correct; consider de-duplicating token consumption

Logic for LEFT/RIGHT/FULL with optional OUTER is fine. To cut repetition and keep error messaging consistent, consider a small helper to “consume optional OUTER then require JOIN,” used by all three branches. This will also centralize the "'join' expected" check.


4259-4266: Duplicate joinStartSet mappings for LEFT/RIGHT/FULL

You put LEFT/RIGHT/FULL into joinStartSet twice: first mapped to JOIN_INNER, then to their OUTER variants. The latter overwrites the former, so the earlier puts are dead and can confuse future maintainers. Remove the inner mappings to keep intent clear.

Apply:

-        joinStartSet.put("left", QueryModel.JOIN_INNER);
-        joinStartSet.put("right", QueryModel.JOIN_INNER);
-        joinStartSet.put("full", QueryModel.JOIN_INNER);
         joinStartSet.put("join", QueryModel.JOIN_INNER);
         joinStartSet.put("inner", QueryModel.JOIN_INNER);
         joinStartSet.put("left", QueryModel.JOIN_LEFT_OUTER);
         joinStartSet.put("right", QueryModel.JOIN_RIGHT_OUTER);
         joinStartSet.put("full", QueryModel.JOIN_FULL_OUTER);
core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java (4)

38-40: Imports: prefer immutable factory; drop Arrays

If you switch to List.of(...) below, you can remove import java.util.Arrays;.

-import java.util.Arrays;
 import java.util.List;

60-60: Use List.of for immutability and clarity

No mutation here; List.of("left","right","full") is simpler and avoids regex pitfalls with replaceAll later if ever expanded.

-private static final List<String> outerJoinTypes = Arrays.asList("left", "right", "full");
+private static final List<String> outerJoinTypes = List.of("left", "right", "full");

434-480: Good parametrization; use replace instead of replaceAll (no regex needed)

Literal marker replacement is clearer and avoids regex overhead.

-).replaceAll("#JOIN_TYPE", joinType);
+).replace("#JOIN_TYPE", joinType);
-"        Hash #JOIN_TYPE Outer Join Light\n".replaceAll("#JOIN_TYPE", Character.toUpperCase(joinType.charAt(0)) + joinType.substring(1)) +
+"        Hash #JOIN_TYPE Outer Join Light\n".replace("#JOIN_TYPE", Character.toUpperCase(joinType.charAt(0)) + joinType.substring(1)) +

1143-1159: RIGHT JOIN result test may not validate outer semantics

The WHERE predicates on el.* null-reject unmatched right rows, effectively making this behave like INNER JOIN for results. If the goal is to validate RIGHT OUTER result emission, consider moving el.* predicates into the ON clause (or duplicating a variant without those predicates) and asserting presence of unmatched right-side rows.

core/src/test/java/io/questdb/test/griffin/SqlCompilerImplTest.java (2)

3769-3813: Full outer join post-metadata plan looks correct; consider avoiding NOW() for stability

Plan text matches the new "Hash Full Outer Join Light" node and join nesting. To keep this test deterministic across runs and CI clock skew, consider replacing NOW() with a fixed timestamp or binding a constant. For example, compare against a literal known to miss to exercise the same plan without time dependency.

- "WHERE T2.created IN (NOW(),NOW()) ";
+ "WHERE T2.created IN (to_timestamp('2100-01-01','yyyy-MM-dd'),to_timestamp('2100-01-01','yyyy-MM-dd')) ";

5945-6200: Right outer join coverage and reorder tests are solid; one minor stability suggestion

Great to see RIGHT OUTER JOIN post-metadata and reorder paths covered, including null-propagation cases. Similar to the FULL test, replacing NOW() with a literal can make results less time-dependent.

- "WHERE T2.created IN (NOW(),NOW()) ";
+ "WHERE T2.created IN (to_timestamp('2100-01-01','yyyy-MM-dd'),to_timestamp('2100-01-01','yyyy-MM-dd')) ";
core/src/main/java/io/questdb/griffin/engine/join/AbstractHashOuterJoinLightRecordCursor.java (4)

68-71: Use of preComputedStateSize for map-built hint is fine; add a brief comment

Returning RecordCursor.fromBool(isMapBuilt) is terse but non-obvious. A short comment about the contract (e.g., “1 when map built else 0”) would aid maintainability.


78-87: Reset logic on toTop() could be stricter to avoid stale state

You clear joinKeyMap and slaveChain only when !isMapBuilt. If a caller does toTop() after a run that built the map (e.g., before a new scan that should rebuild), stale entries may persist until the subclass forces a rebuild. Consider always clearing map/chain on toTop() unless there is a proven reuse scenario that requires retention.

     public void toTop() {
         masterCursor.toTop();
         slaveChainCursor = null;
-        if (!isMapBuilt) {
-            slaveCursor.toTop();
-            joinKeyMap.clear();
-            slaveChain.clear();
-        }
+        slaveCursor.toTop();
+        joinKeyMap.clear();
+        slaveChain.clear();
+        isMapBuilt = false;
     }

89-109: Tighten and document row-id chaining; minor duplication cleanup

Logic is correct: chaining row IDs via the head index in value slot 0. You can reduce duplication and make intent clearer with a single put:

-            MapValue value = key.createValue();
-            if (value.isNew()) {
-                value.putInt(0, rowIDChain.put(record.getRowId(), -1));
-            } else {
-                value.putInt(0, rowIDChain.put(record.getRowId(), value.getInt(0)));
-            }
+            MapValue value = key.createValue();
+            final int next = value.isNew() ? -1 : value.getInt(0);
+            value.putInt(0, rowIDChain.put(record.getRowId(), next));

Also consider a brief comment indicating that slot 0 stores the head of the per-key row-id chain.


111-132: Same duplication cleanup with matched-flag initialization

Keep the matched flag set only when new, and collapse the chain handling as above.

-            MapValue value = key.createValue();
-            if (value.isNew()) {
-                value.putInt(0, rowIDChain.put(record.getRowId(), -1));
-                value.putBool(1, false);
-            } else {
-                value.putInt(0, rowIDChain.put(record.getRowId(), value.getInt(0)));
-            }
+            MapValue value = key.createValue();
+            final int next = value.isNew() ? -1 : value.getInt(0);
+            value.putInt(0, rowIDChain.put(record.getRowId(), next));
+            if (value.isNew()) {
+                value.putBool(1, false);
+            }
core/src/main/java/io/questdb/griffin/engine/join/FullOuterJoinRecord.java (1)

47-57: Consider extracting duplicated logic into a helper method

The hasMaster and hasSlave methods contain nearly identical logic. This duplication could be reduced for better maintainability.

Consider refactoring to reduce code duplication:

+    private void setRecordState(boolean value, Record flapping, Record nullRecord, boolean isMaster) {
+        if (value) {
+            Record current = isMaster ? master : slave;
+            if (flapping != current) {
+                if (isMaster) {
+                    master = flapping;
+                } else {
+                    slave = flapping;
+                }
+            }
+        } else {
+            Record current = isMaster ? master : slave;
+            if (current != nullRecord) {
+                if (isMaster) {
+                    master = nullRecord;
+                } else {
+                    slave = nullRecord;
+                }
+            }
+        }
+    }
+
     void hasMaster(boolean value) {
-        if (value) {
-            if (flappingMaster != master) {
-                master = flappingMaster;
-            }
-        } else {
-            if (master != masterNullRecord) {
-                master = masterNullRecord;
-            }
-        }
+        setRecordState(value, flappingMaster, masterNullRecord, true);
     }

     void hasSlave(boolean value) {
-        if (value) {
-            if (flappingSlave != slave) {
-                slave = flappingSlave;
-            }
-        } else {
-            if (slave != slaveNullRecord) {
-                slave = slaveNullRecord;
-            }
-        }
+        setRecordState(value, flappingSlave, slaveNullRecord, false);
     }

Also applies to: 63-73

core/src/test/java/io/questdb/test/griffin/JoinTest.java (3)

259-264: Comment/time mismatch in testAsOfCorrectness

The comment says “quote msft @ 10:00:03.000001” but the inserted timestamp is 2018-11-02T10:00:02.000002Z. Please align the comment or the value.

-                // quote msft @ 10.00.03.000001
-                rQuotes = quotes.newRow(TimestampFormatUtils.parseUTCTimestamp("2018-11-02T10:00:02.000002Z"));
+                // quote msft @ 10.00.02.000002
+                rQuotes = quotes.newRow(TimestampFormatUtils.parseUTCTimestamp("2018-11-02T10:00:02.000002Z"));

1515-1518: Duplicate leak test: testAsOfJoinRecordNoLeaks[2]

Both methods run the same query with the same fullFat flag (false). Either remove the second or switch it to exercise the alternate path if relevant.

-    public void testAsOfJoinRecordNoLeaks2() throws Exception {
-        testJoinForCursorLeaks("with crj as (select x, ts from xx latest by x) select xx.x from xx asof join crj on xx.x = crj.x ", false);
-    }
+    public void testAsOfJoinRecordNoLeaks2() throws Exception {
+        // If full-fat mode is supported here, flip the flag to broaden coverage; otherwise remove this duplicate test.
+        testJoinForCursorLeaks("with crj as (select x, ts from xx latest by x) select xx.x from xx asof join crj on xx.x = crj.x ", true);
+    }

Also applies to: 1520-1523


2931-2973: Factor repeated outer-join assertions into a small helper

The LEFT, RIGHT, and FULL variants here share the same projection and ordering. Consider a tiny helper that runs all three and compares against a single expected builder to reduce duplication and maintenance overhead.

Also applies to: 2970-2973

core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinFilteredRecordCursorFactory.java (2)

112-145: Fail fast on unknown joinType in switch.

Add a default branch to avoid a potential NPE if an unexpected joinType slips through.

         if (cursor == null) {
             switch (joinType) {
                 case QueryModel.JOIN_LEFT_OUTER:
                     cursor = new HashLeftOuterJoinFilteredRecordCursor(
                         columnSplit,
                         NullRecordFactory.getInstance(slaveFactory.getMetadata()),
                         joinKeyMap,
                         slaveChain
                     );
                     break;
                 case QueryModel.JOIN_RIGHT_OUTER:
                     cursor = new HashRightOuterJoinFilteredRecordCursor(
                         columnSplit,
                         NullRecordFactory.getInstance(masterFactory.getMetadata()),
                         joinKeyMap,
                         matchIdsMap,
                         slaveChain
                     );
                     break;
                 case QueryModel.JOIN_FULL_OUTER:
                     cursor = new HashFullOuterJoinFilteredRecordCursor(
                         columnSplit,
                         NullRecordFactory.getInstance(masterFactory.getMetadata()),
                         NullRecordFactory.getInstance(slaveFactory.getMetadata()),
                         joinKeyMap,
                         matchIdsMap,
                         slaveChain
                     );
                     break;
+                default:
+                    throw new IllegalArgumentException("Unsupported outer join type: " + joinType);
             }
             this.joinKeyMap = null;
             this.slaveChain = null;
             this.matchIdsMap = null;
         }

431-448: RIGHT OUTER: explicitly reset useSlaveCursor after draining slaveChain.

Not strictly necessary (guarded by hasNext), but makes state transitions deterministic and mirrors FULL OUTER.

             if (useSlaveCursor && slaveChain.hasNext()) {
                 do {
                     if (record.hasMaster()) {
                         if (filter.getBool(record)) {
                             MapKey keys = matchIdsMap.withKey();
                             keys.put(slaveRecord, RecordIdSink.RECORD_ID_SINK);
                             keys.createValue();
                             return true;
                         }
                     } else {
                         MapKey keys = matchIdsMap.withKey();
                         keys.put(slaveRecord, RecordIdSink.RECORD_ID_SINK);
                         if (keys.findValue() == null) {
                             return true;
                         }
                     }
                 } while (slaveChain.hasNext());
+                useSlaveCursor = false;
             }
core/src/main/java/io/questdb/griffin/SqlOptimiser.java (8)

121-121: Make timestamp-requirement mapping robust (avoid brittle boolean array).

The fixed-position array is easy to desync from QueryModel join-type constants and can yield wrong behavior or OOB if new join types are added. Prefer a switch-based helper.

Apply:

-    private static final boolean[] joinsRequiringTimestamp = {false, false, false, false, true, true, true, false, false, false, false};
+    private static boolean joinRequiresTimestamp(int joinType) {
+        switch (joinType) {
+            case QueryModel.JOIN_ASOF:
+            case QueryModel.JOIN_SPLICE:
+            case QueryModel.JOIN_LT:
+                return true;
+            default:
+                return false;
+        }
+    }

Then update call sites (see Lines 1454-1464).


568-576: Context accessor naming is asymmetric.

getJoinContext() pairs better with setJoinContext() than setContext(); consider renaming for API clarity.


667-673: Preserve outer-join barrier in the topo-sort by recording a parent edge.

linkDependencies() prevents reordering, but when the join has no JoinContext (e.g., CROSS_LEFT/RIGHT/FULL after homogenization) the inCount path won’t reflect the dependency and the node is later appended as “cross”. Explicitly attach the previous model as a parent in the join’s JoinContext so the topo-sort respects the barrier.

Apply:

     model.setOuterJoinExpressionClause(concatFilters(model.getOuterJoinExpressionClause(), node));
-    // add dependency to prevent previous model reordering (left/right outer joins are not symmetric)
+    // add dependency to prevent previous model reordering (left/right outer joins are not symmetric)
     if (joinIndex > 0) {
       linkDependencies(parent, joinIndex - 1, joinIndex);
+      // ensure topo-sort sees the barrier even when no context yet
+      JoinContext ctx = model.getJoinContext();
+      if (ctx == null) {
+          ctx = contextPool.next();
+          ctx.slaveIndex = joinIndex;
+          model.setContext(ctx);
+      }
+      if (!ctx.parents.contains(joinIndex - 1)) {
+          ctx.parents.add(joinIndex - 1);
+      }
     }

Please run queries with 3+ tables mixing INNER with LEFT/RIGHT and CROSS_* (e.g., A INNER B LEFT C RIGHT D) and confirm the produced orderedJoinModels never place C/D before their respective parents.


1454-1464: Avoid array indexing for join timestamp requirement.

If adopting the helper, update the condition accordingly.

Apply:

-            if (joinsRequiringTimestamp[m.getJoinType()]) {
+            if (joinRequiresTimestamp(m.getJoinType())) {
                 linkDependencies(parent, 0, i);
                 if (m.getJoinContext() == null) {
                     m.setContext(jc = contextPool.next());
                     jc.parents.add(0);
                     jc.slaveIndex = i;
                 }
             }

Add a sanity test to ensure only ASOF/LT/SPLICE require timestamp; if more join kinds are added, the switch must be updated.


1719-1731: Classifying “cross-like” joins by null context may miss barriers.

The condition treats any join with null context as “cross”. For CROSS_LEFT/RIGHT/FULL this is intentional, but they may still carry barriers via dependencies. Your fix in addOuterJoinExpression mitigates this; consider explicitly checking joinBarriers to make the intent obvious.

Apply:

-            if (q.getJoinType() == QueryModel.JOIN_CROSS || q.getJoinContext() == null || q.getJoinContext().parents.size() == 0) {
+            if (q.getJoinType() == QueryModel.JOIN_CROSS ||
+                (q.getJoinContext() == null && !joinBarriers.contains(q.getJoinType())) ||
+                (q.getJoinContext() != null && q.getJoinContext().parents.size() == 0)) {

2498-2524: Homogenization to CROSS_ variants is sensible; ensure semantics don’t change.*

LEFT/RIGHT/FULL with joinCriteria but no equi-ctx are relabeled to CROSS_*; this affects reorder/pushdown later. Add tests that WHERE and ON predicates yield identical results pre/post homogenization, especially for NULL-producing sides.

I can draft parametrized SQL tests covering LEFT/RIGHT/FULL with/without ON, and with constants/functions in ON/WHERE. Want me to open a test PR?


3588-3618: Propagating literals through updated join context accessors — LGTM.

One nit: leftJoinWhere variable name now refers to generic outer-join expression. Consider renaming to outerJoinWhere for readability.


7264-7275: Barrier set updated (RIGHT_OUTER + CROSS_*): good coverage.

Minor nit on the TODO phrasing.

Apply:

-        // todo full outer join do not support reorder for now
+        // TODO: FULL OUTER JOIN does not support reordering yet
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (3)

312-313: Redundant static import.

QUERY is already covered by import static io.questdb.griffin.model.QueryModel.*;. Remove the dedicated QUERY import to avoid duplication.

-import static io.questdb.griffin.model.QueryModel.QUERY;

2621-2660: CROSS_LEFT/RIGHT/FULL timestamp policy — confirm intent.

CROSS_LEFT keeps master timestamp index; CROSS_RIGHT/CROSS_FULL set -1. If downstream optimizations (ORDER BY ts, LATEST BY, SAMPLE BY) would benefit from a designated timestamp when the preserved side has one, consider mirroring LEFT behavior for RIGHT (using slave ts) and decide for FULL. If -1 is deliberate, add a short comment.


3052-3064: Constant-false ON predicate fast-path — consider FULL OUTER case.

You optimized LEFT/INNER/RIGHT. For FULL OUTER with ON FALSE, a specialized plan (unmatched on both sides) could avoid hash-build/probe, but it’s optional.

core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinRecordCursorFactory.java (6)

48-49: Reduce inter-factory coupling from static import.

outerJoinTypeToString() coming from HashOuterJoinFilteredLightRecordCursorFactory creates an odd dependency between factories. Consider moving the helper to a shared utility (e.g., JoinPlanPrinter) or keep a qualified call. Low impact, improves cohesion.


147-148: Scan direction policy OK; confirm RIGHT-join expectations.

Returning SCAN_DIRECTION_OTHER for RIGHT/FULL is conservative. If RIGHT OUTER joins are expected to preserve right-side scan order, consider delegating to slaveFactory.getScanDirection(). If not, current behavior is fine.


168-171: Free-and-null to avoid stale references on reuse.

Use the Misc.free(...) return to null fields and prevent accidental reuse after close.

Apply:

-        Misc.free(cursor);
-        Misc.free(joinKeyMap);
-        Misc.free(slaveChain);
+        cursor = Misc.free(cursor);
+        joinKeyMap = Misc.free(joinKeyMap);
+        slaveChain = Misc.free(slaveChain);

173-261: FULL OUTER cursor logic: correct phases; just replace magic indices.

Build-with-flag → matched via master scan (mark flag) → unmatched via map scan is correct. Resource handling (mapCursor) and toTop/reset look good. Replace raw 0/3 indices per earlier suggestion for safety.


329-405: RIGHT OUTER cursor logic: correct match-first then unmatched; replace magic indices.

Flow matches semantics and uses the match flag properly. Swap raw indices for the named constants as noted.


65-65: Update valueTypes comment and replace magic indices with named constants

Verified: SqlCodeGenerator appends a BOOLEAN for RIGHT/FULL joins; the constructor comment is stale — update it and introduce named indices to avoid magic numbers.

-            @Transient ColumnTypes valueTypes, // this expected to be just 3 INTs, we store chain references in map
+            @Transient ColumnTypes valueTypes, // expected: 3 chain columns (INT/LONG) + 1 BOOL (match flag for RIGHT/FULL)

Add constants near the top of the class:

// Map value layout
private static final int V_CHAIN_HEAD = 0;
private static final int V_CHAIN_TAIL = 1; // if applicable
private static final int V_CHAIN_COUNT = 2; // if applicable
private static final int V_MATCH_FLAG = 3;

Replace usages (examples):

-                    slaveChain.of(value.getInt(0));
+                    slaveChain.of(value.getInt(V_CHAIN_HEAD));
-                    value.putBool(3, true);
+                    value.putBool(V_MATCH_FLAG, true);
-                if (!value.getBool(3)) {
+                if (!value.getBool(V_MATCH_FLAG)) {
core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (5)

2316-2339: Reduce duplication in OUTER JOIN plan assertions

The LEFT/RIGHT/FULL loop is repeated in several tests. Consider extracting join types and expected factory names into shared constants/helpers to improve maintainability and keep plan-label expectations in one place.

Example approach:

  • Define class-level constants for OUTER_JOIN_TYPES and OUTER_JOIN_FACTORIES_(LIGHT|FULL).
  • Provide a small helper that takes a SQL template and expected plan template and iterates variants.

2343-2343: Fix method name typos: “Ahd” → “And”

Public test names contain a typo (“Ahd”). Renaming avoids confusion and improves discoverability.

-    public void testFullJoinWithEqualityAndExpressionsAhdWhere1() throws Exception {
+    public void testFullJoinWithEqualityAndExpressionsAndWhere1() throws Exception {
-    public void testFullJoinWithEqualityAndExpressionsAhdWhere3() throws Exception {
+    public void testFullJoinWithEqualityAndExpressionsAndWhere3() throws Exception {
-    public void testOuterJoinWithEqualityAndExpressionsAhdWhere2() throws Exception {
+    public void testOuterJoinWithEqualityAndExpressionsAndWhere2() throws Exception {

Also applies to: 2376-2376, 6073-6073


4617-4636: Add RIGHT/FULL analogs for function/OR join predicates

These two LEFT JOIN tests intentionally fall back to Nested Loop due to non-equi (function/OR) predicates. Consider mirroring them for RIGHT and FULL to lock in fallback behavior across all outer-join directions.

Also applies to: 4638-4657


4659-4853: Clarify conditional plan formatting in the loop

The (i < 3) branches switch between OUTER-HASH vs LT/ASOF shapes and adjust indentation accordingly. A short in-code comment explaining the alignment trick and why equality condition appears only for LEFT/RIGHT/FULL would make future edits safer.

If you prefer, I can factor this into a helper that renders the correct expected plan for each join type to avoid hand-crafted indentation.


6041-6069: Regex plan text stability

Plan expectations use unquoted patterns (a.* and .*z). If the planner later adds quoting, these would become brittle. Consider asserting via contains() on the relevant fragments or splitting into smaller lines to reduce coupling.

Would you like me to refactor this assertion to a more robust contains-based check while preserving strictness elsewhere?

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e89d2fc and 0f25afa.

📒 Files selected for processing (29)
  • core/src/main/java/io/questdb/cairo/ArrayColumnTypes.java (1 hunks)
  • core/src/main/java/io/questdb/cairo/RecordIdSink.java (1 hunks)
  • core/src/main/java/io/questdb/cairo/TableUtils.java (0 hunks)
  • core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (20 hunks)
  • core/src/main/java/io/questdb/griffin/SqlOptimiser.java (18 hunks)
  • core/src/main/java/io/questdb/griffin/SqlParser.java (4 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/AbstractHashOuterJoinLightRecordCursor.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/AbstractHashOuterJoinRecordCursor.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/FullOuterJoinRecord.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/HashJoinLightRecordCursorFactory.java (12 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/HashJoinRecordCursorFactory.java (6 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinFilteredLightRecordCursorFactory.java (7 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinFilteredRecordCursorFactory.java (6 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinLightRecordCursorFactory.java (6 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinRecordCursorFactory.java (7 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/NestedLoopFullJoinRecordCursorFactory.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/NestedLoopRightJoinRecordCursorFactory.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/join/RightOuterJoinRecord.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/model/QueryModel.java (4 hunks)
  • core/src/test/java/io/questdb/test/cutlass/suspend/TestCases.java (1 hunks)
  • core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (7 hunks)
  • core/src/test/java/io/questdb/test/griffin/GroupByTest.java (1 hunks)
  • core/src/test/java/io/questdb/test/griffin/JoinTest.java (53 hunks)
  • core/src/test/java/io/questdb/test/griffin/SqlCodeGeneratorTest.java (2 hunks)
  • core/src/test/java/io/questdb/test/griffin/SqlCompilerImplTest.java (4 hunks)
  • core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java (6 hunks)
  • core/src/test/java/io/questdb/test/griffin/SqlParserTest.java (38 hunks)
  • core/src/test/java/io/questdb/test/griffin/engine/groupby/SampleByTest.java (1 hunks)
  • core/src/test/java/io/questdb/test/griffin/engine/join/HashJoinTest.java (2 hunks)
💤 Files with no reviewable changes (1)
  • core/src/main/java/io/questdb/cairo/TableUtils.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: New pull request (Coverage Report Coverage Report)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
  • GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
  • GitHub Check: New pull request (Hosted Running tests on windows-other)
  • GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on windows-cairo)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
  • GitHub Check: New pull request (Hosted Running tests on windows-griffin)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
  • GitHub Check: New pull request (Hosted Running tests on mac-other)
  • GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
  • GitHub Check: New pull request (Hosted Running tests on mac-griffin)
  • GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
  • GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
  • GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (66)
core/src/test/java/io/questdb/test/griffin/SqlParserTest.java (1)

895-901: Broad adoption of parametric outer-join tests: LGTM

The conversion to a single helper with #OUTER_JOIN_TYPE expands coverage nicely across left/right/full variants without duplicating test bodies.

Also applies to: 905-911, 1730-1738, 4604-4612, 4616-4624, 4637-4661, 4651-4661, 4666-4683, 6820-6887, 6920-6927, 6932-6939, 7348-7355, 8589-8595, 8600-8607, 8629-8637, 8642-8661, 8653-8661, 10416-10439, 11743-11769, 11935-11945

core/src/main/java/io/questdb/griffin/model/QueryModel.java (1)

66-76: Verification complete: legacy APIs removed and constants usage updated
No remaining calls to getContext() or references to JOIN_OUTER; JOIN_MAX is now JOIN_CROSS_FULL (12), and the joinsRequiringTimestamp array size (JOIN_MAX + 1) correctly accommodates all join types.

core/src/test/java/io/questdb/test/griffin/engine/groupby/SampleByTest.java (2)

5955-5955: Plan expectation updated to specific join type — OK

The change to “Hash Left Outer Join Light” aligns with the new explicit LEFT/RIGHT/FULL outer-join labeling introduced in this PR. No further changes needed here.


5952-5958: No legacy ‘Hash Outer Join Light’ labels remain. All test expectations have been updated to explicit Left/Right/Full variants.

core/src/main/java/io/questdb/cairo/RecordIdSink.java (1)

38-41: Confirm all call sites guarantee row ID availability.

copy() uses Record.getRowId(); not all Record implementations expose a valid row id. Please ensure this sink is only wired where row IDs are guaranteed (table-backed records, not synthetic/projected-only records).

core/src/test/java/io/questdb/test/cutlass/suspend/TestCases.java (1)

270-273: Nice expansion to RIGHT/FULL variants.

Good symmetry with existing LEFT cases; this should exercise new paths under suspend/resume.

Also applies to: 275-278, 283-286, 288-291, 293-296

core/src/main/java/io/questdb/griffin/SqlParser.java (2)

2640-2643: ON-clause enabled for LEFT/RIGHT/FULL — LGTM

Including LEFT_OUTER, RIGHT_OUTER, and FULL_OUTER in the ON-clause switch is correct and aligns with expected syntax.


4242-4244: Disallowing RIGHT/FULL as table aliases — LGTM

Adding "right" and "full" to tableAliasStop prevents ambiguous aliases and matches existing treatment of "left"/"outer".

core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java (5)

458-479: Plan string template looks correct for all outer-join flavors

The dynamic label substitution to "Hash Left/Right/Full Outer Join Light" aligns with the new toPlan output.


984-999: Updated left-outer plan expectations align with new naming

Switch to "Hash Left Outer Join Light" and structure looks consistent.


1001-1038: Right-outer plan coverage added — nice

Covers plan text and filter positioning; parser is case-insensitive so "Right JOIN" is fine.


1040-1077: Full-outer plan coverage added — looks good

Plan assertion matches the expected "Hash Full Outer Join Light" path.


2231-2231: Plan name update acknowledged

"Hash Left Outer Join Light" matches the new toPlan naming; assertion is correct.

core/src/test/java/io/questdb/test/griffin/SqlCompilerImplTest.java (2)

5288-5288: Plan label update to “Hash Left Outer Join Light” is consistent

The expectation change aligns with the new node naming introduced by the join enhancements. Looks good.


5488-5488: Left-join reorder plan assertion reads well

The “Hash Left Outer Join Light” label under the nested-loop left-join path matches the new taxonomy. No issues spotted.

core/src/main/java/io/questdb/griffin/engine/join/AbstractHashOuterJoinLightRecordCursor.java (2)

58-66: Idempotent close with resource release — good

close() guards against double-close and releases joinKeyMap and slaveChain before delegating to super.close(). This is correct and avoids leaks.


134-147: Ensure fresh state on of(); verify lifecycle expectations

Reopening joinKeyMap/slaveChain only when !isOpen assumes of() is always preceded by close(). If a caller reuses a cursor without closing, stale state could leak. Either clear state unconditionally here or verify the contract.

Would you like me to scan the codebase for all of( invocations of this cursor (and subclasses) to ensure close() precedes subsequent of() calls? I can provide a script to confirm the lifecycle expectations.

core/src/main/java/io/questdb/griffin/engine/join/HashJoinLightRecordCursorFactory.java (4)

225-234: Symbol table delegation logic appears correct

The symbol table methods correctly handle the swapped state by routing to the appropriate cursor based on column index and swap status. The split-point logic is properly maintained.

Also applies to: 261-269


299-320: Resource initialization and swap handling logic looks good

The of method properly initializes the cursor state, including:

  • Setting up the swapped record ordering
  • Assigning the correct sinks based on swap state
  • Properly resetting state flags

106-110: Exception handling properly cleans up resources

The catch block correctly frees both cursors in case of failure, preventing resource leaks.


91-104: Remove unsupported size‐value checks
The size() contract never returns –1—implementations either return a non-negative count or throw DataUnavailableException—and empty cursors correctly return 0, so the suggested sentinel checks aren’t applicable.

Likely an incorrect or invalid review comment.

core/src/main/java/io/questdb/griffin/engine/join/FullOuterJoinRecord.java (1)

41-45: LGTM! Proper initialization of flapping state

The of method correctly caches the non-null records for later restoration when toggling between null and actual records.

core/src/main/java/io/questdb/griffin/engine/join/NestedLoopRightJoinRecordCursorFactory.java (3)

133-163: Complex state machine for RIGHT JOIN iteration appears correct

The nested loop implementation properly handles:

  • Matched pairs from both cursors
  • Unmatched slave rows (when no master matches)
  • State transitions via isMatch and isSlaveHasNextPending flags

The logic correctly implements RIGHT OUTER JOIN semantics.


70-80: Resource cleanup in exception handler is properly implemented

The exception handling correctly frees both cursors to prevent resource leaks.


176-183: toTop() method correctly resets all state

The implementation properly:

  • Resets both cursors to top
  • Resets the filter
  • Reinitialized state flags
  • Restores the master record presence
core/src/test/java/io/questdb/test/griffin/JoinTest.java (6)

73-73: New unmatched-right row ensures FULL JOIN coverage — LGTM

Adding the 'tom' row on table_2 exercises unmatched-right behavior in FULL joins. Looks correct given the subsequent expectations.


1887-1890: Alias propagation checks extended to RIGHT/FULL — LGTM

Good coverage to ensure alias resolution symmetry across join types.


1922-1926: Join-type propagation loop now includes RIGHT/FULL — LGTM

Nice reuse via iteration; increases coverage with minimal duplication.


4513-4522: Plan assertions for RIGHT/FULL outer join light — LGTM

Great to validate the physical operator names for LEFT/RIGHT/FULL variants in the nested-alias case.

Also applies to: 4545-4553, 4575-4584


4648-4651: Confirm assert helper usage

This test uses assertException (not assertExceptionNoLeakCheck used elsewhere). Verify both helpers exist and the difference is intentional for this case.


5723-5726: Varchar/symbol join extensions to RIGHT/FULL — LGTM

Good to assert cross-type equality semantics across all outer join variants.

core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinFilteredRecordCursorFactory.java (1)

147-149: Confirm disabling pre-touch is intentionally process-wide.

executionContext.setColumnPreTouchEnabled(false) isn’t restored. Verify this side effect is intended for the entire query pipeline using this cursor.

core/src/main/java/io/questdb/griffin/SqlOptimiser.java (8)

868-873: Good clarification on LEFT/RIGHT predicate handling.

The comment succinctly states the rules and matches the implementation below.


920-936: Post-join placement for outer-join predicates looks correct.

When a single-table equality is detected under a join barrier, routing to post-join where is appropriate.


960-970: Guard against pushing into another outer join.

The joinBarriers check before pushing or adding post-join filters is correct and avoids semantic changes.


1752-1757: Topo-sort decrements based on dependencies only when jc != null — OK once parent edges are recorded.

With parent edges attached for outer/cross_* joins, inCount transitions will be honored and orderingStack evolves correctly.


1762-1766: Cycle detection against inCount — OK.

Returning MAX_VALUE when inCount remains > 0 is consistent with rejecting invalid orderings.


3923-3934: Reordering across “crosses” — looks consistent with the new model.

swapJoinOrder() respects barriers via joinBarriers and parents.contains(to). No action.


6554-6563: parents.contains(to) guard avoids stealing across barriers — LGTM.

This prevents moving reversible clauses when the target is not a parent.


6661-6685: Safely split/move clauses and upgrade target to INNER when needed — LGTM.

Also correctly creates a target JoinContext when missing.

core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (5)

209-212: New nested-loop RIGHT/FULL join factories wired in — looks good.

Imports for NestedLoopRightJoinRecordCursorFactory and NestedLoopFullJoinRecordCursorFactory align with new CROSS_RIGHT/CROSS_FULL handling.


2681-2681: JoinContext propagation cleanup — good change.

Switching to processJoinContext(..., slaveModel.getJoinContext(), ...) across ASOF/LT/SPLICE/HASH paths improves key resolution under aliasing/self-joins.

Also applies to: 2942-2942, 3031-3031, 3046-3046, 3076-3076, 2810-2810, 2964-2964


6885-6893: joinsRequiringTimestamp updated for split outer-join types — looks correct.

LEFT/RIGHT/FULL outer joins don’t require a designated timestamp; consistent with earlier semantics.


1304-1336: Gate BOOLEAN addition behind filter == null in light outer joins
Apply the same guard used in the full-fat path so that valueTypes.add(BOOLEAN) only occurs when there’s no filter:

-            if (joinType == JOIN_RIGHT_OUTER || joinType == JOIN_FULL_OUTER) {
-                valueTypes.add(BOOLEAN);
-            }
+            if (filter == null && (joinType == JOIN_RIGHT_OUTER || joinType == JOIN_FULL_OUTER)) {
+                valueTypes.add(BOOLEAN);
+            }

Also verify that HashOuterJoinFilteredLightRecordCursorFactory does not rely on the extra BOOLEAN slot; if it does, align its behavior with the unfiltered path.


1319-1321: All HashOuterJoinRecordCursorFactory instantiations include joinType — verified that each new HashOuterJoin…RecordCursorFactory(…) call in SqlCodeGenerator.java now passes joinType as the final argument at lines 1308–1310, 1324–1326, 1367–1371, and 1384–1388.

core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinLightRecordCursorFactory.java (3)

285-293: Confirm matched-flag reset on re-iteration (toTop) for FULL OUTER path

HashFullOuterJoinLightLightRecordCursor marks matches via value.putBool(1, true). toTop() rewinds cursors but does not explicitly clear these flags. If super.toTop() doesn’t rebuild/clear the map, the unmatched-slave phase could be wrong on subsequent scans.

Would you confirm AbstractHashOuterJoinLightRecordCursor.toTop() or the next build path resets map values? If not, consider resetting the flag for all entries or rebuilding the map on toTop(). I can provide a targeted patch after confirmation.


87-91: Order-by advice passthrough looks correct for LEFT OUTER joins

Delegating only for LEFT OUTER is consistent with scan-direction handling.


427-451: RIGHT OUTER join iteration/match-tracking looks correct

Map build with matched flag, per-master chaining, and unmatched-slave emission via map cursor are coherent.

core/src/main/java/io/questdb/griffin/engine/join/NestedLoopFullJoinRecordCursorFactory.java (1)

112-118: Plan printing is clear and consistent

The plan includes type and filter and shows both children. Good.

core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinFilteredLightRecordCursorFactory.java (2)

269-276: Verify sink initialization order before first map population

hasNext() builds the map via populateRowIDHashMap(..., slaveCursorSink, ...). Ensure of(...) is always called before the first hasNext() so slaveCursorSink is set (and swapped correctly). It appears correct via getCursor(), but please confirm no alternative paths call hasNext() earlier.


197-208: outerJoinTypeToString helper is concise and aids plan readability

Good utility; matches plan printing elsewhere.

core/src/main/java/io/questdb/griffin/engine/join/HashOuterJoinRecordCursorFactory.java (5)

51-58: Field additions look good.

columnSplit/joinType are immutable and clearly scoped; joinKeyMap/slaveChain are staged for transfer into the cursor. No concerns here.


93-94: Order-by advice logic is appropriate.

Limiting followedOrderByAdvice() to LEFT OUTER + masterFactory is consistent with scanDirection below.


130-143: Cursor acquisition path looks solid.

Column pre-touch disabled, slave first then master, and proper cleanup on exceptions. Good.


156-161: Plan rendering LGTM.

The plan string includes the specific outer join type and condition. Clear and consistent.


283-323: LEFT OUTER cursor logic looks correct.

Map populate (no match flag needed), per-master probe, slave chain iteration, and null-slave handling are all clean.

core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (10)

3780-3803: Good coverage for new outer join variants (LIGHT)

The parameterized assertions over LEFT/RIGHT/FULL validate the correct factory labels and join conditions for hash outer joins. Looks solid.


3808-3834: Post-join filter behavior exercised across join types

Asserting “b.i is not null” as a post-join filter is a nice sanity check for OUTER → INNER semantics at the plan layer.


5619-5650: Nice validation of order preservation for Lt/AsOf subqueries

Good assertion that child sort orders are respected without introducing an unnecessary top-level Sort.


5652-5681: Plan-shape tests for Splice and timestamp-ordered subqueries look correct

These ensure upstream ordering advice is honored. No issues spotted.

Also applies to: 5684-5723


5750-5776: Comprehensive OUTER JOIN equality coverage

The new LEFT/RIGHT/FULL cases (single- and multi-key) look consistent with the intended factory names and condition placement.

Also applies to: 5780-5804, 5806-5834, 5839-5875


5878-5904: Filter vs. condition split for expression-based equi joins

Good to see expression predicates (e.g., a2+5=b2+10) stay as filters while pure equalities remain conditions. This accurately reflects current capabilities.

Also applies to: 5906-5933


5936-5980: Edge cases: provably false join filters

Nice set of cases (… and 1=0) locking in left/right/full behaviors (including Empty table on the appropriate side). This will prevent regressions when optimizing predicate pushdown.


5982-6010: Self-true/false predicates under OUTER join

The dedicated tests for a2!=a2 and a2=a2 are helpful to keep optimizer changes honest about filter placement under outer joins.

Also applies to: 6012-6039


6072-6099: Join condition + where-predicate split validated across OUTER variants

Good to see the “and abs(…) = abs(…) where a1=b1” separation explicitly tested for LEFT/RIGHT/FULL.


12035-12035: AsOf + OUTER join chaining improvements

Adding ts2 to tabb and covering LEFT→ASOF, ASOF→RIGHT, ASOF→FULL sequences is valuable. Expected labels (“Hash … Outer Join Light/AsOf Join Fast Scan”) align with the new runtime join features.

Also applies to: 12048-12048, 12062-12107

@puzpuzpuz
Copy link
Copy Markdown
Contributor

@mtopolnik FYI I've noticed that we're reserving a redudant LONG value slot for AsOfJoinLightRecordCursorFactory and LtJoinLightRecordCursorFactory. Fixed it in be68a43

@puzpuzpuz
Copy link
Copy Markdown
Contributor

@kafka1991 I've noticed that HashJoinRecordCursorFactory, HashOuterJoinFilteredRecordCursorFactory and HashOuterJoinRecordCursorFactory were using value.getInt(0) calls to access the RecordChain offsets while they should use value.getLong(0). Fixed that in 1a23877

@puzpuzpuz puzpuzpuz self-requested a review September 29, 2025 15:05
puzpuzpuz
puzpuzpuz previously approved these changes Sep 29, 2025
@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 1171 / 1205 (97.18%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/join/LtJoinLightRecordCursorFactory.java 2 3 66.67%
🔵 io/questdb/griffin/engine/join/AsOfJoinLightRecordCursorFactory.java 2 3 66.67%
🔵 io/questdb/cairo/RecordIdSink.java 6 7 85.71%
🔵 io/questdb/griffin/engine/join/NestedLoopFullJoinRecordCursorFactory.java 90 100 90.00%
🔵 io/questdb/griffin/engine/join/NestedLoopRightJoinRecordCursorFactory.java 64 70 91.43%
🔵 io/questdb/griffin/SqlCodeGenerator.java 42 45 93.33%
🔵 io/questdb/griffin/engine/join/HashOuterJoinFilteredLightRecordCursorFactory.java 220 229 96.07%
🔵 io/questdb/griffin/engine/join/HashOuterJoinFilteredRecordCursorFactory.java 190 191 99.48%
🔵 io/questdb/griffin/engine/join/HashOuterJoinLightRecordCursorFactory.java 166 167 99.40%
🔵 io/questdb/griffin/engine/join/HashOuterJoinRecordCursorFactory.java 133 134 99.25%
🔵 io/questdb/griffin/engine/join/NestedLoopLeftJoinRecordCursorFactory.java 4 4 100.00%
🔵 io/questdb/cairo/ArrayColumnTypes.java 1 1 100.00%
🔵 io/questdb/griffin/engine/join/FullOuterJoinRecord.java 17 17 100.00%
🔵 io/questdb/griffin/engine/join/RightOuterJoinRecord.java 11 11 100.00%
🔵 io/questdb/griffin/engine/join/OuterJoinRecord.java 2 2 100.00%
🔵 io/questdb/griffin/engine/join/AbstractHashOuterJoinRecordCursor.java 66 66 100.00%
🔵 io/questdb/griffin/SqlParser.java 18 18 100.00%
🔵 io/questdb/griffin/model/QueryModel.java 6 6 100.00%
🔵 io/questdb/griffin/engine/join/HashJoinRecordCursorFactory.java 10 10 100.00%
🔵 io/questdb/griffin/SqlOptimiser.java 26 26 100.00%
🔵 io/questdb/griffin/engine/join/AbstractHashOuterJoinLightRecordCursor.java 55 55 100.00%
🔵 io/questdb/griffin/engine/join/HashJoinLightRecordCursorFactory.java 40 40 100.00%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement Enhance existing functionality Performance Performance improvements

Projects

Development

Successfully merging this pull request may close these issues.

5 participants