Skip to content

fix(sql): compilation error for symbol column to symbol cast in group by query#6072

Merged
bluestreak01 merged 13 commits intomasterfrom
puzpuzpuz_fix_symbol_cast_in_group_by
Sep 24, 2025
Merged

fix(sql): compilation error for symbol column to symbol cast in group by query#6072
bluestreak01 merged 13 commits intomasterfrom
puzpuzpuz_fix_symbol_cast_in_group_by

Conversation

@puzpuzpuz
Copy link
Copy Markdown
Contributor

@puzpuzpuz puzpuzpuz commented Aug 22, 2025

Fixes #6071

Also removes string sinks from IPv4Function: they were only used in UPDATEs. UPDATEs will now generate a wrapper cast function, like it's done with most other fixed-size types.

@puzpuzpuz puzpuzpuz self-assigned this Aug 22, 2025
@puzpuzpuz puzpuzpuz added Bug Incorrect or unexpected behavior SQL Issues or changes relating to SQL execution labels Aug 22, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Aug 22, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Refactors column functions to implement a new ColumnFunction interface exposing getColumnIndex and centralizes plan emission via a default toPlan. Removes per-class toPlan implementations. Updates GroupByUtils to detect column keys, including self-casts, via a new helper. Adds tests covering self-cast and symbol grouping scenarios. Minor formatting/import tweaks elsewhere.

Changes

Cohort / File(s) Summary
ColumnFunction introduction and adoption
core/src/main/java/io/questdb/griffin/engine/functions/columns/ColumnFunction.java
New interface extends Function; adds int getColumnIndex(); default toPlan delegates to sink.putColumnName(getColumnIndex()).
Primitive/Scalar column refactor
.../columns/BooleanColumn.java, .../columns/ByteColumn.java, .../columns/CharColumn.java, .../columns/DateColumn.java, .../columns/DoubleColumn.java, .../columns/FloatColumn.java, .../columns/IntColumn.java, .../columns/LongColumn.java, .../columns/ShortColumn.java, .../columns/TimestampColumn.java, .../columns/UuidColumn.java, .../columns/VarcharColumn.java
Switch implements Function → ColumnFunction; add getColumnIndex(); remove toPlan(PlanSink); adjust imports; some introduce/extend static pooling with private constructors and caches.
Complex/Composite column refactor
.../columns/ArrayColumn.java, .../columns/BinColumn.java, .../columns/Long128Column.java, .../columns/Long256Column.java, .../columns/StrColumn.java, .../columns/IPv4Column.java
Implement ColumnFunction; add getColumnIndex(); remove toPlan(PlanSink); minor constructor/pooling adjustments; IPv4Column also removes isThreadSafe override (behavior otherwise unchanged).
Geo column refactor
.../columns/GeoByteColumn.java, .../columns/GeoIntColumn.java, .../columns/GeoLongColumn.java, .../columns/GeoShortColumn.java
Implement ColumnFunction; change getColumnIndex() annotation to @OverRide; remove toPlan(PlanSink) and related imports.
Interval and Record adjustments
.../columns/IntervalColumn.java, .../columns/RecordColumn.java
IntervalColumn: drop Function from implements and remove toPlan; RecordColumn: remove getInterval(Record) override and related imports.
Group-by logic update
core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java
Add findColumnKeyIndex(expr, func, metadata) to detect column keys and self-casts; refactor assembleGroupByFunctions passes to use it; integrate ColumnFunction; update key/value mapping, validation, and messaging.
Tests for casts and symbols
core/src/test/java/io/questdb/test/griffin/engine/functions/cast/CastTest.java
Add tests: self-cast in group-by; symbol-to-symbol in group-by (1–4); symbol in SAMPLE BY; add imports for PartitionBy and CreateTableTestUtils.
Minor formatting/imports
core/src/main/java/io/questdb/griffin/FunctionParser.java, core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java, .../engine/functions/AbstractGeoHashFunction.java, .../engine/functions/GeoByteFunction.java
Import reordering and whitespace-only changes; no functional impact.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant SQL as SQL Parser
  participant GB as GroupByUtils
  participant FN as Function
  participant CF as ColumnFunction
  participant MD as RecordMetadata

  SQL->>GB: assembleGroupByFunctions(exprs, functions, metadata)
  loop For each projection
    GB->>FN: inspect function at position
    alt FN is ColumnFunction
      Note right of CF: New interface
      GB->>CF: getColumnIndex()
      GB->>GB: treat as map key (COLUMN)
    else FN is cast(column) of same type
      GB->>MD: findColumnKeyIndex(expr, FN, MD)
      GB->>GB: treat as map key (COLUMN)
    else FN is GroupByFunction
      GB->>GB: collect as aggregate (VALUE)
    else
      GB->>GB: mark as function key or virtual (VALUE)
    end
  end
  GB-->>SQL: keys, values, functions mapped
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Assessment against linked issues

Objective Addressed Explanation
Fix 500 error when self-casting a symbol in SELECT with grouping (#6071)

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
None found

Suggested reviewers

  • bluestreak01
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch puzpuzpuz_fix_symbol_cast_in_group_by

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@puzpuzpuz
Copy link
Copy Markdown
Contributor Author

@nwoolmer could you review this one when you have the time?

@puzpuzpuz
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Aug 22, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@nwoolmer nwoolmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor couple of comments, otherwise looking good!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/src/main/java/io/questdb/griffin/engine/functions/columns/ArrayColumn.java (1)

34-41: Restrict ArrayColumn constructor and refactor all direct instantiations
To align with the emerging factory pattern and leave room for future optimizations (e.g., caching), please make the ArrayColumn constructor non-public and update every direct new ArrayColumn(…) call to use the newInstance(...) factory.

• In core/src/main/java/io/questdb/griffin/engine/functions/columns/ArrayColumn.java, change:

-    public ArrayColumn(int columnIndex, int columnType) {
+    private ArrayColumn(int columnIndex, int columnType) {
         this.columnIndex = columnIndex;
         this.type = columnType;
     }

• In core/src/main/java/io/questdb/griffin/FunctionParser.java (around line 232), replace:

-                return new ArrayColumn(index, columnType);
+                return ArrayColumn.newInstance(index, columnType);

• In core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java (around line 386), replace:

-                func = new ArrayColumn(keyColumnIndex - 1, type);
+                func = ArrayColumn.newInstance(keyColumnIndex - 1, type);

• In core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (around line 2407), replace:

-            colFuncs.put(ColumnType.ARRAY, new ArrayColumn(1, ColumnType.encodeArrayType(ColumnType.DOUBLE, 2)));
+            colFuncs.put(ColumnType.ARRAY, ArrayColumn.newInstance(1, ColumnType.encodeArrayType(ColumnType.DOUBLE, 2)));

After these changes, the public API remains consistent (all external code still calls the static newInstance), the constructor is encapsulated, and you’ll preserve symmetry with other Column* classes.

core/src/main/java/io/questdb/griffin/engine/functions/columns/SymbolColumn.java (1)

36-46: SymbolColumn.init(): reset ownership flag in shared‐table path

The init(...) method currently sets ownSymbolTable = true when cloning, but never clears it when falling back to the shared table. As a result, if getCloneSymbolTables() is true on one call and then false on the next, ownSymbolTable remains true and close() will erroneously free a shared/static table.

Locations to fix:

  • File core/src/main/java/io/questdb/griffin/engine/functions/columns/SymbolColumn.java, method init(...)

Suggested diff:

@@ public void init(SymbolTableSource symbolTableSource, SqlExecutionContext executionContext) {
-        if (executionContext.getCloneSymbolTables()) {
+        if (executionContext.getCloneSymbolTables()) {
             if (symbolTable != null) {
                 assert ownSymbolTable;
                 symbolTable = Misc.freeIfCloseable(symbolTable);
             }
             symbolTable = symbolTableSource.newSymbolTable(columnIndex);
             ownSymbolTable = true;
         } else {
+            // switching back to shared table: clear ownership
+            ownSymbolTable = false;
             symbolTable = symbolTableSource.getSymbolTable(columnIndex);
         }
         // static symbol table must be non-null
         assert !symbolTableStatic || getStaticSymbolTable() != null;

This ensures that after a clone‐to‐shared transition, ownSymbolTable is reset and close() won’t free a non-owned table.

core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java (1)

262-297: Restrict inferredKeyColumnCount++ to actual key emissions

The counter inferredKeyColumnCount is currently incremented for every non--1 index, even when the timestamp column is deliberately skipped (index == timestampIndex && !timestampUnimportant). This miscounts map keys and can cause “not enough columns in group by” errors when users explicitly GROUP BY only timestamp literals or self-casts.

• File: core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java
• Around line 283 in the loop over columns

Apply this diff to move the increment inside the timestamp gate:

--- a/core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java
+++ b/core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java
@@ for (int i = 0, n = columns.size(); i < n; i++) {
-                if (index != -1) {
+                if (index != -1) {
                     final int type = baseMetadata.getColumnType(index);
-                    if (index != timestampIndex || timestampUnimportant) {
+                    if (index != timestampIndex || timestampUnimportant) {
                         if (lastIndex != index) {
                             outColumnFilter.add(index + 1);
                             outKeyTypes.add(keyColumnIndex - valueCount, type);
                             keyColumnIndex++;
                             lastIndex = index;
                         }
                         outerProjectionFunctions.set(i, createColumnFunction(baseMetadata, keyColumnIndex, type, index));
-                    }
-
-                    // and finish with populating metadata for this factory
-                    inferredKeyColumnCount++;
+                        // count only keys that actually become map keys
+                        inferredKeyColumnCount++;
+                    }

After merging, please run the GROUP BY test suite—especially explicit GROUP BY on timestamp columns and SAMPLE BY combinations—to confirm no regressions.

🧹 Nitpick comments (21)
core/src/main/java/io/questdb/griffin/engine/functions/GeoByteFunction.java (1)

30-33: Remove the extra blank line after the class declaration.

Avoids unnecessary diff churn and stays consistent with surrounding classes’ style.

-public abstract class GeoByteFunction extends AbstractGeoHashFunction {
-
-    protected GeoByteFunction(int type) {
+public abstract class GeoByteFunction extends AbstractGeoHashFunction {
+    protected GeoByteFunction(int type) {
         super(type);
     }
core/src/main/java/io/questdb/griffin/engine/functions/columns/ColumnFunction.java (1)

31-35: Document ColumnFunction contract to prevent accidental misuse with nested/derived columns.

A brief Javadoc clarifies that implementers must expose a top-level column index; nested/derived implementations should not implement ColumnFunction.

 public interface ColumnFunction extends Function {
-    int getColumnIndex();
+    /**
+     * Returns the index of the column in the current top-level RecordMetadata.
+     * Implementations representing nested or derived columns should not implement
+     * ColumnFunction and must override toPlan(...) explicitly.
+     */
+    int getColumnIndex();
 
-    default void toPlan(PlanSink sink) {
+    /**
+     * Emits this column's name to the execution plan using its column index.
+     */
+    default void toPlan(PlanSink sink) {
         sink.putColumnName(getColumnIndex());
     }
 }
core/src/main/java/io/questdb/griffin/engine/functions/columns/IntervalColumn.java (1)

36-36: Implement ColumnFunction on IntervalColumn to restore default toPlan()

IntervalColumn currently neither implements ColumnFunction (which provides a default toPlan()) nor overrides toPlan() itself, which can lead to missing column names in query plans. A quick audit shows:

  • core/src/main/java/io/questdb/griffin/engine/functions/columns/IntervalColumn.java lacks both implements ColumnFunction and a toPlan() override.
  • RecordColumn.java also doesn’t implement ColumnFunction but already defines toPlan(), so no change is needed there.

To address this for IntervalColumn only, apply the following minimal patch:

• In IntervalColumn’s class declaration (around line 36):

-public class IntervalColumn extends IntervalFunction implements FunctionExtension {
+public class IntervalColumn extends IntervalFunction implements ColumnFunction, FunctionExtension {

• Add the ColumnFunction method (anywhere in IntervalColumn):

@Override
public int getColumnIndex() {
    return columnIndex;
}

This restores plan readability by supplying the column index to the default toPlan().

core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoLongColumn.java (1)

42-53: Minor readability nit: avoid local “bits” name shadowing.

You compute a local bits count in both newInstance and the static block; consider a clearer name (e.g., bitsCount or rangeSize) to make the indexing math self-evident at a glance.

final int bitsCount = ColumnType.GEOLONG_MAX_BITS - ColumnType.GEOLONG_MIN_BITS + 1;
// ...
return COLUMNS[columnIndex * bitsCount + ColumnType.getGeoHashBits(columnType) - ColumnType.GEOLONG_MIN_BITS];

Also applies to: 70-79

core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoIntColumn.java (1)

33-37: Minor consistency nit: make columnIndex private.

Here it’s protected final, while sibling classes declare it private final. Unless subclassing is intended, prefer private final for encapsulation consistency.

// Current
protected final int columnIndex;

// Suggested
private final int columnIndex;
core/src/main/java/io/questdb/griffin/engine/functions/columns/BinColumn.java (1)

60-62: Consider declaring isThreadSafe() = true for consistency.

Other column classes explicitly override isThreadSafe() to true. Doing the same here would keep behavior and planner assumptions consistent across column functions.

Apply this diff:

 public class BinColumn extends BinFunction implements ColumnFunction {
@@
     @Override
     public int getColumnIndex() {
         return columnIndex;
     }
+
+    @Override
+    public boolean isThreadSafe() {
+        return true;
+    }
core/src/main/java/io/questdb/griffin/engine/functions/columns/CharColumn.java (1)

30-31: Consider making CharColumn final for parity and minor devirtualization wins.

Other column classes like UuidColumn are final; marking this final can help JIT and clarifies intent that it’s not a subclassing point.

Apply if you agree:

-public class CharColumn extends CharFunction implements ColumnFunction {
+public final class CharColumn extends CharFunction implements ColumnFunction {
core/src/main/java/io/questdb/griffin/engine/functions/columns/VarcharColumn.java (1)

43-57: Minor: consider making class final for consistency.

-public class VarcharColumn extends VarcharFunction implements ColumnFunction {
+public final class VarcharColumn extends VarcharFunction implements ColumnFunction {
core/src/main/java/io/questdb/griffin/engine/functions/columns/DoubleColumn.java (1)

41-46: Defensive check for negative columnIndex in newInstance.

While callers should only pass non-negative indices, a guard avoids surprising NPE/AIOOBE if assumptions change.

 public static DoubleColumn newInstance(int columnIndex) {
-    if (columnIndex < STATIC_COLUMN_COUNT) {
+    if (columnIndex < 0) {
+        throw new IllegalArgumentException("columnIndex must be >= 0: " + columnIndex);
+    }
+    if (columnIndex < STATIC_COLUMN_COUNT) {
         return COLUMNS.getQuick(columnIndex);
     }
     return new DoubleColumn(columnIndex);
 }
core/src/main/java/io/questdb/griffin/engine/functions/columns/ShortColumn.java (1)

41-46: Add the same defensive negative-index check here as in DoubleColumn.

 public static ShortColumn newInstance(int columnIndex) {
-    if (columnIndex < STATIC_COLUMN_COUNT) {
+    if (columnIndex < 0) {
+        throw new IllegalArgumentException("columnIndex must be >= 0: " + columnIndex);
+    }
+    if (columnIndex < STATIC_COLUMN_COUNT) {
         return COLUMNS.getQuick(columnIndex);
     }
     return new ShortColumn(columnIndex);
 }
core/src/main/java/io/questdb/griffin/engine/functions/columns/UuidColumn.java (1)

63-70: Nit: tighten the comment for clarity.

Slight rewording makes the thread-safety rationale easier to scan.

-    @Override
-    public boolean isThreadSafe() {
-        // the UUID column is thread-safe
-
-        // it's only when casting to string (=common operation) then it's not thread-safe
-        // the CastUuidToStr function indicate it's not thread-safe
-        return true;
-    }
+    @Override
+    public boolean isThreadSafe() {
+        // UUID column access is thread-safe.
+        // Casting to string is not thread-safe; CastUuidToStr indicates this explicitly.
+        return true;
+    }
core/src/main/java/io/questdb/griffin/engine/functions/columns/DateColumn.java (1)

41-46: Defensive check for negative indices.

While column indices should be non-negative by construction, a fast-fail assertion helps catch accidental misuse during development.

     public static DateColumn newInstance(int columnIndex) {
+        assert columnIndex >= 0 : "columnIndex must be >= 0";
         if (columnIndex < STATIC_COLUMN_COUNT) {
             return COLUMNS.getQuick(columnIndex);
         }
         return new DateColumn(columnIndex);
     }
core/src/main/java/io/questdb/griffin/engine/functions/columns/FloatColumn.java (1)

41-46: Add a guard against negative indices.

Same rationale as other Column* wrappers: fail fast if an invalid index slips through.

     public static FloatColumn newInstance(int columnIndex) {
+        assert columnIndex >= 0 : "columnIndex must be >= 0";
         if (columnIndex < STATIC_COLUMN_COUNT) {
             return COLUMNS.getQuick(columnIndex);
         }
         return new FloatColumn(columnIndex);
     }
core/src/main/java/io/questdb/griffin/engine/functions/columns/LongColumn.java (1)

41-46: Optional: assertion for invalid indices.

For consistency with sibling classes and to catch misuse early.

     public static LongColumn newInstance(int columnIndex) {
+        assert columnIndex >= 0 : "columnIndex must be >= 0";
         if (columnIndex < STATIC_COLUMN_COUNT) {
             return COLUMNS.getQuick(columnIndex);
         }
         return new LongColumn(columnIndex);
     }
core/src/main/java/io/questdb/griffin/engine/functions/columns/TimestampColumn.java (1)

41-46: Consider adding a negative index assertion.

Minor guardrail; mirrors the suggestion across other Column* classes.

     public static TimestampColumn newInstance(int columnIndex) {
+        assert columnIndex >= 0 : "columnIndex must be >= 0";
         if (columnIndex < STATIC_COLUMN_COUNT) {
             return COLUMNS.getQuick(columnIndex);
         }
         return new TimestampColumn(columnIndex);
     }
core/src/main/java/io/questdb/griffin/engine/functions/columns/IPv4Column.java (1)

43-49: Avoid calling Record.getIPv4 twice per row

The null-sentinel check duplicates the getIPv4 call without changing behavior. Simplify and avoid an extra accessor call.

Apply:

-    if (rec.getIPv4(columnIndex) == Numbers.IPv4_NULL) {
-        return Numbers.IPv4_NULL;
-    }
-    return rec.getIPv4(columnIndex);
+    final int v = rec.getIPv4(columnIndex);
+    return v;
core/src/main/java/io/questdb/griffin/engine/functions/columns/SymbolColumn.java (2)

86-101: Lifecycle: when cloning, free owned tables and null/flag reset post-close

init() correctly frees owned symbol tables when cloning is enabled. Minor defensive improvement: close() could null out symbolTable and reset ownSymbolTable to prevent accidental reuse after close.

Apply this diff:

 public void close() {
-    if (ownSymbolTable) {
-        symbolTable = Misc.freeIfCloseable(symbolTable);
-    }
+    if (ownSymbolTable) {
+        symbolTable = Misc.freeIfCloseable(symbolTable);
+        ownSymbolTable = false;
+    } else {
+        // shared table is owned by the source, but we shouldn't retain stale refs after close
+        symbolTable = null;
+    }
 }

108-111: Guard newSymbolTable() against pre-init usage

newSymbolTable() assumes symbolTableSource is non-null (i.e., init() already called). If that assumption can be violated by callers, add a precondition or assertion.

 public @Nullable SymbolTable newSymbolTable() {
-    return symbolTableSource.newSymbolTable(columnIndex);
+    assert symbolTableSource != null : "init() must be called before newSymbolTable()";
+    return symbolTableSource.newSymbolTable(columnIndex);
 }
core/src/test/java/io/questdb/test/griffin/engine/functions/cast/CastTest.java (2)

5054-5116: Solid regression test for self-cast as key (timestamp::timestamp et al.)

Great coverage for “self-cast” across multiple types within a grouped flow. The table fabrication via CreateTableTestUtils and deterministic timestamp_sequence make this stable.

Consider adding one variant with NULLs in symbol/string to ensure self-cast key detection coexists with null semantics in grouping.

I can draft an additional test that inserts some NULL sym/str rows and validates group counts; want me to push that?


6183-6209: New group-by tests for cast(symbol as SYMBOL) cover the reported failure mode

  • Verifies implicit key inference with aggregates and cast(symbol as SYMBOL).
  • Exercises ORDER BY and SAMPLE BY paths.
    This directly targets the issue described in #6071 and provides good safety nets.

Add a variant where symbols repeat (e.g., insert x%3) to ensure aggregation collapses as expected when keys collide after cast, plus a NULL-bearing case to exercise coalesce branches more meaningfully.

I can extend testSymbolColumnToSymbolInGroupBy2 to insert repeating symbols and add expected aggregated outputs. Shall I propose a diff?

Also applies to: 6210-6235, 6238-6262, 6264-6289, 6291-6316

core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java (1)

305-393: createColumnFunction(): symbol‐key handling is correct; MapSymbolColumn is in the same package

  • MapSymbolColumn is declared in core/src/main/java/io/questdb/griffin/engine/groupby/MapSymbolColumn.java, so no additional import is required.
  • The differentiation between function‐key SYMBOLs (stored as strings) and column‐key SYMBOLs (using symbol tables via MapSymbolColumn) makes sense to avoid unnecessary symbol‐table churn.

• Consider adding a brief Javadoc or inline comment on createColumnFunction() (or on MapSymbolColumn) to explain this design choice:
– Why function‐key SYMBOLs are treated as StrColumn
– Why column‐key SYMBOLs map directly into the symbol table via MapSymbolColumn

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5a29eb3 and 195546a.

📒 Files selected for processing (32)
  • core/src/main/java/io/questdb/griffin/FunctionParser.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (0 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/AbstractGeoHashFunction.java (0 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/GeoByteFunction.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/ArrayColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/BinColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/BooleanColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/ByteColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/CharColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/ColumnFunction.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/DateColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/DoubleColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/FloatColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoByteColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoIntColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoLongColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoShortColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/IPv4Column.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/IntColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/IntervalColumn.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/Long128Column.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/Long256Column.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/LongColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/RecordColumn.java (0 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/ShortColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/StrColumn.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/SymbolColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/TimestampColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/UuidColumn.java (2 hunks)
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/VarcharColumn.java (1 hunks)
  • core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java (9 hunks)
  • core/src/test/java/io/questdb/test/griffin/engine/functions/cast/CastTest.java (3 hunks)
💤 Files with no reviewable changes (3)
  • core/src/main/java/io/questdb/griffin/engine/functions/AbstractGeoHashFunction.java
  • core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java
  • core/src/main/java/io/questdb/griffin/engine/functions/columns/RecordColumn.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
  • GitHub Check: New pull request (Hosted Running tests on windows-other)
  • GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on windows-cairo)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
  • GitHub Check: New pull request (Hosted Running tests on windows-griffin)
  • GitHub Check: New pull request (Hosted Running tests on mac-other)
  • GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo)
  • GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
  • GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
  • GitHub Check: New pull request (Hosted Running tests on mac-griffin)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
  • GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
  • GitHub Check: New pull request (Check Changes Check changes)
  • GitHub Check: build
  • GitHub Check: build
🔇 Additional comments (48)
core/src/main/java/io/questdb/griffin/FunctionParser.java (1)

103-112: Import reshuffle for memoizers looks good.

All moved imports are used in the memoization switch; no duplicates or missing types detected.

core/src/main/java/io/questdb/griffin/engine/functions/columns/ColumnFunction.java (1)

30-36: Solid interface extraction; centralizing toPlan() is the right call.

This reduces duplication across column functions and standardizes plan emission.

core/src/main/java/io/questdb/griffin/engine/functions/columns/StrColumn.java (1)

30-30: Good migration to ColumnFunction with a clean getColumnIndex().

Dropping the local toPlan() in favor of the interface default removes redundant PlanSink wiring; string column behavior remains unchanged.

Also applies to: 37-41

core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoLongColumn.java (3)

33-33: Adopting ColumnFunction here is the right move.

Unifying on ColumnFunction will let planning and group-by utilities reliably detect column keys (including self-casts), which aligns with the PR’s objective. No functional regressions apparent in this class change.


55-58: getColumnIndex() exposure looks correct and minimal.

Returning the stored columnIndex matches the new interface contract and enables the default toPlan to emit the column name. Good.


33-58: Verify EXPLAIN output for geohash columns

We’ve confirmed the code changes removed per-class toPlan for all geohash column implementations (only RecordColumn still overrides toPlan) and that GeoLongColumn correctly implements ColumnFunction and exposes getColumnIndex. However, we haven’t yet validated that the planner (EXPLAIN) output for geohash columns remains identical to the prior per-class implementation. Please perform the following sanity check:

  • Create a table with a geohash column at various bit‐widths, for example:
    CREATE TABLE t4 (ts TIMESTAMP, g GEOHASH4) TIMESTAMP(ts);
    CREATE TABLE t8 (ts TIMESTAMP, g GEOHASH8) TIMESTAMP(ts);
  • Run and capture the plan output both before and after this change:
    EXPLAIN SELECT g FROM t4;
    EXPLAIN SELECT g FROM t8;
  • Compare the PlanSink output (column name, bit‐width details, etc.) against the previous implementation to ensure there is no loss of detail.
  • If you spot any discrepancies, please adjust the default toPlan in ColumnFunction or reintroduce a targeted override in GeoLongFunction/GeoLongColumn. Otherwise, consider adding automated unit tests that assert the EXPLAIN output for each geohash bit‐width remains stable.
core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoIntColumn.java (2)

33-33: Consistent ColumnFunction adoption.

This mirrors the broader refactor and should allow group-by key detection to “see” this as a column source. Looks good.


55-58: getColumnIndex() correctly wired.

Contract satisfied; default planning path will work.

core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoShortColumn.java (2)

33-33: ColumnFunction integration LGTM.

Keeps this class in step with the refactor and enables uniform planning/group-by handling.


55-58: getColumnIndex() is correctly implemented.

No concerns.

core/src/main/java/io/questdb/griffin/engine/functions/columns/ByteColumn.java (2)

33-34: Good: pooled instances and ColumnFunction interface.

Static pooling with ObjList for indices < STATIC_COLUMN_COUNT avoids allocations on hot paths; ColumnFunction adoption removes bespoke toPlan. Solid.


55-62: Thread-safety explicitness is appreciated.

Explicit isThreadSafe() returning true matches other column implementations and helps planner/runtime. getColumnIndex() looks correct.

core/src/main/java/io/questdb/griffin/engine/functions/columns/BinColumn.java (1)

34-34: ColumnFunction adoption and pooling LGTM.

Uniform interface and static pooling mirror other types; no behavior changes in getBin/getBinLen. Looks good.

core/src/main/java/io/questdb/griffin/engine/functions/columns/CharColumn.java (2)

30-31: Migration to ColumnFunction is correct; column index exposure looks good.

Implements ColumnFunction and returns the right column index; no functional risk spotted.


30-31: Verification complete: only RecordColumn.java overrides ColumnFunction.toPlan()

  • Grepped core/src/main/java/io/questdb/griffin/engine/functions/columns: no void toPlan( implementations in any *Column classes aside from the override in RecordColumn.java. All other column functions use the default toPlan() defined in ColumnFunction.java.
  • Inspected GroupByUtils (core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java): it imports and interacts with ColumnFunction, confirming plan generation uniformly relies on the interface’s toPlan() method.

Explain-plan output (including column-name rendering) should remain unchanged.

core/src/main/java/io/questdb/griffin/engine/functions/columns/VarcharColumn.java (2)

31-41: Consistent API change: implements ColumnFunction and exposes column index.

Looks good and aligns with the wider refactor.


31-41: Ensure isThreadSafe() is explicitly declared for VarcharColumn (and StrColumn)

Many of the *Column classes in core/src/main/java/io/questdb/griffin/engine/functions/columns override isThreadSafe() to return true, but both VarcharColumn and its sibling StrColumn currently do not. This inconsistency can lead to unnecessarily blocking concurrent usage if the default is false, or obscure thread-safety guarantees if the default is true.

• If VarcharFunction/StrFunction defaults to non-thread-safe (buffer reuse semantics), add a comment explaining why these column wrappers aren’t marked thread-safe and ensure they are never shared across threads.
• Otherwise, explicitly override in each column:

 public class VarcharColumn extends VarcharFunction implements ColumnFunction {
@@
     public int getColumnIndex() {
         return columnIndex;
     }

+    @Override
+    public boolean isThreadSafe() {
+        return true;
+    }
 }

(Apply the same change to StrColumn.)

To verify defaults and confirm which functions still lack overrides:

rg -n 'class VarcharFunction' -n 'isThreadSafe' core/src/main/java/io/questdb/griffin/engine/functions/columns/VarcharFunction.java
rg -n 'class StrColumn' -n 'isThreadSafe' core/src/main/java/io/questdb/griffin/engine/functions/columns/StrColumn.java
core/src/main/java/io/questdb/griffin/engine/functions/columns/DoubleColumn.java (1)

33-46: LGTM: ColumnFunction adoption and pooling logic preserved.

getColumnIndex(), thread-safety, and the STATIC_COLUMN_COUNT pool remain intact.

core/src/main/java/io/questdb/griffin/engine/functions/columns/ShortColumn.java (1)

33-46: LGTM: Mirrors DoubleColumn changes with ColumnFunction support.

Pooling behavior and getColumnIndex() look correct.

core/src/main/java/io/questdb/griffin/engine/functions/columns/UuidColumn.java (2)

33-46: LGTM: ColumnFunction integration with pooling retained.

No functional issues spotted; getColumnIndex() and factory logic are sound.


33-51: Verify ColumnFunction Support in GroupByUtils and Test Coverage

It looks like SymbolColumn correctly implements ColumnFunction and exposes getColumnIndex() as expected. However, I did not find any usage of ColumnFunction or calls to getColumnIndex() in the groupby package, nor explicit tests for GROUP BY cast(... AS symbol) to confirm the self-cast scenario passes end-to-end.

Please verify and, if missing, update accordingly:

  • In core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java, ensure functions that represent pure column references (i.e. instances of ColumnFunction) are detected and their getColumnIndex() is used to map grouping keys rather than falling back to generic function handling.
  • Add an integration test (or extend an existing one) under core/src/test/java/io/questdb/test/griffin/ that performs:
    CREATE TABLE t AS (SELECT x, CAST(x AS SYMBOL) AS sx FROM long_sequence(10));
    SELECT CAST(x AS SYMBOL) sx FROM t GROUP BY sx;
    
    and asserts the correct grouping behavior.
  • If the detection logic is already present in another helper or factory class, point it out here so we can confirm its coverage.

Once the above is in place, rerun the reproduction for the linked issue to confirm that self-casts in GROUP BY now pass.

core/src/main/java/io/questdb/griffin/engine/functions/columns/ArrayColumn.java (2)

31-31: Good move to implement ColumnFunction and align with the new contract.

This keeps ArrayColumn consistent with the rest of the column wrappers and enables centralised planning via ColumnFunction’s default toPlan.


49-56: getColumnIndex + isThreadSafe look correct.

Stateless accessors; returning true for thread-safety is consistent with other column wrappers.

core/src/main/java/io/questdb/griffin/engine/functions/columns/DateColumn.java (2)

33-33: Interface switch to ColumnFunction is spot on.

This unifies column metadata exposure and defers planning to the default implementation.


48-51: getColumnIndex addition looks good.

Matches the ColumnFunction contract and is used by planning and group-by detection logic.

core/src/main/java/io/questdb/griffin/engine/functions/columns/FloatColumn.java (2)

33-33: Implements ColumnFunction — consistent with the refactor.

This enables GroupByUtils to recognise column keys (including self-casts) uniformly.


48-51: getColumnIndex implementation LGTM.

No concerns.

core/src/main/java/io/questdb/griffin/engine/functions/columns/LongColumn.java (2)

33-33: ColumnFunction adoption is correct.

Keeps LongColumn aligned with the new API and removes per-class planning boilerplate.


48-51: getColumnIndex is straightforward and correct.

No action needed.

core/src/main/java/io/questdb/griffin/engine/functions/columns/TimestampColumn.java (3)

33-33: Switch to ColumnFunction looks good.

This ensures timestamp columns participate in the new column-key detection path used by group-by logic.


48-51: getColumnIndex addition is correct.

Integrates cleanly with default toPlan in ColumnFunction.


33-33: Verification complete: SymbolColumn meets ColumnFunction requirements

SymbolColumn in core/src/main/java/io/questdb/griffin/engine/functions/columns/SymbolColumn.java extends SymbolFunction and implements ColumnFunction (line 36), and it defines getColumnIndex() (line 56). As a result, GroupByUtils.findColumnKeyIndex will correctly recognize symbol columns as keys. No further changes are needed.

core/src/main/java/io/questdb/griffin/engine/functions/columns/Long128Column.java (2)

33-33: Adopting ColumnFunction is correct and aligns with new column-key detection

Moving to ColumnFunction will let planning and GroupByUtils treat this as a column-projection uniformly. No functional risks spotted.


48-52: Expose column index via getColumnIndex(): LGTM

Accessor is simple, side-effect free, and consistent with other column classes.

core/src/main/java/io/questdb/griffin/engine/functions/columns/GeoByteColumn.java (2)

55-58: getColumnIndex(): LGTM

Consistent with the new interface; no hidden state, safe for reuse/pooling.


33-33: Verify GeoHash EXPLAIN plan readability

I ran searches across the codebase and confirmed:

  • ColumnFunction.default.toPlan(PlanSink) simply emits the column name via sink.putColumnName(getColumnIndex()).
  • There are no existing tests or assertions in core/src/test that inspect EXPLAIN output for Geo types (Geo, GEOBYTE, GeoHash).

Removing GeoByteFunction’s class-specific toPlan override won’t break any plan-sensitive tests, but it does omit the bit-width hints previously included in the plan. Please execute EXPLAIN on representative queries that involve GeoByteColumn (e.g., SELECT geohash(...)) and verify that the resulting plan remains clear and informative.

  • If the output is acceptable, you can consider this change safe to merge.
  • Otherwise, we may need to reintroduce a custom toPlan to preserve the missing hints.
core/src/main/java/io/questdb/griffin/engine/functions/columns/IPv4Column.java (1)

38-42: getColumnIndex(): LGTM

Matches the new contract; no issues.

core/src/main/java/io/questdb/griffin/engine/functions/columns/Long256Column.java (2)

35-35: Migration to ColumnFunction looks good

Pooling and construction patterns stay intact; central planning via the interface is appropriate.


50-54: getColumnIndex(): LGTM

Simple accessor; fits the shared pattern.

core/src/main/java/io/questdb/griffin/engine/functions/columns/IntColumn.java (2)

33-33: Switch to ColumnFunction: LGTM

Consistent with the broader refactor and necessary for unified group-by handling.


48-52: getColumnIndex(): LGTM

Straightforward and thread-safe; matches other primitive columns.

core/src/main/java/io/questdb/griffin/engine/functions/columns/BooleanColumn.java (1)

33-61: Adopt ColumnFunction: implementation looks correct and thread-safe

  • getColumnIndex() correctly exposes the backing column index.
  • isThreadSafe() returning true matches the read-only nature of column access.
core/src/main/java/io/questdb/griffin/engine/functions/columns/SymbolColumn.java (1)

65-74: Review fallback instanceof branch for reachability

I wasn’t able to find any classes in the codebase that both extend SymbolFunction and implement SymbolTable, so the fallback if (symbolTable instanceof SymbolFunction) path in SymbolColumn.getStaticSymbolTable() may never be taken. Please verify whether any SymbolFunction subclasses also implement SymbolTable; if not, consider removing this dead‐code branch.

• File: core/src/main/java/io/questdb/griffin/engine/functions/columns/SymbolColumn.java
Lines: 65–74

core/src/test/java/io/questdb/test/griffin/engine/functions/cast/CastTest.java (1)

27-30: Additional imports are appropriate for the new tests

PartitionBy and CreateTableTestUtils usage is valid here.

core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java (4)

122-156: First pass: column-key detection via findColumnKeyIndex is a clear improvement

Parsing once and deferring the “column vs function vs aggregate” decision to findColumnKeyIndex reduces duplication and correctly handles no-op self-casts (e.g., symbol::symbol) as map keys.


193-255: Value-slot planning: correct ordering and fill validation using concrete GroupByFunction

  • Using groupByFunc for error messaging and initValueTypes(outValueTypes) is precise.
  • The “same column several times in a row” coalescing for keys is preserved.

404-406: Comment clarifies dependency on prior planning

Small but helpful documentation; no action needed.


616-629: findColumnKeyIndex(): concise and correct for literals and no-op self-casts

This reliably distinguishes:

  • Pure literals (base metadata column).
  • Self-casts compiled as ColumnFunction (e.g., symbol::symbol).

nwoolmer
nwoolmer previously approved these changes Aug 26, 2025
@puzpuzpuz
Copy link
Copy Markdown
Contributor Author

@nwoolmer thanks for the review!

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 80 / 87 (91.95%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/SqlCodeGenerator.java 2 3 66.67%
🔵 io/questdb/griffin/engine/functions/columns/IPv4Column.java 7 8 87.50%
🔵 io/questdb/griffin/engine/groupby/GroupByUtils.java 36 41 87.80%
🔵 io/questdb/griffin/engine/functions/columns/DateColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/CharColumn.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/columns/StrColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/VarcharColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/SymbolColumn.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/columns/BooleanColumn.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/columns/ArrayColumn.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/columns/FloatColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/Long128Column.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/ColumnFunction.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/columns/Long256Column.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/ShortColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/BinColumn.java 1 1 100.00%
🔵 io/questdb/griffin/FunctionParser.java 6 6 100.00%
🔵 io/questdb/griffin/engine/functions/IPv4Function.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/columns/DoubleColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/UuidColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/LongColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/TimestampColumn.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/columns/ByteColumn.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/columns/IntColumn.java 1 1 100.00%

@bluestreak01 bluestreak01 merged commit 589e109 into master Sep 24, 2025
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Incorrect or unexpected behavior SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

500 error when self-casting a symbol

5 participants