fix(sql): fix SELECT DISTINCT queries with table name prefixed columns#6061
fix(sql): fix SELECT DISTINCT queries with table name prefixed columns#6061bluestreak01 merged 11 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughUpdates adjust alias generation to handle dotted identifiers and disallowed names case-insensitively, add a ranged contains() to LowerCaseCharSequenceHashSet, and refine local immutability/comments in SqlOptimiser. Tests are expanded for DISTINCT with joins and alias behaviors, including toggling column-alias-expression settings. Changes
Sequence Diagram(s)sequenceDiagram
participant Parser
participant SqlUtil as SqlUtil.createExprColumnAlias
participant Set as LowerCaseCharSequenceHashSet
Parser->>SqlUtil: base, aliasMap, nonLiteral, maxLen
alt base contains '.'
SqlUtil->>Set: contains(base, dot+1, end)
Set-->>SqlUtil: true/false
SqlUtil->>SqlUtil: decide quoting, take substring after dot
else no dot
SqlUtil->>Set: contains(base)
Set-->>SqlUtil: true/false
SqlUtil->>SqlUtil: decide quoting, use full base
end
SqlUtil->>SqlUtil: trim/limit, dedupe with suffixes
SqlUtil-->>Parser: alias
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changesNone found. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
core/src/main/java/io/questdb/griffin/SqlUtil.java (3)
106-110: Optional: early-exit for prefixed literals (micro-optimization)You can extend the early-exit to return the unprefixed column substring when not quoting and constraints are met. This avoids allocating a
CharacterStoreEntryfor simple, non-conflicting cases.Non-blocking; current logic is correct.
133-139: Fix trimming logic for prefixed literals (trims the wrong slice today)When
prefixedLiteralis true, the trailing-space check usesbase.charAt(len - 1)andChars.lastIndexOfDifferent(base, 0, len, ' '), which index into the start of the fullbaserather than the substring after the dot. This can produce incorrect trimming if the column segment has trailing spaces.Adjust the indices to operate within the substring range:
- // We don't want the alias to finish with a space. - if (!quote && len > 0 && base.charAt(len - 1) == ' ') { - final int lastSpace = Chars.lastIndexOfDifferent(base, 0, len, ' '); - if (lastSpace > 0) { - len = lastSpace + 1; - } - } + // We don't want the alias to finish with a space. + if (!quote && len > 0) { + final int baseStart = prefixedLiteral ? indexOfDot + 1 : 0; + final int baseEnd = baseStart + len; // exclusive + if (base.charAt(baseEnd - 1) == ' ') { + final int lastNonSpace = Chars.lastIndexOfDifferent(base, baseStart, baseEnd, ' '); + if (lastNonSpace >= baseStart) { + len = lastNonSpace - baseStart + 1; + } else { + len = 0; + } + } + }Low impact in practice (parsing typically strips trailing spaces), but this makes the logic correct for all inputs, including quoted or unusual spacing.
99-104: Non-blocking: consider last dot for multi-part identifiersCurrently the substring is taken after the first dot. If multi-part identifiers (e.g., schema.table.column) ever appear in
base, using the last dot would yield the column segment. QuestDB doesn’t use multi-part names today, so this is a future-proofing note only.core/src/test/java/io/questdb/test/griffin/DistinctTest.java (1)
335-356: Good reproduction of the original failure mode; minor DRY opportunity
- This precisely matches the reported failure (table-prefixed columns with DISTINCT).
- Optional: you could factor table setup/insert into a small helper to reduce duplication across the three helper methods.
Non-blocking.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
core/src/main/java/io/questdb/griffin/SqlOptimiser.java(5 hunks)core/src/main/java/io/questdb/griffin/SqlUtil.java(3 hunks)core/src/main/java/io/questdb/std/LowerCaseCharSequenceHashSet.java(1 hunks)core/src/test/java/io/questdb/test/griffin/DistinctTest.java(2 hunks)core/src/test/java/io/questdb/test/griffin/SqlUtilTest.java(2 hunks)core/src/test/java/io/questdb/test/std/LowerCaseCharSequenceHashSetTest.java(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
- GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
- GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
- GitHub Check: New pull request (Hosted Running tests on windows-other)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (14)
core/src/main/java/io/questdb/griffin/SqlOptimiser.java (2)
4340-4346: Immutability in rewriteDistinct: good cleanup and zero semantic riskMaking wrapperNested/wrapperModel/bottomUpColumns and loop locals (qc, ast, alias) final, as well as countAst/countColumn, improves readability and guards against accidental reassignment without altering behavior. This aligns with the rest of the method’s intent and should be a no-op semantically.
Also applies to: 4351-4354, 4373-4386
6754-6762: validateColumnAndGetModelIndex: finalized locals and clearer comment — LGTM
- final joinModels improves clarity.
- The updated comment clarifies the special-casing for group-by and projection references; it matches the logic that returns -1 only when the projection match is allowed (and groupByCall is false).
No functional changes observed.core/src/test/java/io/questdb/test/std/LowerCaseCharSequenceHashSetTest.java (1)
88-88: Good addition: range-based contains() is coveredThe new assertion exercises the
contains(CharSequence, int, int)overload and strengthens the saturation verification.core/src/main/java/io/questdb/std/LowerCaseCharSequenceHashSet.java (1)
70-72: LGTM: range-based contains() overloadThe overload mirrors existing
contains(key)semantics by delegating tokeyIndex(key, lo, hi) < 0. This is consistent with the set’s negative-index “found” convention and supports efficient subrange checks upstream.core/src/main/java/io/questdb/griffin/SqlUtil.java (2)
45-46: Case-insensitive disallowed alias handling is a solid improvementSwitching
disallowedAliasestoLowerCaseCharSequenceHashSetaligns with how identifiers are compared elsewhere and fixes uppercase keyword cases without extra normalization.Also applies to: 68-69
99-104: Dot-aware aliasing and conditional quoting logic look correct
prefixedLiteralcorrectly distinguishes literal table-prefixed identifiers.- Quoting rules handle non-literals with dots and reserved words, and only quote the column part for prefixed literals that would otherwise conflict.
This addresses the DISTINCT rewrite aliasing issue described in #6058.
core/src/test/java/io/questdb/test/griffin/DistinctTest.java (3)
36-45: LGTM: coverage for alias-expression toggleThese entry-point tests exercise both configurations by delegating to shared helpers, keeping duplication low while validating behavior under both settings.
358-387: LGTM: inner-join DISTINCT with mixed table prefixesValidates alias resolution across join boundaries and ensures the unprefixed aliasing doesn’t break inter-model references.
389-422: LGTM: duplicate column name across tables with DISTINCTGood assertion on header naming differences when alias-expression is enabled vs. disabled (
apptype_2vsapptype1). This guards the exact behavior the optimizer and alias generator should produce.core/src/test/java/io/questdb/test/griffin/SqlUtilTest.java (5)
70-75: LGTM: reserved-word aliases are quoted case-insensitivelyThe added uppercase assertions confirm the new lower-casing set behavior.
79-93: LGTM: duplicate base alias suffixingVerifies sequential
_2,_3behavior and guards against regressions in suffixing logic.
105-113: LGTM: trimming trailing spaces for simple basesConfirms that the trailing space removal preserves leading spaces while trimming the tail only.
114-122: LGTM: max-length trimmingEnsures aliases respect max length limits without breaking early-exit cases.
125-136: LGTM: non-literal quoting coverageCovers both pre-quoted and dot-containing non-literals; aligns with the new
quotedecision logic.
[PR Coverage check]😍 pass : 35 / 35 (100.00%) file detail
|
|
really solid PR, thanks @puzpuzpuz 👍 |
|
@bluestreak01 thanks for the review! |
Fixes #6058
Fixes #6066
Queries like
and any queries that involve table name-prefixed columns in SELECT weren't compiling. That's due to select distinct to group by rewrite combined with expression alias generation. The alias generation kept
sensors.sensor_idas the column alias, without any quotes, and laterSqlOptimiserremoved thesensors.leading to broken column references between the outer and the nested query models.The fix is to use
sensor_idinstead ofsensors.sensor_idas the implicit alias.