fix(sql): handle quoted column names with dots in JOIN metadata#6590
fix(sql): handle quoted column names with dots in JOIN metadata#6590bluestreak01 merged 1 commit intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
[PR Coverage check]😍 pass : 1 / 1 (100.00%) file detail
|
|
dear @coderabbitai please review this. one rule for this review: question everything! |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 16606 🏁 Script executed: Length of output: 2964 🏁 Script executed: Length of output: 4583 🏁 Script executed: Length of output: 14200 🏁 Script executed: Length of output: 15707 🏁 Script executed: Length of output: 7506 @jerrinot - Alright, I'll question everything as requested. 🔍 Critical Analysis1. Inconsistency Fix, But Why Was This Missed?Looking at the git history,
2. Why LAST Unquoted, Not FIRST Unquoted?The method uses
3. Test Doesn't Reproduce Original BugYour test uses: SELECT x as "foo.bar" FROM long_sequence(5)But the original issue #6407 was: SELECT json_extract(msg, '$.') FROM "a-b-c" LEFT JOIN ...Questions:
4. Quote Escaping - Is This a Bug Waiting to Happen?Looking at the if (seq.charAt(i) == '\"') {
inQuotes = !inQuotes;
}This simply toggles on
5. Performance Implications
6. Root Cause vs SymptomThe issue mentions "the optimizer generates a wild alias like
7. Edge Cases Not CoveredWhat about:
8. The Assertion That Was RemovedI notice in the git history that an assertion was removed earlier: - assert dot != -1 || tableAlias != null;Why was this removed? Was it triggering false positives? Does your fix now make this assertion valid again? 9. Coverage: Is 100% Enough?The coverage report shows 1/1 lines covered (100%). But:
10. Documentation GapNeither
Should this be documented to prevent future regressions? Recommended Actions
What do you think? Should I dig deeper into any of these concerns? 🐰 🏁 Script executed: Length of output: 293275 @jerrinot Thanks — I questioned everything. Here’s the focused review. What I verified
Concerns and questions
Concrete proposals A) Harden Chars.indexOfLastUnquoted to support "" inside quoted identifiers public static int indexOfLastUnquoted(@NotNull CharSequence seq, char ch, int seqLo, int seqHi) {
boolean inQuotes = false;
int last = -1;
for (int i = seqLo; i < seqHi; i++) {
char c = seq.charAt(i);
if (c == '\"') {
if (inQuotes && i + 1 < seqHi && seq.charAt(i + 1) == '\"') {
// Escaped quote inside a quoted identifier: consume both quotes, stay in quotes.
i++;
continue;
}
inQuotes = !inQuotes;
continue;
}
if (!inQuotes && c == ch) {
last = i;
}
}
return last;
}B) Make getColumnIndexQuiet() respect the hi bound @Override
public int getColumnIndexQuiet(CharSequence columnName, int lo, int hi) {
final MapKey key = map.withKey();
final int dot = Chars.indexOfLastUnquoted(columnName, '.', lo, hi);
if (dot == -1) {
key.putStr(null);
key.putStrLowerCase(columnName, lo, hi);
} else {
key.putStrLowerCase(columnName, 0, dot);
// use hi instead of columnName.length() to honor caller-provided bounds
key.putStrLowerCase(columnName, dot + 1, hi);
}
MapValue value = key.findValue();
return value != null ? value.getInt(0) : -1;
}C) Add tests to pin the original and adjacent scenarios @Test
public void testJoiningSubqueryWithJsonExtractDotInAlias() throws Exception {
assertMemoryLeak(() -> {
// The alias will be something like: _xQdbA1."json_extract(msg, '$.')"
execute("create table t(msg string)");
execute("insert into t values ('{\"k\":1}')");
assertQueryNoLeakCheck(
// We only need to check that it compiles and executes without internal errors;
// expected output: one row with the extracted object and "1" from the RHS.
"json_extract\t1\n" +
"{\"k\":1}\t1\n",
"SELECT * FROM ( SELECT json_extract(msg, '$.') FROM t ) LEFT JOIN (select 1) ON 1 = 1",
null,
false
);
});
}
@Test
public void testQualifiedAliasWithQuotedDots() throws Exception {
assertMemoryLeak(() -> {
// Ensure we can qualify a quoted identifier with internal dots and still resolve correctly.
assertQueryNoLeakCheck(
"\"a.b.c\"\t1\n" +
"42\t1\n",
"SELECT * FROM (SELECT 42 as \"a.b.c\") l LEFT JOIN (select 1) r ON true",
null,
false
);
});
}
@Test
public void testQuotedIdentifierWithEscapedQuoteAndDot() throws Exception {
assertMemoryLeak(() -> {
// If Chars.indexOfLastUnquoted handles "" correctly, this remains unqualified.
assertQueryNoLeakCheck(
"\"a\"\"b.c\"\t1\n" +
"7\t1\n",
"SELECT * FROM (SELECT 7 as \"a\"\"b.c\") l LEFT JOIN (select 1) r ON true",
null,
false
);
});
}Notes:
D) Tiny docs note (inline comments)
Summary stance
Would you like me to open a follow-up issue for A/B/C/D so we can land this PR cleanly and track the hardening separately? |
fixes #6407