feat(sql): speed-up queries with multiple IN predicates on indexed columns#6112
Conversation
…n indexes columns
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughUpdates WhereClauseParser to conditionally allow merging list-of-values with bind variables based on whether a new column is being created. Adds a PG wire test verifying that query plans with bind variables favor the high-cardinality index regardless of predicate order. Changes
Sequence Diagram(s)sequenceDiagram
participant Query as Query (IN-list with binds)
participant Parser as WhereClauseParser
participant Analyzer as analyzeListOfValues
participant Planner as Planner
Query->>Parser: Parse WHERE with IN-list
Parser->>Analyzer: analyzeListOfValues(nodes, newColumn)
alt all key values known OR newColumn
Analyzer->>Analyzer: Merge/validate values
alt existing column AND bind variable present
note over Analyzer: Skip merging when !newColumn and bind seen with prior values
else new column or all known
Analyzer->>Analyzer: Allow merge
end
else
Analyzer->>Analyzer: Do not merge lists
end
Parser->>Planner: Emit analyzed predicates
Planner->>Planner: Choose index based on cardinality
Planner-->>Query: Plan (index scan on high-cardinality col)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
core/src/main/java/io/questdb/griffin/WhereClauseParser.java (1)
885-887: Use short-circuit boolean operator (&&) instead of bitwise&.Replace
&with&&for clarity and to avoid unnecessary RHS evaluation.- if (!allKeyValuesAreKnown & !newColumn) { + if (!allKeyValuesAreKnown && !newColumn) { return false; }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/io/questdb/griffin/WhereClauseParser.java(2 hunks)core/src/test/java/io/questdb/test/cutlass/pgwire/PGJobContextTest.java(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (Hosted Running tests on windows-other)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
- GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
- GitHub Check: New pull request (Check Changes Check changes)
[PR Coverage check]😍 pass : 3 / 3 (100.00%) file detail
|
This speeds up PGWire queries with
IN()predicates on multiple indexed columns.Example:
Details:
When both
symAandsymbBare indexed then the optimizer should plan index-scan on the column with higher cardinality. Since this column will likely result in a greater selectivity. However, before this PR the optimize would refuse to consider column cardinality in the presence of binding variables. Thus yielding a suboptimal plan.This change it more relaxed. Reasoning: We cannot merge keys when binding variables are used. But if a column is 'new' (=different from the current key column) then we can still analyze it and possibly swap it for the existing key column. This does not involve any key merging => it's safe with binding variables.
TODO: