feat(sql): timestamp predicate pushdown through virtual models with dateadd offset#6702
feat(sql): timestamp predicate pushdown through virtual models with dateadd offset#6702bluestreak01 merged 26 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
WalkthroughAdds dateadd-based timestamp offset detection and an Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Fix all issues with AI agents
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Around line 6021-6031: The model-provided timestamp index
(modelTimestampIndex) can be overridden by a later literal match; modify the
loop in SqlCodeGenerator so that when modelTimestampIndex == i you set
virtualMetadata.setTimestampIndex(i) and then continue to the next iteration
(skip the literal check), or alternatively only allow the literal-path
(node.type == LITERAL && Chars.equalsNc(...)) to set the timestamp if
modelTimestampIndex is not equal to i and virtualMetadata has no timestamp set;
adjust the logic around modelTimestampIndex, virtualMetadata.setTimestampIndex,
node.type, LITERAL, Chars.equalsNc, timestampColumn and the loop over
columns/QueryColumn/ExpressionNode to enforce model precedence.
In `@core/src/main/java/io/questdb/griffin/SqlOptimiser.java`:
- Around line 3689-3706: The unary-minus branch in isConstantIntegerExpression
currently returns true for any CONSTANT rhs without verifying it's an integer;
update isConstantIntegerExpression so that when node.type == OPERATION and token
== "-" and paramCount == 1 you additionally attempt to parse node.rhs.token as a
long (using the same Numbers.parseLong used earlier) and only return true if
parsing succeeds, catching NumericException and returning false otherwise; keep
the existing null/type checks for node.rhs and preserve behavior for the
non-unary cases.
- Around line 4049-4106: referencesColumn and rewriteColumnReferences only
compare node.token exactly, so qualified names like "v.ts" are not matched or
rewritten; update referencesColumn(ExpressionNode) to, when node.type ==
LITERAL, check token suffix after the last '.' (case-insensitive) against
columnName and return true if equal, and update
rewriteColumnReferences(ExpressionNode) to, when matching the suffix, replace
only the suffix portion (preserving any qualifying prefix and the dot) with the
new name (or if there is no prefix, replace the whole token); ensure
rewriteTimestampColumnForOffset still supplies the same from/to values to these
methods so virtual timestamp aliases like "v.ts" are recognized and rewritten to
the source column.
In `@core/src/main/java/io/questdb/griffin/WhereClauseParser.java`:
- Around line 458-474: The code extracts a time unit by naively taking
unitNode.token.charAt(0) which accepts multi-char tokens or bind variables
(e.g., "'ms'" or ":v") and causes incorrect shifts; update the parsing in
WhereClauseParser to validate unitNode is a constant single-character token
(unitNode.token length is 1 or length==3 with surrounding single quotes and a
single inner char), reject or return false for any other shape (multi-char
tokens, bind variables, empty tokens), then only set the char into variable
unit; keep offset parsing via Numbers.parseLong(offsetNode.token) and preserve
the NumericException handling.
- Around line 517-529: convertToMicroseconds currently approximates calendar
units ('M','y') and returns raw values for unknown units, risking incorrect
shifts and silent overflow; update convertToMicroseconds to reject
variable-length units ('M' and 'y') and any unknown unit by throwing or
returning a sentinel that signals "unsupported", and use Math.multiplyExact (or
a try/catch around it) for all multiplications (cases
'u','T','s','m','h','d','w') to detect overflow; when multiplyExact throws,
ensure the caller (e.g., the pushdown decision path) treats this as a conversion
failure and skips pushdown rather than applying a bad offset.
In `@core/src/test/java/io/questdb/test/griffin/engine/SqlCodeGeneratorTest.java`:
- Around line 9201-9205: The inline comment in SqlCodeGeneratorTest near the
cursor/record assertion contains an incorrect microsecond timestamp for
2022-01-15T11:00:00Z; update the comment (in the block that reads "First row:
2022-01-15T12:00:00 - 1h = 2022-01-15T11:00:00 = 1642247200000000 microseconds")
to the correct value 1642244400000000 so the comment matches the intended time
math around the cursor/record assertion (verify around the
Record/Assert.assertEquals lines).
In `@core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java`:
- Around line 6020-6041: The test method testTimestampOffsetRewriteModelDaysUnit
currently checks for "'d'" anywhere in modelStr which can match the original
dateadd and not the optimizer output; update the assertions to locate the
and_offset(...) fragment and assert the unit and offset within that wrapper
(e.g., assert modelStr contains "and_offset(" and then contains the expected
"'d', -X" pattern for the negative-offset case), and likewise tighten the
positive-offset test (the other test referenced around the lines 6091-6108) to
assert the inverted offset value appears inside the and_offset(...) call rather
than anywhere in modelStr; use the QueryModel/modelStr output to perform these
targeted contains checks on the and_offset(...) substring.
In `@core/src/test/java/io/questdb/test/griffin/SqlParserTest.java`:
- Around line 12260-12266: The comment in
testTimestampPredicateNoPushdownWithoutDateadd contradicts the expected result:
it claims predicates are "pushed all the way through" but the assertQuery's
expected SQL keeps the predicate on the outer alias `ts`. Either update the
comment to state that predicates remain on the outer alias (i.e., no pushdown)
or change the expected string passed to assertQuery so the predicate is pushed
into the inner query; locate the assertion in method
testTimestampPredicateNoPushdownWithoutDateadd (the assertQuery call with the
expected SQL string and modelOf("trades").col("price",
ColumnType.DOUBLE).timestamp()) and edit the comment or the expected SQL
accordingly.
- Around line 12243-12247: Update the block header comment above the tests for
"and_offset rewrite when pushing timestamp predicates through virtual models
with dateadd offset" to reflect the current behavior: remove the sentence
claiming the and_offset rewrite "is NOT working yet" and change the EXPECTED
behavior description to state that predicates are wrapped in and_offset and
reference the real 'timestamp' column (instead of the virtual 'ts' column); this
comment surrounds tests asserting and_offset behavior and refers to the
and_offset rewrite, the virtual 'ts' column, and the real 'timestamp' column.
core/src/test/java/io/questdb/test/griffin/engine/SqlCodeGeneratorTest.java
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java
Outdated
Show resolved
Hide resolved
|
@CodeRabbit, i addressed some review comments and added tests. Can you review? |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@core/src/main/java/io/questdb/griffin/SqlOptimiser.java`:
- Around line 3892-3919: The isDateaddTimestampExpression method currently
compares timestampArg.token using exact equality, which misses qualified
references like v.ts; update the comparisons to use the existing
matchesColumnName(...) helper instead of Chars.equalsIgnoreCase for all three
checks: compare timestampArg.token against nestedModel.getTimestamp().token via
matchesColumnName, compare against QueryColumn alias via
matchesColumnName(tsCol.getAlias(), timestampArg.token) when checking
nestedModel.getTimestampColumnIndex(), and compare against
nestedModel.getTimestampOffsetAlias() using matchesColumnName as well so
qualified suffix matches are recognized.
- Around line 3744-3821: detectTimestampOffsetsRecursive can overwrite an
explicitly designated timestamp because it sets model.setTimestampColumnIndex
for virtual models; change the logic in detectTimestampOffsetsRecursive to skip
processing when model.getTimestamp() != null (i.e., if the model already has an
explicit timestamp, do not attempt to detect or set timestampColumnIndex or
offset fields). Locate the method detectTimestampOffsetsRecursive and add a
guard using model.getTimestamp() before the existing SELECT_MODEL_VIRTUAL and
timestampColumnIndex checks so that model.setTimestampColumnIndex,
setTimestampOffsetUnit/value, setTimestampSourceColumn, and
setTimestampOffsetAlias are never called for models with an explicit timestamp.
In `@core/src/test/java/io/questdb/test/griffin/GroupByTest.java`:
- Around line 3123-3169: The test testSelectDistinctOnTimestampWithOrderBy is a
duplicate of testSelectDistinctOnAliasedColumnWithOrderBy and has mismatched
expectations (plan shows functions: [dateadd('h',1,created)] and results show
+1h) vs the query which selects raw tab.created; either (A) change the query
string in testSelectDistinctOnTimestampWithOrderBy to select the expression
using dateadd('h',1,created) so the plan and expected rows match, or (B) keep
the existing query and update the assertPlanNoLeakCheck and
assertQueryNoLeakCheck expectations to reflect raw timestamps (00:00:00.00000xZ)
instead of +1h, or (C) remove this test entirely to avoid duplication with
testSelectDistinctOnAliasedColumnWithOrderBy; update the assertions using the
existing helper calls assertPlanNoLeakCheck and assertQueryNoLeakCheck to
reflect your chosen fix.
🧹 Nitpick comments (5)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)
6021-6032: Guard against out-of-range model timestamp index.If
modelTimestampIndexis set but falls outside the projection list, the literal fallback is disabled and the timestamp may remain unset. Consider sanitizing the index so the previous heuristic still applies when the override is invalid.🛠️ Suggested defensive guard
- int modelTimestampIndex = model.getTimestampColumnIndex(); + int modelTimestampIndex = model.getTimestampColumnIndex(); + if (modelTimestampIndex < 0 || modelTimestampIndex >= columnCount) { + modelTimestampIndex = -1; + }core/src/test/java/io/questdb/test/griffin/TimestampOffsetPushdownTest.java (2)
326-371: Consider adding leap year edge case test.The month offset test correctly demonstrates calendar-aware interval adjustment (Feb 28 + 1 month = Mar 28). However, leap year edge cases (e.g., what happens with Feb 29 → Mar 29, or Jan 31 → Feb 28/29) are not explicitly tested.
💡 Optional: Add leap year test case
`@Test` public void testMonthOffsetLeapYearPushdown() throws Exception { // Test calendar-aware month handling across leap year boundary assertMemoryLeak(() -> { execute("CREATE TABLE trades (price DOUBLE, timestamp TIMESTAMP) TIMESTAMP(timestamp) PARTITION BY MONTH;"); // Row with timestamp Mar 31 -> ts (after -1M) = Feb 28/29 depending on year execute("INSERT INTO trades VALUES (100, '2024-03-31T12:00:00.000000Z');"); // 2024 is leap year String query = """ SELECT * FROM ( SELECT dateadd('M', -1, timestamp) as ts, price FROM trades ) WHERE ts IN '2024-02' """; // ts = Feb 29 (leap year clamps Mar 31 - 1M to Feb 29) assertQueryNoLeakCheck( """ ts\tprice 2024-02-29T12:00:00.000000Z\t100.0 """, query, "ts", true, false ); }); }
495-515: Consider adding zero offset and BETWEEN test cases.The test suite is comprehensive but could benefit from a few additional edge cases mentioned in the PR description:
- Zero offset: Test that
dateadd('h', 0, timestamp)results in an unchanged interval (no-op offset).- BETWEEN predicate: The PR mentions BETWEEN support, but no explicit test with
dateadd+BETWEENexists.💡 Optional: Additional edge case tests
`@Test` public void testZeroOffsetPushdown() throws Exception { // Zero offset should result in no interval adjustment assertMemoryLeak(() -> { execute("CREATE TABLE trades (price DOUBLE, timestamp TIMESTAMP) TIMESTAMP(timestamp) PARTITION BY DAY;"); execute("INSERT INTO trades VALUES (100, '2022-01-01T12:00:00.000000Z');"); String query = """ SELECT * FROM ( SELECT dateadd('h', 0, timestamp) as ts, price FROM trades ) WHERE ts IN '2022' """; // Interval should remain unchanged: [2022-01-01, 2022-12-31 23:59:59] assertPlanNoLeakCheck( query, """ VirtualRecord functions: [dateadd('h',0,timestamp),price] PageFrame Row forward scan Interval forward scan on: trades intervals: [("2022-01-01T00:00:00.000000Z","2022-12-31T23:59:59.999999Z")] """ ); }); } `@Test` public void testBetweenPredicatePushdown() throws Exception { assertMemoryLeak(() -> { execute("CREATE TABLE trades (price DOUBLE, timestamp TIMESTAMP) TIMESTAMP(timestamp) PARTITION BY DAY;"); execute("INSERT INTO trades VALUES (100, '2022-01-01T02:00:00.000000Z');"); execute("INSERT INTO trades VALUES (150, '2022-01-01T03:00:00.000000Z');"); String query = """ SELECT * FROM ( SELECT dateadd('h', -1, timestamp) as ts, price FROM trades ) WHERE ts BETWEEN '2022-01-01T01:00:00' AND '2022-01-01T02:00:00' """; // BETWEEN should also push down with offset adjustment assertPlanNoLeakCheck( query, """ VirtualRecord functions: [dateadd('h',-1,timestamp),price] PageFrame Row forward scan Interval forward scan on: trades intervals: [("2022-01-01T02:00:00.000000Z","2022-01-01T03:00:00.000000Z")] """ ); }); }core/src/test/java/io/questdb/test/griffin/SqlParserTest.java (2)
12258-12266: Rename test to reflect actual pushdown behavior.The method name says “NoPushdown” but the expected model pushes the predicate down to
timestamp. Consider renaming for clarity.♻️ Suggested rename
- public void testTimestampPredicateNoPushdownWithoutDateadd() throws Exception { + public void testTimestampPredicatePushdownWithoutDateadd() throws Exception {
12415-12423: Clarify that this test only covers seconds.This method currently validates only the seconds unit; minutes and months are handled in later tests. Consider renaming the test or tightening the comment to avoid implying broader coverage.
♻️ Example adjustment
- public void testTimestampPredicateWithDifferentTimeUnits() throws Exception { - // Test various time units: seconds, minutes, months - // Using 's' for seconds + public void testTimestampPredicateWithSecondUnit() throws Exception { + // Seconds unit
|
@CodeRabbit I addressed your comments once again and added more tests, pls review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@core/src/main/java/io/questdb/griffin/SqlOptimiser.java`:
- Around line 4020-4082: The NonLiteralException branch currently pushes
predicates to nested models if they are timestamp predicates even when the
predicate also references other non-literal aliases; change this to only push
when the predicate references exclusively the timestamp alias/token (or an
AND/OR subset composed only of timestamp-only parts). Implement a conservative
helper like referencesOnlyTimestampAlias(ExpressionNode node, QueryModel model)
and call it in the catch before rewriteTimestampColumnForOffset(node,
parent)/wrapInAndOffset(node, parent)/addWhereNode(nested, wrapped); if it
returns false, fall back to addWhereNode(parent, node) so no non-literal aliases
are pushed into the nested model. Ensure the check uses
model.getTimestamp().token and model.getTimestampOffsetAlias() to match allowed
literal tokens and treat multi-argument expressions by traversing all child
nodes.
- Around line 4104-4200: The rewriteColumnToken currently preserves the original
prefix and blindly appends the replacement 'to', which double‑qualifies when
'to' itself contains a dot (e.g., rewriting "v.ts" -> "v.t.ts"); update
rewriteColumnToken to detect if 'to' contains a qualifier
(Chars.lastIndexOf(to,0,to.length(),'.') >= 0) and in that case return 'to'
as‑is (dropping the original prefix) instead of prefix+to; otherwise preserve
the existing behavior of creating a CharacterStoreEntry, putting the prefix then
the simple 'to', and returning entry.toImmutable().
- Around line 3693-3815: The literal-check that skips ts_offset detection should
treat qualified column literals as matches; in detectTimestampOffsetsRecursive
replace the exact token equality check between ast.token and sourceTimestampName
with a call to matchesColumnName (or equivalent) so qualified names like "v.ts"
match "ts"; update the loop over model.getBottomUpColumns() where it checks
ast.type == LITERAL to use matchesColumnName(ast.token, sourceTimestampName) for
the early-return logic in QueryModel/QueryColumn handling.
♻️ Duplicate comments (2)
core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java (2)
6035-6040: Tighten the day-unit assertion to targetand_offset(...).
modelStr.contains("'d'")can match thedateadd('d', ...)select list, so the test can pass even ifand_offsetuses a different unit.🔧 Suggested tightening
- assertTrue("Expected day unit 'd' in and_offset: " + modelStr, - modelStr.contains("'d'") || modelStr.contains("d,")); + assertTrue("Expected day unit 'd' in and_offset: " + modelStr, + modelStr.matches("(?s).*and_offset\\([^)]*'d'[^)]*\\).*"));
6106-6108: Assert the inverted offset value insideand_offset(...).
This test only checks the wrapper’s presence, so it can pass even if the optimizer fails to invert+1to-1.🔧 Suggested assertion
assertTrue("Expected and_offset in model: " + modelStr, modelStr.contains("and_offset")); + assertTrue("Expected inverted offset -1 in and_offset: " + modelStr, + modelStr.matches("(?s).*and_offset\\([^)]*,\\s*-1\\b[^)]*\\).*"));
🧹 Nitpick comments (2)
core/src/main/java/io/questdb/griffin/WhereClauseParser.java (1)
483-489: Add defensive null check for offset token.While
and_offsetis internally generated and should always have valid arguments, adding a null check provides defensive coding consistency with the unit validation above.♻️ Suggested improvement
// Parse offset value int offsetValue; try { + if (offsetNode.token == null) { + return false; + } offsetValue = Numbers.parseInt(offsetNode.token); } catch (NumericException e) { return false; }core/src/test/java/io/questdb/test/griffin/TimestampOffsetPushdownTest.java (1)
463-494: Add a result assertion to avoid plan-only coverage. This test currently validates only the plan, which could miss data-path regressions.✅ Suggested enhancement
assertMemoryLeak(() -> { execute("CREATE TABLE trades (price DOUBLE, timestamp TIMESTAMP) TIMESTAMP(timestamp) PARTITION BY DAY;"); + execute("INSERT INTO trades VALUES (100, '2022-01-02T02:00:00.000000Z');"); String query = """ SELECT * FROM ( SELECT dateadd('d', -1, ts1) as ts2, price FROM ( SELECT dateadd('h', -1, timestamp) as ts1, price FROM trades ) ) WHERE ts2 IN '2022' """; @@ assertPlanNoLeakCheck( query, """ VirtualRecord functions: [dateadd('d',-1,ts1),price] VirtualRecord functions: [dateadd('h',-1,timestamp),price] PageFrame Row forward scan Interval forward scan on: trades intervals: [("2022-01-02T01:00:00.000000Z","2023-01-02T00:59:59.999999Z")] """ ); + + assertQueryNoLeakCheck( + """ + ts2\tprice + 2022-01-01T01:00:00.000000Z\t100.0 + """, + query, + "ts2", + true, + false + ); }); }
|
@CodeRabbit I addressed all your comments, pls review. Find more bugs. |
|
I'll perform a thorough review of the latest changes, paying special attention to:
Let me conduct a detailed review to identify any potential bugs or issues. ✅ Actions performedReview triggered.
|
Resolved conflicts in IntrinsicModel.java: - Kept both imports (CairoConfiguration and TimestampDriver) - Updated of() method signature to include CairoConfiguration parameter - Kept mergeIntervalModelWithAddMethod() method Updated WhereClauseParser.java to use new of() signature with configuration. Co-Authored-By: Claude Opus 4.5 <[email protected]>
|
@CodeRabbit how is it going? |
|
The review will be posted shortly with detailed findings and suggestions. ✅ Actions performedReview triggered.
|
|
@CodeRabbit - have you finished? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
🏁 Script executed: Length of output: 16079 🏁 Script executed: Length of output: 6268 🏁 Script executed: Length of output: 5016 🏁 Script executed: Length of output: 5670 🏁 Script executed: Length of output: 6928 🏁 Script executed: Length of output: 4455
🔴 Critical Issues1. Type Inconsistency:
|
|
@CodeRabbit i addressed your comments, pls review the changes |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 4269 🏁 Script executed: Length of output: 2974 🏁 Script executed: Length of output: 1063 🏁 Script executed: Length of output: 675
✅ Issues Fixed1. Type Consistency - RESOLVED ✅You've standardized the offset type to
This is a clean, consistent solution. While it limits offsets to the 2. Overflow Protection - RESOLVED ✅You've added comprehensive overflow protection with the new private static long addWithOverflowCheck(TimestampDriver.TimestampAddMethod addMethod,
long timestamp, int offset) throws SqlException {
if (offset == 0) {
return timestamp;
}
long result = addMethod.add(timestamp, offset);
// Detect overflow in both directions
if (offset > 0 && result < timestamp) {
throw SqlException...
} else if (offset < 0 && result > timestamp) {
throw SqlException...
}
...
}This properly catches timestamp overflow in both positive and negative offset directions. Great defensive programming! 3. NumericException Usage - NOT A BUG ✅Upon re-verification, 📊 SummaryAll critical issues have been properly addressed. The implementation now has:
The code is ready for merge from my perspective. Great job on the thorough implementation and testing! If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
[PR Coverage check]😍 pass : 427 / 501 (85.23%) file detail
|
|
@CodeRabbit i know you're busy, i just sent few changes, can you check if i broke anything? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
🏁 Script executed: Length of output: 3135 🏁 Script executed: Length of output: 8396
✅ Changes Verified1. SqlOptimiser.java - Memory Optimization ✅You refactored // Before:
"'" + model.getTimestampOffsetUnit() + "'"
Integer.toString(model.getTimestampOffsetValue())
// After:
CharacterStoreEntry unitEntry = characterStore.newEntry();
unitEntry.put('\'').put(model.getTimestampOffsetUnit()).put('\'');
// ...and similar for offsetEntryThis is a positive change — it uses QuestDB's object pooling mechanism for better memory efficiency instead of creating new String objects. Well done! 2. WhereClauseParser.java - Code Clarity Improvement ✅You renamed Old (confusing):
New (clear):
The logic inversion was done correctly at all 3 call sites, and the new naming is much more intuitive! 3. TimestampOffsetPushdownTest.java - Test Coverage ✅Added 🎯 SummaryAll your refactorings are improvements:
You're good to go! 🚀 |
Summary
This PR enables pushing timestamp predicates through virtual models where the timestamp column is derived from a
dateadd()function. This allows partition pruning to work even when queries transform the timestamp usingdateadd.Key changes:
dateadd(unit, constant, timestamp)patterns in SELECT clauses and annotate models with offset infoand_offset(predicate, unit, offset)during optimizationand_offsetin WhereClauseParser to extract intervals and apply offset adjustmentstimestamp(ts)clause required)Example
For nested dateadd levels:
Test plan
and_offsetparsing in WhereClauseParser🤖 Generated with Claude Code