perf(sql): reduce repeated expression execution when a column is referenced by other columns#6093
perf(sql): reduce repeated expression execution when a column is referenced by other columns#6093bluestreak01 merged 38 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR refactors the function memoization architecture in QuestDB's SQL engine. It removes per-record memoization hooks from core function interfaces, introduces a new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…dels and add basic column ref count memoization tests
…just related tests
# Conflicts: # core/src/main/java/io/questdb/griffin/FunctionParser.java # core/src/main/java/io/questdb/griffin/engine/functions/json/JsonExtractFunction.java # core/src/main/java/io/questdb/griffin/engine/functions/memoization/TimestampFunctionMemoizer.java
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (8)
core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndVarcharFunction.java (1)
87-90: Confirm memoization scope for rnd_varchar and add regression test coverageEnabling
shouldMemoize()on a non-deterministic, random function makes sense to avoid re-evaluatingrnd_varcharwhen other projection columns depend on it (e.g.,SELECT rnd_varchar(5,10,2) v, length(v) FROM ...). This is only safe if the memoizer’s cache is strictly per record and reset on each row; otherwisernd_varcharcould effectively become constant over the cursor.Please:
- Double-check that the
MemoizerFunction/ memoizer wrappers invalidate cached values on everyRecordadvance for this function.- Add a focused test that:
- Verifies
SELECT rnd_varchar(...) v, length(v)yields consistentv/length(v)within a row.- Verifies consecutive rows still produce the same distribution/sequence characteristics as before (not accidentally frozen across rows).
Given the coverage report shows 0% on this new branch for
RndVarcharFunction, this test would also close that gap. Based on learnings, deterministicRndseeds in QuestDB tests will keep such tests stable across runs.If you’d like, I can sketch an example JUnit test case for this scenario.
core/src/main/java/io/questdb/griffin/model/QueryModel.java (1)
1095-1105: Clarify method semantics: dual use ofrefCountparameter.The parameter
refCountserves dual purposes: it's added to the existing count when the key exists (line 1100), but used as an initial value when the key doesn't exist (line 1102). This dual behavior isn't obvious from the method signature and could confuse maintainers.Consider renaming to make the semantics clearer:
-public void incrementColumnRefCount(CharSequence alias, int refCount) { +public void addColumnRefCount(CharSequence alias, int delta) { if (columnAliasIndexes.get(alias) > -1) { int keyIndex = columnAliasRefCounts.keyIndex(alias); if (keyIndex < 0) { int old = columnAliasRefCounts.valueAt(keyIndex); - columnAliasRefCounts.putAt(keyIndex, alias, old + refCount); + columnAliasRefCounts.putAt(keyIndex, alias, old + delta); } else { - columnAliasRefCounts.putAt(keyIndex, alias, refCount); + columnAliasRefCounts.putAt(keyIndex, alias, delta); } } }Or add a comment explaining that
refCountrepresents the delta to add, which initializes to that value if the alias hasn't been tracked yet.core/src/main/java/io/questdb/griffin/SqlOptimiser.java (3)
174-175: Consider clearingtempExprsinclear()for consistency
tempExprsis only used insidecollectColumnRefCountand is explicitly cleared there, so leaving it out ofclear()is safe. Still, for symmetry with other scratch collections (tempColumns,tempColumns2,tempIntList, etc.) and to reduce the chance of future misuse, it may be worth addingtempExprs.clear()toclear().public void clear() { clearForUnionModelInJoin(); contextPool.clear(); @@ tempColumns.clear(); tempColumns2.clear(); + tempExprs.clear(); }Also applies to: 238-263
7275-7427: Column ref-count accounting: tighten alias matching and dotted-name handlingThe overall structure of
collectColumnRefCount(...)makes sense (local traversal, then parent-driven propagation usinggetRefCount, and skipping non-CHOOSE/VIRTUAL models), but there are a couple of edge cases worth tightening:
Incrementing counts for literals that aren’t produced by this model
In the parent block, every literal encountered is passed to
queryModel.incrementColumnRefCount(...)regardless of whetherqueryModelactually defines a column with that alias:if (dot == -1) { queryModel.incrementColumnRefCount(columnRef, refCount); } else { ... if (queryModel.getAlias() != null && Chars.equalsIgnoreCase(tableName, queryModel.getAlias().token)) { queryModel.incrementColumnRefCount(columnName, refCount); } }If a parent expression references a base-table column or some other token that is not a column of this
queryModel, this will still increment a counter. UnlessincrementColumnRefCountinternally validates aliases, that can skew the heuristic.Suggest gating the increments on the child’s alias map, mirroring the checks you do on the local pass:
if (dot == -1) {
- queryModel.incrementColumnRefCount(columnRef, refCount);
- if (queryModel.getAliasToColumnMap().contains(columnRef)) {
queryModel.incrementColumnRefCount(columnRef, refCount);- }
} else {
final CharSequence tableName = columnRef.subSequence(0, dot);
final CharSequence columnName = columnRef.subSequence(dot + 1, columnRef.length());
if (queryModel.getAlias() != null && Chars.equalsIgnoreCase(tableName, queryModel.getAlias().token)) {
queryModel.incrementColumnRefCount(columnName, refCount);
if (queryModel.getAliasToColumnMap().contains(columnName)) {queryModel.incrementColumnRefCount(columnName, refCount); }} }
Key mismatch between local and parent dotted references
In the local traversal (same model), for
alias.colyou increment on the fullcolumnRef:queryModel.incrementColumnRefCount(columnRef, 1); // "alias.col"In the parent-driven traversal, you increment on
columnName(suffix) only:queryModel.incrementColumnRefCount(columnName, refCount); // "col"Depending on how
QueryModel.incrementColumnRefCountandgetRefCountkey their internal map, this asymmetry could mean that some references toalias.colare counted under"alias.col"and some under"col", fragmenting counts.It might be safer to normalize to the same key everywhere (probably the column alias as exposed by this
QueryModel, i.e. suffix without table prefix), e.g.:// local dotted case
- queryModel.incrementColumnRefCount(columnRef, 1);
- queryModel.incrementColumnRefCount(columnName, 1);
…or, if you need to keep both forms, ensure `incrementColumnRefCount` always canonicalizes to the per-model alias before updating the counter.
Small clarity/duplication point (optional)
The expression-tree walks for the local block and parent block are structurally identical. Extracting a tiny helper like
traverseExprAndAccumulateRefs(QueryModel target, ExpressionNode ast, int weight, @Nullable QueryModel nestedChild, boolean skipNestedColumns)would reduce duplication and make the logic easier to reason about, but this is more of a maintainability improvement than a bug.Overall behaviour looks correct in the common cases; the above changes would make the heuristic more robust and self-documenting.
7430-7448: Double-check parent propagation for join models incollectColumnRefCountRecursiveThe recursion helper currently does:
ObjList<QueryModel> joinModels = queryModel.getJoinModels(); for (int i = 1, n = joinModels.size(); i < n; i++) { collectColumnRefCount(parent, joinModels.getQuick(i)); } collectColumnRefCount(parent, queryModel.getUnionModel()); ... collectColumnRefCount(parentModel, queryModel.getNestedModel());For joins, this passes the same
parentthat was passed into the current call, rather than usingqueryModelitself as the parent of its join children. That means:
- If
queryModelis a CHOOSE/VIRTUAL layer that defines computed columns used in join criteria or upper projections, those references will not be propagated into join models via this parent link.- Top-level calls (
collectColumnRefCount(null, root)) will traverse join models withparent == null, so no parent expressions will ever contribute to ref-counts in join branches.If the intent is that join models should see usage coming from the “join root” model, you likely want:
for (int i = 1, n = joinModels.size(); i < n; i++) { - collectColumnRefCount(parent, joinModels.getQuick(i)); + collectColumnRefCount(queryModel, joinModels.getQuick(i)); }while preserving the existing
parentModellogic for nested models ofSELECT_MODEL_NONEwrappers.If, on the other hand, you intentionally want to ignore join-root expressions when computing ref counts for join branches (e.g., because memoization is only used on projection layers, not join-side functions), it would be helpful to add a brief comment here to capture that design choice so future readers don’t “fix” it back to
queryModellater.Please double-check this against how
QueryModel.getRefCountis consumed inSqlCodeGeneratorfor memoization decisions.core/src/main/java/io/questdb/griffin/engine/functions/memoization/UuidFunctionMemoizer.java (2)
49-67: Consider extracting duplicate computation logic.Both
getLong128Hi()andgetLong128Lo()contain identical cache population logic. Consider extracting to a private helper method for maintainability:+ private void ensureCached(Record rec) { + if (!validValue) { + lo = fn.getLong128Lo(rec); + hi = fn.getLong128Hi(rec); + validValue = true; + } + } + @Override public long getLong128Hi(Record rec) { - if (!validValue) { - lo = fn.getLong128Lo(rec); - hi = fn.getLong128Hi(rec); - validValue = true; - } + ensureCached(rec); return hi; } @Override public long getLong128Lo(Record rec) { - if (!validValue) { - lo = fn.getLong128Lo(rec); - hi = fn.getLong128Hi(rec); - validValue = true; - } + ensureCached(rec); return lo; }
84-88: Minor: Extra blank line.There's a trailing blank line inside the
memoize()method body.@Override public void memoize(Record record) { validValue = false; - }core/src/main/java/io/questdb/griffin/engine/functions/memoization/DoubleFunctionMemoizer.java (1)
63-65: Optionally reset cache explicitly oninitfor defensive reuseIf
DoubleFunctionMemoizerinstances are ever re-initialized and then reused, it may be safer to explicitly clear the memoized state here rather than relying solely onMemoizerFunction.super.init(...)to do so. For example:@Override public void init(SymbolTableSource symbolTableSource, SqlExecutionContext executionContext) throws SqlException { - MemoizerFunction.super.init(symbolTableSource, executionContext); + MemoizerFunction.super.init(symbolTableSource, executionContext); + validValue = false; }Not strictly necessary if the engine always constructs fresh memoizers or if
MemoizerFunction.initalready guarantees a reset, but this small change would make the class self-contained and robust to future lifecycle changes.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (40)
core/src/main/java/io/questdb/cairo/sql/Function.java(0 hunks)core/src/main/java/io/questdb/griffin/FunctionParser.java(0 hunks)core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java(2 hunks)core/src/main/java/io/questdb/griffin/SqlOptimiser.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/BinaryFunction.java(0 hunks)core/src/main/java/io/questdb/griffin/engine/functions/QuaternaryFunction.java(0 hunks)core/src/main/java/io/questdb/griffin/engine/functions/TernaryFunction.java(0 hunks)core/src/main/java/io/questdb/griffin/engine/functions/UnaryFunction.java(0 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/ArrayFunctionMemoizer.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/BooleanFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/ByteFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/CharFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/DateFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/DoubleFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/FloatFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/IPv4FunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/IntFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/Long256FunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/LongFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/MemoizerFunction.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/ShortFunctionMemoizer.java(3 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/StrFunctionMemoizer.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/SymbolFunctionMemoizer.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/TimestampFunctionMemoizer.java(3 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/UuidFunctionMemoizer.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/memoization/VarcharFunctionMemoizer.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndDoubleArrayFunctionFactory.java(2 hunks)core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndStrFunction.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndSymbolFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndVarcharFunction.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java(2 hunks)core/src/main/java/io/questdb/griffin/engine/table/VirtualFunctionRecordCursor.java(3 hunks)core/src/main/java/io/questdb/griffin/engine/table/VirtualRecordCursorFactory.java(3 hunks)core/src/main/java/io/questdb/griffin/model/QueryModel.java(5 hunks)core/src/test/java/io/questdb/test/cairo/ArrayOrderBookTest.java(8 hunks)core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java(4 hunks)core/src/test/java/io/questdb/test/griffin/JsonExtractMemoizationTest.java(2 hunks)core/src/test/java/io/questdb/test/griffin/ProjectionReferenceTest.java(27 hunks)core/src/test/java/io/questdb/test/griffin/RndMemoizationTest.java(3 hunks)core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java(5 hunks)
💤 Files with no reviewable changes (6)
- core/src/main/java/io/questdb/griffin/engine/functions/TernaryFunction.java
- core/src/main/java/io/questdb/griffin/engine/functions/QuaternaryFunction.java
- core/src/main/java/io/questdb/griffin/FunctionParser.java
- core/src/main/java/io/questdb/griffin/engine/functions/BinaryFunction.java
- core/src/main/java/io/questdb/cairo/sql/Function.java
- core/src/main/java/io/questdb/griffin/engine/functions/UnaryFunction.java
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-19T12:21:00.062Z
Learnt from: jerrinot
Repo: questdb/questdb PR: 6413
File: core/src/test/java/io/questdb/test/cutlass/pgwire/PGJobContextTest.java:11982-12002
Timestamp: 2025-11-19T12:21:00.062Z
Learning: QuestDB Java tests use a deterministic random seed. The test utilities (e.g., io.questdb.test.tools.TestUtils and io.questdb.std.Rnd) produce reproducible sequences, so rnd_* functions (including rnd_uuid4) yield deterministic outputs across runs. Do not flag tests in core/src/test/** that assert against values produced by rnd_* as flaky due to randomness.
Applied to files:
core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndVarcharFunction.javacore/src/test/java/io/questdb/test/griffin/RndMemoizationTest.javacore/src/test/java/io/questdb/test/griffin/ExplainPlanTest.javacore/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.javacore/src/main/java/io/questdb/griffin/engine/functions/rnd/RndDoubleArrayFunctionFactory.java
📚 Learning: 2025-11-10T14:28:48.329Z
Learnt from: mtopolnik
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2025-11-10T14:28:48.329Z
Learning: In AsOfJoinDenseRecordCursorFactoryBase.java, the `backwardScanExhausted` flag is intentionally NOT reset in `toTop()` because backward scan results are reusable across cursor rewinds. The backward scan caches historical matches that remain valid when the cursor is rewound.
Applied to files:
core/src/main/java/io/questdb/griffin/model/QueryModel.java
🧬 Code graph analysis (6)
core/src/main/java/io/questdb/griffin/engine/functions/memoization/SymbolFunctionMemoizer.java (2)
core/src/main/java/io/questdb/griffin/engine/functions/SymbolFunction.java (1)
SymbolFunction(52-276)core/src/main/java/io/questdb/std/str/StringSink.java (1)
StringSink(31-194)
core/src/main/java/io/questdb/griffin/model/QueryModel.java (1)
core/src/main/java/io/questdb/std/CharSequenceIntHashMap.java (1)
CharSequenceIntHashMap(32-170)
core/src/main/java/io/questdb/griffin/engine/functions/memoization/ArrayFunctionMemoizer.java (3)
core/src/main/java/io/questdb/cairo/arr/ArrayView.java (1)
ArrayView(135-600)core/src/main/java/io/questdb/cairo/arr/DerivedArrayView.java (1)
DerivedArrayView(36-309)core/src/main/java/io/questdb/std/Misc.java (1)
Misc(35-154)
core/src/main/java/io/questdb/griffin/engine/functions/memoization/VarcharFunctionMemoizer.java (1)
core/src/main/java/io/questdb/std/str/Utf8StringSink.java (1)
Utf8StringSink(36-207)
core/src/test/java/io/questdb/test/cairo/ArrayOrderBookTest.java (1)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)
SqlCodeGenerator(350-7515)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (18)
core/src/main/java/io/questdb/griffin/engine/functions/memoization/ArrayFunctionMemoizer.java (1)
ArrayFunctionMemoizer(37-91)core/src/main/java/io/questdb/griffin/engine/functions/memoization/BooleanFunctionMemoizer.java (1)
BooleanFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/ByteFunctionMemoizer.java (1)
ByteFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/CharFunctionMemoizer.java (1)
CharFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/DateFunctionMemoizer.java (1)
DateFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/DoubleFunctionMemoizer.java (1)
DoubleFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/FloatFunctionMemoizer.java (1)
FloatFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/IPv4FunctionMemoizer.java (1)
IPv4FunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/IntFunctionMemoizer.java (1)
IntFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/Long256FunctionMemoizer.java (1)
Long256FunctionMemoizer(37-114)core/src/main/java/io/questdb/griffin/engine/functions/memoization/LongFunctionMemoizer.java (1)
LongFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/ShortFunctionMemoizer.java (1)
ShortFunctionMemoizer(34-81)core/src/main/java/io/questdb/griffin/engine/functions/memoization/StrFunctionMemoizer.java (1)
StrFunctionMemoizer(35-121)core/src/main/java/io/questdb/griffin/engine/functions/memoization/SymbolFunctionMemoizer.java (1)
SymbolFunctionMemoizer(37-183)core/src/main/java/io/questdb/griffin/engine/functions/memoization/TimestampFunctionMemoizer.java (1)
TimestampFunctionMemoizer(34-82)core/src/main/java/io/questdb/griffin/engine/functions/memoization/UuidFunctionMemoizer.java (1)
UuidFunctionMemoizer(34-94)core/src/main/java/io/questdb/griffin/engine/functions/memoization/VarcharFunctionMemoizer.java (1)
VarcharFunctionMemoizer(36-122)core/src/main/java/io/questdb/cairo/ColumnType.java (1)
ColumnType(58-1164)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Trigger Enterprise CI Trigger Enterprise Pipeline)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (Check Changes Check changes)
- GitHub Check: build
🔇 Additional comments (50)
core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndDoubleArrayFunctionFactory.java (2)
254-257: LGTM! Consistent memoization for random dimension lengths.The
shouldMemoize()override is correctly implemented and maintains consistency with theFixDimLenFuncvariant. Both fixed and random dimension length paths now support memoization, ensuring uniform behavior across thernd_double_arrayfunction family.
184-187: LGTM!The
shouldMemoize()override correctly enables caching for this random function, ensuring consistent values when referenced multiple times in the same row context (e.g.,SELECT rnd_double_array(...) as x, x[0], x[1]). Comprehensive functional tests already exist inRndMemoizationTest.testRndArray()that verify the memoization behavior with deterministic expected values.core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndSymbolFunctionFactory.java (1)
129-132: No changes needed.The
shouldMemoize()returningtruefor random functions is intentional and correct design. Memoization in this context caches function results per record evaluation, not across records. Each row receives a fresh random value because the memoization cache is reset viamemoize(Record record)on every record.When a random function is referenced multiple times within the same row (e.g.,
SELECT rnd_int() x1, x1+1 x12), memoization ensures the same random value is used for all references, preventing inconsistency. This is the intended behavior introduced in PR #5778 ("enable SQL projection columns to reference preceding columns on the same projection").The comprehensive
RndMemoizationTestsuite confirms that each row produces different random values while memoization correctly handles intra-row reuse, and the deterministic random seed used in QuestDB tests ensures reproducibility.core/src/main/java/io/questdb/griffin/engine/functions/memoization/MemoizerFunction.java (1)
30-32: Clean interface design for memoization.The interface appropriately extends
UnaryFunctionto leveragegetArg()for accessing the wrapped function, and thememoize(Record)method provides a clear contract for cache invalidation. This establishes a consistent pattern for all memoizer implementations.core/src/main/java/io/questdb/griffin/engine/functions/memoization/SymbolFunctionMemoizer.java (3)
80-128: Well-structured cross-caching optimization.The logic correctly reuses cached values between
getSymbolandgetSymbolBvariants, and properly copies into separate sinks to maintain independent storage. The null handling and validity flag management is correct.
37-51: LGTM on the class structure and field initialization.The use of separate
StringSinkinstances for A and B variants prevents aliasing issues. The validity flags provide a clear cache invalidation mechanism.
148-153: Cache invalidation is correct.Resetting validity flags without clearing sinks is efficient—the sinks will be overwritten on next access anyway.
core/src/main/java/io/questdb/griffin/engine/functions/rnd/RndStrFunction.java (1)
75-78: Correct: memoization ensures consistent random values per row.For non-deterministic random functions, enabling memoization is critical to ensure that multiple references to the same column within a row return the same value. This aligns with the PR's goal of materializing functions that are referenced by other projection columns.
core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (2)
796-806: Memoization flag usage and test isolationEnabling memoization explicitly in these three tests and asserting on
memoize(...)in the plan looks consistent with the new behavior and keeps the expectations clear.One thing to double‑check: if
allowFunctionMemoization()flips a global/static flag (e.g., on the engine or SQL code generator), make sure it’s reset between tests (viaAbstractCairoTestlifecycle or an internal try/finally) so that tests which expect non‑memoized plans don’t become order‑dependent on execution order.Also applies to: 1844-1862, 10729-10741
11535-11545: Non‑jitted async filter explicitly kept non‑memoizedThe added comment and updated expectation (
filter: l=rnd_long()) correctly document that the Async Filter path does not participate in function memoization, which aligns with the stated limitations of the current memoization implementation.core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java (4)
81-85: Memoization toggle in test setup is appropriateCalling
super.setUp()and then forcingSqlCodeGenerator.ALLOW_FUNCTION_MEMOIZATION = truein@Before setUp()keeps all optimiser tests in this class consistently on the memoized path, matching the PR’s objectives and avoiding per-test boilerplate.
417-421: Window inner VirtualRecord now correctly memoizes reused expressionThe updated plan expectation for
testDuplicateColumnsInWindowModel(functions[hostname,ts,memoize(usage_system),usage_system1+10]) ensures the baseusage_systemexpression is computed once and reused via theusage_system1alias inusage_system1 + 10. This matches the new “materialize shared expressions” behaviour and keeps the test aligned with the optimised plan.
481-568: New basic ref-count memoization test covers key projection/order/group/filter cases
testFunctionMemoizationBasicColumnRefCountexercises:
- Projection + ORDER BY on
a + bwith memoization.- Nested subquery where an aliased
a + 1is reused multiple times.- Aggregation over repeated alias expressions.
- WHERE filter over an alias while memoizing the underlying
a + 1.The expected plans and result sets are consistent and give good coverage that repeated expressions are materialized once at the appropriate model level.
570-595: Union-model memoization test validates symmetric handling across branches
testFunctionMemoizationUnionModelverifies that each side of aUNIONindependently memoizesa + 1/c + 1in an innerVirtualRecord, while the outer projections reuse thea1alias fora1 + 1anda1 + 2. The plan shape matches the intended optimisation, and asserting only the plan here is sufficient given the simple schema.core/src/test/java/io/questdb/test/cairo/ArrayOrderBookTest.java (6)
27-39: Enabling memoization in ArrayOrderBook tests is consistent with optimiser changesImporting
SqlCodeGeneratorand settingSqlCodeGenerator.ALLOW_FUNCTION_MEMOIZATION = trueinsetUpThisTest()ensures all order-book tests run with the memoized-function code path enabled, which is required for the newmemoize(...)expectations in plan assertions.
48-52: Bid/ask spread assertion change is purely cosmeticSwitching to a text block for the expected result of
testBidAskSpreadkeeps the test clearer without changing behaviour. The SQL and expected values remain correct.
76-97: Cumulative imbalance test correctly validates memoized array_sum usage
testImbalanceCumulativenow:
- Builds a readable
sqlstring usingarray_sum(asks[2, 1:4])andarray_sum(bids[2, 1:4])with aliases.- Asserts a plan where both sums are wrapped in
memoize(...)and reused inbid_vol / ask_vol.- Verifies the resulting
ask_vol,bid_vol, and ratio against deterministic data.The slice bounds and expected numbers are consistent; this gives good coverage that array aggregation expressions are materialized once per row.
109-112: Price-level test refactor to text blocks improves readabilityConverting the expected output in
testPriceLevelForTargetVolumeto a text block while keeping the SQL unchanged makes the test easier to read and does not affect semantics; the cumulative volumes andinsertion_pointexpectations still check out.
161-165: Sudden volume drop test change is a formatting improvementFor
testSuddenVolumeDrop, moving the expected result into a text block simplifies maintenance; the query and the lag-based volume drop conditions remain unchanged and correct.
219-236: Volume dropoff test now also validates memoization of array_avg slices
testVolumeDropoffnow:
- Factors the SQL into a
sqlstring.- Asserts a plan where both
array_avg(asks[2, 1:3])andarray_avg(asks[2, 3:6])are memoized once in theVirtualRecord.- Checks that the filter
WHERE top > 3 * deepis represented equivalently as3*deep<top.The expected
top/deepvalues match the inserted data, and this test nicely covers memoization of windowed array averages plus a dependent filter.core/src/main/java/io/questdb/griffin/engine/groupby/GroupByUtils.java (2)
208-248: LGTM! Pattern matching improves readability.The use of pattern matching for
instanceof(func instanceof GroupByFunction groupByFunc) is a good modernization that eliminates the need for explicit casting on line 215 and subsequent usages.
440-448: LGTM! Consistent pattern matching usage.Same pattern matching improvement applied consistently here.
core/src/main/java/io/questdb/griffin/engine/functions/memoization/IntFunctionMemoizer.java (1)
34-80: LGTM! Clean single-value memoization implementation.The memoization pattern is straightforward and correct:
memoize(record)invalidates the cache by settingvalidValue = falsegetInt(rec)lazily computes and caches the value on first accessisThreadSafe()correctly returnsfalsesince the cache state is not synchronizedThis aligns well with the PR's goal of reducing repeated expression execution.
core/src/main/java/io/questdb/griffin/engine/table/VirtualFunctionRecordCursor.java (2)
38-74: LGTM! Type strengthening improves safety.Changing the
memoizerscollection fromObjList<Function>toObjList<MemoizerFunction>provides better type safety and makes the intent clearer. The constructor and field are consistently typed.
189-194: LGTM! Memoization trigger logic is correct.The
memoizeFunctionsmethod correctly:
- Extracts the internal join record from the virtual record
- Iterates over all memoizers and calls
memoize(joinRecord)to invalidate cached valuesThis is called from both
hasNext()andrecordAt(), ensuring memoization is properly triggered on record transitions.core/src/test/java/io/questdb/test/griffin/JsonExtractMemoizationTest.java (2)
47-64: LGTM! Comprehensive memoization test with plan verification.Good use of
assertQueryAndPlanto verify both the query results and execution plan. The plan correctly showsmemoize(json_extract()::byte)confirming the function is wrapped for memoization when referenced by other columns (res + 1).
227-273: LGTM! Additional byte-type memoization test coverage.Tests multiple JSON paths (
$.a,$.b) with the same memoization pattern, ensuring the feature works consistently across different JSON field extractions.core/src/test/java/io/questdb/test/griffin/RndMemoizationTest.java (3)
85-104: LGTM! Array memoization test with derived values.Good test coverage for
rnd_double_arraymemoization. The test verifies that:
- Array is generated once and memoized
arr[1]accesses the first element from the memoized arrayfirst_elem * 2correctly operates on the memoized valueThe null handling in expected results is also well-covered. Based on learnings, QuestDB tests use deterministic random seeds, so these assertions are reliable.
396-436: LGTM! String and symbol memoization tests.Both
testRndStrandtestRndSymbolverify that string/symbol values are properly memoized when referenced via column aliases. The pattern of applyingupper()and then concatenating with a suffix ensures the memoized value is accessed multiple times correctly.
450-469: LGTM! Varchar memoization test with Unicode support.Good test coverage for varchar memoization including Unicode characters. The expected values demonstrate that
upper()and concatenation work correctly with the memoized varchar values.core/src/main/java/io/questdb/griffin/engine/functions/memoization/CharFunctionMemoizer.java (1)
34-74: LGTM: Clean refactoring to single-value memoization.The transition from dual-slot memoization to a single-value cache with a validity flag is implemented correctly. The
getChar()method properly uses lazy evaluation, andmemoize()correctly invalidates the cache. The implementation aligns well with the newMemoizerFunctioncontract.core/src/main/java/io/questdb/griffin/SqlOptimiser.java (1)
88-88: Memoization toggle wiring into optimise() looks sensibleThe static import of
ALLOW_FUNCTION_MEMOIZATIONand the guarded call tocollectColumnRefCount(null, rewrittenModel)integrate the feature cleanly without affecting existing optimisation flows when the flag is disabled. No functional issues spotted here.Also applies to: 7491-7493
core/src/main/java/io/questdb/griffin/engine/functions/memoization/ByteFunctionMemoizer.java (1)
34-81: Clean memoization implementation.The single-value caching pattern with lazy computation and invalidation on
memoize()is correct. The explicitisThreadSafe() = falseappropriately reflects the mutable state.core/src/main/java/io/questdb/griffin/engine/table/VirtualRecordCursorFactory.java (1)
64-78: Good migration to type-based memoization detection.Using
instanceof MemoizerFunctionis cleaner than the previous flag-based approach. This ensures only explicitly wrapped memoizer functions are tracked, and the typedObjList<MemoizerFunction>improves type safety.core/src/main/java/io/questdb/griffin/engine/functions/memoization/ShortFunctionMemoizer.java (1)
34-81: Consistent implementation with other memoizers.Follows the same single-value caching pattern as other type-specific memoizers.
core/src/main/java/io/questdb/griffin/engine/functions/memoization/DateFunctionMemoizer.java (1)
34-81: Consistent implementation with other memoizers.Follows the established single-value caching pattern with appropriate
longstorage for date values.core/src/main/java/io/questdb/griffin/engine/functions/memoization/Long256FunctionMemoizer.java (1)
37-108: LGTM!The dual A/B caching pattern is correctly implemented for Long256. The cross-referencing logic between
getLong256AandgetLong256Bproperly avoids redundant computation when one value is already cached.core/src/test/java/io/questdb/test/griffin/ProjectionReferenceTest.java (2)
414-482: LGTM!Good addition of test coverage for String, Symbol, and Varchar types with projection references and ORDER BY. These tests properly exercise the memoization paths for string-like column types.
543-560: LGTM!Comprehensive test for array projection handling with nested access and arithmetic operations. The test validates that array indexing works correctly through projection references.
core/src/main/java/io/questdb/griffin/engine/functions/memoization/StrFunctionMemoizer.java (1)
35-121: LGTM!Well-structured implementation following the established memoization pattern. The dual A/B caching with cross-referencing and proper null handling is correctly implemented.
core/src/main/java/io/questdb/griffin/engine/functions/memoization/LongFunctionMemoizer.java (1)
34-81: LGTM!Clean and correct implementation of single-value caching for the long primitive type.
core/src/main/java/io/questdb/griffin/engine/functions/memoization/FloatFunctionMemoizer.java (1)
34-81: LGTM!Consistent implementation following the same single-value caching pattern as other primitive type memoizers.
core/src/main/java/io/questdb/griffin/engine/functions/memoization/ArrayFunctionMemoizer.java (1)
46-50: LGTM on resource cleanup.The
close()method correctly chains toMemoizerFunction.super.close()and frees thederivedArrayusingMisc.free(), ensuring proper resource management.core/src/main/java/io/questdb/griffin/engine/functions/memoization/VarcharFunctionMemoizer.java (1)
36-122: LGTM!Implementation is consistent with
StrFunctionMemoizer, properly adapted forUtf8Sequencetypes. The dual A/B caching with cross-referencing and null handling is correctly implemented.core/src/main/java/io/questdb/griffin/engine/functions/memoization/IPv4FunctionMemoizer.java (1)
34-81: LGTM!Consistent implementation following the single-value caching pattern for the IPv4 type (represented as int).
core/src/main/java/io/questdb/griffin/engine/functions/memoization/BooleanFunctionMemoizer.java (1)
34-81: LGTM! Clean single-value memoization implementation.The caching logic is correct:
validValuedefaults tofalse, ensuring first access computes the valuememoize()properly invalidates the cache for each new recordisThreadSafe()correctly returnsfalsefor the mutable state- Delegation patterns for
init()andsupportsRandomAccess()are appropriatecore/src/main/java/io/questdb/griffin/engine/functions/memoization/TimestampFunctionMemoizer.java (1)
34-82: LGTM! Consistent with the memoization pattern.Implementation correctly mirrors
BooleanFunctionMemoizerfor timestamp values:
- Type metadata preserved via
super(fn.getType())in constructorvalidValueguard ensures the default0Lvalue is never returned without computation- Same correct invalidation and caching flow
core/src/main/java/io/questdb/griffin/engine/functions/memoization/DoubleFunctionMemoizer.java (1)
34-37: Single-value memoization logic looks soundThe transition to a single cached
doubleguarded byvalidValue(getDoublecomputing once per memoization cycle andmemoizeinvalidating the cache) is consistent with the MemoizerFunction contract and should correctly avoid repeated evaluation when multiple projection columns depend on the same underlying function. The lazy evaluation ingetDouble(Record)and the simple invalidation inmemoize(Record)keep this fast and easy to reason about, andisThreadSafe() == falseaccurately reflects the stateful nature.Also applies to: 49-55, 73-75
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (2)
150-166: Memoizer imports align with usageThe added imports cover all memoizer types used later in
generateSelectVirtualWithSubQuery(numeric, temporal, string, symbol, varchar, array, UUID, IPv4, Long256). No issues here.
5580-5585: PassingvirtualColumnReservedSlotsintoVirtualRecordCursorFactorylooks consistentThe new
virtualColumnReservedSlots = columnCount + 1is used both to sizePriorityMetadataand as the final argument toVirtualRecordCursorFactory, ensuring the factory knows about the maximum number of virtual slots (projected columns plus optional hidden timestamp). This is coherent with the way the metadata is constructed.
core/src/main/java/io/questdb/griffin/engine/functions/memoization/ArrayFunctionMemoizer.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/functions/memoization/SymbolFunctionMemoizer.java
Show resolved
Hide resolved
[PR Coverage check]😍 pass : 383 / 435 (88.05%) file detail
|
This PR will materialize function in the same-level
queryModelthat are depended on by other projection columns. Note that it does not collect references from its parentQueryModeland dependencies from the currentqueryModelsorderByandwhereClause.