perf(sql): optimize ORDER BY on computed expressions with sort key materialization#6793
perf(sql): optimize ORDER BY on computed expressions with sort key materialization#6793bluestreak01 merged 13 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR introduces a function complexity framework that categorizes computational cost for SQL expressions, then uses it to selectively materialize sort key columns during ORDER BY operations when complexity exceeds a configurable threshold. A new configuration property Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
core/src/main/java/io/questdb/griffin/engine/functions/regex/RegexpReplaceStrFunctionFactory.java (1)
204-240:⚠️ Potential issue | 🟡 Minor
StringBuilderSinkshould not be public—no external code references this class.Verification confirms zero external usages of
StringBuilderSinkoutsideRegexpReplaceStrFunctionFactory.java. This class is used only as internal state withinRegexpReplaceStrFunction(lines 87–88). The public methodgetStr(Record, StringBuilderSink)is a helper called only internally. Promoting this implementation detail topublicunnecessarily enlarges the API surface. Revert topackage-private(remove thepublicmodifier).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/regex/RegexpReplaceStrFunctionFactory.java` around lines 204 - 240, The nested helper class StringBuilderSink in RegexpReplaceStrFunctionFactory is declared public but is only used internally (e.g., as state in RegexpReplaceStrFunction and by the internal getStr(Record, StringBuilderSink) helper); change its visibility from public to package-private by removing the public modifier on the StringBuilderSink class declaration so it is no longer part of the public API while leaving all methods and behavior intact.core/src/main/java/io/questdb/griffin/engine/functions/math/PowDoubleFunctionFactory.java (1)
48-78:⚠️ Potential issue | 🟠 MajorOverride
getComplexity()forMath.pow()— currently inherits COMPLEXITY_ARITHMETIC (2).
PowDoubleFunctionFactory.Funcdoes not overridegetComplexity()and inherits the default fromArithmeticBinaryFunction, which returnsFunction.addComplexity(COMPLEXITY_ARITHMETIC, ...)whereCOMPLEXITY_ARITHMETIC = 2.
Math.pow()is significantly more expensive than basic arithmetic operations (+, −, ×, ÷) — on HotSpot JVM it typically compiles to a ln+exp sequence rather than a single instruction. The codebase pattern for expensive operations is to overridegetComplexity()with a higher constant:GeoDistanceMetersFunctionFactoryusesCOMPLEXITY_GEO (30), andJsonExtractFunctionusesCOMPLEXITY_JSON (80). This ensures the sort key materialization optimization triggers appropriately based on the actual cost.Override
getComplexity()to return a value reflecting transcendental operation cost (e.g.,COMPLEXITY_STRING_OP (10)as a minimum, or consider a new constant closer toCOMPLEXITY_GEO (30)if one is added for transcendental functions).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/PowDoubleFunctionFactory.java` around lines 48 - 78, PowDoubleFunctionFactory.Func currently inherits the default arithmetic complexity; add an override of getComplexity() in the Func class to return a higher cost for Math.pow (e.g., return COMPLEXITY_STRING_OP or another appropriate constant like COMPLEXITY_GEO) so the planner treats this transcendental operation as expensive; update the Func class (inside PowDoubleFunctionFactory) to implement public int getComplexity() and return the chosen complexity constant.core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java (1)
1085-1097:⚠️ Potential issue | 🟡 Minor
getSqlSortKeyMaterializationThresholdis out of alphabetical order — same issue as inCairoConfiguration.The method should appear before
getSqlSortKeyMaxPages()("Mat..." < "Max...").🔧 Proposed fix
- `@Override` - public int getSqlSortKeyMaxPages() { - return getDelegate().getSqlSortKeyMaxPages(); - } - `@Override` public int getSqlSortKeyMaterializationThreshold() { return getDelegate().getSqlSortKeyMaterializationThreshold(); } + `@Override` + public int getSqlSortKeyMaxPages() { + return getDelegate().getSqlSortKeyMaxPages(); + } + `@Override` public long getSqlSortKeyPageSize() {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java` around lines 1085 - 1097, Reorder the three accessor methods so they are alphabetical: move getSqlSortKeyMaterializationThreshold() so it appears before getSqlSortKeyMaxPages() (i.e., place getSqlSortKeyMaterializationThreshold() then getSqlSortKeyMaxPages() then getSqlSortKeyPageSize()); ensure you only change the ordering in CairoConfigurationWrapper (preserve method bodies calling getDelegate()) and run quick compile to verify no other references are affected.core/src/main/java/io/questdb/PropServerConfiguration.java (1)
4208-4217:⚠️ Potential issue | 🟡 MinorPlace
getSqlSortKeyMaterializationThreshold()in alphabetical order with other getSqlSortKey methods.*It should be ordered before
getSqlSortKeyMaxPages()to follow the member ordering rule.Suggested ordering
- `@Override` - public int getSqlSortKeyMaxPages() { - return sqlSortKeyMaxPages; - } - - `@Override` - public int getSqlSortKeyMaterializationThreshold() { - return sqlSortKeyMaterializationThreshold; - } + `@Override` + public int getSqlSortKeyMaterializationThreshold() { + return sqlSortKeyMaterializationThreshold; + } + + `@Override` + public int getSqlSortKeyMaxPages() { + return sqlSortKeyMaxPages; + }As per coding guidelines, “Java class members must be grouped by kind (static vs. instance) and visibility, and sorted alphabetically.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/PropServerConfiguration.java` around lines 4208 - 4217, The two accessor methods for SQL sort key properties are out of alphabetical order; move getSqlSortKeyMaterializationThreshold() so it appears before getSqlSortKeyMaxPages() among the other getSqlSortKey* methods to comply with member ordering rules (locate the methods getSqlSortKeyMaterializationThreshold() and getSqlSortKeyMaxPages() and reorder them accordingly).core/src/main/java/io/questdb/griffin/engine/functions/math/SubIntFunctionFactory.java (1)
45-45:⚠️ Potential issue | 🟡 MinorRename
configuration1→configuration.Every other
FunctionFactory.newInstance()implementation in the codebase usesconfiguration. The1suffix is a typo.🔧 Proposed fix
- public Function newInstance(int position, ObjList<Function> args, IntList argPositions, CairoConfiguration configuration1, SqlExecutionContext sqlExecutionContext) { + public Function newInstance(int position, ObjList<Function> args, IntList argPositions, CairoConfiguration configuration, SqlExecutionContext sqlExecutionContext) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/SubIntFunctionFactory.java` at line 45, Rename the method parameter named configuration1 to configuration in SubIntFunctionFactory.newInstance to match the standard FunctionFactory.newInstance signature used across the codebase; update the parameter name in the method declaration and any usages inside the method body (referencing SubIntFunctionFactory.newInstance) so compilation and naming consistency are preserved.core/src/main/java/io/questdb/griffin/engine/functions/math/SignDecimalFunctionFactory.java (1)
62-98:⚠️ Potential issue | 🟠 Major
Decimal128Funcis missingisThreadSafe() = false— data race on thedecimal128scratch buffer.
decimal128is a mutable instance field written byfunction.getDecimal128(rec, decimal128)and then read byisNull()/isZero()/isNegative()on everygetDecimal8call. WithoutisThreadSafe() = false, the engine may share this instance across threads, causing concurrent mutations to corrupt each other's results.Both
Decimal256Funcin this same file (line 177) andAbsDecimalFunctionFactory.Decimal128Funccorrectly overrideisThreadSafe() = falsefor exactly this reason.🐛 Proposed fix
`@Override` public String getName() { return "sign"; } + + `@Override` + public boolean isThreadSafe() { + return false; + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/SignDecimalFunctionFactory.java` around lines 62 - 98, The Decimal128Func class uses a mutable instance field decimal128 that is written by function.getDecimal128(rec, decimal128) and read in getDecimal8, so add an override of isThreadSafe() returning false on Decimal128Func (same pattern as Decimal256Func and AbsDecimalFunctionFactory.Decimal128Func) to prevent the engine from sharing the mutable scratch buffer across threads; update the Decimal128Func class to include the isThreadSafe() method that returns false.
🧹 Nitpick comments (15)
core/src/main/java/io/questdb/griffin/engine/functions/math/ExpDoubleFunctionFactory.java (1)
49-49:functionfield should beprivate.The field is package-private, but since
ExpFunctionis aprivate staticinner class there is no defensive need for package-level visibility. Making itprivateis more idiomatic.♻️ Proposed fix
- final Function function; + private final Function function;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/ExpDoubleFunctionFactory.java` at line 49, The field "function" in ExpDoubleFunctionFactory is package-private but should be private; update the declaration of the field named function in class ExpDoubleFunctionFactory to be private (i.e., change its visibility to private final) to match the private static inner class ExpFunction and follow idiomatic encapsulation.core/src/main/java/io/questdb/griffin/engine/functions/math/SqrtDoubleFunctionFactory.java (1)
48-70: All arithmetic unary functions uniformly use the default complexity fromArithmeticUnaryFunction.The concern about
sqrtbeing a transcendental/expensive operation is architecturally valid, but this applies to all transcendental math functions (log,exp,sin,cos,tan,asin,acos,atan), not justsqrt. None of them overridegetComplexity(). SinceFunction.COMPLEXITY_ARITHMETIC = 2is uniformly applied across all arithmetic unary operations via the interface default, this appears to be an intentional design choice. If complexity stratification for transcendental operations is desired, it would need to be addressed across all such functions consistently, not on a per-function basis.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/SqrtDoubleFunctionFactory.java` around lines 48 - 70, The review points out that overriding complexity only in SqrtFunction is inconsistent; either keep the current uniform default from ArithmeticUnaryFunction or implement a consistent strategy for all transcendental ops. Fix by removing any per-function getComplexity override in SqrtFunction (leave SqrtFunction as shown using the ArithmeticUnaryFunction default) or, if you want higher complexity for transcendental ops, introduce a new marker/interface (e.g., TranscendentalUnaryFunction or change ArithmeticUnaryFunction default) and update all transcendental implementations (SqrtFunction, log/exp/sin/cos/tan/asin/acos/atan classes) to implement that so complexity is applied consistently across those functions (refer to SqrtFunction, ArithmeticUnaryFunction, and getComplexity).core/src/main/java/io/questdb/cairo/sql/Function.java (1)
138-140:getComplexity()is placed out of alphabetical order.Per the coding guideline, instance methods should be sorted alphabetically.
getComplexity()(line 138) is placed beforegetArray()(line 148), but alphabeticallygetArray<getComplexity. It should be betweengetChar()andgetDate().Move getComplexity() to its correct alphabetical position
Move the
getComplexity()default method to betweengetChar(Record)(currently around line 158) andgetDate(Record)(currently around line 160).As per coding guidelines: "Java class members must be grouped by kind (static vs. instance) and visibility, and sorted alphabetically."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/cairo/sql/Function.java` around lines 138 - 140, Move the default instance method getComplexity() so instance methods remain alphabetically ordered: remove the existing getComplexity() definition and reinsert it between getChar(Record) and getDate(Record) in the Function interface; ensure the method signature and return value (return COMPLEXITY_COLUMN) are unchanged and that imports/formatting remain consistent.core/src/main/java/io/questdb/griffin/engine/functions/str/SubStringFunctionFactory.java (1)
94-97:getComplexity()should followgetCenter()alphabetically.
getCenter(line 100) sorts beforegetComplexitylexicographically ("getCe" < "getCo"), butgetComplexityis placed first. Swap the order to maintain the alphabetical convention.Reorder methods
- `@Override` - public int getComplexity() { - return Function.addComplexity(COMPLEXITY_STRING_OP, TernaryFunction.super.getComplexity()); - } - `@Override` public Function getCenter() { return startFunc; } + `@Override` + public int getComplexity() { + return Function.addComplexity(COMPLEXITY_STRING_OP, TernaryFunction.super.getComplexity()); + } + `@Override` public Function getLeft() {As per coding guidelines: "Java class members must be grouped by kind (static vs. instance) and visibility, and sorted alphabetically."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/str/SubStringFunctionFactory.java` around lines 94 - 97, In SubStringFunctionFactory, reorder the instance methods so that getCenter() appears before getComplexity() to match alphabetical ordering of method names; locate the two methods named getCenter and getComplexity in the SubStringFunctionFactory class and swap their positions (keeping their implementations unchanged) so getCenter() is declared prior to getComplexity().core/src/main/java/io/questdb/griffin/engine/functions/CursorFunction.java (1)
54-57:getComplexity()placed beforegetArray()— same alphabetical ordering issue.Same pattern as in
Function.java:getComplexity()should be positioned betweengetChar()andgetDate()to maintain alphabetical ordering.As per coding guidelines: "Java class members must be grouped by kind (static vs. instance) and visibility, and sorted alphabetically."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/CursorFunction.java` around lines 54 - 57, The getComplexity() method in CursorFunction is out of alphabetical order — move the public int getComplexity() { return COMPLEXITY_SUBQUERY; } method so it sits after getChar() and before getDate() (i.e., between getChar() and getDate()) to comply with the class members grouping/sorting rule; ensure visibility and static/instance grouping remain unchanged when relocating the method.core/src/main/java/io/questdb/griffin/engine/functions/str/ReplaceStrFunctionFactory.java (1)
96-99:getComplexity()should followgetCenter()alphabetically.Same issue as in
SubStringFunctionFactory:getCentersorts beforegetComplexitylexicographically. Swap the order.Reorder methods
- `@Override` - public int getComplexity() { - return Function.addComplexity(COMPLEXITY_STRING_OP, TernaryFunction.super.getComplexity()); - } - `@Override` public Function getCenter() { return oldSubStr; } + `@Override` + public int getComplexity() { + return Function.addComplexity(COMPLEXITY_STRING_OP, TernaryFunction.super.getComplexity()); + } + `@Override` public Function getLeft() {As per coding guidelines: "Java class members must be grouped by kind (static vs. instance) and visibility, and sorted alphabetically."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/str/ReplaceStrFunctionFactory.java` around lines 96 - 99, Move the getComplexity() method so it appears after getCenter() within the ReplaceStrFunctionFactory class to satisfy alphabetical ordering of instance methods; ensure you retain the existing implementation (returning Function.addComplexity(COMPLEXITY_STRING_OP, TernaryFunction.super.getComplexity())) and keep method visibility unchanged so grouping and sorting rules for class members are preserved.core/src/main/java/io/questdb/griffin/engine/functions/math/BitwiseAndLongFunctionFactory.java (1)
55-85: Semantic concern valid, but implementation is intentionally consistent across all bitwise operations.Bitwise AND is indeed a logical/bitwise operation, not an arithmetic one—this semantic distinction is correct. However, the codebase shows that this design is intentional and applied consistently: all bitwise operations (
&,|,^for Long/Int/IPv4 variants, and~for unary) implementArithmeticBinaryFunctionorArithmeticUnaryFunction.
ArithmeticBinaryFunctionaddsCOMPLEXITY_ARITHMETICto the cost model, grouping bitwise operations with arithmetic ones based on comparable computational cost rather than semantic category. The suggestedBitwiseBinaryFunctionsub-interface does not exist in the codebase, and creating one is not necessary for consistency—all bitwise operations already follow the same pattern.If the complexity-based grouping is the intended design, the naming remains semantically imprecise, but this is a codebase-wide pattern, not an issue specific to this change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/BitwiseAndLongFunctionFactory.java` around lines 55 - 85, The implementation intentionally uses ArithmeticBinaryFunction for bitwise ops; update BitAndLongFunction (and similar bitwise classes) to add a brief clarifying comment explaining that bitwise operations (e.g., BitAndLongFunction) are grouped under ArithmeticBinaryFunction to reflect complexity-based costing (COMPLEXITY_ARITHMETIC) rather than semantic arithmetic, so future maintainers understand this naming/design choice and don't introduce a BitwiseBinaryFunction; place the comment near the class declaration and mention ArithmeticBinaryFunction and COMPLEXITY_ARITHMETIC.core/src/main/java/io/questdb/griffin/engine/functions/conditional/CaseFunction.java (1)
32-35: Consider adding a named constant for the CASE complexity value.CaseFunction uses literal
5here, but all other functions in the codebase (ReplaceStrFunction, SubStringFunction, MatchStrFunction, ArithmeticUnaryFunction, etc.) use named constants likeCOMPLEXITY_REGEX,COMPLEXITY_STRING_OP, andCOMPLEXITY_ARITHMETIC. Define a constant likeCOMPLEXITY_CASE = 5in the Function interface to maintain consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/conditional/CaseFunction.java` around lines 32 - 35, The CaseFunction.getComplexity implementation uses a magic literal 5; add a named constant COMPLEXITY_CASE = 5 to the Function interface and replace the literal in CaseFunction with Function.COMPLEXITY_CASE when calling Function.addComplexity(Function.COMPLEXITY_CASE, MultiArgFunction.super.getComplexity()); update any imports or references if necessary so CaseFunction, Function.addComplexity, and MultiArgFunction.super.getComplexity() compile against the new constant.core/src/main/java/io/questdb/griffin/engine/functions/geo/GeoDistanceMetersFunctionFactory.java (1)
194-197: Child complexity is not propagated in specialized geo-distance functions.
ConstPoint1GeoDistanceFunctionandGeoDistanceFunctionextendDoubleFunctionand manually manage child functions (lat2Func,lon2Func, etc.) without implementing arity interfaces likeBinaryFunctionorUnaryFunction. TheirgetComplexity()overrides return a flatCOMPLEXITY_GEO(30) without aggregating child complexity. This design diverges from the aggregation pattern used by arity interfaces, where child complexity is summed viaFunction.addComplexity().The practical impact is minimal because
COMPLEXITY_GEOexceeds the materialization threshold (3), so the sort key will be materialized regardless. However, if the complexity model is extended for finer-grained ranking beyond threshold checks, this could undercount actual expression cost.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/geo/GeoDistanceMetersFunctionFactory.java` around lines 194 - 197, The getComplexity() overrides in ConstPoint1GeoDistanceFunction and GeoDistanceFunction return a flat COMPLEXITY_GEO and do not include their child functions' complexity; update each getComplexity() to aggregate child complexities by calling Function.addComplexity(...) (or summing child.getComplexity()) for their managed child functions (e.g., lat2Func, lon2Func, latFunc, lonFunc, pointFunc) and return COMPLEXITY_GEO plus that aggregated value so the overall expression cost includes children.core/src/main/java/io/questdb/griffin/engine/functions/math/RadiansDoubleFunctionFactory.java (1)
33-33: Stray blank line left by removedUnaryFunctionimport.The blank line at line 33 is a formatting artifact; removing it keeps the import block tidy.
🧹 Suggested cleanup
import io.questdb.griffin.engine.functions.DoubleFunction; - import io.questdb.griffin.engine.functions.constants.DoubleConstant;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/RadiansDoubleFunctionFactory.java` at line 33, Remove the stray blank line in the import block of RadiansDoubleFunctionFactory.java left after removing the UnaryFunction import; open the import section in RadiansDoubleFunctionFactory and delete the empty line so imports are contiguous and properly formatted (look for the class RadiansDoubleFunctionFactory and the surrounding import declarations to locate the spot).core/src/test/java/io/questdb/test/griffin/GroupByTest.java (1)
2738-2785: Same applies here — plan expectation correctly updated.
testLiftAliasesFromInnerSelect9andtestGroupByWithAliasClash6appear to test an identical query (same DDL, same SQL, same expected rows, same expected plan). If this duplication is pre-existing it's fine to leave it, but it may be an opportunity to consolidate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/GroupByTest.java` around lines 2738 - 2785, Duplicate test detected: testLiftAliasesFromInnerSelect9 appears identical to testGroupByWithAliasClash6 (same DDL, SQL, expected rows and plan). Either remove one of the tests or consolidate them by extracting the shared setup and assertions into a single helper method (e.g., createTestForAliasClash(query, expectedPlan, expectedRows)) and call it from both test names if both must remain; update or remove the redundant test (testLiftAliasesFromInnerSelect9 or testGroupByWithAliasClash6) and ensure the remaining test keeps the same execute/create table calls, assertPlanNoLeakCheck, and assertQueryNoLeakCheck expectations.core/src/main/java/io/questdb/griffin/engine/functions/math/SubIntFunctionFactory.java (1)
49-97: AddisOperator()override toSubtractIntVVFunc.
SubtractIntVVFuncshould overrideisOperator()to returntrue, consistent with other arithmetic binary operators. For example,MulIntFunctionFactory'sFuncclass explicitly overridesisOperator()to returntrue. The default implementation inFunctionreturnsfalse, causing the subtraction operator to be misclassified as a non-operator, which is inconsistent with similar operations and may impacttoPlan()behavior and future logic that depends on this flag.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/math/SubIntFunctionFactory.java` around lines 49 - 97, SubtractIntVVFunc is missing an override of isOperator() so it inherits the default false; add a public boolean isOperator() override in SubtractIntVVFunc that returns true to classify this binary subtraction function as an operator (mirror how MulIntFunctionFactory's Func implements isOperator()), ensuring to place the method within the SubtractIntVVFunc class alongside other overrides.core/src/test/java/io/questdb/test/griffin/engine/orderby/OrderBySortKeyMaterializationTest.java (2)
1629-1658: Clarify the materialization exclusion criteria for IPv4.The comment says "IPv4 is excluded from materialization (no arithmetic on this type)" but IPv4 is a fixed-size 4-byte type. The actual exclusion reason appears to be a design decision in the materializer (perhaps only numeric/temporal/boolean/geo/decimal types are routed), not a size constraint. Meanwhile, UUID exclusion at Line 1772 is explained as "> Long.BYTES" which is a different criterion. Consider aligning the comments to reference the actual filtering logic in
SortKeyMaterializingRecordCursor/MaterializedRecordfor clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/orderby/OrderBySortKeyMaterializationTest.java` around lines 1629 - 1658, Update the inline comment in OrderBySortKeyMaterializationTest to accurately reflect the materializer's filtering logic: replace "IPv4 is excluded from materialization (no arithmetic on this type)" with a brief note that IPv4 is excluded because the materializer routes only certain categories (numeric/temporal/boolean/geo/decimal) for materialization per the logic in SortKeyMaterializingRecordCursor/MaterializedRecord, and similarly adjust the UUID comment to reference the size-based rule (> Long.BYTES) used by the same materializer code so both comments consistently point to the actual filtering criteria in SortKeyMaterializingRecordCursor/MaterializedRecord.
1748-1758: DifferentassertQueryNoLeakCheckoverload used only here.This test uses the 5-argument overload
assertQueryNoLeakCheck(expected, query, "x", true, true)while all other tests use the simpler 2-argument form. If the extra parameters (expected timestamp column, supportsRandomAccess flags) are needed specifically for TIMESTAMP sort-key verification, a brief inline comment explaining why would help future maintainers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/orderby/OrderBySortKeyMaterializationTest.java` around lines 1748 - 1758, The test call to assertQueryNoLeakCheck is the only one using the 5-argument overload (assertQueryNoLeakCheck(expected, query, "x", true, true)); either revert it to the standard 2-argument form assertQueryNoLeakCheck(expected, query) if the extra parameters are not required, or keep the 5-arg call but add a one-line inline comment immediately above it explaining that the third and two boolean flags are intentionally supplied to verify TIMESTAMP sort-key behavior (i.e., specify expected timestamp column "x" and supportsRandomAccess=true flags) so future maintainers understand why this test differs from others.core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursor.java (1)
125-144: Missing guard for the-1Lno-entry sentinel returned byrowIdToOrdinal.
rowIdToOrdinalwas constructed withnoEntryValue = -1L. Whenmaterialized = true,rowIdToOrdinal.get(rowId)silently returns-1Lfor any rowId not in the map — including the edge case wherebaseRecord.getRowId()equalsLong.MIN_VALUE(thenoEntryKey, whichputsilently ignores). Passing ordinal-1LtorecordA.setOrdinal()causesMaterializedRecordto compute a negative buffer offset and read native memory before the buffer's base address.The same pattern repeats at line 178 inside
recordAt().While in the normal two-pass flow this should never fire, the invariant is invisible to the JVM and failures would manifest as silent data corruption or a native crash. An assertion or explicit check at both sites would make the invariant explicit and the failure mode detectable:
♻️ Proposed defensive guard
if (materialized) { ordinal = rowIdToOrdinal.get(rowId); + assert ordinal != -1L : "rowId not found in materialization map: " + rowId; } else {baseCursor.recordAt(((MaterializedRecord) record).getBaseRecord(), atRowId); final long ordinal = rowIdToOrdinal.get(atRowId); + assert ordinal != -1L : "rowId not found in materialization map: " + atRowId; ((MaterializedRecord) record).setOrdinal(ordinal);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursor.java` around lines 125 - 144, In SortKeyMaterializingRecordCursor, guard against the rowIdToOrdinal no-entry sentinel (-1L) returned by rowIdToOrdinal.get(rowId) before passing the value into recordA.setOrdinal: in hasNext() (where materialized is true) check if ordinal == -1L and throw an IllegalStateException (or assert) with a clear message referencing rowId and that materialization invariant was violated; apply the same defensive check in recordAt() where rowIdToOrdinal.get(...) is used, and ensure any code paths that compute/put ordinals (nextOrdinal/rowIdToOrdinal.put(...) and appendValue(...)) remain unchanged — this makes the invariant explicit and fails fast instead of computing negative offsets in MaterializedRecord.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/src/main/java/io/questdb/cairo/CairoConfiguration.java`:
- Around line 637-641: Members are out of alphabetical order: move the
declaration of getSqlSortKeyMaterializationThreshold() so it appears before
getSqlSortKeyMaxPages() in CairoConfiguration (keep getSqlSortKeyPageSize()
position unchanged); locate the three methods
getSqlSortKeyMaterializationThreshold, getSqlSortKeyMaxPages, and
getSqlSortKeyPageSize and reorder them so
"getSqlSortKeyMaterializationThreshold" comes before "getSqlSortKeyMaxPages" to
comply with the class member alphabetical ordering rule.
In `@core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java`:
- Around line 1128-1131: The new method getSqlSortKeyMaterializationThreshold()
is out of alphabetical order; move the entire method (public int
getSqlSortKeyMaterializationThreshold() { return 3; }) so it appears before
getSqlSortKeyMaxPages() in DefaultCairoConfiguration.java, preserving its
signature and body and leaving surrounding formatting intact.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/cast/AbstractCastToDecimalFunction.java`:
- Line 37: The class AbstractCastToDecimalFunction no longer overrides
isThreadSafe() and thus may incorrectly inherit thread-safe status from
UnaryFunction despite sharing a mutable protected final Decimal256 decimal
buffer; restore an override public boolean isThreadSafe() that returns false in
AbstractCastToDecimalFunction so all cast-to-decimal implementations (e.g.,
ScaledDecimal8FuncAbstract, UnscaledNarrowDecimal128FuncAbstract) correctly
report non-thread-safe behavior and avoid race conditions when using the shared
Decimal256 decimal buffer; place the override in AbstractCastToDecimalFunction
(not in each concrete subclass) to centralize the fix.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/json/JsonExtractFunction.java`:
- Around line 225-228: Move the getComplexity() method so class members remain
alphabetically ordered: remove the getComplexity() definition from after
getFloat() and reinsert it between getChar() and getDate() so the sequence is
getChar(), getComplexity(), getDate(); ensure method visibility and
implementation remain unchanged.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/IPv4MinusIPv4FunctionFactory.java`:
- Line 56: The inner implementation class Func should be made private to match
other factories; change the declaration of the inner class Func in
IPv4MinusIPv4FunctionFactory from package-private "static class Func" to
"private static class Func" so the implementation detail is not exposed, and
verify no external code relies on accessing IPv4MinusIPv4FunctionFactory.Func
directly (update imports/usages if any).
In
`@core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursor.java`:
- Around line 44-45: Class member declarations in
SortKeyMaterializingRecordCursor are out of alphabetical order: swap the two
instance fields so "buffers" is declared before "bufferToColIndex". Locate the
fields named buffers (MemoryCARW[] buffers) and bufferToColIndex (int[]
bufferToColIndex) and reorder their declarations so members remain grouped by
kind/visibility and sorted alphabetically.
- Around line 76-99: The constructor can leak CARW buffers if Vm.getCARWInstance
throws mid-loop; move the rowIdToOrdinal initialization to before the buffers
allocation loop, wrap the loop that calls Vm.getCARWInstance(...) in a
try/catch, call freeBuffers() in the catch to release any partially allocated
buffers and then rethrow the exception, and ensure freeBuffers() can safely
handle a partially-initialized buffers array; also update close() to null-guard
each buffer entry (check buffers[i] != null before calling truncate()) so
truncate() won't NPE on partially-initialized buffers.
- Around line 200-232: The appendValue method in
SortKeyMaterializingRecordCursor fails to handle ColumnType.SYMBOL causing an
UnsupportedOperationException for materialized SYMBOL columns; update the switch
in appendValue (which uses bufferToColIndex, buffers, and colTypes) to add a
SYMBOL branch that writes the symbol index into the buffer by calling
buf.putInt(record.getSymbolIndex(colIndex)) so SYMBOL values are materialized
like other 4-byte types.
In
`@core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursorFactory.java`:
- Around line 82-84: The method recordCursorSupportsRandomAccess() in
SortKeyMaterializingRecordCursorFactory should delegate to the underlying base
cursor capability instead of returning true unconditionally; change the
implementation to call base.recordCursorSupportsRandomAccess() (or the
appropriate field/method on the wrapped cursor exposed by this factory) so that
SortKeyMaterializingRecordCursorFactory.recordCursorSupportsRandomAccess()
reflects the actual support used by recordAt() and non-materialized column
access, consistent with other wrappers like LimitRecordCursorFactory and
RecordAsAFieldRecordCursorFactory.
In `@core/src/main/java/io/questdb/PropServerConfiguration.java`:
- Around line 490-492: The three private final fields in PropServerConfiguration
are out of alphabetical order; move sqlSortKeyMaterializationThreshold so it
appears before sqlSortKeyMaxPages within the same private-final field block
(keeping sqlSortKeyPageSize position intact) to comply with the member ordering
rule; update only the field declaration order
(sqlSortKeyMaterializationThreshold, sqlSortKeyMaxPages, sqlSortKeyPageSize) in
the PropServerConfiguration class.
---
Outside diff comments:
In `@core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java`:
- Around line 1085-1097: Reorder the three accessor methods so they are
alphabetical: move getSqlSortKeyMaterializationThreshold() so it appears before
getSqlSortKeyMaxPages() (i.e., place getSqlSortKeyMaterializationThreshold()
then getSqlSortKeyMaxPages() then getSqlSortKeyPageSize()); ensure you only
change the ordering in CairoConfigurationWrapper (preserve method bodies calling
getDelegate()) and run quick compile to verify no other references are affected.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/PowDoubleFunctionFactory.java`:
- Around line 48-78: PowDoubleFunctionFactory.Func currently inherits the
default arithmetic complexity; add an override of getComplexity() in the Func
class to return a higher cost for Math.pow (e.g., return COMPLEXITY_STRING_OP or
another appropriate constant like COMPLEXITY_GEO) so the planner treats this
transcendental operation as expensive; update the Func class (inside
PowDoubleFunctionFactory) to implement public int getComplexity() and return the
chosen complexity constant.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/SignDecimalFunctionFactory.java`:
- Around line 62-98: The Decimal128Func class uses a mutable instance field
decimal128 that is written by function.getDecimal128(rec, decimal128) and read
in getDecimal8, so add an override of isThreadSafe() returning false on
Decimal128Func (same pattern as Decimal256Func and
AbsDecimalFunctionFactory.Decimal128Func) to prevent the engine from sharing the
mutable scratch buffer across threads; update the Decimal128Func class to
include the isThreadSafe() method that returns false.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/SubIntFunctionFactory.java`:
- Line 45: Rename the method parameter named configuration1 to configuration in
SubIntFunctionFactory.newInstance to match the standard
FunctionFactory.newInstance signature used across the codebase; update the
parameter name in the method declaration and any usages inside the method body
(referencing SubIntFunctionFactory.newInstance) so compilation and naming
consistency are preserved.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/regex/RegexpReplaceStrFunctionFactory.java`:
- Around line 204-240: The nested helper class StringBuilderSink in
RegexpReplaceStrFunctionFactory is declared public but is only used internally
(e.g., as state in RegexpReplaceStrFunction and by the internal getStr(Record,
StringBuilderSink) helper); change its visibility from public to package-private
by removing the public modifier on the StringBuilderSink class declaration so it
is no longer part of the public API while leaving all methods and behavior
intact.
In `@core/src/main/java/io/questdb/PropServerConfiguration.java`:
- Around line 4208-4217: The two accessor methods for SQL sort key properties
are out of alphabetical order; move getSqlSortKeyMaterializationThreshold() so
it appears before getSqlSortKeyMaxPages() among the other getSqlSortKey* methods
to comply with member ordering rules (locate the methods
getSqlSortKeyMaterializationThreshold() and getSqlSortKeyMaxPages() and reorder
them accordingly).
---
Duplicate comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/cast/AbstractCastToIntFunction.java`:
- Line 34: The review indicates no changes are required — the class
AbstractCastToIntFunction correctly extends IntFunction and implements
CastFunction and is consistent with other AbstractCastTo* classes; no code
modifications needed, simply keep AbstractCastToIntFunction as declared.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/cast/AbstractCastToShortFunction.java`:
- Line 34: Ensure AbstractCastToShortFunction follows the same CastFunction
interface switch pattern used in AbstractCastToTimestampFunction: update any
casting/dispatch logic in AbstractCastToShortFunction (extending ShortFunction
and implementing CastFunction) to mirror the timestamp variant, and add the same
backward-compatibility handling that checks for instanceof UnaryFunction where
cast operands are inspected so existing unary implementations continue to work.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/GeoIntCaseFunction.java`:
- Line 33: The GeoIntCaseFunction class should follow the same CaseFunction
migration pattern as UuidCaseFunction: ensure GeoIntCaseFunction implements
CaseFunction, mirror any helper methods and constructors migrated in
UuidCaseFunction, and keep the backward-compatibility check using instanceof
MultiArgFunction where applicable so existing callers still work; update
GeoIntCaseFunction to match the structure and compatibility guards used in
UuidCaseFunction.
---
Nitpick comments:
In `@core/src/main/java/io/questdb/cairo/sql/Function.java`:
- Around line 138-140: Move the default instance method getComplexity() so
instance methods remain alphabetically ordered: remove the existing
getComplexity() definition and reinsert it between getChar(Record) and
getDate(Record) in the Function interface; ensure the method signature and
return value (return COMPLEXITY_COLUMN) are unchanged and that
imports/formatting remain consistent.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/CaseFunction.java`:
- Around line 32-35: The CaseFunction.getComplexity implementation uses a magic
literal 5; add a named constant COMPLEXITY_CASE = 5 to the Function interface
and replace the literal in CaseFunction with Function.COMPLEXITY_CASE when
calling Function.addComplexity(Function.COMPLEXITY_CASE,
MultiArgFunction.super.getComplexity()); update any imports or references if
necessary so CaseFunction, Function.addComplexity, and
MultiArgFunction.super.getComplexity() compile against the new constant.
In `@core/src/main/java/io/questdb/griffin/engine/functions/CursorFunction.java`:
- Around line 54-57: The getComplexity() method in CursorFunction is out of
alphabetical order — move the public int getComplexity() { return
COMPLEXITY_SUBQUERY; } method so it sits after getChar() and before getDate()
(i.e., between getChar() and getDate()) to comply with the class members
grouping/sorting rule; ensure visibility and static/instance grouping remain
unchanged when relocating the method.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/geo/GeoDistanceMetersFunctionFactory.java`:
- Around line 194-197: The getComplexity() overrides in
ConstPoint1GeoDistanceFunction and GeoDistanceFunction return a flat
COMPLEXITY_GEO and do not include their child functions' complexity; update each
getComplexity() to aggregate child complexities by calling
Function.addComplexity(...) (or summing child.getComplexity()) for their managed
child functions (e.g., lat2Func, lon2Func, latFunc, lonFunc, pointFunc) and
return COMPLEXITY_GEO plus that aggregated value so the overall expression cost
includes children.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/BitwiseAndLongFunctionFactory.java`:
- Around line 55-85: The implementation intentionally uses
ArithmeticBinaryFunction for bitwise ops; update BitAndLongFunction (and similar
bitwise classes) to add a brief clarifying comment explaining that bitwise
operations (e.g., BitAndLongFunction) are grouped under ArithmeticBinaryFunction
to reflect complexity-based costing (COMPLEXITY_ARITHMETIC) rather than semantic
arithmetic, so future maintainers understand this naming/design choice and don't
introduce a BitwiseBinaryFunction; place the comment near the class declaration
and mention ArithmeticBinaryFunction and COMPLEXITY_ARITHMETIC.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/ExpDoubleFunctionFactory.java`:
- Line 49: The field "function" in ExpDoubleFunctionFactory is package-private
but should be private; update the declaration of the field named function in
class ExpDoubleFunctionFactory to be private (i.e., change its visibility to
private final) to match the private static inner class ExpFunction and follow
idiomatic encapsulation.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/RadiansDoubleFunctionFactory.java`:
- Line 33: Remove the stray blank line in the import block of
RadiansDoubleFunctionFactory.java left after removing the UnaryFunction import;
open the import section in RadiansDoubleFunctionFactory and delete the empty
line so imports are contiguous and properly formatted (look for the class
RadiansDoubleFunctionFactory and the surrounding import declarations to locate
the spot).
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/SqrtDoubleFunctionFactory.java`:
- Around line 48-70: The review points out that overriding complexity only in
SqrtFunction is inconsistent; either keep the current uniform default from
ArithmeticUnaryFunction or implement a consistent strategy for all
transcendental ops. Fix by removing any per-function getComplexity override in
SqrtFunction (leave SqrtFunction as shown using the ArithmeticUnaryFunction
default) or, if you want higher complexity for transcendental ops, introduce a
new marker/interface (e.g., TranscendentalUnaryFunction or change
ArithmeticUnaryFunction default) and update all transcendental implementations
(SqrtFunction, log/exp/sin/cos/tan/asin/acos/atan classes) to implement that so
complexity is applied consistently across those functions (refer to
SqrtFunction, ArithmeticUnaryFunction, and getComplexity).
In
`@core/src/main/java/io/questdb/griffin/engine/functions/math/SubIntFunctionFactory.java`:
- Around line 49-97: SubtractIntVVFunc is missing an override of isOperator() so
it inherits the default false; add a public boolean isOperator() override in
SubtractIntVVFunc that returns true to classify this binary subtraction function
as an operator (mirror how MulIntFunctionFactory's Func implements
isOperator()), ensuring to place the method within the SubtractIntVVFunc class
alongside other overrides.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/str/ReplaceStrFunctionFactory.java`:
- Around line 96-99: Move the getComplexity() method so it appears after
getCenter() within the ReplaceStrFunctionFactory class to satisfy alphabetical
ordering of instance methods; ensure you retain the existing implementation
(returning Function.addComplexity(COMPLEXITY_STRING_OP,
TernaryFunction.super.getComplexity())) and keep method visibility unchanged so
grouping and sorting rules for class members are preserved.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/str/SubStringFunctionFactory.java`:
- Around line 94-97: In SubStringFunctionFactory, reorder the instance methods
so that getCenter() appears before getComplexity() to match alphabetical
ordering of method names; locate the two methods named getCenter and
getComplexity in the SubStringFunctionFactory class and swap their positions
(keeping their implementations unchanged) so getCenter() is declared prior to
getComplexity().
In
`@core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursor.java`:
- Around line 125-144: In SortKeyMaterializingRecordCursor, guard against the
rowIdToOrdinal no-entry sentinel (-1L) returned by rowIdToOrdinal.get(rowId)
before passing the value into recordA.setOrdinal: in hasNext() (where
materialized is true) check if ordinal == -1L and throw an IllegalStateException
(or assert) with a clear message referencing rowId and that materialization
invariant was violated; apply the same defensive check in recordAt() where
rowIdToOrdinal.get(...) is used, and ensure any code paths that compute/put
ordinals (nextOrdinal/rowIdToOrdinal.put(...) and appendValue(...)) remain
unchanged — this makes the invariant explicit and fails fast instead of
computing negative offsets in MaterializedRecord.
In
`@core/src/test/java/io/questdb/test/griffin/engine/orderby/OrderBySortKeyMaterializationTest.java`:
- Around line 1629-1658: Update the inline comment in
OrderBySortKeyMaterializationTest to accurately reflect the materializer's
filtering logic: replace "IPv4 is excluded from materialization (no arithmetic
on this type)" with a brief note that IPv4 is excluded because the materializer
routes only certain categories (numeric/temporal/boolean/geo/decimal) for
materialization per the logic in
SortKeyMaterializingRecordCursor/MaterializedRecord, and similarly adjust the
UUID comment to reference the size-based rule (> Long.BYTES) used by the same
materializer code so both comments consistently point to the actual filtering
criteria in SortKeyMaterializingRecordCursor/MaterializedRecord.
- Around line 1748-1758: The test call to assertQueryNoLeakCheck is the only one
using the 5-argument overload (assertQueryNoLeakCheck(expected, query, "x",
true, true)); either revert it to the standard 2-argument form
assertQueryNoLeakCheck(expected, query) if the extra parameters are not
required, or keep the 5-arg call but add a one-line inline comment immediately
above it explaining that the third and two boolean flags are intentionally
supplied to verify TIMESTAMP sort-key behavior (i.e., specify expected timestamp
column "x" and supportsRandomAccess=true flags) so future maintainers understand
why this test differs from others.
In `@core/src/test/java/io/questdb/test/griffin/GroupByTest.java`:
- Around line 2738-2785: Duplicate test detected:
testLiftAliasesFromInnerSelect9 appears identical to testGroupByWithAliasClash6
(same DDL, SQL, expected rows and plan). Either remove one of the tests or
consolidate them by extracting the shared setup and assertions into a single
helper method (e.g., createTestForAliasClash(query, expectedPlan, expectedRows))
and call it from both test names if both must remain; update or remove the
redundant test (testLiftAliasesFromInnerSelect9 or testGroupByWithAliasClash6)
and ensure the remaining test keeps the same execute/create table calls,
assertPlanNoLeakCheck, and assertQueryNoLeakCheck expectations.
core/src/main/java/io/questdb/griffin/engine/functions/cast/AbstractCastToDecimalFunction.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/functions/json/JsonExtractFunction.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/functions/math/IPv4MinusIPv4FunctionFactory.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursor.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursor.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursor.java
Show resolved
Hide resolved
...src/main/java/io/questdb/griffin/engine/orderby/SortKeyMaterializingRecordCursorFactory.java
Show resolved
Hide resolved
|
@CodeRabbit i have addressed your comments, please resolve conversations that have been addressed. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
🏁 Script executed: Length of output: 60027 🏁 Script executed: Length of output: 45145 🏁 Script executed: Length of output: 4493
🏁 Script executed: Length of output: 1777 🏁 Script executed: Length of output: 238 🏁 Script executed: Length of output: 190 🏁 Script executed: Length of output: 472 🏁 Script executed: Length of output: 162 🏁 Script executed: Length of output: 314 🏁 Script executed: Length of output: 1074 🏁 Script executed: Length of output: 239 🏁 Script executed: Length of output: 127 🏁 Script executed: Length of output: 38 🏁 Script executed: Length of output: 154 🏁 Script executed: Length of output: 539 🏁 Script executed: Length of output: 120 🏁 Script executed: Length of output: 115 🏁 Script executed: Length of output: 11991 🏁 Script executed: Length of output: 5697 🏁 Script executed: Length of output: 2685 🏁 Script executed: Length of output: 2727 🏁 Script executed: Length of output: 3119
Resolved
Still open (with suggested tiny diffs)
If you push these final tweaks, I’ll immediately re-check and mark the remaining threads as addressed. 🏁 Script executed: Length of output: 5480
Resolved
Remaining small follow-ups (still open)
If you push these tiny tweaks, I’ll re-check and mark the remaining threads as addressed right away. Want me to open a follow-up issue instead for any of them? 🧠 Learnings usedIf you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
[PR Coverage check]😍 pass : 303 / 352 (86.08%) file detail
|
Summary
When sorting by computed expressions (e.g.,
ORDER BY a + b * c), QuestDB'sSortedLightRecordCursorre-evaluates the expression on every comparison during tree insertion and output. This means sort key functions are called O(N log N) times during sorting plus O(N) during output — a significant cost for expensive expressions like regex, JSON extraction, or geo distance calculations.This PR adds a sort key materialization layer that pre-computes expensive sort key values into off-heap buffers before sorting, reducing function evaluations from O(N log N) to O(N).
How it works
Function complexity model: Each
Functiongets agetComplexity()method. Complexity propagates through the expression tree — leaf costs (column=1, constant=0, regex=50, JSON=80, etc.) are summed through arity interfaces. Marker interfaces (ArithmeticBinaryFunction,ArithmeticUnaryFunction,CastFunction) add self-cost for their operation type.Materialization decision: In
SqlCodeGenerator.generateOrderBy(), if any sort key column backed by aVirtualRecordCursorFactoryexceeds the configured complexity threshold (default 3), that column's values are pre-computed.Materialization layer: A
SortKeyMaterializingRecordCursorFactorywraps the base cursor. On first iteration, it evaluates sort key expressions and stores fixed-size values in per-columnMemoryCARWbuffers. AMaterializedRecordinterceptsgetXxx(col)for materialized columns (reads from buffer) and delegates to the base record for everything else.Two-pass support: The materializing cursor tracks a
materializedflag. First pass fills buffers; subsequent passes (aftertoTop()) look up existing ordinals via aDirectLongLongHashMapwithout re-computing values.Scope limitations
cairo.sql.sort.key.materialization.threshold(default: 3)Memory safety
DirectLongLongHashMapfor rowId→ordinal mapping (taggedNATIVE_TREE_CHAIN)cairo.sql.sort.key.max.pageslimitlongordinals to avoid overflow on large result setsOther changes
Test plan
FunctionComplexityTest: complexity constants, propagation through unary/binary/ternary/quaternary/multi-arg trees, overflow capping, leaf overridesOrderBySortKeyMaterializationTest: correctness for all fixed-size types (boolean, byte, short, char, int, long, float, double, date, timestamp, geohash, decimal8/16/32/64/128/256), threshold behavior, EXPLAIN plan output, NULL ordering, DESC/ASC, multiple sort keys, mixed materialized/non-materialized keys, empty/single-row tables, toTop() re-reading🤖 Generated with Claude Code