perf(sql): improve ORDER BY performance for fixed-width column types#6862
perf(sql): improve ORDER BY performance for fixed-width column types#6862bluestreak01 merged 42 commits intomasterfrom
Conversation
WalkthroughReplaces the ORDER BY radix-sort path with an encoded-sort implementation: adds native C/C++ encoded-key sorter and JNI bridge, introduces SortKeyEncoder/SORT key types and encoded-sort cursor factories, removes radix-sort configuration and legacy LongSortedLight cursor, and updates many tests and plan labels to "Encode sort" / "Encode sort light". Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (9)
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeomeanDoubleGroupByFunctionFactoryTest.java (1)
71-91: Normalize the new SQL to the test style guide.The added statements use lowercase SQL keywords, and the longer CTAS/query in
testGeomeanAgainstExpAvgLnFormula()would be clearer as text blocks. Please keep new test SQL in the repository’s standard format. As per coding guidelines, "In SQL statements within test code, use UPPERCASE for SQL keywords" and "Use multiline strings for longer statements (multiple INSERT rows, complex queries) and to assert multiline query results".Also applies to: 116-152
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeomeanDoubleGroupByFunctionFactoryTest.java` around lines 71 - 91, The test uses lowercase inline SQL and a single-line string for a longer CTAS/query; update the statements in testGeomeanAgainstExpAvgLnFormula() to follow the project style by converting SQL keywords to UPPERCASE and using multiline text blocks for the CTAS and the SELECT used in assertSql; specifically replace the execute("create table tab as (... )") call with a multiline string containing UPPERCASE keywords (e.g., CREATE TABLE AS SELECT FROM LONG_SEQUENCE) and similarly make the SELECT passed to assertSql a text block with all SQL keywords uppercased, leaving the expected result string unchanged, and ensure you apply the same changes to the other occurrence noted (lines 116-152) for consistency.core/src/test/java/io/questdb/test/griffin/WindowJoinTest.java (1)
120-132: Extract a helper for encoded-sort plan prefixes.This rename touched a lot of assertions for a pure plan-node label change. Pulling the
"Encode sort"/"Encode sort light"prefixes into a small helper would make future planner-label churn much less noisy in this file.♻️ Example shape
+ private static String encodeSortPlan(String keys, String body) { + return "Encode sort\n" + + " keys: [" + keys + "]\n" + + body; + } + + private static String encodeSortLightPlan(String keys, String body) { + return "Encode sort light\n" + + " keys: [" + keys + "]\n" + + body; + }Also applies to: 4374-4429
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/WindowJoinTest.java` around lines 120 - 132, Refactor repeated plan-prefix strings by extracting a helper method (e.g., encodeSortPrefix or getEncodeSortPrefix) that returns either "Encode sort\n" or "Encode sort light\n" based on a parameter, then replace all hard-coded occurrences in WindowJoinTest (including the multi-line expected strings around the Async Window Fast Join blocks at the locations shown and the similar block at lines ~4374-4429) to call this helper so assertions concatenate the helper result with the rest of the expected plan text; update test helper usages where the prefix is needed so future label renames are handled in one place.core/src/test/java/io/questdb/test/griffin/SecurityTest.java (2)
560-560: Avoid pinning these assertions to theEncodedSortimplementation name.These checks will start failing on harmless renames/refactors even if the memory-limit behavior is still correct. If the intent is to validate the failure mode, assert a stable fragment like
memory exceededor an error code, and keep plan/path assertions separate.Also applies to: 809-809, 1092-1092, 1156-1156
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/SecurityTest.java` at line 560, Replace assertions that match the full implementation name "memory exceeded in EncodedSort" with a stable fragment such as "memory exceeded" (e.g. change the Assert.assertTrue(...contains("memory exceeded in EncodedSort")) calls in SecurityTest.java to assert on "memory exceeded"). Keep any plan/path or implementation-name checks as separate assertions if you still need to validate the execution plan; update the occurrences around the Assert.assertTrue calls at the noted spots (lines with Assert.assertTrue(ex.toString().contains(...))) so they check the generic error text or an error code rather than the concrete class name.
1148-1156: This change no longer exercises the tree fallback path.
ORDER BY sym1, count()stays within the encoded-key path described in this PR, so this test now validatesEncodedSortrather than the tree-resize scenario implied bytestTreeResizesWithImplicitGroupBy. Please keep a separate case that forces the fallback sorter, or rename this test if that coverage moved elsewhere.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/SecurityTest.java` around lines 1148 - 1156, The current assertion in testTreeResizesWithImplicitGroupBy now hits EncodedSort (query "select sym1, count() from tb1 order by sym1, count()") and no longer triggers the tree-resize fallback; either add a separate assertion that will force the fallback sorter (for example run the same aggregation but order only by the aggregate like "select sym1, count() from tb1 order by count()" or otherwise change the ORDER BY to one that does not use the encoded-key path) or rename the test to reflect it now targets EncodedSort; update the test body around the existing execute/query call and the Assert.assertTrue checking for "memory exceeded in EncodedSort" (or add a new try/catch block) so there is a distinct case that exercises the tree-resize fallback path referenced by testTreeResizesWithImplicitGroupBy.core/src/main/java/io/questdb/std/Vect.java (1)
419-419: PlacesortEncodedEntries()in the alphabeticalsort*section.The new method was inserted before
sortArrayColumn(), which breaks the required ordering for public static members in Java files.As per coding guidelines,
**/*.java: "Group Java class members by kind (static vs. instance) and visibility, sorted alphabetically."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/std/Vect.java` at line 419, The new native method sortEncodedEntries(long addr, long count, int keyLongs) in class Vect should be relocated into the alphabetical block of public static sort* methods: move its declaration so it appears in the same section as and alphabetically among other sort* methods (for example placing it after sortArrayColumn() or wherever it fits alphabetically), preserving the native modifier and signature.core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java (1)
1559-1562: MoveisSqlParquetRowGroupPruningEnabled()into the alphabeticalisSqlP...block.This delegation is fine, but its current placement comes after all
isSqlParallel*members, so the class no longer follows the required member ordering for Java files.As per coding guidelines,
**/*.java: "Group Java class members by kind (static vs. instance) and visibility, sorted alphabetically."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java` around lines 1559 - 1562, The method isSqlParquetRowGroupPruningEnabled() is misplaced after the isSqlParallel* members, breaking the alphabetical grouping; move the entire method (public boolean isSqlParquetRowGroupPruningEnabled()) into the existing alphabetical isSqlP... block among the other isSqlParquet*/isSqlP* methods so members remain grouped and alphabetized by name consistent with the class's Java member ordering rules.core/src/test/java/io/questdb/test/griffin/engine/groupby/SampleByTest.java (1)
6279-6326: Use UPPERCASE for SQL keywords in test SQL statements.The SQL statements use lowercase keywords (
select,from,sample by,fill), which does not follow the coding guidelines for test code.♻️ Proposed fix
assertSql( """ ts\tavg 2017-12-30T00:00:00.000000Z\t72.5 2018-01-04T00:00:00.000000Z\t264.5 2018-01-09T00:00:00.000000Z\t432.5 """, - "select ts, avg(x) from fromto\n" + - "sample by 5d" + "SELECT ts, avg(x) FROM fromto\n" + + "SAMPLE BY 5d" ); assertSql( """ ts\tavg 2017-12-20T00:00:00.000000Z\tnull 2017-12-25T00:00:00.000000Z\tnull 2017-12-30T00:00:00.000000Z\t72.5 2018-01-04T00:00:00.000000Z\t264.5 2018-01-09T00:00:00.000000Z\t432.5 """, - "select ts, avg(x) from fromto\n" + - "sample by 5d from '2017-12-20' fill(null)" + "SELECT ts, avg(x) FROM fromto\n" + + "SAMPLE BY 5d FROM '2017-12-20' FILL(null)" ); assertSql( """ ts\tavg 2017-12-20T00:00:00.000000Z\tnull 2017-12-25T00:00:00.000000Z\tnull 2017-12-30T00:00:00.000000Z\t72.5 2018-01-04T00:00:00.000000Z\t264.5 2018-01-09T00:00:00.000000Z\t432.5 2018-01-14T00:00:00.000000Z\tnull 2018-01-19T00:00:00.000000Z\tnull 2018-01-24T00:00:00.000000Z\tnull 2018-01-29T00:00:00.000000Z\tnull """, - "select ts, avg(x) from fromto\n" + - "sample by 5d from '2017-12-20' to '2018-01-31' fill(null)" + "SELECT ts, avg(x) FROM fromto\n" + + "SAMPLE BY 5d FROM '2017-12-20' TO '2018-01-31' FILL(null)" );As per coding guidelines: "In SQL statements within test code, use UPPERCASE for SQL keywords (CREATE TABLE, INSERT, SELECT ... AS ... FROM, etc.)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/groupby/SampleByTest.java` around lines 6279 - 6326, In testSampleByFromToFillNull, update the SQL literals passed to assertSql so all SQL keywords are UPPERCASE: change "select" to "SELECT", "from" to "FROM", "sample by" to "SAMPLE BY", "from '...'"/ "to '...'" to "FROM '...'" / "TO '...'", and "fill(null)" to "FILL(NULL)" in the three assertSql calls (the queries that reference the fromto table); keep string literals like dates and identifiers unchanged.core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.java (1)
89-102: Use UPPERCASE for SQL keywords.Per coding guidelines, SQL keywords in test code should be UPPERCASE. The query and DDL statements use lowercase
select,from,create table,case,when,then,else,end,as.♻️ Example fix for one test
- "select * from x order by a asc;", - "create table x as (" + - "select" + - " cast (case" + - " when x < 10 then x" + - " when x >= 10 and x < 15 then null" + - " else x - 25" + - " end as date) as a" + - " from long_sequence(25)" + - ")", + "SELECT * FROM x ORDER BY a ASC", + "CREATE TABLE x AS (" + + "SELECT" + + " CAST(CASE" + + " WHEN x < 10 THEN x" + + " WHEN x >= 10 AND x < 15 THEN NULL" + + " ELSE x - 25" + + " END AS DATE) AS a" + + " FROM long_sequence(" + 25 + "))",As per coding guidelines: "In SQL statements within test code, use UPPERCASE for SQL keywords".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.java` around lines 89 - 102, The SQL literals in OrderByEncodeSortTest.java use lowercase keywords; update the two string literals (the query string "select * from x order by a asc;" and the DDL passed as the second argument) to use UPPERCASE SQL keywords (e.g., SELECT, FROM, ORDER BY, ASC, CREATE TABLE AS, CAST, CASE, WHEN, THEN, ELSE, END, AS) while preserving identifiers, expressions and spacing so the test behavior is unchanged.core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortFuzzTest.java (1)
43-69: Use UPPERCASE for SQL keywords in test code.Per coding guidelines, SQL keywords should be UPPERCASE in test code. Consider updating
select,case,when,then,else,end,fromto their uppercase equivalents.♻️ Suggested change
execute( - "CREATE TABLE fuzz_sort AS (SELECT" + - " rnd_boolean() col_bool," + + "CREATE TABLE fuzz_sort AS (SELECT " + + "rnd_boolean() col_bool," + ... - " FROM long_sequence(" + rowCount + ")) TIMESTAMP(ts)" + " FROM long_sequence(" + rowCount + ")) TIMESTAMP(ts)"The
CREATE TABLE,AS,SELECT,FROM, andTIMESTAMPkeywords should remain uppercase (they already are), but ensure consistency throughout.As per coding guidelines: "In SQL statements within test code, use UPPERCASE for SQL keywords".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortFuzzTest.java` around lines 43 - 69, The SQL in OrderByEncodeSortFuzzTest uses lowercase SQL keywords (e.g., "select", "from", "timestamp") inconsistent with project guidelines; update the CREATE TABLE statement inside the test so all SQL keywords are UPPERCASE (SELECT, FROM, TIMESTAMP and any other keywords like CASE/WHEN/THEN/ELSE/END if present) while leaving function names (rnd_*), column names, and symbols unchanged; locate the SQL string in OrderByEncodeSortFuzzTest (the execute(...) call that creates fuzz_sort) and replace the lowercase keywords with their uppercase equivalents for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/src/main/c/share/ooo.cpp`:
- Around line 1468-1470: The code uses GCC/Clang-specific __builtin_clzll and
doesn't handle count==0; replace this with a cross-platform count-leading-zeros
helper and guard for zero. Implement a wrapper (e.g.,
count_leading_zeros64(uint64_t)) that uses std::countl_zero(count) from <bit>
when available, falls back to _BitScanReverse64 on MSVC via conditional
compilation, and provides a portable loop-based fallback; then compute lg = 63 -
count_leading_zeros64(count) but ensure you handle count==0 (return 0 or set
lg=1 as original intent) before using it to calculate unstable_limit = count /
lg. Ensure necessary headers and conditional `#includes` are added and no
undefined behavior occurs for count==0.
In `@core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyEncoder.java`:
- Around line 156-162: The code undercounts needed bytes because ranks are
stored as (rank + 1) (see the use of "rank + 1"), so a symbol table of size 255
produces a max stored value of 256 and won't fit in 1 byte; update the threshold
logic in SortKeyEncoder where columnByteWidths is set (refer to symbolCount and
columnByteWidths[i]) to account for the +1 by either comparing symbolCount <
0xFF and symbolCount < 0xFFFF or by testing (symbolCount + 1) <= 0xFF and
(symbolCount + 1) <= 0xFFFF so that tables of exactly 256 or 65536 symbols pick
2 or 4 bytes respectively.
In `@core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyType.java`:
- Around line 40-53: The method SortKeyType.fromKeyLength currently maps zero or
negative keyLength to FIXED_8; change it to reject non-positive lengths by
adding an early check in fromKeyLength (e.g., if keyLength <= 0 return
UNSUPPORTED or throw IllegalArgumentException) before the existing <=8/16/24/32
branches so invalid encoder states aren't treated as 8-byte keys.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeomeanDoubleGroupByFunctionFactoryTest.java`:
- Around line 68-93: Wrap the test method testGeomeanAgainstExpAvgLnFormula (and
the similar test at lines 112-154) inside assertMemoryLeak(), and replace the
use of assertSql(...) with the leak-aware query helper
assertQueryNoLeakCheck(...) so the table creation and query assertions run under
the memory-leak guard; specifically, call assertMemoryLeak(() -> { create the
table via execute(...); then invoke assertQueryNoLeakCheck(expectedResultString,
queryString, "tab"); }) referencing the test method name
testGeomeanAgainstExpAvgLnFormula and the other affected test to locate where to
apply the change.
In
`@core/src/test/java/io/questdb/test/griffin/engine/orderby/FunctionComplexityTest.java`:
- Around line 58-68: The test FunctionComplexityTest currently expects a plan
without a materialization node even though its intent is to verify that a
non-SYMBOL computed sort key (a::DOUBLE) exceeding the complexity threshold
triggers sort-key materialization; update the test to either (A) rename the
test/comment to reflect the new behavior if materialization is intentionally
removed, or (B) restore/assert the materialization path by adjusting the
expected plan passed to assertPlanNoLeakCheck so it includes the materialization
node for the computed expression (a::double) — ensure the assertion specifically
targets non-SYMBOL computed keys and references the test harness method
assertPlanNoLeakCheck and the test class FunctionComplexityTest to prevent
regressions in complexity-based materialization.
In `@core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java`:
- Around line 10291-10294: The expected explain plan in ExplainPlanTest.java
incorrectly treats ORDER BY 3 as referring to the projected column "sum" instead
of the third projected column "a0"; update the expected plan string in the test
so the key list shows "a0" (or the correct alias used in the projection) instead
of "sum" for the ORDER BY 3 entry (look for the test method or expectation
string that contains the block starting "SelectedRecord\n Encode sort\n
keys: [column desc, sum, berlin_ts desc]" and replace "sum" with "a0").
In `@core/src/test/java/io/questdb/test/griffin/SecurityTest.java`:
- Line 1148: The SQL in the test contains lowercase keywords; update the query
string "select sym1, count() from tb1 order by sym1, count()" to use uppercase
SQL keywords (e.g., SELECT, FROM, ORDER BY) so it follows the test convention
and coding guidelines.
---
Nitpick comments:
In `@core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java`:
- Around line 1559-1562: The method isSqlParquetRowGroupPruningEnabled() is
misplaced after the isSqlParallel* members, breaking the alphabetical grouping;
move the entire method (public boolean isSqlParquetRowGroupPruningEnabled())
into the existing alphabetical isSqlP... block among the other
isSqlParquet*/isSqlP* methods so members remain grouped and alphabetized by name
consistent with the class's Java member ordering rules.
In `@core/src/main/java/io/questdb/std/Vect.java`:
- Line 419: The new native method sortEncodedEntries(long addr, long count, int
keyLongs) in class Vect should be relocated into the alphabetical block of
public static sort* methods: move its declaration so it appears in the same
section as and alphabetically among other sort* methods (for example placing it
after sortArrayColumn() or wherever it fits alphabetically), preserving the
native modifier and signature.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeomeanDoubleGroupByFunctionFactoryTest.java`:
- Around line 71-91: The test uses lowercase inline SQL and a single-line string
for a longer CTAS/query; update the statements in
testGeomeanAgainstExpAvgLnFormula() to follow the project style by converting
SQL keywords to UPPERCASE and using multiline text blocks for the CTAS and the
SELECT used in assertSql; specifically replace the execute("create table tab as
(... )") call with a multiline string containing UPPERCASE keywords (e.g.,
CREATE TABLE AS SELECT FROM LONG_SEQUENCE) and similarly make the SELECT passed
to assertSql a text block with all SQL keywords uppercased, leaving the expected
result string unchanged, and ensure you apply the same changes to the other
occurrence noted (lines 116-152) for consistency.
In `@core/src/test/java/io/questdb/test/griffin/engine/groupby/SampleByTest.java`:
- Around line 6279-6326: In testSampleByFromToFillNull, update the SQL literals
passed to assertSql so all SQL keywords are UPPERCASE: change "select" to
"SELECT", "from" to "FROM", "sample by" to "SAMPLE BY", "from '...'"/ "to '...'"
to "FROM '...'" / "TO '...'", and "fill(null)" to "FILL(NULL)" in the three
assertSql calls (the queries that reference the fromto table); keep string
literals like dates and identifiers unchanged.
In `@core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortFuzzTest.java`:
- Around line 43-69: The SQL in OrderByEncodeSortFuzzTest uses lowercase SQL
keywords (e.g., "select", "from", "timestamp") inconsistent with project
guidelines; update the CREATE TABLE statement inside the test so all SQL
keywords are UPPERCASE (SELECT, FROM, TIMESTAMP and any other keywords like
CASE/WHEN/THEN/ELSE/END if present) while leaving function names (rnd_*), column
names, and symbols unchanged; locate the SQL string in OrderByEncodeSortFuzzTest
(the execute(...) call that creates fuzz_sort) and replace the lowercase
keywords with their uppercase equivalents for consistency.
In `@core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.java`:
- Around line 89-102: The SQL literals in OrderByEncodeSortTest.java use
lowercase keywords; update the two string literals (the query string "select *
from x order by a asc;" and the DDL passed as the second argument) to use
UPPERCASE SQL keywords (e.g., SELECT, FROM, ORDER BY, ASC, CREATE TABLE AS,
CAST, CASE, WHEN, THEN, ELSE, END, AS) while preserving identifiers, expressions
and spacing so the test behavior is unchanged.
In `@core/src/test/java/io/questdb/test/griffin/SecurityTest.java`:
- Line 560: Replace assertions that match the full implementation name "memory
exceeded in EncodedSort" with a stable fragment such as "memory exceeded" (e.g.
change the Assert.assertTrue(...contains("memory exceeded in EncodedSort"))
calls in SecurityTest.java to assert on "memory exceeded"). Keep any plan/path
or implementation-name checks as separate assertions if you still need to
validate the execution plan; update the occurrences around the Assert.assertTrue
calls at the noted spots (lines with
Assert.assertTrue(ex.toString().contains(...))) so they check the generic error
text or an error code rather than the concrete class name.
- Around line 1148-1156: The current assertion in
testTreeResizesWithImplicitGroupBy now hits EncodedSort (query "select sym1,
count() from tb1 order by sym1, count()") and no longer triggers the tree-resize
fallback; either add a separate assertion that will force the fallback sorter
(for example run the same aggregation but order only by the aggregate like
"select sym1, count() from tb1 order by count()" or otherwise change the ORDER
BY to one that does not use the encoded-key path) or rename the test to reflect
it now targets EncodedSort; update the test body around the existing
execute/query call and the Assert.assertTrue checking for "memory exceeded in
EncodedSort" (or add a new try/catch block) so there is a distinct case that
exercises the tree-resize fallback path referenced by
testTreeResizesWithImplicitGroupBy.
In `@core/src/test/java/io/questdb/test/griffin/WindowJoinTest.java`:
- Around line 120-132: Refactor repeated plan-prefix strings by extracting a
helper method (e.g., encodeSortPrefix or getEncodeSortPrefix) that returns
either "Encode sort\n" or "Encode sort light\n" based on a parameter, then
replace all hard-coded occurrences in WindowJoinTest (including the multi-line
expected strings around the Async Window Fast Join blocks at the locations shown
and the similar block at lines ~4374-4429) to call this helper so assertions
concatenate the helper result with the rest of the expected plan text; update
test helper usages where the prefix is needed so future label renames are
handled in one place.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6f3345ea-2a1b-4e6f-acd9-48378718fe93
⛔ Files ignored due to path filters (6)
core/src/main/bin/linux-aarch64/libjemalloc.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/darwin-aarch64/libquestdb.dylibis excluded by!**/*.dylibcore/src/main/resources/io/questdb/bin/darwin-x86-64/libquestdb.dylibis excluded by!**/*.dylibcore/src/main/resources/io/questdb/bin/linux-aarch64/libquestdb.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/linux-x86-64/libquestdb.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/windows-x86-64/libquestdb.dllis excluded by!**/*.dll
📒 Files selected for processing (53)
core/.clang-formatcore/src/main/c/share/ooo.cppcore/src/main/java/io/questdb/PropServerConfiguration.javacore/src/main/java/io/questdb/PropertyKey.javacore/src/main/java/io/questdb/cairo/CairoConfiguration.javacore/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.javacore/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.javacore/src/main/java/io/questdb/griffin/SqlCodeGenerator.javacore/src/main/java/io/questdb/griffin/engine/orderby/EncodedSortLightRecordCursor.javacore/src/main/java/io/questdb/griffin/engine/orderby/EncodedSortLightRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/orderby/EncodedSortRecordCursor.javacore/src/main/java/io/questdb/griffin/engine/orderby/EncodedSortRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/orderby/LongSortedLightRecordCursor.javacore/src/main/java/io/questdb/griffin/engine/orderby/SortKeyEncoder.javacore/src/main/java/io/questdb/griffin/engine/orderby/SortKeyType.javacore/src/main/java/io/questdb/std/Vect.javacore/src/test/java/io/questdb/test/PropServerConfigurationTest.javacore/src/test/java/io/questdb/test/ServerMainQuerySmokeTest.javacore/src/test/java/io/questdb/test/ServerMainTest.javacore/src/test/java/io/questdb/test/cairo/ArrayTest.javacore/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.javacore/src/test/java/io/questdb/test/cairo/view/ViewsFunctionTest.javacore/src/test/java/io/questdb/test/griffin/AggregateTest.javacore/src/test/java/io/questdb/test/griffin/ClickBenchTest.javacore/src/test/java/io/questdb/test/griffin/DistinctTest.javacore/src/test/java/io/questdb/test/griffin/ExplainPlanTest.javacore/src/test/java/io/questdb/test/griffin/GroupByTest.javacore/src/test/java/io/questdb/test/griffin/HorizonJoinTest.javacore/src/test/java/io/questdb/test/griffin/IPv4Test.javacore/src/test/java/io/questdb/test/griffin/LimitTest.javacore/src/test/java/io/questdb/test/griffin/OrderByEncodeSortFuzzTest.javacore/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.javacore/src/test/java/io/questdb/test/griffin/OrderByRadixSortTest.javacore/src/test/java/io/questdb/test/griffin/OrderByWithFilterTest.javacore/src/test/java/io/questdb/test/griffin/PivotTest.javacore/src/test/java/io/questdb/test/griffin/ProjectionReferenceTest.javacore/src/test/java/io/questdb/test/griffin/SecurityTest.javacore/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.javacore/src/test/java/io/questdb/test/griffin/WindowJoinTest.javacore/src/test/java/io/questdb/test/griffin/engine/SqlCodeGeneratorTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/array/DoubleArrayRoundFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/catalogue/PgAttributeFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMaxDoubleDoubleGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMaxDoubleTimestampGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMaxTimestampDoubleGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeomeanDoubleGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.javacore/src/test/java/io/questdb/test/griffin/engine/groupby/SampleByNanoTimestampTest.javacore/src/test/java/io/questdb/test/griffin/engine/groupby/SampleByTest.javacore/src/test/java/io/questdb/test/griffin/engine/orderby/FunctionComplexityTest.javacore/src/test/java/io/questdb/test/griffin/engine/orderby/OrderBySortKeyMaterializationTest.javacore/src/test/java/io/questdb/test/griffin/engine/table/AsyncFilteredRecordCursorFactoryTest.javacore/src/test/resources/server.conf
💤 Files with no reviewable changes (7)
- core/src/test/resources/server.conf
- core/src/main/java/io/questdb/cairo/CairoConfiguration.java
- core/src/test/java/io/questdb/test/griffin/OrderByRadixSortTest.java
- core/src/test/java/io/questdb/test/ServerMainTest.java
- core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java
- core/src/test/java/io/questdb/test/PropServerConfigurationTest.java
- core/src/main/java/io/questdb/griffin/engine/orderby/LongSortedLightRecordCursor.java
core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyEncoder.java
Show resolved
Hide resolved
...o/questdb/test/griffin/engine/functions/groupby/GeomeanDoubleGroupByFunctionFactoryTest.java
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/griffin/engine/orderby/FunctionComplexityTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.java (1)
125-134: Please normalize the new SQL fixtures to the test conventions.These setup/query strings are still using lowercase keywords and manual string concatenation. Converting them to multiline strings with UPPERCASE SQL will make the new cases much easier to scan and keep consistent with the rest of the suite. As per coding guidelines "In SQL statements within test code, use UPPERCASE for SQL keywords (CREATE TABLE, INSERT, SELECT ... AS ... FROM, etc.)" and "Use multiline strings for longer statements (multiple INSERT rows, complex queries) and to assert multiline query results".
Also applies to: 172-181, 322-331, 369-378, 416-425, 463-472, 702-713, 738-749, 787-796, 834-843, 894-903
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.java` around lines 125 - 134, Normalize the SQL fixtures in OrderByEncodeSortTest by replacing the concatenated lowercase strings with Java text-block multiline strings and using UPPERCASE for SQL keywords; for example replace the query string "select * from x order by a asc;" with a multiline text block "SELECT * FROM x ORDER BY a ASC;" and replace the long CREATE TABLE ... AS (...) concatenation with a single multiline text block where keywords like CREATE TABLE, AS, SELECT, CAST, CASE, WHEN, THEN, ELSE, END, FROM are uppercased and the expression is formatted across lines for readability. Apply this same conversion pattern to the other occurrences called out in the review (the blocks around lines 172-181, 322-331, 369-378, 416-425, 463-472, 702-713, 738-749, 787-796, 834-843, 894-903) so all SQL fixtures in OrderByEncodeSortTest use multiline text blocks and UPPERCASE keywords.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/src/main/c/share/ooo.cpp`:
- Around line 1563-1575: The JNI boundary function
Java_io_questdb_std_Vect_sortEncodedEntries currently uses assert in the default
switch branch which is disabled in release builds; replace that with explicit
validation: detect unsupported keyLongs in
Java_io_questdb_std_Vect_sortEncodedEntries (before or inside the switch
default), and call a JNI throwable (e.g., env->ThrowNew for
IllegalArgumentException or RuntimeException) with a clear message mentioning
keyLongs and/or call abort() if you want a hard fail, then return immediately;
ensure references to sort_encoded_impl<...> remain unchanged and that the
function does not fall through silently when keyLongs is invalid.
In `@core/src/main/c/share/ooo.h`:
- Around line 37-45: The header ooo.h must be made self-contained: add the
necessary includes before the inline clzll(uint64_t x) definition — specifically
include <stdint.h> (for uint64_t), <type_traits> (for std::integral_constant
usage elsewhere), and on MSVC include <intrin.h> (for _BitScanReverse64) guarded
by the MSVC detection macro; then keep the existing clzll, _BitScanReverse64
usage, and range_bits() changes intact so the header no longer relies on include
order.
In `@core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyEncoder.java`:
- Around line 121-124: The init() method is freeing the encoder-owned field
tempList (calling Misc.free(tempList)), which leads to double-free/idempotency
issues since close() also frees tempList; remove the free of the tempList field
from init() and either (a) allocate any scratch list as a local variable inside
init() if it should be ephemeral, or (b) keep tempList alive and only free it in
close() alongside rankMaps (see tempList, init(), close(), and rankMaps usage);
apply the same correction for the other init/cleanup occurrences noted (around
the 135-168 region).
---
Nitpick comments:
In `@core/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.java`:
- Around line 125-134: Normalize the SQL fixtures in OrderByEncodeSortTest by
replacing the concatenated lowercase strings with Java text-block multiline
strings and using UPPERCASE for SQL keywords; for example replace the query
string "select * from x order by a asc;" with a multiline text block "SELECT *
FROM x ORDER BY a ASC;" and replace the long CREATE TABLE ... AS (...)
concatenation with a single multiline text block where keywords like CREATE
TABLE, AS, SELECT, CAST, CASE, WHEN, THEN, ELSE, END, FROM are uppercased and
the expression is formatted across lines for readability. Apply this same
conversion pattern to the other occurrences called out in the review (the blocks
around lines 172-181, 322-331, 369-378, 416-425, 463-472, 702-713, 738-749,
787-796, 834-843, 894-903) so all SQL fixtures in OrderByEncodeSortTest use
multiline text blocks and UPPERCASE keywords.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1554c663-855f-4b69-b238-7028bae1a2a2
⛔ Files ignored due to path filters (4)
core/src/main/resources/io/questdb/bin/darwin-aarch64/libquestdb.dylibis excluded by!**/*.dylibcore/src/main/resources/io/questdb/bin/linux-aarch64/libquestdb.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/linux-x86-64/libquestdb.sois excluded by!**/*.socore/src/main/resources/io/questdb/bin/windows-x86-64/libquestdb.dllis excluded by!**/*.dll
📒 Files selected for processing (7)
core/src/main/c/share/ooo.cppcore/src/main/c/share/ooo.hcore/src/main/java/io/questdb/cairo/CairoConfiguration.javacore/src/main/java/io/questdb/griffin/engine/orderby/SortKeyEncoder.javacore/src/main/java/io/questdb/griffin/engine/orderby/SortKeyType.javacore/src/main/java/io/questdb/std/Vect.javacore/src/test/java/io/questdb/test/griffin/OrderByEncodeSortTest.java
🚧 Files skipped from review as they are similar to previous changes (2)
- core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyType.java
- core/src/main/java/io/questdb/cairo/CairoConfiguration.java
core/src/main/java/io/questdb/griffin/engine/orderby/SortKeyEncoder.java
Show resolved
Hide resolved
# Conflicts: # core/src/main/resources/io/questdb/bin/linux-aarch64/libquestdb.so # core/src/main/resources/io/questdb/bin/windows-x86-64/libquestdb.dll
[PR Coverage check]😍 pass : 483 / 503 (96.02%) file detail
|
Summary
SortKeyEncoderencodes column values into fixed-width, order-preserving binary keys (8/16/24/32 bytes), enabling comparisons via native uint64 operations instead of per-column type dispatch.encodeFixed8) for single-column sorts on types ≤8 bytes, avoidingreverseBytes.sortEncodedEntries) is a three-layer hybrid:std::inplace_mergeto avoid heap allocation.std::inplace_mergebuffer. All data structures (count arrays, bucket lists, thread handles) are stack-allocated.std::sortfor the comparison-sort fallback, ensuring consistent performance across platforms regardless of the standard library implementation.EncodedSortRecordCursorFactoryfor the non-random-access path, replacing the R-B tree (RecordTreeChain) with flat array collect + radix sort +RecordChainoffset lookup.LongSortedLightRecordCursorFactorywithEncodedSortLightRecordCursorFactoryfor single-column long/timestamp sorts.tandem: https://github.com/questdb/questdb-enterprise/pull/942
Benchmark
5-column ORDER BY (
symbol, ecn, counterparty, passive, horizon_sec) over a HORIZON JOIN with 3,085,929 result rows:2-column ORDER BY (
symbol, best_ask) over 1,000,000,000 rows, 10 symbols:Query:
5-column benchmark query
Limitations
min(hardware_concurrency, bucket_count).