feat(sql): use vectorized computation in more non-keyed GROUP BY queries#6805
feat(sql): use vectorized computation in more non-keyed GROUP BY queries#6805bluestreak01 merged 17 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughExtended GroupByFunction.computeBatch to accept a long startRowId and updated many group-by implementations and call sites to support batch-aware, cumulative and row-id-ordered aggregation semantics; added vectorized batch execution paths and updated tests and factories to exercise accumulation across sequential batches. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@puzpuzpuz, LGTM, but Claude is concerned about test coverage: The last one is probably ok, the fuzz test exercises that path. |
@glasstiger all three concerns are already covered by existing tests:
|
|
@CodeRabbit can you critically review this PR and find bugs |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 7645 🏁 Script executed: Length of output: 107961 🏁 Script executed: Length of output: 102697
🔴 Bug 1:
|
|
What Claude thinks about coderabbit comments: |
|
@CodeRabbit re-review the PR |
🧩 Analysis chain🏁 Script executed: Length of output: 34903
Corrections to my previous reviewBugs 1, 2, 3 were wrong. After reading the actual implementations:
All three What is actually a real (minor) new issueClaude caught something I missed: final double batchSum = Vect.sumDouble(ptr, count);
if (Numbers.isFinite(batchSum)) { // ← entire batch silently dropped if batchSum = ±Infinity
...
}If a batch of values sums to Standing findings (unchanged)Bug 4 (dormant API contract): All window-join call sites pass Bug 5 (MaxIPv4): Confirmed style inconsistency only, not a behavioral bug. Withdrawing the "bug" label. 🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullDoubleGroupByFunction.java (1)
41-68:⚠️ Potential issue | 🟡 MinorInconsistent null-check predicate:
computeBatchusesisNull,computeNextusesisFinite.Line 47 uses
!Numbers.isNull(value)while line 63 usesNumbers.isFinite(arg.getDouble(record)). This creates a behavioral discrepancy:
Path NaN±InfinitycomputeBatchskipped accepted computeNextskipped skipped For
last_not_nullsemantics,±Infinityis arguably a valid non-null value and should be treated consistently. ThecomputeNextpredicate is stricter and may incorrectly skip infinite values.🔧 Suggested fix: align computeNext with computeBatch
`@Override` public void computeNext(MapValue mapValue, Record record, long rowId) { - if (Numbers.isFinite(arg.getDouble(record))) { + double val = arg.getDouble(record); + if (!Numbers.isNull(val)) { if (Numbers.isNull(mapValue.getDouble(valueIndex + 1)) || rowId > mapValue.getLong(valueIndex)) { computeFirst(mapValue, record, rowId); } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullDoubleGroupByFunction.java` around lines 41 - 68, computeNext currently uses Numbers.isFinite(arg.getDouble(record)) which rejects ±Infinity while computeBatch only rejects Numbers.isNull; update computeNext to use the same null-check as computeBatch (i.e., treat any non-null double, including ±Infinity, as valid) by replacing the Numbers.isFinite predicate with a !Numbers.isNull check on arg.getDouble(record), keeping the existing logic that compares mapValue.getDouble(valueIndex + 1) and mapValue.getLong(valueIndex) and calls computeFirst(mapValue, record, rowId) when appropriate.core/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.java (1)
122-128:⚠️ Potential issue | 🟠 MajorBug:
mergeusesputIntinstead ofputChar.Line 126 calls
destValue.putInt(valueIndex, srcMin)but the value type isColumnType.CHAR(initialized ininitValueTypes). This should beputCharto match the column type and be consistent withcomputeFirst,computeNext, andcomputeBatchwhich all useputChar.🐛 Proposed fix
`@Override` public void merge(MapValue destValue, MapValue srcValue) { char srcMin = srcValue.getChar(valueIndex); char destMin = destValue.getChar(valueIndex); if (srcMin != 0 && (srcMin < destMin || destMin == 0)) { - destValue.putInt(valueIndex, srcMin); + destValue.putChar(valueIndex, srcMin); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.java` around lines 122 - 128, In MinCharGroupByFunction, the merge(MapValue destValue, MapValue srcValue) implementation stores the min value with destValue.putInt(...) while the column type is CHAR; change the write to destValue.putChar(valueIndex, srcMin) so it matches initValueTypes and the other methods (computeFirst, computeNext, computeBatch) that use putChar and uses the same valueIndex and MapValue semantics.
🧹 Nitpick comments (14)
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java (1)
205-205: Use a non-zerostartRowIdon the second sum batch to harden API-contract coverage.On Line 205, passing
0again doesn’t validate cross-segment offset handling. Using2here makes this test consistent with the new segmented-batch contract and better guards future regressions.Suggested change
- function.computeBatch(value, ptr, 2, 0); + function.computeBatch(value, ptr, 2, 2);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java` at line 205, In ShortGroupByFunctionBatchTest, the second computeBatch call uses startRowId 0 which doesn't exercise cross-segment offsets; update the second invocation of function.computeBatch(value, ptr, 2, 0) to use a non-zero startRowId (e.g., 2) so it becomes function.computeBatch(value, ptr, 2, 2), ensuring the computeBatch method is tested with a non-zero startRowId to validate segmented-batch behavior.core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (1)
3132-3137: Update stale inline comment to match the new expectationLine 3132 says aggregate expressions disable vectorization, but Line 3136 now expects
vectorized: true. Please align the comment with the actual behavior to avoid confusion.✏️ Suggested comment tweak
- `@Test` // expressions in aggregates disable vectorized impl + `@Test` // aggregate expressions may still vectorize depending on function support public void testGroupByNotKeyed2() throws Exception {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java` around lines 3132 - 3137, The inline comment before the test method testGroupByNotKeyed2 is now misleading: it states "expressions in aggregates disable vectorized impl" while the test expectation shows vectorized: true; update that comment to reflect the current behavior (e.g., "expressions in aggregates do not disable vectorized implementation" or similar) so the `@Test` comment matches the expected output in the test asserting vectorized: true in testGroupByNotKeyed2.core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionInt.java (1)
51-51: Skip zero-delta map writes in the batch count path.Line 51 always performs a map mutation, even when
nonNullCountis0. In a tight vectorized path this adds avoidable write traffic.♻️ Proposed tweak
- mapValue.addLong(valueIndex, nonNullCount); + if (nonNullCount > 0) { + mapValue.addLong(valueIndex, nonNullCount); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionInt.java` at line 51, In CountGeoHashGroupByFunctionInt update the batch count path to avoid mutating the map when nonNullCount is zero: before calling mapValue.addLong(valueIndex, nonNullCount) check if nonNullCount != 0 and only perform the addLong when it's non-zero (leave existing behavior for non-zero counts); this prevents unnecessary map writes in the tight vectorized path while preserving current semantics for positive counts.core/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxDateGroupByFunction.java (1)
48-56: Missing null guard forbatchMax– inconsistent withMinLongGroupByFunction.
Vect.maxLongcan returnNumbers.LONG_NULLwhen all batch values are null. The current conditionbatchMax > existingwill evaluate to false in that case (sinceLONG_NULLisLong.MIN_VALUE), so it won't corrupt data. However, this differs from the pattern inMinLongGroupByFunction.computeBatch(lines 51-56) which explicitly guards:if (batchMin != Numbers.LONG_NULL) { // ... update logic }Similarly,
merge()at line 113 lacks a null guard unlikeMinLongGroupByFunction.merge()(line 120).Consider adding an explicit guard for consistency and clarity:
♻️ Suggested refactor for consistency
public void computeBatch(MapValue mapValue, long ptr, int count, long startRowId) { if (count > 0) { final long batchMax = Vect.maxLong(ptr, count); - final long existing = mapValue.getDate(valueIndex); - if (batchMax > existing) { - mapValue.putDate(valueIndex, batchMax); + if (batchMax != Numbers.LONG_NULL) { + final long existing = mapValue.getDate(valueIndex); + if (batchMax > existing || existing == Numbers.LONG_NULL) { + mapValue.putDate(valueIndex, batchMax); + } } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxDateGroupByFunction.java` around lines 48 - 56, Add explicit null checks like MinLongGroupByFunction: in MaxDateGroupByFunction.computeBatch, after computing batchMax = Vect.maxLong(ptr, count) guard with "if (batchMax != Numbers.LONG_NULL)" before comparing/updating mapValue (use mapValue.getDate(valueIndex) / mapValue.putDate(valueIndex, batchMax)); likewise update MaxDateGroupByFunction.merge() to first check merged value != Numbers.LONG_NULL before merging into existing value to mirror MinLongGroupByFunction's null-guard pattern for clarity and consistency.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DateGroupByFunctionBatchTest.java (1)
322-347: Strengthen min/max accumulation tests with a non-zero second-batchstartRowId.
min/maxshould be row-id agnostic. Reusing0in both calls makes it easier for accidental row-id gating bugs to slip through unnoticed.🔧 Suggested test hardening
- function.computeBatch(value, ptr, 2, 0); + function.computeBatch(value, ptr, 2, 2); ... - function.computeBatch(value, ptr, 2, 0); + function.computeBatch(value, ptr, 2, 2);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DateGroupByFunctionBatchTest.java` around lines 322 - 347, Update the two tests so the second computeBatch call uses a non-zero startRowId to ensure row-id agnosticism: in testMaxDateBatchAccumulates and testMinDateBatchAccumulates change the second call to MaxDateGroupByFunction.computeBatch(...) and MinDateGroupByFunction.computeBatch(...) respectively to pass a non-zero startRowId (e.g., 2) instead of 0 while keeping the batch size (2) unchanged; this ensures the second batch is treated as a subsequent block of rows rather than restarting at row 0.core/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.java (1)
2246-2255: Add plan/path assertions to the two new path-specific tests.Both tests currently validate only final results. They can still pass after a fallback to non-vectorized/non-hybrid execution, which weakens regression detection for the intended execution paths.
Also applies to: 2323-2333
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.java` around lines 2246 - 2255, The two new tests (testParallelNonKeyedGroupByWithColumnTopsAndFirstLast and the similar test at lines 2323-2333 that call testParallelNonKeyedGroupBy(...)) only validate results and must also assert the execution path/plan to prevent silent fallbacks; update each test to capture and assert the query execution plan or runtime path (e.g., via existing helpers like assertPlanContains or assertPathUsed) to verify the expected vectorized/hybrid operator is present (reference the call to testParallelNonKeyedGroupBy and the SQL string passed) and assert that no non-vectorized/non-hybrid fallback operator is used. Ensure the assertion checks for the unique operator or plan fragment name your engine uses for the intended path so the test fails if execution falls back.core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedAtom.java (1)
97-104: Avoid leaking mutable batch metadata state.
getBatchColumnIndexes()(Line 193) returns the internal array directly, whilehasNonBatchFunctionsis derived once in constructor (Line 97-104). External mutation can silently desync these two and affect execution-path decisions.♻️ Suggested defensive copy
@@ import java.io.Closeable; +import java.util.Arrays; @@ - this.batchColumnIndexes = batchColumnIndexes; + this.batchColumnIndexes = Arrays.copyOf(batchColumnIndexes, batchColumnIndexes.length); @@ public int[] getBatchColumnIndexes() { - return batchColumnIndexes; + return Arrays.copyOf(batchColumnIndexes, batchColumnIndexes.length); }Also applies to: 193-195
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedAtom.java` around lines 97 - 104, Batch metadata can be mutated externally because batchColumnIndexes is stored directly and getBatchColumnIndexes() exposes the internal array while hasNonBatchFunctions is computed once in the constructor; to fix, defensively copy the array both when assigning it in the constructor and when returning it from getBatchColumnIndexes(), and recompute hasNonBatchFunctions based on the defensive copy (or compute it from the copied array) so the internal boolean stays consistent with the stored array; reference symbols: AsyncGroupByNotKeyedAtom, batchColumnIndexes, hasNonBatchFunctions, getBatchColumnIndexes().core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java (1)
889-892: Avoid hardcodingstartRowIdto0in vectorized batch calls.Line 889 and Line 891 currently force zero-based row ids. Passing a logical window-relative start (e.g., derived from
rowLo) makes this path safer if aggregation is ever split into multiplecomputeBatch()calls per function.💡 Future-proof wiring
+ final long batchStartRowId = rowLo; for (int i = 0, n = groupByFunctions.size(); i < n; i++) { int mapIndex = groupByFunctionToColumnIndex.getQuick(i); final long ptr = slaveData.get(idx, mapIndex + 2); if (ptr != 0) { final long typeSize = ColumnType.sizeOfTag((short) groupByFuncTypes.getQuick(mapIndex)); // in case of include prevailing column values are shifted by one slot, hence `rowLo + 1` final long efficientRowLo = includePrevailing ? rowLo + 1 : rowLo; - groupByFunctions.getQuick(i).computeBatch(value, columnSink.of(ptr).startAddress() + typeSize * efficientRowLo, (int) (rowHi - rowLo), 0); + groupByFunctions.getQuick(i).computeBatch(value, columnSink.of(ptr).startAddress() + typeSize * efficientRowLo, (int) (rowHi - rowLo), batchStartRowId); } else { // no-arg function, e.g. count() - groupByFunctions.getQuick(i).computeBatch(value, 0, (int) (rowHi - rowLo), 0); + groupByFunctions.getQuick(i).computeBatch(value, 0, (int) (rowHi - rowLo), batchStartRowId); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java` around lines 889 - 892, The batch calls to groupByFunctions currently hardcode startRowId to 0 for no-arg functions, which breaks correctness if aggregation is split; update the no-arg computeBatch call in WindowJoinFastRecordCursorFactory so it uses the logical window-relative start (the same start used for the data-backed path) instead of 0 — e.g., pass the int-casted window start (derived from rowLo / efficientRowLo) as the third computeBatch startRowId parameter when calling groupByFunctions.getQuick(i).computeBatch(value, 0/..., (int)(rowHi-rowLo), startRowId).core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java (2)
282-297: Reduce duplicated row-by-row fallback logic.The column-top fallback loop in
aggregateVect(...)repeats the same row iteration/update pattern already present inaggregate(...). A shared helper would reduce divergence risk between the two paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java` around lines 282 - 297, The column-top fallback loop in aggregateVect(...) duplicates the row-by-row iteration/update logic from aggregate(...); extract that logic into a shared private helper on AsyncGroupByNotKeyedRecordCursorFactory (e.g., applyRowByRowFallback or processFrameRowByRow) that accepts the frameMemory, frameRowCount, slotId, atom, value, and record (or the minimal pieces needed), moves the record.init(frameMemory)/record.setRowIndex(0) setup, obtains GroupByFunctionsUpdater via atom.getFunctionUpdater(slotId), and runs the for-loop using value.isNew(), functionUpdater.updateNew/updateExisting and correct rowId incrementing; then replace the duplicated loop in both aggregateVect(...) and aggregate(...) with a call to this helper so both paths share identical logic.
75-76: Rename boolean field toisVectorizedfor guideline compliance.The new boolean field name
vectorizedshould use anis...prefix for consistency with repository conventions.♻️ Proposed rename
- private final boolean vectorized; + private final boolean isVectorized; @@ - this.vectorized = (reducer == AGGREGATE_VECT); + this.isVectorized = (reducer == AGGREGATE_VECT); @@ - sink.attr("vectorized").val(vectorized); + sink.attr("vectorized").val(isVectorized);As per coding guidelines "When choosing a name for a boolean variable, field or method, always use the is... or has... prefix, as appropriate".
Also applies to: 136-137, 185-185
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java` around lines 75 - 76, Rename the boolean field vectorized in AsyncGroupByNotKeyedRecordCursorFactory to isVectorized and update all references (constructor parameter, assignments, any getters or uses) to the new name; specifically change the private field declaration, the constructor parameter and its assignment, and any reads of vectorized at locations referenced (around the areas previously at lines ~136-137 and ~185) to use isVectorized so the class, its methods and callers compile and follow the is... boolean naming guideline.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/FloatGroupByFunctionBatchTest.java (1)
71-83: Wrap newly added native-memory tests inassertMemoryLeak().These tests allocate off-heap buffers; adding
assertMemoryLeak()would verify cleanup on all paths, including assertion-failure/error paths.As per coding guidelines "Write all tests using assertMemoryLeak(). This isn't needed for narrow unit tests that don't allocate native memory."
Also applies to: 126-138, 193-205, 222-237, 273-287, 303-315, 350-362, 395-418, 420-432
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/FloatGroupByFunctionBatchTest.java` around lines 71 - 83, The test allocates off-heap buffers (via allocateFloats and CountFloatGroupByFunction.computeBatch) but is not wrapped with assertMemoryLeak(), so native memory might not be verified/cleaned on failure; update the test method testCountFloatBatchAccumulates (and the other similar tests in this file that allocate native buffers) to wrap the test body inside assertMemoryLeak(() -> { ... }), keeping the existing calls to prepare(function), allocateFloats(...), function.computeBatch(...), and function.getLong(value) unchanged, and do the same for other CountFloatGroupByFunction-related test methods referenced in the review (the blocks around the noted ranges) so all off-heap allocations are checked for leaks.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.java (1)
82-97: Wrap newly added geohash batch tests withassertMemoryLeak().These tests rely on native buffers, so leak checks should be enforced at test execution scope, not only in teardown.
As per coding guidelines "Write all tests using assertMemoryLeak(). This isn't needed for narrow unit tests that don't allocate native memory."
Also applies to: 114-130, 146-162, 178-194, 208-221, 246-259, 289-304, 366-381
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.java` around lines 82 - 97, Wrap the body of the new geohash batch test methods (e.g., testCountGeoHashByteBatchAccumulates) in assertMemoryLeak(() -> { ... }) to ensure native buffer leak checks run during test execution; keep the existing resource handling (try-with-resources on SimpleMapValue from prepare(...)) and calls to allocateBytes(...) and function.computeBatch(...) inside the lambda so they execute under the leak assertion. Apply the same wrapping to all other added geohash batch tests referenced (the additional test methods in this file) so each test runs under assertMemoryLeak.core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java (1)
245-256: Rename boolean toisAsyncVectorizedfor guideline compliance.Line [245] introduces a boolean local without the required
is.../has...prefix.✏️ Proposed refactor
- boolean asyncVectorized = !rosti && keys == null + boolean isAsyncVectorized = !rosti && keys == null && ((t == CHAR && f >= 4) // min/max on char || (t == FLOAT && (f == 2 || f >= 4))); // sum/min/max on float @@ - planSink.put(" vectorized: ").put(asyncVectorized).put('\n'); + planSink.put(" vectorized: ").put(isAsyncVectorized).put('\n');As per coding guidelines, "When choosing a name for a boolean variable, field or method, always use the is... or has... prefix, as appropriate".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java` around lines 245 - 256, Rename the boolean local asyncVectorized to isAsyncVectorized and update all its usages in this method: replace its declaration "boolean asyncVectorized = ..." and the reference in the planSink.put call so the variable name follows the is... prefix guideline; ensure the boolean expression remains unchanged (still using rosti, keys, t, f) and run tests to confirm no other references need updating.core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java (1)
55-59: Inconsistent null checks:Numbers.isNullvsDouble.isNaN.Line 53 uses
Numbers.isNull(batchSum)while Line 55 usesDouble.isNaN(prevSum). For consistency and correctness, consider using the same null-checking approach for both. IfNumbers.isNullis the correct check for batch results fromVect.sumDoubleAcc, it should also be used forprevSum.♻️ Suggested alignment
if (!Numbers.isNull(batchSum)) { final double prevSum = mapValue.getDouble(valueIndex); - if (!Double.isNaN(prevSum)) { + if (!Numbers.isNull(prevSum)) { mapValue.putDouble(valueIndex, prevSum + batchSum); } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java` around lines 55 - 59, The null-checks in AvgDoubleGroupByFunction are inconsistent: replace the Double.isNaN(prevSum) check with Numbers.isNull(prevSum) so both batchSum and prevSum use the same null-detection (the values produced by Vect.sumDoubleAcc should be tested with Numbers.isNull); update the branch that writes to mapValue (using valueIndex) to treat a null prevSum as missing and only set prevSum+batchSum when prevSum is not Numbers.isNull, otherwise store batchSum, and ensure any existing logic that relied on Double.isNaN is preserved but using Numbers.isNull instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstCharGroupByFunction.java`:
- Around line 48-54: computeBatch in FirstCharGroupByFunction records the
first-seen row by comparing startRowId against existingRowId, but several
window/async join callers (WindowJoinFastRecordCursorFactory,
AsyncWindowJoinRecordCursorFactory, AsyncWindowJoinFastRecordCursorFactory)
always pass 0 for startRowId across batches, breaking correctness when later
batches contain lower row IDs; fix by changing those callers to compute and pass
the actual starting row ID for each batch (the first rowId represented by ptr
for that batch) instead of 0 so FirstCharGroupByFunction.computeBatch and its
use of mapValue/valueIndex continue to pick the true lowest row id, and add
regression tests exercising multi-batch group cases (e.g., mimic
BooleanGroupByFunctionBatchTest pattern) to ensure updates occur when later
batches contain smaller row IDs.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIPv4GroupByFunctionFactory.java`:
- Around line 68-83: The computeBatch loop in
FirstNotNullIPv4GroupByFunctionFactory relies on a correct startRowId to
preserve first_not_null ordering, but the window join factories call
computeBatch(...) with hardcoded startRowId=0; update
WindowJoinFastRecordCursorFactory, AsyncWindowJoinRecordCursorFactory, and
AsyncWindowJoinFastRecordCursorFactory so they pass the actual starting slave
row id for the batch (derive startRowId from the first row in the batch range,
e.g., slaveRowIds.get(rowLo) or the equivalent index used for the batch) instead
of 0 when invoking computeBatch, ensuring computeBatch's rowId arithmetic
(startRowId + offset) reflects real row ordering for first_not_null.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullTimestampGroupByFunction.java`:
- Around line 49-52: In LastNotNullTimestampGroupByFunction ensure
computeBatch() uses the same null-value fallback as computeNext(): when deciding
to replace the stored pair, check not only (rowId > existingRowId ||
existingRowId == Numbers.LONG_NULL) but also whether the stored timestamp is
NULL (e.g., mapValue.getLong(valueIndex + 1) == Numbers.LONG_NULL); update the
conditional in computeBatch() to include that OR check and write both
mapValue.putLong(valueIndex, rowId) and mapValue.putLong(valueIndex + 1, value)
accordingly. Also add a regression test that initializes a group with a null
stored value state, then calls computeBatch() with a non-null candidate and
asserts the stored values match the candidate (and matches computeNext()
behavior).
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/SumFloatGroupByFunction.java`:
- Around line 59-66: In SumFloatGroupByFunction, make computeBatch's
overflow/null check consistent with computeNext by replacing the existing check
that uses !Float.isNaN(existing) with Float.isFinite(existing); in the
computeBatch path that reads mapValue.getFloat(valueIndex) into existing, only
add acc to existing when Float.isFinite(existing), otherwise overwrite with acc
(same behavior as computeNext which uses Float.isFinite(sum)), so adjust the
conditional around mapValue.putFloat accordingly to reference
Float.isFinite(existing) and keep the same write behavior to
mapValue.putFloat(valueIndex, acc) when not finite.
In
`@core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java`:
- Around line 500-503: The computeBatch() calls in
AsyncWindowJoinRecordCursorFactory are passing a hardcoded startRowId of 0 which
breaks the GroupByFunction.computeBatch() contract; update every
groupByFunctions.getQuick(i).computeBatch(...) invocation (the ones that
currently pass 0) to pass the actual batch start row id (use rowLo / cast to int
where necessary) instead of 0 so the i-th record is treated as startRowId + i;
ensure this change is applied to all occurrences around the
columnSink/of(ptr)/typeSize usages and in the no-arg branches so order-sensitive
aggregates like FirstDoubleGroupByFunction get the correct startRowId.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/BooleanGroupByFunctionBatchTest.java`:
- Around line 66-78: The new native-memory test testFirstBooleanBatchAccumulates
(and the similar test around allocateBooleans at the other spot) must be wrapped
with the assertMemoryLeak() harness; update the test method(s) so the body that
calls allocateBooleans and function.computeBatch (for
FirstBooleanGroupByFunction and any other affected tests) is executed inside
assertMemoryLeak(() -> { ... }); thereby ensuring allocateBooleans usage is
covered by the leak checker while leaving the test assertions (e.g.,
Assert.assertTrue(function.getBool(value))) inside the lambda.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/CharGroupByFunctionBatchTest.java`:
- Around line 243-245: The test calls computeBatch with length 4 after
allocateChars('d','b','c') only allocates 3 chars, risking OOB reads; update the
call in testMinCharBatchSimple to pass the correct batch length (use 3) or
allocate four characters to match the 4; locate the allocateChars(...) call and
the function.computeBatch(value, ptr, 4, 0) invocation and either change the
third argument from 4 to 3 or add a fourth char to allocateChars so the length
matches.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java`:
- Around line 88-100: Wrap each test that uses native allocation in an
assertMemoryLeak(...) block: for the testFirstShortBatchAccumulates method (and
the other new tests around lines with FirstShortGroupByFunction usage at 161-175
and 197-209), place the existing test body that calls allocateShorts(...) and
invokes FirstShortGroupByFunction.computeBatch(...) and function.getShort(...)
inside assertMemoryLeak(() -> { ... }); so native memory allocated by
allocateShorts is always checked and released across all paths; keep the
existing try-with-resources (SimpleMapValue value = prepare(function)) inside
that lambda.
---
Outside diff comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullDoubleGroupByFunction.java`:
- Around line 41-68: computeNext currently uses
Numbers.isFinite(arg.getDouble(record)) which rejects ±Infinity while
computeBatch only rejects Numbers.isNull; update computeNext to use the same
null-check as computeBatch (i.e., treat any non-null double, including
±Infinity, as valid) by replacing the Numbers.isFinite predicate with a
!Numbers.isNull check on arg.getDouble(record), keeping the existing logic that
compares mapValue.getDouble(valueIndex + 1) and mapValue.getLong(valueIndex) and
calls computeFirst(mapValue, record, rowId) when appropriate.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.java`:
- Around line 122-128: In MinCharGroupByFunction, the merge(MapValue destValue,
MapValue srcValue) implementation stores the min value with
destValue.putInt(...) while the column type is CHAR; change the write to
destValue.putChar(valueIndex, srcMin) so it matches initValueTypes and the other
methods (computeFirst, computeNext, computeBatch) that use putChar and uses the
same valueIndex and MapValue semantics.
---
Nitpick comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java`:
- Around line 55-59: The null-checks in AvgDoubleGroupByFunction are
inconsistent: replace the Double.isNaN(prevSum) check with
Numbers.isNull(prevSum) so both batchSum and prevSum use the same null-detection
(the values produced by Vect.sumDoubleAcc should be tested with Numbers.isNull);
update the branch that writes to mapValue (using valueIndex) to treat a null
prevSum as missing and only set prevSum+batchSum when prevSum is not
Numbers.isNull, otherwise store batchSum, and ensure any existing logic that
relied on Double.isNaN is preserved but using Numbers.isNull instead.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionInt.java`:
- Line 51: In CountGeoHashGroupByFunctionInt update the batch count path to
avoid mutating the map when nonNullCount is zero: before calling
mapValue.addLong(valueIndex, nonNullCount) check if nonNullCount != 0 and only
perform the addLong when it's non-zero (leave existing behavior for non-zero
counts); this prevents unnecessary map writes in the tight vectorized path while
preserving current semantics for positive counts.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxDateGroupByFunction.java`:
- Around line 48-56: Add explicit null checks like MinLongGroupByFunction: in
MaxDateGroupByFunction.computeBatch, after computing batchMax =
Vect.maxLong(ptr, count) guard with "if (batchMax != Numbers.LONG_NULL)" before
comparing/updating mapValue (use mapValue.getDate(valueIndex) /
mapValue.putDate(valueIndex, batchMax)); likewise update
MaxDateGroupByFunction.merge() to first check merged value != Numbers.LONG_NULL
before merging into existing value to mirror MinLongGroupByFunction's null-guard
pattern for clarity and consistency.
In
`@core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java`:
- Around line 889-892: The batch calls to groupByFunctions currently hardcode
startRowId to 0 for no-arg functions, which breaks correctness if aggregation is
split; update the no-arg computeBatch call in WindowJoinFastRecordCursorFactory
so it uses the logical window-relative start (the same start used for the
data-backed path) instead of 0 — e.g., pass the int-casted window start (derived
from rowLo / efficientRowLo) as the third computeBatch startRowId parameter when
calling groupByFunctions.getQuick(i).computeBatch(value, 0/...,
(int)(rowHi-rowLo), startRowId).
In
`@core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedAtom.java`:
- Around line 97-104: Batch metadata can be mutated externally because
batchColumnIndexes is stored directly and getBatchColumnIndexes() exposes the
internal array while hasNonBatchFunctions is computed once in the constructor;
to fix, defensively copy the array both when assigning it in the constructor and
when returning it from getBatchColumnIndexes(), and recompute
hasNonBatchFunctions based on the defensive copy (or compute it from the copied
array) so the internal boolean stays consistent with the stored array; reference
symbols: AsyncGroupByNotKeyedAtom, batchColumnIndexes, hasNonBatchFunctions,
getBatchColumnIndexes().
In
`@core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java`:
- Around line 282-297: The column-top fallback loop in aggregateVect(...)
duplicates the row-by-row iteration/update logic from aggregate(...); extract
that logic into a shared private helper on
AsyncGroupByNotKeyedRecordCursorFactory (e.g., applyRowByRowFallback or
processFrameRowByRow) that accepts the frameMemory, frameRowCount, slotId, atom,
value, and record (or the minimal pieces needed), moves the
record.init(frameMemory)/record.setRowIndex(0) setup, obtains
GroupByFunctionsUpdater via atom.getFunctionUpdater(slotId), and runs the
for-loop using value.isNew(), functionUpdater.updateNew/updateExisting and
correct rowId incrementing; then replace the duplicated loop in both
aggregateVect(...) and aggregate(...) with a call to this helper so both paths
share identical logic.
- Around line 75-76: Rename the boolean field vectorized in
AsyncGroupByNotKeyedRecordCursorFactory to isVectorized and update all
references (constructor parameter, assignments, any getters or uses) to the new
name; specifically change the private field declaration, the constructor
parameter and its assignment, and any reads of vectorized at locations
referenced (around the areas previously at lines ~136-137 and ~185) to use
isVectorized so the class, its methods and callers compile and follow the is...
boolean naming guideline.
In `@core/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.java`:
- Around line 2246-2255: The two new tests
(testParallelNonKeyedGroupByWithColumnTopsAndFirstLast and the similar test at
lines 2323-2333 that call testParallelNonKeyedGroupBy(...)) only validate
results and must also assert the execution path/plan to prevent silent
fallbacks; update each test to capture and assert the query execution plan or
runtime path (e.g., via existing helpers like assertPlanContains or
assertPathUsed) to verify the expected vectorized/hybrid operator is present
(reference the call to testParallelNonKeyedGroupBy and the SQL string passed)
and assert that no non-vectorized/non-hybrid fallback operator is used. Ensure
the assertion checks for the unique operator or plan fragment name your engine
uses for the intended path so the test fails if execution falls back.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DateGroupByFunctionBatchTest.java`:
- Around line 322-347: Update the two tests so the second computeBatch call uses
a non-zero startRowId to ensure row-id agnosticism: in
testMaxDateBatchAccumulates and testMinDateBatchAccumulates change the second
call to MaxDateGroupByFunction.computeBatch(...) and
MinDateGroupByFunction.computeBatch(...) respectively to pass a non-zero
startRowId (e.g., 2) instead of 0 while keeping the batch size (2) unchanged;
this ensures the second batch is treated as a subsequent block of rows rather
than restarting at row 0.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/FloatGroupByFunctionBatchTest.java`:
- Around line 71-83: The test allocates off-heap buffers (via allocateFloats and
CountFloatGroupByFunction.computeBatch) but is not wrapped with
assertMemoryLeak(), so native memory might not be verified/cleaned on failure;
update the test method testCountFloatBatchAccumulates (and the other similar
tests in this file that allocate native buffers) to wrap the test body inside
assertMemoryLeak(() -> { ... }), keeping the existing calls to
prepare(function), allocateFloats(...), function.computeBatch(...), and
function.getLong(value) unchanged, and do the same for other
CountFloatGroupByFunction-related test methods referenced in the review (the
blocks around the noted ranges) so all off-heap allocations are checked for
leaks.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.java`:
- Around line 82-97: Wrap the body of the new geohash batch test methods (e.g.,
testCountGeoHashByteBatchAccumulates) in assertMemoryLeak(() -> { ... }) to
ensure native buffer leak checks run during test execution; keep the existing
resource handling (try-with-resources on SimpleMapValue from prepare(...)) and
calls to allocateBytes(...) and function.computeBatch(...) inside the lambda so
they execute under the leak assertion. Apply the same wrapping to all other
added geohash batch tests referenced (the additional test methods in this file)
so each test runs under assertMemoryLeak.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java`:
- Line 205: In ShortGroupByFunctionBatchTest, the second computeBatch call uses
startRowId 0 which doesn't exercise cross-segment offsets; update the second
invocation of function.computeBatch(value, ptr, 2, 0) to use a non-zero
startRowId (e.g., 2) so it becomes function.computeBatch(value, ptr, 2, 2),
ensuring the computeBatch method is tested with a non-zero startRowId to
validate segmented-batch behavior.
In
`@core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java`:
- Around line 245-256: Rename the boolean local asyncVectorized to
isAsyncVectorized and update all its usages in this method: replace its
declaration "boolean asyncVectorized = ..." and the reference in the
planSink.put call so the variable name follows the is... prefix guideline;
ensure the boolean expression remains unchanged (still using rosti, keys, t, f)
and run tests to confirm no other references need updating.
In `@core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java`:
- Around line 3132-3137: The inline comment before the test method
testGroupByNotKeyed2 is now misleading: it states "expressions in aggregates
disable vectorized impl" while the test expectation shows vectorized: true;
update that comment to reflect the current behavior (e.g., "expressions in
aggregates do not disable vectorized implementation" or similar) so the `@Test`
comment matches the expected output in the test asserting vectorized: true in
testGroupByNotKeyed2.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8f0b9c3e-abe8-49b3-bea8-aad44e7d0f2a
📒 Files selected for processing (103)
core/src/main/java/io/questdb/griffin/engine/functions/GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionByte.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionInt.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionLong.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionShort.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountLongConstGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstBooleanGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstByteGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionByte.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionInt.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionLong.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionShort.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullGeoHashGroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIPv4GroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastBooleanGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastByteGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastGeoHashGroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullGeoHashGroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullIPv4GroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinFastRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedAtom.javacore/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.javacore/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.javacore/src/test/java/io/questdb/test/griffin/AggregateTest.javacore/src/test/java/io/questdb/test/griffin/ExplainPlanTest.javacore/src/test/java/io/questdb/test/griffin/GroupByTest.javacore/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/finance/LevelTwoPriceFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/BooleanGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ByteGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/CharGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DateGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DoubleGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/FloatGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/IPv4GroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/IntGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/LongGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/MinCharGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java
core/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstCharGroupByFunction.java
Show resolved
Hide resolved
...java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIPv4GroupByFunctionFactory.java
Show resolved
Hide resolved
...in/java/io/questdb/griffin/engine/functions/groupby/LastNotNullTimestampGroupByFunction.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/functions/groupby/SumFloatGroupByFunction.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.java
Show resolved
Hide resolved
...t/java/io/questdb/test/griffin/engine/functions/groupby/BooleanGroupByFunctionBatchTest.java
Show resolved
Hide resolved
...test/java/io/questdb/test/griffin/engine/functions/groupby/CharGroupByFunctionBatchTest.java
Show resolved
Hide resolved
...est/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java
Show resolved
Hide resolved
@glasstiger have you seen #6805 (comment)? Coderabbit frequently hallucinates, so it's better to double-check whether its comments make sense. |
|
I've reverted the fix for bugs 1-3 in fb08047 |
It did not hallucinate. |
The fix (fa588c6) changed
The same query could produce different results depending on whether QuestDB takes the batch or row-by-row code path (e.g. column tops force fallback to row-by-row). That's worse than the original behavior where all three methods were at least consistent. Same applies to As for For other two functions, the revert restores |
|
@CodeRabbit re-review the PR |
🧩 Analysis chain🏁 Script executed: Length of output: 34846
✅ All previous concerns resolvedSumDoubleGroupByFunctionAll three code paths now use
The inconsistency SumFloatGroupByFunctionSame pattern: AvgDoubleGroupByFunctionThe tricky
🟠 Remaining dormant issue (unchanged from before)Window-join OverallThe PR is in a correct, internally consistent state. Coverage at 98.92% is excellent, with all modified production files at 100%. No blocking issues found. The 🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/MinCharGroupByFunctionFactoryTest.java (1)
62-90:⚠️ Potential issue | 🟠 MajorWrap this test with
assertMemoryLeak()to enforce resource-safety checks.
testFirstNull()allocates writer/cursor resources and should follow the test harness leak-check pattern.Proposed fix
`@Test` public void testFirstNull() throws SqlException { - execute("create table tab (f char)"); - - final Rnd rnd = new Rnd(); - char min = 0; - try (TableWriter w = getWriter("tab")) { - TableWriter.Row r = w.newRow(); - r.append(); - for (int i = 100; i > 10; i--) { - r = w.newRow(); - char ch = rnd.nextChar(); - r.putChar(0, ch); - r.append(); - if (ch > 0 && (ch < min || min == 0)) { - min = ch; - } - } - w.commit(); - } - - try (RecordCursorFactory factory = select("select min(f) from tab")) { - try (RecordCursor cursor = factory.getCursor(sqlExecutionContext)) { - Record record = cursor.getRecord(); - Assert.assertEquals(1, cursor.size()); - Assert.assertTrue(cursor.hasNext()); - Assert.assertEquals(min, record.getChar(0)); - } - } + assertMemoryLeak(() -> { + execute("create table tab (f char)"); + + final Rnd rnd = new Rnd(); + char min = 0; + try (TableWriter w = getWriter("tab")) { + TableWriter.Row r = w.newRow(); + r.append(); + for (int i = 100; i > 10; i--) { + r = w.newRow(); + char ch = rnd.nextChar(); + r.putChar(0, ch); + r.append(); + if (ch > 0 && (ch < min || min == 0)) { + min = ch; + } + } + w.commit(); + } + + try (RecordCursorFactory factory = select("select min(f) from tab")) { + try (RecordCursor cursor = factory.getCursor(sqlExecutionContext)) { + Record record = cursor.getRecord(); + Assert.assertEquals(1, cursor.size()); + Assert.assertTrue(cursor.hasNext()); + Assert.assertEquals(min, record.getChar(0)); + } + } + }); }As per coding guidelines: "
**/test/**/*.java: Write all tests using assertMemoryLeak(). This isn't needed for narrow unit tests that don't allocate native memory."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/MinCharGroupByFunctionFactoryTest.java` around lines 62 - 90, The test method testFirstNull allocates native resources (TableWriter/RecordCursor) and must be wrapped with assertMemoryLeak to ensure no leaks; update testFirstNull to call assertMemoryLeak(() -> { ... }) around the existing test body (the parts that call getWriter("tab"), select("select min(f) from tab"), and related TableWriter/RecordCursor usage) so that resource allocation and cursor/writer lifecycles are executed inside the assertMemoryLeak closure and existing try-with-resources blocks remain unchanged inside it.core/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.java (1)
122-127:⚠️ Potential issue | 🟠 MajorType mismatch:
putIntused for char value.Line 126 uses
destValue.putInt(valueIndex, srcMin)to store acharvalue, but the column type isCHAR(seeinitValueTypesat line 108). This should useputCharfor consistency with how the value is stored elsewhere (e.g.,computeFirstat line 68,computeBatchat line 60).🐛 Proposed fix
public void merge(MapValue destValue, MapValue srcValue) { char srcMin = srcValue.getChar(valueIndex); char destMin = destValue.getChar(valueIndex); if (srcMin != 0 && (srcMin < destMin || destMin == 0)) { - destValue.putInt(valueIndex, srcMin); + destValue.putChar(valueIndex, srcMin); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.java` around lines 122 - 127, The merge method in MinCharGroupByFunction writes a char using destValue.putInt(valueIndex, srcMin); change this to use the correct char setter destValue.putChar(valueIndex, srcMin) so the stored type matches initValueTypes and the other methods (computeFirst, computeBatch) that use char accessors; ensure valueIndex and srcMin remain the same and no other logic is changed.core/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullIPv4GroupByFunctionFactory.java (1)
82-86:⚠️ Potential issue | 🟡 MinorAdd row-id guard to
LastNotNullIPv4GroupByFunctionFactory.computeNext.The IPv4 implementation is missing the row-id check present in other
LastNotNull*functions (Double, Char, etc.). It unconditionally callscomputeFirstwhen the value is non-null, which can overwrite a newer value with an older one if records are processed out of order.Update line 84 to check both that the stored value is null (first assignment) and that the incoming rowId is newer:
if (Numbers.IPv4_NULL != arg.getIPv4(record)) { if (mapValue.getIPv4(valueIndex + 1) == Numbers.IPv4_NULL || rowId > mapValue.getLong(valueIndex)) { computeFirst(mapValue, record, rowId); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullIPv4GroupByFunctionFactory.java` around lines 82 - 86, In LastNotNullIPv4GroupByFunctionFactory.computeNext the method currently calls computeFirst whenever the incoming IPv4 is non-null, which can overwrite newer values; change computeNext to first verify the incoming value is non-null and then only call computeFirst if the stored slot is empty or the incoming rowId is newer (check mapValue.getIPv4(valueIndex + 1) == Numbers.IPv4_NULL || rowId > mapValue.getLong(valueIndex)) so that computeFirst(mapValue, record, rowId) is invoked only when appropriate.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/CharGroupByFunctionBatchTest.java (1)
61-249:⚠️ Potential issue | 🟠 MajorWrap these native-memory tests in
assertMemoryLeak().These tests allocate off-heap memory (
Unsafe.malloc) but are not wrapped withassertMemoryLeak(). Please apply the same leak-check wrapper pattern across the newly added/updated batch tests in this PR (including Date/Byte/IPv4/Float/Int/GeoHash batch test files), not just this class.As per coding guidelines,
Write all tests using assertMemoryLeak(). This isn't needed for narrow unit tests that don't allocate native memory.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/CharGroupByFunctionBatchTest.java` around lines 61 - 249, These batch tests allocate native memory (via allocateChars / Unsafe.malloc) and must be wrapped with assertMemoryLeak() to catch leaks; update each affected test (e.g., testFirstCharBatch, testFirstCharBatchAccumulates, testFirstNotNullCharBatch, testLastCharBatch, testMaxCharBatch, testMinCharBatchSimple, etc.) to run their body inside assertMemoryLeak(() -> { ... }), keeping the existing try-with-resources prepare(...) and calls to allocateChars and computeBatch unchanged; apply the same pattern to the other batch test classes mentioned (Date/Byte/IPv4/Float/Int/GeoHash) so every test that calls allocateChars or other native allocators is wrapped in assertMemoryLeak().
🧹 Nitpick comments (11)
core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java (1)
3132-3136: Update stale test intent comment intestGroupByNotKeyed2.Line 3132 says expressions disable vectorization, but Line 3136 asserts
vectorized: true. Please align the comment with the actual expected behavior to avoid misleading future debugging.Suggested comment-only fix
- `@Test` // expressions in aggregates disable vectorized impl + `@Test` // mixed aggregate args are expected to run with vectorized=true in this case public void testGroupByNotKeyed2() throws Exception {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java` around lines 3132 - 3136, The test comment for testGroupByNotKeyed2 is stale—update the inline comment that currently reads "expressions in aggregates disable vectorized impl" to reflect that vectorization is expected (the assertion shows "vectorized: true"); modify the comment near the test method testGroupByNotKeyed2 (method name: testGroupByNotKeyed2) so it accurately describes that expressions in aggregates do not disable vectorized execution in this case.core/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxTimestampGroupByFunction.java (1)
48-56: Minor style inconsistency withMinTimestampGroupByFunction.The logic is correct—
LONG_NULL(Long.MIN_VALUE) ensures any valid timestamp will compare as greater. However,MinTimestampGroupByFunction.computeBatchuses explicit null guards (batchMin != Numbers.LONG_NULL), whereas this relies on implicit sentinel comparison. Additionally,computeFirstandsetNulluseputTimestamp, butcomputeBatchusesputLong.Consider aligning the pattern for consistency:
♻️ Suggested alignment (optional)
public void computeBatch(MapValue mapValue, long ptr, int count, long startRowId) { if (count > 0) { final long batchMax = Vect.maxLong(ptr, count); - final long existing = mapValue.getLong(valueIndex); - if (batchMax > existing) { - mapValue.putLong(valueIndex, batchMax); + if (batchMax != Numbers.LONG_NULL) { + final long existing = mapValue.getTimestamp(valueIndex); + if (batchMax > existing || existing == Numbers.LONG_NULL) { + mapValue.putTimestamp(valueIndex, batchMax); + } } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxTimestampGroupByFunction.java` around lines 48 - 56, MaxTimestampGroupByFunction.computeBatch currently relies on implicit sentinel comparison and uses mapValue.putLong, which is inconsistent with MinTimestampGroupByFunction and with this class's computeFirst/setNull that use putTimestamp; update computeBatch in MaxTimestampGroupByFunction to mirror MinTimestampGroupByFunction by explicitly guarding against Numbers.LONG_NULL for batchMax (e.g., if (batchMax != Numbers.LONG_NULL) { ... }) and when updating the map use mapValue.putTimestamp(valueIndex, batchMax) instead of putLong so the null-sentinel handling and write method are consistent with computeFirst and setNull.core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionByte.java (1)
41-51: Skip zero-delta writes in hot batch path.This is correct functionally, but writing
addLong(..., 0)on null-only batches adds avoidable map churn. Consider guarding onnonNullCount > 0(as in sibling count implementations).♻️ Suggested micro-optimization
- mapValue.addLong(valueIndex, nonNullCount); + if (nonNullCount > 0) { + mapValue.addLong(valueIndex, nonNullCount); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionByte.java` around lines 41 - 51, In computeBatch of CountGeoHashGroupByFunctionByte, avoid calling mapValue.addLong(valueIndex, 0) for batches with no non-null bytes to reduce map churn: after counting nonNullCount (the loop using Unsafe.getUnsafe().getByte and GeoHashes.BYTE_NULL), only call mapValue.addLong(valueIndex, nonNullCount) when nonNullCount > 0; leave the counting loop and variable names (ptr, count, hi, nonNullCount, valueIndex) unchanged.core/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIPv4GroupByFunctionFactory.java (1)
70-73: UseInteger.BYTESinstead of hardcoded4Lin pointer math.This reduces magic numbers and keeps width semantics tied to type size.
♻️ Suggested cleanup
- final long hi = ptr + count * 4L; + final long hi = ptr + count * (long) Integer.BYTES; long offset = 0; - for (; ptr < hi; ptr += 4L) { + for (; ptr < hi; ptr += Integer.BYTES) { int value = Unsafe.getUnsafe().getInt(ptr); if (value != Numbers.IPv4_NULL) { long rowId = startRowId + offset; long existingRowId = mapValue.getLong(valueIndex); if (rowId < existingRowId || existingRowId == Numbers.LONG_NULL || mapValue.getIPv4(valueIndex + 1) == Numbers.IPv4_NULL) { mapValue.putLong(valueIndex, rowId); mapValue.putInt(valueIndex + 1, value); } break; } offset++; }Also applies to: 83-83
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIPv4GroupByFunctionFactory.java` around lines 70 - 73, Replace the hardcoded 4L used for IPv4 pointer arithmetic in FirstNotNullIPv4GroupByFunctionFactory with Integer.BYTES to avoid magic numbers: compute hi as ptr + count * (long)Integer.BYTES, increment ptr by (long)Integer.BYTES in the loop, and update the similar occurrence at the other location (around the second loop at the noted lines); keep the getInt(ptr) usage unchanged but ensure all pointer math consistently uses Integer.BYTES (cast to long where needed).core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DoubleGroupByFunctionBatchTest.java (1)
577-621: Place newtestAvg...methods with the othertestAvg...members.The methods added at Line 577+ are out of alphabetical member order relative to the existing
testAvg...block. Please move them to preserve class member ordering consistency.As per coding guidelines:
**/*.java: "Group Java class members by kind (static vs. instance) and visibility, sorted alphabetically."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DoubleGroupByFunctionBatchTest.java` around lines 577 - 621, The four new test methods (testAvgDoubleComputeFirstInfinity, testAvgDoubleComputeFirstNaN, testAvgDoubleComputeNextInfinity, testAvgDoubleComputeNextNaNSkipped) are placed out of alphabetical order; move these methods inside the existing block of other testAvg... members in DoubleGroupByFunctionBatchTest so they are grouped with the other testAvg* methods and sorted alphabetically within that group, keeping each method signature and body unchanged and preserving the try-with-resources usage and assertions.core/src/main/java/io/questdb/griffin/engine/functions/GroupByFunction.java (1)
58-75: ClarifystartRowIdmonotonicity in thecomputeBatchcontract.The updated docs state multi-call accumulation, but they do not explicitly require increasing, non-overlapping
startRowIdranges per group. Making that explicit will prevent incorrect first/last behavior if callers split work into multiple batch invocations.✏️ Suggested Javadoc clarification
- * This method may be called multiple times for the same group {`@link` MapValue} (e.g. once - * per page frame). Implementations must accumulate into the existing state set by + * This method may be called multiple times for the same group {`@link` MapValue} (e.g. once + * per page frame). Implementations must accumulate into the existing state set by * {`@link` `#setEmpty`(MapValue)}, not overwrite it. + * For a given group, repeated calls must use non-overlapping, increasing {`@code` startRowId} + * ranges so row-order-aware aggregates (e.g. first/last) remain correct.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/GroupByFunction.java` around lines 58 - 75, Update the Javadoc for GroupByFunction.computeBatch to explicitly state that for a given MapValue successive invocations of computeBatch will receive non-overlapping, strictly increasing row ranges: each call’s startRowId must be greater than the previous call’s startRowId and the ranges must not overlap, so implementations of computeBatch (which must accumulate into state set by setEmpty and may be invoked without a preceding computeFirst or followed by merge when supportsBatchComputation() is true) may rely on monotonic, non-overlapping batches to implement first/last semantics correctly.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java (2)
230-239: Strengthen accumulation coverage with non-zero secondstartRowId.For a true “accumulates across batches” shape under the new API, make the second call use an offset (e.g.,
2) rather than0.♻️ Suggested test tweak
- ptr = allocateShorts((short) 3, (short) 4); - function.computeBatch(value, ptr, 2, 0); + ptr = allocateShorts((short) 3, (short) 4); + function.computeBatch(value, ptr, 2, 2);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java` around lines 230 - 239, The test currently calls function.computeBatch(value, ptr, 2, 0) twice, which doesn't verify accumulation when the second batch has a non-zero startRowId; update the second computeBatch invocation to use a non-zero offset (e.g., 2) so it reads like function.computeBatch(value, ptr, 2, 2) to exercise accumulation across batches for SumShortGroupByFunction; keep the rest of the test (prepare, allocateShorts, SimpleMapValue) unchanged.
109-117: Rename*AllNulltests to match actual inputs.Both methods currently include one non-null value, so names like
...LeadingNull/...TrailingNullwould be less misleading during failure triage.Also applies to: 187-196
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java` around lines 109 - 117, The test method names (e.g., testFirstShortBatchAllNull) are misleading because the test bodies include one non-null value; rename them to reflect the actual input pattern (e.g., testFirstShortBatchLeadingNull or testFirstShortBatchTrailingNull) and update the corresponding test at the other occurrence (lines ~187-196) accordingly; ensure you change the method declarations and any references to these methods (search for testFirstShortBatchAllNull and similar) and keep the test logic using FirstShortGroupByFunction, allocateShorts, and prepare unchanged.core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java (1)
75-75: Rename boolean field to useis...prefix.The field
vectorizedat line 75 violates the project convention for boolean naming. Rename toisVectorizedto match the guideline: "When choosing a name for a boolean variable, field or method, always use the is... or has... prefix."♻️ Proposed rename
- private final boolean vectorized; + private final boolean isVectorized; ... - this.vectorized = (reducer == AGGREGATE_VECT); + this.isVectorized = (reducer == AGGREGATE_VECT); ... - sink.attr("vectorized").val(vectorized); + sink.attr("vectorized").val(isVectorized);Applies to lines: 75, 136, 185.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java` at line 75, Rename the boolean field 'vectorized' in AsyncGroupByNotKeyedRecordCursorFactory to 'isVectorized' and update all references (constructor parameter, assignments, and any usages such as in methods or other classes) to the new name; specifically change the field declaration 'vectorized' to 'isVectorized', update the constructor parameter and assignment that set this vector flag, and replace all usages at the sites referenced (previously at lines around 136 and 185) to use 'isVectorized' so naming follows the is... boolean convention.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.java (1)
366-397: Addlast_not_nullaccumulation tests for BYTE/SHORT/LONG geohash variants too.This file now validates non-zero-offset accumulation for the INT variant, but the production change touched BYTE/SHORT/LONG as well. Add sibling two-batch tests (
startRowId > 0on the second batch) for those variants to catch type-specific row-id ordering regressions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.java` around lines 366 - 397, Add sibling two-batch unit tests for the BYTE and SHORT geohash variants (and ensure the LONG variant has a two-batch accumulation test with a non-zero second batch startRowId) similar to testLastNotNullGeoHashBatchIntAccumulates: create tests (e.g., testLastNotNullGeoHashBatchByteAccumulates and testLastNotNullGeoHashBatchShortAccumulates) that build a GroupByFunction via newLastNotNullGeoHashFunction(GeoByteColumn.newInstance(...)) / GeoShortColumn.newInstance(...), call function.setNull(value), then call function.computeBatch(value, ptr, len, 0) for the first batch and function.computeBatch(value, ptr2, len2, startRowId>0) for the second batch using the appropriate allocateBytes/allocateShorts (and for LONG ensure a second-batch startRowId>0 case if missing) with GeoHashes.*_NULL/values, and finally assert the expected value with function.getGeoByte/getGeoShort/getGeoLong and that function.supportsBatchComputation() where appropriate.core/src/test/java/io/questdb/test/griffin/GroupByTest.java (1)
2984-3006: Add plan-shape assertions to lock vectorized execution in these scenarios.Lines 2995, 3025, 3043, and 3154 currently assert only final results. If planner routing regresses to a non-vectorized path, these tests can still pass and miss the performance regression this change targets.
Suggested pattern (example for one test)
`@Test` public void testNonKeyedVectorizedAllNulls() throws Exception { assertMemoryLeak(() -> { execute("CREATE TABLE t (v INT, ts TIMESTAMP) TIMESTAMP(ts) PARTITION BY DAY"); execute( """ INSERT INTO t(ts) VALUES ('2024-01-01T00:00:00.000000Z'), ('2024-01-02T00:00:00.000000Z')""" ); + String query = "SELECT sum(v), min(v), max(v), count(*), avg(v), first(v), last(v), first_not_null(v), last_not_null(v) FROM t"; + assertPlanNoLeakCheck( + query, + """ + Async Group By workers: 1 + vectorized: true + values: [sum(v),min(v),max(v),count(*),avg(v),first(v),last(v),first_not_null(v),last_not_null(v)] + filter: null + PageFrame + Row forward scan + Frame forward scan on: t + """ + ); assertQueryNoLeakCheck( """ sum\tmin\tmax\tcount\tavg\tfirst\tlast\tfirst_not_null\tlast_not_null null\tnull\tnull\t2\tnull\tnull\tnull\tnull\tnull """, - "SELECT sum(v), min(v), max(v), count(*), avg(v), first(v), last(v), first_not_null(v), last_not_null(v) FROM t", + query, null, false, true ); }); }Also applies to: 3008-3036, 3038-3054, 3136-3165
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/io/questdb/test/griffin/GroupByTest.java` around lines 2984 - 3006, The tests (e.g., GroupByTest.testNonKeyedVectorizedAllNulls) currently assert only query results; add an assertion that the planner chooses the vectorized execution path by running an EXPLAIN (or planner plan) for the same query and asserting the plan text contains the vectorized operator identifier used by the engine (e.g., "VectorizedGroupBy" or "vectorized") before calling assertQueryNoLeakCheck; apply the same pattern to the other affected tests (the tests around the shown ranges that use the same GROUP BY aggregate SQL and assertQueryNoLeakCheck) so they fail if the planner regresses to a non-vectorized plan.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java`:
- Around line 52-64: AvgDoubleGroupByFunction's computeBatch currently treats
Double.isNaN(batchSum) as an "empty" sentinel and discards the batch count,
which corrupts state when a batch produces NaN from non-empty data (e.g., +Inf
and -Inf); change the logic to use the actual batch count returned/written by
sumDoubleAcc (the countPtr/batchCount) to decide emptiness and always add the
batchCount to mapValue at valueIndex+1 when batchCount>0, while still merging
sums by reading prevSum and adding batchSum when applicable; update references
in computeBatch/sumDoubleAcc handling (valueIndex, countPtr, mapValue)
accordingly and add a regression test that runs computeBatch across two frames
where the first frame yields NaN from a non-empty set (e.g., +Infinity and
-Infinity) and asserts the aggregate remains NaN with the correct accumulated
count after a subsequent finite batch.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/SumShortGroupByFunction.java`:
- Around line 48-58: computeBatch currently unconditionally accumulates batchSum
(from Vect.sumShort) which can be Numbers.LONG_NULL; update
SumShortGroupByFunction.computeBatch to treat Numbers.LONG_NULL like "no value"
similarly to merge(): obtain batchSum, if batchSum == Numbers.LONG_NULL then
return/do nothing; otherwise read existing = mapValue.getLong(valueIndex) and if
existing == Numbers.LONG_NULL put batchSum, else put existing + batchSum; this
preserves existing behavior and matches
SumIntGroupByFunction/SumLongGroupByFunction semantics.
In
`@core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinFastRecordCursorFactory.java`:
- Around line 569-572: The vectorized computeBatch invocations in
AsyncWindowJoinFastRecordCursorFactory use a hardcoded startRowId of 0 for
no-arg and shifted-column paths which breaks ordering guarantees for
first/last-style aggregates if batches are split; update all calls to
groupByFunctions.getQuick(i).computeBatch(...) to pass the logical batch start
instead of 0 — use rowLo for normal paths and rowLo+1 for shifted-column
(pointer-shifted) paths, and preserve the existing address computation
(columnSink.of(ptr).startAddress() + typeSize * rowLo) and batch length
((int)(rowHi - rowLo)); apply the same change at the other three sites where
computeBatch is called (the shifted and non-shifted variants).
In
`@core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java`:
- Around line 889-892: The vectorized computeBatch invocations for
groupByFunctions.getQuick(i).computeBatch currently pass a hardcoded startRowId
of 0 which breaks row-order-aware aggregates; change both calls to pass the
correct logical startRowId instead of 0—use efficientRowLo when computing
against columnSink.of(ptr).startAddress() + typeSize * efficientRowLo (to match
the address offset) and use rowLo (or efficientRowLo if appropriate for your
window semantics) for the no-arg computeBatch path so startRowId tracks the
actual batch position rather than always 0.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DoubleGroupByFunctionBatchTest.java`:
- Around line 61-83: Wrap the native-buffer-allocating test body in an
assertMemoryLeak() guard: in testAvgDoubleBatchAccumulatedInfinityIsPreserved(),
surround the code that calls allocateDoubles(...) and function.computeBatch(...)
(the block using AvgDoubleGroupByFunction, prepare(...), SimpleMapValue, and
allocateDoubles) with an assertMemoryLeak(() -> { ... }) lambda so the native
allocations are leak-checked; keep the test logic intact and only move its
contents into the assertMemoryLeak wrapper.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/LongGroupByFunctionBatchTest.java`:
- Around line 72-84: Wrap the native-memory-allocating test method
testCountLongBatchAccumulates with assertMemoryLeak(() -> { ... }) so the
allocateLongs (Unsafe.malloc) paths are leak-checked; specifically, keep the
existing instantiation of CountLongGroupByFunction, the prepare(...) usage, the
allocateLongs/computeBatch calls, and the Assert.assertEquals inside the lambda
passed to assertMemoryLeak to ensure error paths are covered by the leak
harness.
In
`@core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java`:
- Around line 245-256: Rename the boolean local asyncVectorized to follow the
boolean naming convention (e.g., isAsyncVectorized): update its declaration
(currently "boolean asyncVectorized = ...") and all usages (the conditional
computing it and the planSink.put(asyncVectorized) call) to the new name so
compilation and behavior remain unchanged but the identifier uses the is...
prefix.
- Around line 195-206: In the GroupByFunctionCaseTest test SQL string (the
multiline SELECT that defines candle_st, venue, num_ticks, quote_volume, vwap
from 'spot_trades'), change the lowercase SQL keywords "like" and "in" to
uppercase "LIKE" and "IN" respectively so the query uses uppercase SQL keywords
per test style conventions; update the occurrences inside the WHERE clause of
that SQL literal accordingly.
---
Outside diff comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullIPv4GroupByFunctionFactory.java`:
- Around line 82-86: In LastNotNullIPv4GroupByFunctionFactory.computeNext the
method currently calls computeFirst whenever the incoming IPv4 is non-null,
which can overwrite newer values; change computeNext to first verify the
incoming value is non-null and then only call computeFirst if the stored slot is
empty or the incoming rowId is newer (check mapValue.getIPv4(valueIndex + 1) ==
Numbers.IPv4_NULL || rowId > mapValue.getLong(valueIndex)) so that
computeFirst(mapValue, record, rowId) is invoked only when appropriate.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.java`:
- Around line 122-127: The merge method in MinCharGroupByFunction writes a char
using destValue.putInt(valueIndex, srcMin); change this to use the correct char
setter destValue.putChar(valueIndex, srcMin) so the stored type matches
initValueTypes and the other methods (computeFirst, computeBatch) that use char
accessors; ensure valueIndex and srcMin remain the same and no other logic is
changed.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/CharGroupByFunctionBatchTest.java`:
- Around line 61-249: These batch tests allocate native memory (via
allocateChars / Unsafe.malloc) and must be wrapped with assertMemoryLeak() to
catch leaks; update each affected test (e.g., testFirstCharBatch,
testFirstCharBatchAccumulates, testFirstNotNullCharBatch, testLastCharBatch,
testMaxCharBatch, testMinCharBatchSimple, etc.) to run their body inside
assertMemoryLeak(() -> { ... }), keeping the existing try-with-resources
prepare(...) and calls to allocateChars and computeBatch unchanged; apply the
same pattern to the other batch test classes mentioned
(Date/Byte/IPv4/Float/Int/GeoHash) so every test that calls allocateChars or
other native allocators is wrapped in assertMemoryLeak().
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/MinCharGroupByFunctionFactoryTest.java`:
- Around line 62-90: The test method testFirstNull allocates native resources
(TableWriter/RecordCursor) and must be wrapped with assertMemoryLeak to ensure
no leaks; update testFirstNull to call assertMemoryLeak(() -> { ... }) around
the existing test body (the parts that call getWriter("tab"), select("select
min(f) from tab"), and related TableWriter/RecordCursor usage) so that resource
allocation and cursor/writer lifecycles are executed inside the assertMemoryLeak
closure and existing try-with-resources blocks remain unchanged inside it.
---
Nitpick comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionByte.java`:
- Around line 41-51: In computeBatch of CountGeoHashGroupByFunctionByte, avoid
calling mapValue.addLong(valueIndex, 0) for batches with no non-null bytes to
reduce map churn: after counting nonNullCount (the loop using
Unsafe.getUnsafe().getByte and GeoHashes.BYTE_NULL), only call
mapValue.addLong(valueIndex, nonNullCount) when nonNullCount > 0; leave the
counting loop and variable names (ptr, count, hi, nonNullCount, valueIndex)
unchanged.
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIPv4GroupByFunctionFactory.java`:
- Around line 70-73: Replace the hardcoded 4L used for IPv4 pointer arithmetic
in FirstNotNullIPv4GroupByFunctionFactory with Integer.BYTES to avoid magic
numbers: compute hi as ptr + count * (long)Integer.BYTES, increment ptr by
(long)Integer.BYTES in the loop, and update the similar occurrence at the other
location (around the second loop at the noted lines); keep the getInt(ptr) usage
unchanged but ensure all pointer math consistently uses Integer.BYTES (cast to
long where needed).
In
`@core/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxTimestampGroupByFunction.java`:
- Around line 48-56: MaxTimestampGroupByFunction.computeBatch currently relies
on implicit sentinel comparison and uses mapValue.putLong, which is inconsistent
with MinTimestampGroupByFunction and with this class's computeFirst/setNull that
use putTimestamp; update computeBatch in MaxTimestampGroupByFunction to mirror
MinTimestampGroupByFunction by explicitly guarding against Numbers.LONG_NULL for
batchMax (e.g., if (batchMax != Numbers.LONG_NULL) { ... }) and when updating
the map use mapValue.putTimestamp(valueIndex, batchMax) instead of putLong so
the null-sentinel handling and write method are consistent with computeFirst and
setNull.
In `@core/src/main/java/io/questdb/griffin/engine/functions/GroupByFunction.java`:
- Around line 58-75: Update the Javadoc for GroupByFunction.computeBatch to
explicitly state that for a given MapValue successive invocations of
computeBatch will receive non-overlapping, strictly increasing row ranges: each
call’s startRowId must be greater than the previous call’s startRowId and the
ranges must not overlap, so implementations of computeBatch (which must
accumulate into state set by setEmpty and may be invoked without a preceding
computeFirst or followed by merge when supportsBatchComputation() is true) may
rely on monotonic, non-overlapping batches to implement first/last semantics
correctly.
In
`@core/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java`:
- Line 75: Rename the boolean field 'vectorized' in
AsyncGroupByNotKeyedRecordCursorFactory to 'isVectorized' and update all
references (constructor parameter, assignments, and any usages such as in
methods or other classes) to the new name; specifically change the field
declaration 'vectorized' to 'isVectorized', update the constructor parameter and
assignment that set this vector flag, and replace all usages at the sites
referenced (previously at lines around 136 and 185) to use 'isVectorized' so
naming follows the is... boolean convention.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DoubleGroupByFunctionBatchTest.java`:
- Around line 577-621: The four new test methods
(testAvgDoubleComputeFirstInfinity, testAvgDoubleComputeFirstNaN,
testAvgDoubleComputeNextInfinity, testAvgDoubleComputeNextNaNSkipped) are placed
out of alphabetical order; move these methods inside the existing block of other
testAvg... members in DoubleGroupByFunctionBatchTest so they are grouped with
the other testAvg* methods and sorted alphabetically within that group, keeping
each method signature and body unchanged and preserving the try-with-resources
usage and assertions.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.java`:
- Around line 366-397: Add sibling two-batch unit tests for the BYTE and SHORT
geohash variants (and ensure the LONG variant has a two-batch accumulation test
with a non-zero second batch startRowId) similar to
testLastNotNullGeoHashBatchIntAccumulates: create tests (e.g.,
testLastNotNullGeoHashBatchByteAccumulates and
testLastNotNullGeoHashBatchShortAccumulates) that build a GroupByFunction via
newLastNotNullGeoHashFunction(GeoByteColumn.newInstance(...)) /
GeoShortColumn.newInstance(...), call function.setNull(value), then call
function.computeBatch(value, ptr, len, 0) for the first batch and
function.computeBatch(value, ptr2, len2, startRowId>0) for the second batch
using the appropriate allocateBytes/allocateShorts (and for LONG ensure a
second-batch startRowId>0 case if missing) with GeoHashes.*_NULL/values, and
finally assert the expected value with
function.getGeoByte/getGeoShort/getGeoLong and that
function.supportsBatchComputation() where appropriate.
In
`@core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.java`:
- Around line 230-239: The test currently calls function.computeBatch(value,
ptr, 2, 0) twice, which doesn't verify accumulation when the second batch has a
non-zero startRowId; update the second computeBatch invocation to use a non-zero
offset (e.g., 2) so it reads like function.computeBatch(value, ptr, 2, 2) to
exercise accumulation across batches for SumShortGroupByFunction; keep the rest
of the test (prepare, allocateShorts, SimpleMapValue) unchanged.
- Around line 109-117: The test method names (e.g., testFirstShortBatchAllNull)
are misleading because the test bodies include one non-null value; rename them
to reflect the actual input pattern (e.g., testFirstShortBatchLeadingNull or
testFirstShortBatchTrailingNull) and update the corresponding test at the other
occurrence (lines ~187-196) accordingly; ensure you change the method
declarations and any references to these methods (search for
testFirstShortBatchAllNull and similar) and keep the test logic using
FirstShortGroupByFunction, allocateShorts, and prepare unchanged.
In `@core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java`:
- Around line 3132-3136: The test comment for testGroupByNotKeyed2 is
stale—update the inline comment that currently reads "expressions in aggregates
disable vectorized impl" to reflect that vectorization is expected (the
assertion shows "vectorized: true"); modify the comment near the test method
testGroupByNotKeyed2 (method name: testGroupByNotKeyed2) so it accurately
describes that expressions in aggregates do not disable vectorized execution in
this case.
In `@core/src/test/java/io/questdb/test/griffin/GroupByTest.java`:
- Around line 2984-3006: The tests (e.g.,
GroupByTest.testNonKeyedVectorizedAllNulls) currently assert only query results;
add an assertion that the planner chooses the vectorized execution path by
running an EXPLAIN (or planner plan) for the same query and asserting the plan
text contains the vectorized operator identifier used by the engine (e.g.,
"VectorizedGroupBy" or "vectorized") before calling assertQueryNoLeakCheck;
apply the same pattern to the other affected tests (the tests around the shown
ranges that use the same GROUP BY aggregate SQL and assertQueryNoLeakCheck) so
they fail if the planner regresses to a non-vectorized plan.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c2a9071a-a2fa-4afc-a617-327ce274b29a
📒 Files selected for processing (103)
core/src/main/java/io/questdb/griffin/engine/functions/GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionByte.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionInt.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionLong.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountGeoHashGroupByFunctionShort.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountLongConstGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/CountLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstBooleanGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstByteGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionByte.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionInt.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionLong.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstGeoHashGroupByFunctionShort.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullGeoHashGroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIPv4GroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstNotNullTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/FirstTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastBooleanGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastByteGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastGeoHashGroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullGeoHashGroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullIPv4GroupByFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastNotNullTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/LastTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MaxTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinCharGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinDateGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinIPv4GroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/MinTimestampGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumDoubleGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumFloatGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumIntGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumLongGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/functions/groupby/SumShortGroupByFunction.javacore/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinFastRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedAtom.javacore/src/main/java/io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.javacore/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.javacore/src/test/java/io/questdb/test/griffin/AggregateTest.javacore/src/test/java/io/questdb/test/griffin/ExplainPlanTest.javacore/src/test/java/io/questdb/test/griffin/GroupByTest.javacore/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/finance/LevelTwoPriceFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/BooleanGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ByteGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/CharGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DateGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/DoubleGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/FloatGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/GeoHashGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/IPv4GroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/IntGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/LongGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/MinCharGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ShortGroupByFunctionBatchTest.javacore/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java
core/src/main/java/io/questdb/griffin/engine/functions/groupby/AvgDoubleGroupByFunction.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/functions/groupby/SumShortGroupByFunction.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/join/AsyncWindowJoinFastRecordCursorFactory.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/join/WindowJoinFastRecordCursorFactory.java
Show resolved
Hide resolved
...st/java/io/questdb/test/griffin/engine/functions/groupby/DoubleGroupByFunctionBatchTest.java
Show resolved
Hide resolved
...test/java/io/questdb/test/griffin/engine/functions/groupby/LongGroupByFunctionBatchTest.java
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByFunctionCaseTest.java
Show resolved
Hide resolved
[PR Coverage check]😍 pass : 561 / 569 (98.59%) file detail
|
Vectorized execution is now applied to more non-keyed GROUP BY queries like
Internally,
GroupByFunction#computeBatch()API is used for all queries without filter. This API was previously used only in WINDOW JOIN.