perf(sql): lazy merge in parallel count_distinct functions#6268
perf(sql): lazy merge in parallel count_distinct functions#6268bluestreak01 merged 11 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (10)
core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByLongListTest.java (1)
37-57: Consider additional test cases for edge scenarios.The smoke test covers basic functionality well and includes proper memory leak testing. However, consider adding tests for:
- Empty list behavior (size 0, get on empty)
- Explicit growth/resizing verification (e.g., verify capacity increases after exceeding initial capacity)
- Boundary conditions (very large N, capacity limits)
These additions would strengthen confidence in the new data structure, especially given its role in the parallel count_distinct optimization.
core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLongGroupByFunction.java (3)
75-100: Prefer using set size after insertion (minor).Instead of incrementing cnt manually, read back setA.size() to avoid any drift.
- setA.addAt(index, value); - mapValue.putLong(valueIndex, cnt + 1); - mapValue.putLong(valueIndex + 1, setA.ptr()); + setA.addAt(index, value); + mapValue.putLong(valueIndex, setA.size()); + mapValue.putLong(valueIndex + 1, setA.ptr());
240-246: Remove no-op call (optional).Calling this.supportsParallelism() in setNull has no effect; consider removing to avoid confusion.
252-282: Max-set selection loop: tiny clarity tweak (optional).Works as-is; consider storing size in a local var to avoid double calls for readability.
- for (int i = 0, n = list.size(); i < n; i++) { - if (setA.of(list.get(i)).size() > maxSize) { - maxSize = setA.size(); - maxIndex = i; - } - } + for (int i = 0, n = list.size(); i < n; i++) { + setA.of(list.get(i)); + final int size = setA.size(); + if (size > maxSize) { + maxSize = size; + maxIndex = i; + } + }core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLong256GroupByFunction.java (2)
202-207: Remove no-op call (optional).this.supportsParallelism() in setNull is unnecessary.
220-250: Same clarity tweak as long variant (optional).Cache size in a local variable when choosing the largest set for readability.
- for (int i = 0, n = list.size(); i < n; i++) { - if (setA.of(list.get(i)).size() > maxSize) { - maxSize = setA.size(); - maxIndex = i; - } - } + for (int i = 0, n = list.size(); i < n; i++) { + setA.of(list.get(i)); + final int size = setA.size(); + if (size > maxSize) { + maxSize = size; + maxIndex = i; + } + }core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctUuidGroupByFunction.java (1)
199-204: Remove no-op call in setNull()Calling supportsParallelism() inside setNull() has no effect; please remove to avoid confusion.
- public void setNull(MapValue mapValue) { - this.supportsParallelism(); + public void setNull(MapValue mapValue) { mapValue.putLong(valueIndex, Numbers.LONG_NULL); mapValue.putLong(valueIndex + 1, 0); }core/src/main/java/io/questdb/PropServerConfiguration.java (1)
1415-1416: Config default updated to 0.5 — OKKeeps runtime/config defaults consistent. Consider noting memory impact in parameter docs.
Add a short note in config reference that 0.5 reduces clustering with FxHasher at the cost of ~2x slack vs 1.0, so peak memory may increase under high cardinality.
core/src/main/java/io/questdb/griffin/engine/groupby/GroupByLongList.java (2)
81-88: Optional debug bounds checks for get/setAdd assertions to catch misuse during development without affecting prod.
@@ public long get(long index) { - return Unsafe.getUnsafe().getLong(ptr + HEADER_SIZE + 8L * index); + assert ptr != 0 && index >= 0 && index < capacity() : "index out of bounds: " + index; + return Unsafe.getUnsafe().getLong(ptr + HEADER_SIZE + 8L * index); @@ public void set(long index, long value) { - Unsafe.getUnsafe().putLong(ptr + HEADER_SIZE + 8L * index, value); + assert ptr != 0 && index >= 0 && index < capacity() : "index out of bounds: " + index; + Unsafe.getUnsafe().putLong(ptr + HEADER_SIZE + 8L * index, value);Also applies to: 105-107
101-104: Clarify ownership/lifecycleresetPtr() drops the pointer without freeing. Add a short Javadoc note that GroupByAllocator owns lifetime and frees externally.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (29)
benchmarks/src/main/java/org/questdb/GroupByCharSequenceLongHashMapBenchmark.java(1 hunks)benchmarks/src/main/java/org/questdb/GroupByLong128HashSetBenchmark.java(2 hunks)benchmarks/src/main/java/org/questdb/GroupByLong256HashSetBenchmark.java(2 hunks)benchmarks/src/main/java/org/questdb/GroupByLongHashSetBenchmark.java(2 hunks)benchmarks/src/main/java/org/questdb/GroupByLongLongHashMapBenchmark.java(1 hunks)benchmarks/src/main/java/org/questdb/GroupByUtf8SequenceLongHashMapBenchmark.java(1 hunks)core/src/main/java/io/questdb/PropServerConfiguration.java(1 hunks)core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/AbstractCountDistinctIntGroupByFunction.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIPv4GroupByFunction.java(2 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIPv4GroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIntGroupByFunction.java(2 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIntGroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLong256GroupByFunction.java(8 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLong256GroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLongGroupByFunction.java(8 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLongGroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctSymbolGroupByFunction.java(4 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctSymbolGroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctUuidGroupByFunction.java(8 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctUuidGroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/groupby/GroupByIntHashSet.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/groupby/GroupByLongHashSet.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/groupby/GroupByLongList.java(1 hunks)core/src/main/java/io/questdb/std/Hash.java(1 hunks)core/src/test/java/io/questdb/test/PropServerConfigurationTest.java(1 hunks)core/src/test/java/io/questdb/test/ServerMainTest.java(1 hunks)core/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/groupby/GroupByLongListTest.java(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
- GitHub Check: New pull request (Hosted Running tests on windows-other)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
- GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (27)
core/src/test/java/io/questdb/test/PropServerConfigurationTest.java (1)
230-231: Expectation matches new defaultThe assertion now aligns with the updated
countDistinctload-factor default of0.5. Looks good.core/src/main/java/io/questdb/griffin/engine/groupby/GroupByLongHashSet.java (1)
101-110: Fast hash swap makes senseUsing
Hash.fastHashLong64keeps the mask logic intact while reusing the new low-overhead hash path. Nice upgrade.benchmarks/src/main/java/org/questdb/GroupByUtf8SequenceLongHashMapBenchmark.java (1)
56-57: Benchmark load factor alignedUpdating the benchmark to the 0.5 load factor keeps it representative of the production path.
core/src/test/java/io/questdb/test/ServerMainTest.java (1)
326-327: Show-parameters expectation updatedThe listed default now matches the 0.5 load factor exposed by configuration—good catch.
benchmarks/src/main/java/org/questdb/GroupByCharSequenceLongHashMapBenchmark.java (1)
56-58: Benchmark constant in syncSetting the load factor to 0.5 keeps this benchmark aligned with the runtime defaults.
core/src/main/java/io/questdb/griffin/engine/groupby/GroupByIntHashSet.java (1)
101-110: Fast hash adoption LGTMSwitching to
Hash.fastHashInt64keeps the probing logic unchanged and benefits from the quicker hash path.core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctUuidGroupByFunctionFactory.java (1)
56-58: Factory wiring matches new signatureSupplying
sqlExecutionContext.getSharedQueryWorkerCount()ensures the group-by function can size its worker-local state correctly.benchmarks/src/main/java/org/questdb/GroupByLong256HashSetBenchmark.java (1)
36-44: LGTM: Explicit imports improve clarity.Replacing wildcard imports with explicit JMH annotation imports improves code clarity and makes dependencies explicit.
benchmarks/src/main/java/org/questdb/GroupByLongLongHashMapBenchmark.java (1)
56-56: LGTM: Consistent load factor adjustment.The load factor reduction to 0.5 is consistent with other benchmark files and aligns with the PR's hash function optimization strategy.
benchmarks/src/main/java/org/questdb/GroupByLongHashSetBenchmark.java (2)
36-44: LGTM: Explicit imports improve maintainability.The explicit JMH annotation imports enhance code clarity.
57-57: LGTM: Load factor adjustment is consistent.The load factor reduction to 0.5 matches the pattern across all benchmark files in this PR.
benchmarks/src/main/java/org/questdb/GroupByLong128HashSetBenchmark.java (2)
36-44: LGTM: Explicit imports improve clarity.The explicit JMH annotation imports are a good practice.
57-57: LGTM: Load factor reduction is consistent.The load factor adjustment to 0.5 aligns with the broader PR changes.
core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctSymbolGroupByFunctionFactory.java (1)
56-57: LGTM: Consistent worker count parameter addition.The addition of
sqlExecutionContext.getSharedQueryWorkerCount()is consistent with the parallel optimization pattern applied across count_distinct factories.core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLong256GroupByFunctionFactory.java (1)
56-57: LGTM: Worker count parameter completes the parallel optimization pattern.The addition of
sqlExecutionContext.getSharedQueryWorkerCount()is consistent across all count_distinct factory implementations, enabling uniform parallel optimization support.core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIPv4GroupByFunctionFactory.java (1)
56-57: LGTM: Worker count parameter fully supported
ConstructorCountDistinctIPv4GroupByFunction(Function, int, double, int)accepts and usesworkerCountfor per-worker state management.core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLongGroupByFunctionFactory.java (1)
56-58: Factory wiring LGTM.Passing shared worker count aligns with the lazy-merge design.
core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIntGroupByFunctionFactory.java (1)
56-58: Factory wiring LGTM.Consistent with other count_distinct factories.
core/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.java (1)
309-435: Solid fuzz coverage for lazy-merge count_distinct.Toggles parallel mode correctly and compares single-thread vs parallel outputs; exercises int/ipv4/symbol/long/uuid/long256. Nice.
core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLongGroupByFunction.java (1)
108-121: Lazy merge on getLong is correct.The MapRecord/MapValue handling and -1 sentinel are consistent; merging once on access avoids wasted work.
Please confirm other count_distinct variants also gate merges solely on -1 and never in computeNext to keep semantics uniform.
core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctLong256GroupByFunction.java (2)
100-114: Lazy merge path LGTM.-1 sentinel + deferred list merge on access is consistent with long variant.
155-181: Defer set-set merges correctly.Accumulating pointers when both sides are sets reduces reduce-time overhead. Good.
core/src/main/java/io/questdb/std/Hash.java (1)
53-70: Fast hash methods usage and load factors verified
All GroupBy(Int|Long)HashSet instances now call fastHashInt64/fastHashLong64, and tests/benchmarks cover load factors up to 0.9. No further changes required.core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIntGroupByFunction.java (1)
36-43: Inline optimization and state transitions look correctFirst non-null value is inlined; promotion to set at second distinct; addAt() guarded by keyIndex prevents double-counting. Good use of setA.ptr() updates after potential rehash.
Also applies to: 47-55, 59-81
core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctSymbolGroupByFunction.java (1)
43-50: Symbol path aligns with new base; early-exit is well integratedInline single-value optimization, promotion to set, and knownSymbolCount early-exit are correct. Clear and init handling look good.
Also applies to: 59-68, 72-96, 98-103, 105-114
core/src/main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctIPv4GroupByFunction.java (1)
36-44: IPv4 implementation mirrors int path correctlyInline single-value, promotion to set, and guarded addAt flow look good. Sentinel handling with Numbers.IPv4_NULL is correct.
Also applies to: 48-56, 60-81
core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java (1)
255-257: Default load factor lowered to 0.5 — OKAligned with server config and PR goals. Expect lower collisions but higher memory per set.
Please ensure documentation/sample configs mention the new default and memory trade-offs.
benchmarks/src/main/java/org/questdb/GroupByLong256HashSetBenchmark.java
Show resolved
Hide resolved
...ava/io/questdb/griffin/engine/functions/groupby/AbstractCountDistinctIntGroupByFunction.java
Show resolved
Hide resolved
.../main/java/io/questdb/griffin/engine/functions/groupby/CountDistinctUuidGroupByFunction.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/groupby/GroupByLongList.java
Show resolved
Hide resolved
|
@bluestreak01 linter CI step was failing, so I fixed it in f987d61 |
[PR Coverage check]😍 pass : 403 / 445 (90.56%) file detail
|
Introduces deferred merge in parallel
count_distinct()functions (int, ipv4, long, uuid and long256 types). This means that the hash set merge is now done ingetLong()call instead of when merging group by's hash tables.This optimization is parallel
count_distinct()specific. For each group, we're accumulating hash sets in the worker threads. Later on, when merging GROUP BY hash tables, we need to merge those hash sets - that's an expensive operation. So, instead of doing that immediately, we gather hash set pointers in a list and merge the hash sets later on, whengetLong()is called for the given group.This way we avoid redundant hash set merges in queries like ClickHouse's Q9:
Here, the result set is ordered and limited by
COUNT(*), so there is no need to calculatecount_distinct(UserID)for each intermediate row. Hence, the exec time improvement.Also, includes the following changes:
count_distinct(int)andcount_distinct(long), it's inlined in theMapValueinstead of allocating additional memory for theGroupBy*HashSet.GroupByIntHashSetandGroupByLongHashSet. To compensate quality loss, load factor is changed from 0.75 to 0.5.Benchmarks
On my Linux box with Ryzen 7900x, ClickBench's Q9 is 0.783s vs. 1.382s on
master.