perf(sql): reduce garbage generated on parallel query hot path#6597
Merged
bluestreak01 merged 25 commits intomasterfrom Jan 11, 2026
Merged
perf(sql): reduce garbage generated on parallel query hot path#6597bluestreak01 merged 25 commits intomasterfrom
bluestreak01 merged 25 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…age_frame_formula
…age_frame_formula
…age_frame_formula
…age_frame_formula
ccc6063 to
4ad275b
Compare
bluestreak01
reviewed
Jan 8, 2026
Contributor
[PR Coverage check]😍 pass : 533 / 598 (89.13%) file detail
|
ctapobep
reviewed
Jan 10, 2026
bluestreak01
approved these changes
Jan 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reduces GC pressure and jitter in parallel queries on large multi-core machines (tested on c7a.metal-48xl with 192 vCPUs).
Key Changes
PageFrameAddressCachenow uses flatDirectLongListlists instead of nestedLongListobjectsDecimal128,Decimal256,Long256flyweights in map values are now lazily initializedint[]array.GroupByAllocatorImprovementsLongLongHashMapis replaced with off-heapDirectLongLongHashMap. Previously, each query worker was growing the hash map when allocating leading to lots of on-heap allocations.cairo.sql.parallel.work.stealing.spin.timeout(default 50µs)AdaptiveWorkStealingStrategyfor better parallel execution. Namely,System.nanoTime()with configurable timeout is now used to avoid variable spinning time on different OSes and CPU architectures.Thread.onSpinWait()is used in the spin-wait loop to use PAUSE/YIELD instruction instead of "sleep(0)" system call.PageFrameAddressCachenow call Misc.free(cursor) in close() methodscairo.sql.jit.page.address.cache.threshold(no longer needed)Benchmarks
I've run clickbench queries with 30 iterations per query on my Ryzen 7900x (12c/24t), 64GB RAM box running Ubuntu 24.04 and GraalVM CE 17.0.8.
GC Analysis Summary
Master Branch GC Metrics
Patch Branch GC Metrics
The runtimes is different since I didn't stop the server immediately after the benchmark, so the server was idle for some time. But this shouldn't impact the end result since runtime for patch was longer.
GC Comparison
Key GC Pause Times (excluding startup Full GCs):
The patch shows a significant reduction in allocations - approximately 2.5x fewer GC cycles during benchmark execution, indicating the optimization reduces memory pressure substantially.
Query Time Analysis
Hot runs comparison according to clickbench rules (min of iterations 2-3, +10ms offset)
Note: Q28 is also tough for JVM JIT since it uses
regexp_replace()SQL function, thus standard regexp library. So, the difference in that query is likely due to JVM JIT's jitter.