perf(sql): remove litter generated by ordered map rehashing#6399
perf(sql): remove litter generated by ordered map rehashing#6399bluestreak01 merged 2 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR refactors OrderedMap's internal memory management from using DirectIntList for offset storage to raw off-heap memory buffers managed via Unsafe. The changes systematically replace pointer addressing schemes, introduce offset compression methods, and update memory allocation and access patterns throughout the class. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–75 minutes
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/main/java/io/questdb/cairo/map/OrderedMap.java (1)
96-99: Offset compression overflow issue confirmed — entries silently lost at 16+ GiB heap sizesThe analysis in the review comment is correct. The code allows heaps up to ~32 GiB (via
MAX_HEAP_SIZEvalidation inresize, line 594), butcompressOffset(line 393–395) overflows into negative values for offsets exceeding ~16 GiB:
Integer.MAX_VALUE = 2,147,483,647, so max safe offset is2,147,483,646 * 8 ≈ 17.2 GiB- For larger offsets,
(offset >> 3) + 1wraps to a negativeinton cast- During
rehash(line 568), the conditionwhile (Unsafe.getUnsafe().getInt(newOffsetAddr) > 0)skips negativerawOffsetvalues- Entries are never copied to the new hash table → silent data loss
- Rehash (line 572) propagates corrupted
rawOffsetvalues verbatimAdd overflow guard in
compressOffsetand treatrawOffsetas unsigned indecompressOffset:private static int compressOffset(long offset) { + long compressed = (offset >> 3) + 1L; + if (compressed > 0xFFFFFFFFL) { + throw LimitOverflowException.instance().put("heap offset overflow"); + } return (int) ((offset >> 3) + 1); } private static long decompressOffset(int rawOffset) { - return ((long) rawOffset - 1) << 3; + return (Integer.toUnsignedLong(rawOffset) - 1L) << 3; }Without this, maps growing beyond ~16 GiB corrupt their hash table state silently instead of failing predictably.
🧹 Nitpick comments (1)
core/src/main/java/io/questdb/cairo/map/OrderedMap.java (1)
802-879: Key / VarSizeKey: capacity checks and var‑size encoding are coherentFor
KeyandVarSizeKey:
Key.reset()seedsstartAddr = kPosandappendAddr = kPos + keyOffset, so var‑size keys reserve space for the 4‑byte length header (keyOffset == VAR_KEY_HEADER_SIZE) and fixed‑size keys start atkPos.checkCapacity(requiredKeySize)always considers the upcoming write plus the fullvalueSize. For var‑size keys (where it’s called perput*), this ensures that by the timecommit()/createValue()run,appendAddr + valueSize <= heapLimit. On overflow it callsresize(requiredSize, appendAddr)and adjustsstartAddr/appendAddrby the returned delta.VarSizeKey.commit()writes the key length into the reserved header and uses it for hashing (hash()over[startAddr + keyOffset, len]) and equality checks.- All var‑size encodings (
putStr*,putVarchar,putBin,putArray,putInterval, etc.):
- Check capacity based on their byte count.
- Lay out data starting at
appendAddrwith length prefixes where appropriate.- Advance
appendAddrby the exact number of bytes written.VarSizeKey.eq(long offset)correctly:
- Treats
offsetas a relative heap offset.- Compares the stored length headers at
heapAddr + offsetvs.startAddr.- Uses
Vect.memeqon the key body[header + keyOffset, header + keyOffset + len).Overall, the var‑size key path remains internally consistent with the refactored heap addressing and should behave the same as before, modulo the offset‑compression limit noted earlier.
You might consider adding a brief comment near
checkCapacityexplaining that it is intended to be called per incremental var‑size write withrequiredKeySize == bytesToAppend, as this is non‑obvious from the signature and helps future maintainers avoid misusing it.Also applies to: 886-907, 914-925, 941-955, 967-979, 981-993, 1000-1005, 1007-1013, 1015-1028, 1030-1048, 1050-1055, 1057-1072, 1074-1084, 1086-1113, 1120-1126, 1128-1138, 1140-1149
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/io/questdb/cairo/map/OrderedMap.java(29 hunks)core/src/test/java/io/questdb/test/cairo/map/OrderedMapTest.java(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-04T09:16:27.366Z
Learnt from: jerrinot
Repo: questdb/questdb PR: 5996
File: core/src/test/java/io/questdb/test/cutlass/http/line/LineHttpSenderTest.java:350-356
Timestamp: 2025-08-04T09:16:27.366Z
Learning: In QuestDB ILP tests, maximum dimension values like `(1 << 28) - 1` are intentionally hard-coded instead of using constants like `ArrayView.DIM_MAX_LEN` to serve as "canaries" that will cause test failures if the API limits change, ensuring developers are aware of breaking changes to the ILP API.
Applied to files:
core/src/main/java/io/questdb/cairo/map/OrderedMap.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (Check Changes Check changes)
- GitHub Check: build
- GitHub Check: build
🔇 Additional comments (4)
core/src/test/java/io/questdb/test/cairo/map/OrderedMapTest.java (1)
82-84: Formatting-only change in test class is fineThe added blank line after the class declaration is harmless and keeps the file readable; no test behavior is affected.
core/src/main/java/io/questdb/cairo/map/OrderedMap.java (3)
176-193: Heap/offset allocation, clear, close, and reopen logic look consistentThe refactor to track
heapAddr/heapLimit/kPosand rawoffsetsAddrappears sound:
- Constructor allocates both heap (
heapAddr = kPos = Unsafe.malloc(...)) and offsets (offsetsAddr = Unsafe.malloc(...)) and zero‑fills the offsets block.clear()correctly resetskPostoheapAddr, resetsfree,size, andnResizes, and memset’s the offset table without reallocating.close()freesoffsetsAddrand heap using the recorded sizes, zeros out pointers and capacity fields, and leaves the object in a reusable but "closed" state (heapAddr == 0).reopen(..)/reopen()guard onheapAddr == 0and route throughrestoreInitialCapacity(), which reallocates both the heap and offsets blocks (handling the0case via realloc) and then callsclear().This keeps the invariants (
heapAddr != 0iff open, offsets sized tokeyCapacity << 3) coherent across lifecycle transitions.Also applies to: 260-267, 269-281, 330-345, 347-363
283-295: Probe/insert path and heap accounting maintain 8‑byte alignmentThe hot path for lookups and inserts—
valueAt,probe0,probeReadOnly,asNew,getAppendOffset,getHeapSize,getUsedHeapSize, andresize—is internally consistent after the refactor:
- All entry addresses stored in
offsetsAddrare based onheapAddrand are 8‑byte aligned (kPos = Bytes.align8b(keyWriter.appendAddr + valueSize)andalignedEntrySize = Bytes.align8b(entrySize)in merge code).getHeapSize()andgetUsedHeapSize()compute sizes asheapLimit - heapAddrandkPos - heapAddr, which match the base/root pointer semantics.probe0/probeReadOnlyinterpret decompressed offsets as relative toheapAddrand reconstructstartAddrviaheapAddr + offset, matching howasNewstores compressed offsets (compressOffset(keyWriter.startAddr - heapAddr)).resize()grows the heap in terms of bytes (kCapacity), enforces an upper bound, and returns adeltathat callers apply tokPos(andKey.startAddr/appendAddrviacheckCapacity), so existing in‑progress keys remain valid after reallocation.Assuming the offset compression issue is addressed, this layout should avoid misalignment and keep heap usage reporting correct.
Also applies to: 307-310, 317-319, 379-386, 401-412, 514-539, 581-608
627-646: FixedSizeKey: direct Unsafe writes and hashing align with new heap layoutWithin
FixedSizeKey:
commit()now asserts thatappendAddrhas not overshotstartAddr + keySize, which guards misuse of theput*methods.copyFrom/copyFromRawKeycopy raw key bytes from another key viaVect.memcpy, and the new signaturecopyFromRawKey(srcFixedKey.startAddr, keySize)is compatible with cross‑map copies (start addresses are absolute).- All
put*primitives (putBool,putByte,putChar,putInt,putLong,putLong128,putLong256,putInterval, etc.) write viaUnsafeatappendAddrand then incrementappendAddrby the exact size of the written type, matching the statickeySizecomputed at construction.hash()usesHash.hashMem64(startAddr, keySize), andeq(long offset)compares againstheapAddr + offset, which lines up with the new compressed offset representation.Given that
FixedSizeKey.init()callscheckCapacity(keySize)before any writes, this path looks correct and safe under the new addressing scheme.Also applies to: 671-679, 693-707, 721-725, 734-745, 748-754, 757-769, 771-779, 786-794, 796-800
[PR Coverage check]😍 pass : 207 / 219 (94.52%) file detail
|
|
there is valid nitpick, 16GiB RAM is achievable. We should error out in case of overflow. What do you think? |
@bluestreak01 that nitpick is not valid since we have this check in the private void rehash(long newKeyCapacity) {
if (newKeyCapacity > MAX_SAFE_INT_POW_2) {
throw CairoException.nonCritical().put("map capacity overflow");
}There is no need to check the offsets each time we compress/decompress them - instead we do the check once when growing the hash table. |
OrderedMapwas creating aDirectIntListinstance each time it was resized. This patch gets rid of the litter and also removes redundant address calculations for offset/hash code low.Microbencmarks
Before:
After: