perf(http): garbage-free HTTP header parser#6397
Conversation
This change reduces the allocation rate when parsing HTTP headers. Previously, header parser would carefully pool HTTP headers, only to store them in a map which implicitly created a defensive copy of all keys. This change extends the map with a method to store keys without creating a defensive copy.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR introduces immutable key storage mechanisms across UTF-8 collection classes by adding a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai pls review |
|
@jerrinot, I'll review the changes in this PR focusing on the garbage-free HTTP header parser implementation and the optimization to avoid defensive copies when storing keys in the map. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
core/src/main/java/io/questdb/cutlass/http/HttpHeaderParser.java(1 hunks)core/src/main/java/io/questdb/cutlass/http/processors/LineHttpTudCache.java(4 hunks)core/src/main/java/io/questdb/std/AbstractLowerCaseUtf8SequenceHashSet.java(1 hunks)core/src/main/java/io/questdb/std/LowerCaseUtf8SequenceObjHashMap.java(4 hunks)core/src/test/java/io/questdb/test/std/LowerCaseUtf8SequenceObjHashMapTest.java(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (Check Changes Check changes)
- GitHub Check: build
- GitHub Check: build
🔇 Additional comments (6)
core/src/main/java/io/questdb/cutlass/http/HttpHeaderParser.java (1)
332-332: LGTM! Optimization correctly reduces allocations.The switch to
putImmutableeliminates defensive key copies during header storage. The lifecycle is properly managed:headers.clear()(line 116) removes all references before pooledDirectUtf8Stringinstances can be reused, satisfying the immutability guarantee documented inputImmutable.core/src/main/java/io/questdb/cutlass/http/processors/LineHttpTudCache.java (1)
86-86: LGTM! Type changes align with the immutable key storage refactor.The widening of the key type from
Utf8StringtoUtf8Sequenceis safe and consistent with the broader changes across the map infrastructure. The usage at line 182 properly creates new instances viaUtf8String.newInstance(line 169) before storage, satisfying lifecycle requirements.Also applies to: 104-104, 119-119, 187-187
core/src/main/java/io/questdb/std/AbstractLowerCaseUtf8SequenceHashSet.java (1)
40-40: LGTM! Key storage type widened to support immutable insertions.Changing the backing storage from
Utf8String[]toUtf8Sequence[]enables the map to store anyUtf8Sequenceimplementation directly, which is essential for the garbage-free optimization introduced byputImmutable.Also applies to: 51-51
core/src/test/java/io/questdb/test/std/LowerCaseUtf8SequenceObjHashMapTest.java (1)
206-231: LGTM! Comprehensive test coverage for identity-preserving behavior.The test properly verifies that
putImmutableretains the exact key instance (viaassertSameat lines 219-220), exercises both insertion paths (DirectUtf8Sink and Utf8String), and confirms correct behavior on removal. This provides strong validation for the garbage-free optimization.core/src/main/java/io/questdb/std/LowerCaseUtf8SequenceObjHashMap.java (2)
88-109: LGTM! Well-designed API for garbage-free key storage.The new
putImmutablemethod is properly documented with clear lifecycle requirements (lines 94-99), and the implementation inputImmutableAtcorrectly stores the key instance without defensive copying. This achieves the stated performance goal of reducing allocation rates.Also applies to: 131-145
39-39: LGTM! Type changes enable polymorphic key storage.The list type change from
ObjList<Utf8String>toObjList<Utf8Sequence>(line 39) and correspondingkeys()return type update (line 67) properly support storing anyUtf8Sequenceimplementation, enabling the garbage-free optimization while maintaining type safety.Also applies to: 67-67
|
Note Unit test generation is an Early Access feature. Expect some limitations and changes as we gather feedback and continue to improve it. Generating unit tests... This may take up to 20 minutes. |
|
Caution The CodeRabbit agent failed during execution: Clone operation failed |
|
@CodeRabbit try again, something went wrong? does this change really reduce allocations? |
🧩 Analysis chain🏁 Script executed: Length of output: 5932 🏁 Script executed: Length of output: 445 🏁 Script executed: Length of output: 5978 @bluestreak01, yes, this change definitely reduces allocations! Let me explain how: Allocation Reduction AnalysisBefore (using public boolean putAt(int index, Utf8Sequence key, V value) {
Utf8String onHeapKey = Utf8String.newInstance(key); // ❌ ALLOCATION
return putImmutableAt(index, onHeapKey, value);
}Every time a header was stored, the map created a defensive copy by allocating a new After (using public boolean putImmutable(Utf8Sequence key, V value) {
return putImmutableAt(keyIndex(key), key, value);
}
private boolean putImmutableAt(int index, Utf8Sequence key, V value) {
// ...
keys[index] = key; // ✅ NO ALLOCATION - stores key directly
// ...
}The key is now stored directly without creating a copy, eliminating one allocation per header. In HttpHeaderParser: // Changed from:
headers.put(headerName, ...)
// To:
headers.putImmutable(headerName, ...)Since HTTP requests typically have 5-15 headers, this eliminates 5-15 allocations per request during header parsing. The change is safe because This is a clean performance optimization that reduces GC pressure in the hot path of HTTP request processing. ✅ Actions performedReview triggered.
|
|
@CodeRabbit is there a regression where this map is used elsewhere and there is a risk of storing mutable copies ? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 270281 🏁 Script executed: Length of output: 10655 🏁 Script executed: Length of output: 35503 @bluestreak01 Short answer: with the current call sites, no regression found.
I also scanned the tree and didn’t find any other non-test usages or any other putImmutable(...) calls. So the only place we store a pooled/mutable DirectUtf8String by reference is HttpHeaderParser, and its lifecycle is contained to a single request, with no reuse before headers.clear(). To prevent future footguns, I’d document (in the map javadoc and call sites) that putImmutable must only be used when:
If you want, I can add a brief comment near headers.putImmutable(...) and in LowerCaseUtf8SequenceObjHashMap.putImmutable(...) reinforcing that contract. |
[PR Coverage check]😍 pass : 19 / 20 (95.00%) file detail
|
This change reduces the allocation rate when parsing HTTP headers. Previously, header parser would carefully pool HTTP headers, only to store them in a map which implicitly created a defensive copy of all keys.
This change extends the map with a method to store keys without creating a defensive copy.