Conversation
📝 WalkthroughWalkthroughInternal little-endian helper Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant HW as HuffmanBitWriter
participant Buf as Output Buffer ([]byte)
participant LE as internal/le.Store64
rect rgb(238,245,255)
note over HW: Writing compressed bits
HW->>LE: Store64(Buf, offset, bits)
note right of LE: Compute address at Buf[offset:]
LE-->>Buf: Write 8 bytes (little-endian)
end
note over HW,LE: Same flow under unsafe-enabled/disabled variants
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
internal/le/unsafe_disabled.go (1)
39-42: Docstring mismatch: now stores at b[i:], not bThe comment still says “Store64 will store v at b.” but the implementation writes at an offset (b[i:]). Update the comment to avoid confusion.
Apply:
-// Store64 will store v at b. +// Store64 will store v at b[i:]. func Store64[I Indexer](b []byte, i I, v uint64) {Optional: consider adding a brief precondition note (len(b) >= int(i)+8) to mirror how call sites rely on headroom.
internal/le/unsafe_enabled.go (1)
49-52: Store64 now writes at an explicit offset — good API for hot pathsThis matches the new call sites and avoids extra slicing/indexing. Given this uses unsafe.Add, it’s worth documenting the precondition that len(b) >= int(i)+8 and that i must be non-negative.
Add a brief precondition comment:
-// Store64 will store v at b[i:]. +// Store64 will store v at b[i:]. +// Precondition: len(b) >= int(i)+8 and i >= 0. func Store64[I Indexer](b []byte, i I, v uint64) {flate/huffman_bit_writer.go (1)
437-439: Approve — no 2‑arg le.Store64 usages found; tiny follow-ups
- Verified: all le.Store64 calls are three-argument (found at flate/huffman_bit_writer.go:438, 854, 881, 903, 928, 949, 1102, 1131).
- Add a short comment by the bytes field declaration documenting the intentional +8 headroom (overwrite-8 / consume-6 invariant).
- Optional: for consistency, change the call using
nat flate/huffman_bit_writer.go:438 tonbytes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
flate/deflate.go(3 hunks)flate/huffman_bit_writer.go(8 hunks)internal/le/unsafe_disabled.go(1 hunks)internal/le/unsafe_enabled.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
flate/deflate.go (2)
internal/le/unsafe_disabled.go (1)
Load32(20-22)internal/le/unsafe_enabled.go (1)
Load32(26-30)
internal/le/unsafe_disabled.go (2)
internal/le/unsafe_enabled.go (1)
Store64(50-52)internal/le/le.go (1)
Indexer(3-5)
flate/huffman_bit_writer.go (2)
internal/le/unsafe_disabled.go (1)
Store64(40-42)internal/le/unsafe_enabled.go (1)
Store64(50-52)
internal/le/unsafe_enabled.go (2)
internal/le/unsafe_disabled.go (1)
Store64(40-42)internal/le/le.go (1)
Indexer(3-5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: Analyze (go)
- GitHub Check: fuzz-other (nounsafe)
- GitHub Check: build (1.25.x, ubuntu-latest)
- GitHub Check: build (1.24.x, windows-latest)
- GitHub Check: build (1.25.x, macos-latest)
- GitHub Check: fuzz-other ("noasm,nounsafe")
- GitHub Check: fuzz-zstd (nounsafe)
- GitHub Check: build (1.23.x, windows-latest)
- GitHub Check: fuzz-zstd ("noasm,nounsafe")
- GitHub Check: fuzz-s2 (nounsafe)
- GitHub Check: build (1.24.x, macos-latest)
- GitHub Check: build (1.25.x, windows-latest)
- GitHub Check: build (1.24.x, ubuntu-latest)
- GitHub Check: build (1.23.x, macos-latest)
- GitHub Check: fuzz-s2 ("noasm,nounsafe")
- GitHub Check: build-special
- GitHub Check: build (1.23.x, ubuntu-latest)
🔇 Additional comments (2)
flate/deflate.go (2)
14-15: Import shift to internal/le looks goodDropping encoding/binary here and centralizing loads via internal/le is appropriate and aligns with the PR intent.
366-366: Swap to le.Load32 retains semantics and improves inliningle.Load32(b, 0) is equivalent to binary.LittleEndian.Uint32(b) and should be faster with unsafe-enabled builds; caller precondition len(b) >= 4 still applies. Confirm there are no remaining binary.LittleEndian.Uint32 call sites in flate/ — the provided ripgrep check returned no matches (empty output); re-run or verify manually.
|
It also could speedup gzip compression, Right? So any plan to release a tag for this performance improvement? Thanks! |
|
@phuslu Yes, but much less so, since entropy coding is only a (minor) part of it. This is difference on "huffman" (last column is MiB/s) This is the fastest compression... I am reviewing code for golang/go#75532 - I will probably release once I've submitted that. For now the stdlib will not include "unsafe" code. |
|
Yes, that's true. We also are a modified k8s branch, it already integrated klauspost/compress for speedup gzip/zstd. |
This PR contains the following updates: | Package | Change | Age | Confidence | |---|---|---|---| | [github.com/klauspost/compress](https://github.com/klauspost/compress) | `v1.18.0` -> `v1.18.1` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes <details> <summary>klauspost/compress (github.com/klauspost/compress)</summary> ### [`v1.18.1`](https://github.com/klauspost/compress/releases/tag/v1.18.1) [Compare Source](klauspost/compress@v1.18.0...v1.18.1) #### What's Changed - zstd: Fix incorrect buffer size in dictionary encodes by [@​klauspost](https://github.com/klauspost) in [#​1059](klauspost/compress#1059) - s2: check for cap, not len of buffer in EncodeBetter/Best by [@​vdarulis](https://github.com/vdarulis) in [#​1080](klauspost/compress#1080) - zstd: Add simple zstd EncodeTo/DecodeTo functions by [@​klauspost](https://github.com/klauspost) in [#​1079](klauspost/compress#1079) - zlib: Avoiding extra allocation in zlib.reader.Reset by [@​travelpolicy](https://github.com/travelpolicy) in [#​1086](klauspost/compress#1086) - gzhttp: remove redundant err check in zstdReader by [@​ryanfowler](https://github.com/ryanfowler) in [#​1090](klauspost/compress#1090) - Run modernize. Deprecate Go 1.22 by [@​klauspost](https://github.com/klauspost) in [#​1095](klauspost/compress#1095) - flate: Simplify matchlen by [@​klauspost](https://github.com/klauspost) in [#​1101](klauspost/compress#1101) - flate: Add examples by [@​klauspost](https://github.com/klauspost) in [#​1102](klauspost/compress#1102) - flate: Use exact sizes for huffman tables by [@​klauspost](https://github.com/klauspost) in [#​1103](klauspost/compress#1103) - flate: Faster load+store by [@​klauspost](https://github.com/klauspost) in [#​1104](klauspost/compress#1104) - Add notice to S2 about MinLZ by [@​klauspost](https://github.com/klauspost) in [#​1065](klauspost/compress#1065) #### New Contributors - [@​wooffie](https://github.com/wooffie) made their first contribution in [#​1069](klauspost/compress#1069) - [@​vdarulis](https://github.com/vdarulis) made their first contribution in [#​1080](klauspost/compress#1080) - [@​travelpolicy](https://github.com/travelpolicy) made their first contribution in [#​1086](klauspost/compress#1086) - [@​ryanfowler](https://github.com/ryanfowler) made their first contribution in [#​1090](klauspost/compress#1090) **Full Changelog**: <klauspost/compress@v1.18.0...v1.18.1> </details> --- ### Configuration 📅 **Schedule**: Branch creation - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) (UTC), Automerge - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) (UTC). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4xNTIuOSIsInVwZGF0ZWRJblZlciI6IjQxLjE1Mi45IiwidGFyZ2V0QnJhbmNoIjoiZm9yZ2VqbyIsImxhYmVscyI6WyJkZXBlbmRlbmN5LXVwZ3JhZGUiLCJ0ZXN0L25vdC1uZWVkZWQiXX0=--> Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/9786 Reviewed-by: Earl Warren <[email protected]> Co-authored-by: Renovate Bot <[email protected]> Co-committed-by: Renovate Bot <[email protected]>
~10% faster huffman encoding.
Summary by CodeRabbit
Performance
Refactor
Chores
Compatibility