1#1
Merged
Merged
Conversation
Signed-off-by: Ayman Bagabas <[email protected]>
Prevent combining deepen <n> with deepen-since or deepen-not, following canonical git's validation in upload-pack.c: `deepen and deepen-since (or deepen-not) cannot be used together` Add validation to both encoder and decoder, with tests for all invalid combinations. Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
…atching protocol lines Signed-off-by: Ayman Bagabas <[email protected]>
…achine Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
…ents Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
Signed-off-by: Ayman Bagabas <[email protected]>
plumbing: packp, refactor AdvRefs and capability packages
* *: skip gitignored directories during Status walk Status() walks the working tree via the filesystem merkletrie noder, then filters gitignored entries in excludeIgnoredChanges. Every file under a large ignored directory like node_modules is lstat'd and hashed before being dropped, even though CLI git skips such directories at the directory level (dir.c:read_directory_recursive). Plumb an IgnoreMatcher into filesystem.Options so the walker can skip entries that match the matcher AND have no corresponding entry in the index. Tracked entries that happen to match an ignore rule are still walked so their modifications are reported. Patterns are loaded once in diffStagingWithWorktree and reused, so the post-walk filter no longer re-reads them. Benchmark on a tree with 100 tracked source files and N untracked files in a gitignored directory (Apple M4 Max, -benchtime=10x): Untracked | before | after | speedup --------- | ------- | ------ | ------- 0 | 1.5 ms | 1.9 ms | - 1,000 | 7.0 ms | 4.4 ms | 1.6x 5,000 | 18.2 ms | 4.4 ms | 4.2x 20,000 | 58.4 ms | 4.4 ms | 13.3x The small overhead in the 0-untracked case comes from the unchanged ReadPatterns walk plus a small trackedDirs map; the gain comes from the diff walk no longer descending into ignored subtrees. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * *: test ignored-directory walk skipping in Status Add coverage for the three correctness invariants of the new IgnoreMatcher path in the filesystem noder, plus a Status-level integration test: - TestIgnoredDirIsSkipped: an ignored directory containing only untracked files is not walked. - TestTrackedFileInIgnoredDirReportsModify: a tracked file inside an ignored directory still surfaces as Modify. - TestUntrackedSiblingsInIgnoredDirAreSkipped: when the walker descends into an ignored directory because of a tracked file, untracked siblings are still filtered. - TestStatusReportsModifiedTrackedFileInIgnoredDirectory: the same invariant as the second test, exercised through Worktree.Status(). Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * *: add t.Parallel() to ignore-walk tests The paralleltest linter is enabled in .golangci.yaml and existing tests in utils/merkletrie/filesystem follow the t.Parallel() convention. The three tests added in 9695898 were missing the call, breaking validate-lint on CI. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * utils: merkletrie/filesystem, ignore IgnoreMatcher when Index is nil shouldSkipIgnored documents that IgnoreMatcher is a no-op when no Index is provided, since without an index there is no way to identify tracked entries. The implementation did not enforce that contract: idxMap nil caused ignored files to be skipped, and trackedDirs nil (a nil-map lookup returning false) caused ignored directories to be skipped too. Add an explicit early return when idxMap is nil, drop the now-dead nil check on the file path, tighten the doc comment on Options.IgnoreMatcher, and cover the contract with a regression test. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * *: drop redundant excludeIgnoredChanges post-walk filter The filesystem walker now skips ignored, untracked entries during the Status walk, so they never enter the merkletrie. The post-walk filter that removed them again can never observe such an entry, and tracked entries are walked in both implementations, so excludeIgnoredChanges agrees with the new walker on every input. Drop the function and its only caller in diffStagingWithWorktree, and fold the early return into the DiffTree calls. collectIgnorePatterns is kept; it now only feeds the IgnoreMatcher. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * utils: merkletrie/filesystem, switch trackedDirs to a set trackedDirs is only ever consulted as a presence check; it never stored a meaningful bool value. Use map[string]struct{} to make the intent explicit and avoid the per-entry byte. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * *: fold ignored-dir status bench into worktree_status_bench_test.go Keep all Status() benchmarks in a single file. Also refresh the docstrings: they described the pre-PR behavior where ignored files were walked and hashed, which no longer matches reality now that the filesystem walker skips ignored directories. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * utils: merkletrie/filesystem, use gitignore.Matcher in Options directly Drop the local IgnoreMatcher interface and reference gitignore.Matcher in Options instead. The interface served only to keep this package free of the gitignore import, but reviewers preferred the simpler direct reference: every real caller plugs in a gitignore.Matcher anyway, so the abstraction was paying for a layering boundary nobody was using. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * utils: merkletrie/filesystem, walk submodules inside ignored directories shouldSkipIgnored consulted trackedDirs in the directory branch, but trackedDirs only records *parent* paths of index entries. A submodule's own path is the index entry, so its directory was pruned by the walker when its parent matched an ignore rule, and modifications to the submodule were silently dropped from Status(). Check idxMap before falling back to trackedDirs: an entry whose own path is in the index is tracked, whether it is a regular file or a directory-shaped entry like a submodule. Add a regression test that places a submodule under an ignored directory. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> * utils: merkletrie/filesystem, expand ignore-walk test coverage Add three regression tests that exercise edge cases of the ignore-walk optimization, modeled on equivalent scenarios in upstream git's t/t7061-wtstatus-ignore.sh and t/t7521-ignored-mode.sh: - TestDeeplyNestedTrackedFileInIgnoredDir: a tracked file three directories deep under an ignored top-level dir, exercising the parent-chain population of trackedDirs. - TestFilePatternIgnoreSkipsUntrackedSiblings: a *.log file pattern rather than a directory pattern, ensuring the file branch of shouldSkipIgnored handles per-file ignore rules. - TestEmptyIgnoredDirIsSkipped: an ignored directory with no entries on disk is still pruned without surfacing as a child. Assisted-by: Claude Opus 4.7 Signed-off-by: Stefan Haubold <[email protected]> --------- Signed-off-by: Stefan Haubold <[email protected]>
build: Update module github.com/pjbgf/sha1cd to v0.6.0 (main)
internal: servers, add git server implementation
Signed-off-by: Paulo Gomes <[email protected]>
Signed-off-by: Paulo Gomes <[email protected]>
Signed-off-by: Paulo Gomes <[email protected]>
Signed-off-by: Paulo Gomes <[email protected]>
Signed-off-by: Paulo Gomes <[email protected]>
Previously go-git flush size was hard-coded to 32, which could be rather inefficient for very large negotiations. This change aligns the behaviour with upstream, growing exponentially that number when it is safe to do so, as per next_flush() (fetch-pack.c:277-291). The initial flush size is now 16 - which is also aligned with upstream. ref: https://github.com/git/git/blob/9f223ef1c026d91c7ac68cc0211bde255dda6199/fetch-pack.c Signed-off-by: Paulo Gomes <[email protected]>
… git Add applyServerACKs to properly handle multi-ack-detailed responses: - Reset in_vain on ACK_continue and ACK_ready (fetch-pack.c:550-552) - Re-queue ACK_common haves for stateless RPC rounds (fetch-pack.c:533-542) - Suppress duplicate ACK_common to avoid re-queuing already-known objects - Track ACK_ready and stop negotiation early (fetch-pack.c:555-556, 569-570) - Send a terminal done round after ACK_ready (fetch-pack.c:577-579) - Clamp batch size to remaining in-vein budget after got_continue Assisted-by: Claude Opus 4.6 <[email protected]> Signed-off-by: Paulo Gomes <[email protected]>
Signed-off-by: Paulo Gomes <[email protected]>
Remove use of `ChrootOS`
plumbing: transport, Align flush size with upstream git
Signed-off-by: Paulo Gomes <[email protected]>
Signed-off-by: Paulo Gomes <[email protected]>
`NewLazyIndexWithPool` is the pool-aware variant of [NewLazyIndex]: it forwards a `*fdpool.Pool` to both the idx and rev [sharedfile.SharedFile]s so a storage-wide LRU bounds the FD budget held by the lazy index. `NewLazyIndex` remains the pool-less convenience over a `nil` pool. The change is purely additive at the surface; the per-source FD lifecycle lives inside the [sharedfile.SharedFile] primitive. Both shared files receive the same pool. Eviction acts on each SharedFile independently — the pool has no idx-versus-rev affinity — so an active .idx is not forced to keep its .rev FD warm, and vice versa. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
`NewWithPool` is the pool-aware variant of [New]: it forwards a `*fdpool.Pool` to the .pack [sharedfile.SharedFile] so a storage-wide LRU bounds the open pack descriptors across many PackHandles. `New` remains the pool-less convenience over a `nil` pool. The change is purely additive at the surface; the per-source FD lifecycle lives inside the [sharedfile.SharedFile] primitive. Only the .pack FD is threaded here. The .idx and .rev FDs are managed by the [idxfile.LazyIndex] that [PackHandle.Index] lazily constructs; storages that want a pool-wired index ride a separate code path (see `storage/filesystem`'s `ObjectStorage.loadLazyIndex`). Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Wire the storage-wide `*fdpool.Pool` from `Storage` through `DotGit` to both pack-FD (`internal/packhandle`) and idx-FD (`plumbing/format/idxfile`) sharedFiles. `Options.MaxOpenDescriptors` now governs the pool capacity: zero selects `defaultMaxOpenDescriptors` (256, ~85 concurrently-hot packs at 3 FDs each), negative values yield a no-op pool so callers can disable pooling without special-casing the construction path. Replaces the previous no-op stub. `Options.Pool` accepts a caller-constructed `*fdpool.Pool` so multiple `Storage` instances in the same process can share one FD budget. Useful for callers that spawn many short-lived storages concurrently (GitOps controllers reconciling N repositories, daemons handling parallel requests). When nil the storage allocates its own pool from `MaxOpenDescriptors`. `dotgit.Options` gains a `Pool` field that the packhandle constructor consumes via `packhandle.NewWithPool`. `ObjectStorage.loadLazyIndex` switches to `idxfile.NewLazyIndexWithPool` and forwards the same pool. `DotGit.Alternates` propagates the parent's pool to alternate dotgits so the storage-wide budget covers alternates too. Tests inject a `*fdpool.Pool` via `Options.Pool` and inspect `pool.Stats()` directly to verify the wiring, mirroring the shape an external caller uses to observe the pool. Coverage spans the alternate inheritance path, pool sharing across multiple storages, and the disabled-pool path. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Three benches characterise the pool wiring at increasing
scope. All numbers Apple M4 Pro, single -benchtime=1s run;
relative shapes are stable but absolute ns/op drift across
runs and benchstat against a baseline is the correct
methodology for quantitative comparisons.
`BenchmarkFDPool_WorkingSetFitsInPool` and
`BenchmarkFDPool_Eviction_WorkingSetExceedsPool` target the
pool directly. The fit-in-pool case (wset=16, cap=64, all
hits) runs at ~12 ns/op with zero allocations; the eviction
case (wset=32, cap=16, every Touch evicts) costs ~46 ns/op
with a single `list.PushFront` allocation per round. These
numbers are the floor — they set a ceiling for the storage-
level cost we can attribute to the pool.
`BenchmarkSharedFileAcquireReleaseWithPool` in
`internal/sharedfile` measures the pool-notify overhead on
the SharedFile hot path. Touch on a warm SharedFile is
allocation-free and adds roughly 7 ns/op over the pool-less
baseline (`BenchmarkSharedFileAcquireRelease` ~7 ns/op vs
the pool-wired variant ~14 ns/op).
`BenchmarkStorage_PoolPressure` exercises the full stack
against a 32-pack synthetic fixture with
`cache.NewObjectLRU(0)` so every read traverses the FD
layer. Two sub-benchmarks compare regimes:
NoChurn (cap=256, no eviction) ~12.5 us/op 5.1 KB/op 88 allocs/op
Churn (cap=8, eviction) ~17.9 us/op 5.3 KB/op 91 allocs/op
The ~43% delta between Churn and NoChurn is the cost of
evicting and reopening pack/idx FDs when the working set
exceeds capacity. Each sub-benchmark injects its own
`*fdpool.Pool` via `Options.Pool` and asserts against
`pool.Stats().Evictions` at the end — NoChurn fails if any
eviction fires, Churn fails if no eviction fires — so the
bench refuses to silently stop exercising the pool.
`BenchmarkObjectStorage_ParallelReads` complements the
storage benches by measuring concurrent EncodedObject
throughput on the RLock'd read path.
Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <[email protected]>
`BenchmarkAlternatesObjectLookup` constructed its storage via the low-level `NewObjectStorage(dotgit.NewWithOptions(...))` path, which bypasses the FD pool that `NewStorageWithOptions` installs. The preamble noted the goal was to mirror `PlainClone(Shared: true)` behaviour, but the chosen construction shape missed the public API's wiring. Switching to `NewStorageWithOptions` threads the pool through to the alternate's PackHandles and gives the benchmark the same FD lifecycle real consumers see. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Touch's eviction path discards ReleaseNow's error by design (the Member contract states this explicitly). A future reader comparing this to errors.Join in packhandle.doClose may flag the asymmetry as an oversight. Add a sentence to the existing lock-ordering comment block recording the intentional contrast so the pattern reads as a documented choice rather than missing error handling. No behaviour change; documentation only. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
The Options.MaxOpenDescriptors godoc claimed the pool bounds ".pack/.idx/.rev descriptors across the entire Storage", which overstates coverage — pack-write FDs (PackWriter) are short-lived and not pooled. Reword to name the read side explicitly and add a note that the 256 default assumes go-git holds the dominant share of the process's FD budget; callers running alongside other FD-heavy subsystems should size against their full budget. Options.Pool gains a paragraph naming git.Open, git.Clone, and git.Init as the layered entry points for callers who want to share an *fdpool.Pool across Storage instances. The path-based wrappers (git.PlainOpen, git.PlainClone) construct their own Storage internally and so do not accept an injected pool; that is by design (they exist to hide storage internals). Both Options.Pool fields (filesystem and dotgit) gain a stability disclaimer: the field's API stability tracks fdpool.Pool's, not this package's, because fdpool lives under x/ and may change without semver. Consumers reading these fields should treat them as experimental on the same timeline. No behaviour change; documentation only. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Stats already exposes Hits and Evictions; operators monitoring the pool could not distinguish clean evictions from those where Member.ReleaseNow returned a non-nil error. Add EvictionFailures so the failure rate is visible without callers building their own wrapping. The eviction itself still completes (the Member is removed from the LRU regardless of ReleaseNow's return) so Evictions still counts the attempt; EvictionFailures is the subset that failed to close cleanly. This matches how operators expect cumulative counters to compose for rate calculations. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Pool.Touch's eviction previously took the LRU tail unconditionally. Under sustained concurrent load on a small set of hot packs, this caused churn: the LRU tail could be a pinned SharedFile whose FD lingered until refs drained (via the immediateClose latch), then needed reopening on the next Acquire — which itself evicted whoever was at the new tail. The result was extra close+reopen cycles on hot packs without bounding total open FDs more tightly. Add an optional Pinnable interface and a two-pass victim selection in Touch: walk the LRU back-to-front, prefer the LRU-most Member whose Pinned() returns false; fall back to the LRU tail if every Member is pinned. Matches canonical Git's find_lru_pack (packfile.c:482-530). The walk stops before reaching the just-inserted front element so eviction never targets the caller's own Member. SharedFile implements Pinned() as refs > 0. The check takes SharedFile.mu while pool.mu is held, which reverses the "Member→Pool only" lock-acquire rule the existing comment documented. The actual deadlock-avoidance invariant is "Member calls into Pool only after releasing Member.mu" (see Acquire, Close); the new Pool→Member call through Pinned is deadlock-free because no Member call holds s.mu while waiting for p.mu. Update the comment to reflect this. Stats gains PinnedSkips, a cumulative count of pinned Members the walk passed over. Non-zero values surface churn under load. Members that do not implement Pinnable are treated as unpinned, preserving the unconditional-LRU behaviour for any future Member implementation. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
The existing eviction benches assert evictions occurred (or did not) post-loop, but say nothing about Stats.Active relative to capacity. A regression in Touch where the LRU grows past the cap would slip silently past those assertions: evictions still fire, just not enough to keep the working set bounded. Add a post-loop check that Active <= capacity in both benches. Post-loop only — an in-b.Loop() check would distort the timings the bench is trying to measure. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
NewLazyIndexWithPool was exercised only transitively through the storage/filesystem integration tests. A regression in the wiring between LazyIndex's two SharedFiles (idx and rev) and the pool would surface two layers up rather than at the idxfile layer where the change lives, making the failure mode harder to attribute. Add a focused test that constructs two pool-wired LazyIndexes against a real fixture, forces eviction by sizing the pool to exactly one LazyIndex's worth of SharedFiles, and verifies that a read against the evicted LazyIndex reopens its FDs through the stored openers without surfacing an error. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
The existing concurrent ReleaseNow test verifies that Acquire, Release, and ReleaseNow compose without panic or deadlock, but the in-flight body never actually reads from the file. The refcount + immediateClose latch contract guarantees an in-flight read sees a valid FD even if ReleaseNow fires mid-read; a ReleaseNow that closes the FD out from under a reader would slip past tests that only check call composition. Add a stress test where N reader goroutines do Acquire → ReadAt → verify bytes → Release in a loop while M evictor goroutines fire ReleaseNow. A wrong byte slice or a ReadAt error would surface a regression in the latch — the read-back assertion is the load-bearing check, not the absence of panic. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
BenchmarkPackHandleAcquireGrace measures the warm path where the SharedFile's grace timer keeps the .pack FD open across iterations. The cold counterpart — reopen after the grace timer naturally fires — is the workload pattern the grace period is meant to amortize against, but no bench covers it directly. BenchmarkPackHandleAcquireAfterGraceExpiry fills the gap with a wall-clock-accurate measurement: each iteration sleeps past the 1s grace window with the timer paused, so only the subsequent OpenPackReader → ReadAt → Close is counted. The delta against the warm bench is one full pack-SharedFile reopen including the time.AfterFunc-driven close callback. The bench is naturally slow at default -benchtime=1s (each iteration's sleep dominates); the docstring directs users to override via -benchtime=Nx. BenchmarkPackHandleCloseIdleReopen remains the fast proxy for routine runs, fast-forwarding the grace timer via soft-close to keep the cycle synchronous. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Options.MaxOpenDescriptors and Options.Pool overlapped: both shaped the storage-wide FD budget, but only Pool was the actual primitive — the integer knob just picked a capacity for the per-Storage Pool that NewStorageWithOptions constructed when Pool was nil. Carrying both forced the doc to spell out their interaction (zero vs negative, ignored when Pool is non-nil) without giving callers any expressiveness Pool did not already provide. Drop the field. Nil Pool now selects defaultPoolCapacity (256). Disabling pooling becomes Pool: fdpool.New(0) — same no-op-pool branch a negative MaxOpenDescriptors used to take. The disabled- pool test injects the no-op pool explicitly. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Pool kept an O(1) Member-to-list.Element map for Touch lookups, making Member an implicit map key. The doc said as much but the constraint is invisible at the type system level: a Member whose concrete type is non-comparable panics on first Touch with runtime "hash of unhashable type" rather than a contract error. Internal callers happened to satisfy it (*SharedFile, *PackHandle are both pointer types) but external consumers — fdpool is under x/ — have no compile-time safety net. Introduce a Handle type that owns the per-Member back-reference into the LRU list. Touch and Forget take a *Handle alongside the Member; the Pool reads and writes Handle fields under p.mu and stores both Member and Handle in the list element so eviction can clear the victim's h.elem before invoking ReleaseNow. Clearing under p.mu is load-bearing: without it, a re-Touch on a just-evicted Member would call MoveToFront on the removed element (a no-op in container/list) and leave the Member silently unregistered. Member loses its comparable requirement. failingMember now embeds a slice so the concrete type is non-comparable on purpose; under the old design that test would have panicked at registration. SharedFile carries a poolHandle field and passes its address through to Touch/Forget; the SharedFile→Pool lock-ordering invariant is unchanged. Bench on Apple M4 Pro (-benchtime=10000x): WorkingSetFitsInPool 10.47 ns/op, 0 allocs; Eviction_WorkingSetExceedsPool 59.28 ns/op, 2 allocs (entry + list.Element per insert) — the hot-path allocation profile is preserved. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
HashesWithPrefix iterated s.index without holding muI: a concurrent Reindex swap or PackfileWriter.Notify insert could race with the iteration, producing a Go-level data race on the map's internal structure. Snapshot the LazyIndex pointers under a brief muI.RLock and iterate the snapshot afterwards. The borrowed values stay alive for the duration of the loop via the local slice; the .idx and .rev FDs are reference-counted by sharedfile and governed by the fdpool LRU or grace timer, not by removal from s.index. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
storage: pack read optimisations and storage-wide LRU FD pool
The OFS-delta validation rejected only negative offsets larger than the entry's own pack offset, missing the boundary case where the encoded negative offset equals the entry's own offset. The resolved base offset is then zero, which points at the pack signature byte instead of a real object header. Canonical Git rejects this in packfile.c[1] with `base_offset <= 0 || base_offset >= delta_obj_offset`; go-git was using a strict `>` instead of `>=`. The same off-by-one appeared in the two OFS-delta varint readers under storage/filesystem/mmap (`(*ondemandObject).computeTypeAndSize` and `(*ondemandObject).resolveDelta`), so this commit sweeps both sites alongside the scanner. Add a scanner test covering an OFS-delta whose encoded negative offset equals its own pack offset, and a fuzz corpus seed of the same probe. [1]: https://github.com/git/git/blob/94f057755b/packfile.c#L1289-L1290 Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
The OFS-delta base-offset predicate landed in three sites in the fix for the off-by-one: the scanner header decode and the on-demand mmap object's two delta-resolution paths. Each carried a duplicated copy of the canonical-Git citation and the same boundary condition. Consolidate them behind one exported helper, ValidateOFSDeltaBase, so a future audit reads the canonical condition once. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Parse() resolved all REF-deltas in one loop and OFS-deltas in another, processing the two delta types as independent sets. A REF-delta whose base is an in-pack OFS-delta therefore failed to resolve: the REF pass looked up its base hash before the OFS-delta had been applied, since the OFS-delta's hash is unknown at scan time. The lookup misclassified the in-pack base as a thin-pack external reference and the chain never produced a real parent. Canonical Git's `threaded_second_pass` in builtin/index-pack.c[1] walks the delta DAG from non-delta bases, advancing the REF and OFS children of every parent together rather than in two sweeps. Port that shape: index the pending delta lists by parent offset and parent hash, then visit every reachable delta depth-first from each non-delta base. REF-deltas not reached by the walk fall through to the existing thin-pack placeholder path; an OFS-delta whose recorded negative offset matches no in-pack object header is reported as malformed input. A child reachable through more than one parent (two non-deltas with identical content, or an OFS-delta resolving to the same hash as a non-delta in the pack) is processed only on the first visit. Cover the case with a parser test that constructs a pack with the shape `[base blob, OFS-delta(base=blob), REF-delta(base= OFS-delta-hash)]` and asserts all three objects resolve, plus a fuzz corpus seed of the same probe. [1]: https://github.com/git/git/blob/94f057755b/builtin/index-pack.c#L1103 Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Parser.Parse mutates state owned by the Parser: the per-delta parent pointers it links during depth-first delta resolution, and the cache maps that index resolved objects by hash and offset. Neither is reset on entry. A second call against the same Parser would see the prior call's state — whether that call succeeded or failed mid-walk — and produce undefined results. The constraint is not new; the new `c.parent != nil` skip guard in resolveDeltas makes its consequences slightly more visible. Spell it out in the type's godoc so callers don't reach for Parse a second time and silently hit the leftover state. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
The encoder emits every delta in a single pack with the same kind: all OFS_DELTA by default, all REF_DELTA when useRefDeltas is set. The parser now resolves packs that mix the two kinds in one chain, because mixed-kind packs occur in the wild — repacks across servers with differing --delta-base-offset settings, thin-pack splices, and third-party tooling all produce them. Add a short comment at the write site so the next reader of writeDeltaHeader doesn't file the asymmetry between read- and write-side delta-kind support as a defect. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
`deltaEncodeSize` in `diff_delta.go` and `writeTestObjectHeader` in `internal_test.go` each hand-rolled the same 7-bit-chunk LEB128 loop as the writer counterpart to `util.DecodeLEB128`. Route both through `util.EncodeLEB128` / `EncodeLEB128ToWriter` so a single overflow-bounded implementation backs every site that emits an LEB128 length, mirroring the decoder pair already exported from the same package. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
plumbing: format/packfile, fix delta-base parsing bugs
Materialising an `FSObject` returned by `EncodedObject` previously fell back to a direct `o.fs.Open(packPath)` on every `Reader()` call, bypassing the per-pack file-descriptor pool introduced in `billy.Filesystem` that pays an mmap/munmap on each Open, the regression amplified to ~400x more pack opens over 500 reads on the basic fixture[1]. The cure threads the dotgit-cached `*packhandle.PackHandle` into the `*packfile.Packfile` via a new `WithPackHandle` option taking a `PackHandleResolver`. `objectFromHeader` captures the resolver in the `FSObject.acquireRandom` closure, so every read acquires a fresh cursor against the pool-owned `SharedFile`. The resolver re-runs on every `Reader` call, which lets cached FSObjects survive resets of dotgit's pack-handle store such as the one `NewObjectPack` triggers before a repack. The internal `*packhandle.PackHandle` stays internal — the public `packfile.PackHandle` interface returns `io.ReadSeekCloser` and `RandomReader`, and `DotGit.PackHandle` returns the interface through an unexported adapter so callers can't accidentally call `Close()` on a dotgit-owned handle. `Packfile.Close` releases only the scanner cursor; the resolver-owned handle outlives the Packfile. [1]: #2153 (comment) Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <[email protected]>
Cache the *config.Config result and thread it through internal
functions instead of repeatedly calling w.r.Config() for each file
during checkout, reset, status, and add operations.
Previously, each worktree operation (Reset, Status, Add) called
Config() once per file plus overhead, resulting in N+4 disk I/O
operations (open, read, parse .git/config) where N is the number of
files. Now Config() is called once per public entry point and passed
to private functions via a new cfg parameter.
Changes:
- Reset() calls Config once; threaded through resetWorktreeToTree(),
resetWorktree(), checkoutChange(), checkoutChangeRegularFile(),
and checkoutFile()
- StatusWithOptions(), doAdd(), AddGlob() call Config once; threaded
through diffStagingWithWorktree(), checkoutChange(), and related
functions
- autoAddModifiedAndDeleted() passes cfg through checkoutFile()
No public API changes. No new struct fields.
Benchmark results (benchstat):
bench-before.txt bench-after.txt
sec/op sec/op
CloneLargeRepo-12 6.093 ± 38% 1.946 ± 15% -68.06%
CloneDeepRepo-12 3.878 ± 72% 2.143 ± 7% -44.75%
B/op B/op
CloneLargeRepo-12 151.1Mi ± 0% 137.5Mi ± 0% -9.02%
CloneDeepRepo-12 163.7Mi ± 0% 150.1Mi ± 0% -8.35%
allocs/op allocs/op
CloneLargeRepo-12 1013.4k ± 0% 805.3k ± 0% -20.54%
CloneDeepRepo-12 1113.7k ± 0% 905.6k ± 0% -18.69%
Assisted-by: OpenCode with Claude Opus 4.6 <[email protected]>
Signed-off-by: Cedric BAIL <[email protected]>
git: worktree, hoist Config() calls out of per-file loops
storage: filesystem, pool FSObject reads via PackHandle
| // Printf prints the given message only if the target is enabled. | ||
| func (t Target) Printf(format string, args ...any) { | ||
| if t.Enabled() { | ||
| _ = logger.Output(2, fmt.Sprintf(format, args...)) |
| go-version: ${{ matrix.go-version }} | ||
|
|
||
| - name: Install Wasmtime | ||
| uses: bytecodealliance/actions/wasmtime/setup@v1 |
|
|
||
| func (b *Backend) logf(format string, v ...any) { | ||
| if b.ErrorLog != nil { | ||
| b.ErrorLog.Printf(format, v...) |
| pos += int64(consumed) | ||
|
|
||
| //nolint:staticcheck | ||
| base, err = o.scanner.GetByOffset(baseOffset) //nolint:ineffassign |
| // Printf prints the given message only if the target is enabled. | ||
| func (t Target) Printf(format string, args ...any) { | ||
| if t.Enabled() { | ||
| _ = logger.Output(2, fmt.Sprintf(format, args...)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1