Skip to content

plumbing: format/packfile, fix delta-base parsing bugs#2163

Merged
pjbgf merged 6 commits into
go-git:mainfrom
hiddeco:fix/packfile-delta-base
May 29, 2026
Merged

plumbing: format/packfile, fix delta-base parsing bugs#2163
pjbgf merged 6 commits into
go-git:mainfrom
hiddeco:fix/packfile-delta-base

Conversation

@hiddeco

@hiddeco hiddeco commented May 28, 2026

Copy link
Copy Markdown
Member

No description provided.

@hiddeco hiddeco force-pushed the fix/packfile-delta-base branch from 76aac6e to b8e766d Compare May 28, 2026 14:38
Comment thread plumbing/format/packfile/parser_test.go Outdated
hiddeco added 5 commits May 29, 2026 14:28
The OFS-delta validation rejected only negative offsets larger
than the entry's own pack offset, missing the boundary case where
the encoded negative offset equals the entry's own offset. The
resolved base offset is then zero, which points at the pack
signature byte instead of a real object header.

Canonical Git rejects this in packfile.c[1] with
`base_offset <= 0 || base_offset >= delta_obj_offset`; go-git was
using a strict `>` instead of `>=`. The same off-by-one appeared
in the two OFS-delta varint readers under storage/filesystem/mmap
(`(*ondemandObject).computeTypeAndSize` and `(*ondemandObject).resolveDelta`),
so this commit sweeps both sites alongside the scanner.

Add a scanner test covering an OFS-delta whose encoded negative
offset equals its own pack offset, and a fuzz corpus seed of the
same probe.

[1]: https://github.com/git/git/blob/94f057755b/packfile.c#L1289-L1290

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <[email protected]>
The OFS-delta base-offset predicate landed in three sites in the fix
for the off-by-one: the scanner header decode and the on-demand mmap
object's two delta-resolution paths. Each carried a duplicated copy
of the canonical-Git citation and the same boundary condition.
Consolidate them behind one exported helper, ValidateOFSDeltaBase,
so a future audit reads the canonical condition once.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <[email protected]>
Parse() resolved all REF-deltas in one loop and OFS-deltas in
another, processing the two delta types as independent sets. A
REF-delta whose base is an in-pack OFS-delta therefore failed to
resolve: the REF pass looked up its base hash before the OFS-delta
had been applied, since the OFS-delta's hash is unknown at scan
time. The lookup misclassified the in-pack base as a thin-pack
external reference and the chain never produced a real parent.

Canonical Git's `threaded_second_pass` in builtin/index-pack.c[1]
walks the delta DAG from non-delta bases, advancing the REF and
OFS children of every parent together rather than in two sweeps.
Port that shape: index the pending delta lists by parent offset
and parent hash, then visit every reachable delta depth-first
from each non-delta base. REF-deltas not reached by the walk
fall through to the existing thin-pack placeholder path; an
OFS-delta whose recorded negative offset matches no in-pack
object header is reported as malformed input.

A child reachable through more than one parent (two non-deltas
with identical content, or an OFS-delta resolving to the same
hash as a non-delta in the pack) is processed only on the first
visit.

Cover the case with a parser test that constructs a pack with
the shape `[base blob, OFS-delta(base=blob), REF-delta(base=
OFS-delta-hash)]` and asserts all three objects resolve, plus a
fuzz corpus seed of the same probe.

[1]: https://github.com/git/git/blob/94f057755b/builtin/index-pack.c#L1103

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <[email protected]>
Parser.Parse mutates state owned by the Parser: the per-delta parent
pointers it links during depth-first delta resolution, and the cache
maps that index resolved objects by hash and offset. Neither is reset
on entry. A second call against the same Parser would see the prior
call's state — whether that call succeeded or failed mid-walk — and
produce undefined results.

The constraint is not new; the new `c.parent != nil` skip guard in
resolveDeltas makes its consequences slightly more visible. Spell it
out in the type's godoc so callers don't reach for Parse a second
time and silently hit the leftover state.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <[email protected]>
The encoder emits every delta in a single pack with the same kind:
all OFS_DELTA by default, all REF_DELTA when useRefDeltas is set.
The parser now resolves packs that mix the two kinds in one chain,
because mixed-kind packs occur in the wild — repacks across servers
with differing --delta-base-offset settings, thin-pack splices, and
third-party tooling all produce them. Add a short comment at the
write site so the next reader of writeDeltaHeader doesn't file the
asymmetry between read- and write-side delta-kind support as a
defect.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <[email protected]>
@hiddeco hiddeco force-pushed the fix/packfile-delta-base branch from b8e766d to 77f41a2 Compare May 29, 2026 12:29
`deltaEncodeSize` in `diff_delta.go` and `writeTestObjectHeader` in
`internal_test.go` each hand-rolled the same 7-bit-chunk LEB128
loop as the writer counterpart to `util.DecodeLEB128`. Route both
through `util.EncodeLEB128` / `EncodeLEB128ToWriter` so a single
overflow-bounded implementation backs every site that emits an
LEB128 length, mirroring the decoder pair already exported from
the same package.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <[email protected]>
@hiddeco hiddeco force-pushed the fix/packfile-delta-base branch from 77f41a2 to 077d56f Compare May 29, 2026 13:05

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hiddeco thanks for working on this. 🙇

@pjbgf pjbgf merged commit f3634c6 into go-git:main May 29, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants