Skip to content

git: add repository and remote archive support#2013

Merged
pjbgf merged 1 commit into
go-git:mainfrom
aymanbagabas:archive-git
May 13, 2026
Merged

git: add repository and remote archive support#2013
pjbgf merged 1 commit into
go-git:mainfrom
aymanbagabas:archive-git

Conversation

@aymanbagabas

Copy link
Copy Markdown
Member

This implements the high level porcelain end of git-upload-archive and the git archive equivalent APIs. It adds a repo.Archive(opts) and git.ArchiveRemote(url, opts) methods to create archives of local and remote repositories.

This continues the work that was introduced in #1986

Comment thread archive_test.go
"github.com/stretchr/testify/require"
)

func TestArchiveRemote(t *testing.T) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already test the remote transport logic in our git transport tests. This ensures the high-level API does validation.

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. Let's just tighten the validation on ArchiveOptions please as per comment below.

Comment thread repository.go
}

// Validate validates the ArchiveOptions.
func (o *ArchiveOptions) Validate() error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add stricter rules here please, for example:

  • Format must be empty or one of tar, tar.gz, tgz, zip.
  • Prefix must be empty or a valid path component, not containing ...
  • Treeish must be a sha1/sha256 valid hex, or a valid heads name.
  • Paths must be empty, or a valid relative path not containing ...

@aymanbagabas aymanbagabas May 6, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @pjbgf,

I made some changes to validate the treeish during git.Archive and before calling WriteArchive. The rules here are intentional for different reasons.

  • For Format, we do the validation in git.Archive because a remote server and git.ArchiveRemote can have different supported archive formats than local.
  • The Prefix here is the prepended prefix component added to the files being archived. Canonical git allows path components containing ../ and leave unsafe non-normalized prefixes to be the responsibility of the extractor. Many tar and unzip tools reject extracting to ../ by default.
  • Since we moved resolving the tree before actually calling WriteArchive, we validate the tree is valid and has either valid sha1/sha256 or valid heads.
  • Paths are just filters, if a path doesn't exist, we already return pathspec did not match any files similar to canonical git. This validation cannot happen during Validate without resolving the tree and start writing the archive.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pjbgf Prefix now rejects paths with .. components and absolute paths.

Copilot AI review requested due to automatic review settings May 6, 2026 14:19

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds porcelain-level archive creation APIs to the git package, enabling git archive-like behavior for local repositories and git archive --remote-like behavior via git-upload-archive, building on the existing transport-layer support.

Changes:

  • Introduces Repository.Archive(opts) / Repository.ArchiveContext(ctx, opts) for streaming local repository archives.
  • Adds ArchiveRemote(url, opts) / ArchiveRemoteContext(ctx, url, opts) to request archives from remote repositories using git-upload-archive.
  • Refactors internal archive generation to resolve tree-ish metadata separately and expands tests around formats/prefix/path filtering.
Show a summary per file
File Description
repository.go Adds ArchiveOptions and local repository archive streaming APIs.
archive.go Adds remote archive entrypoints that handshake and invoke git-upload-archive.
internal/archive/archive.go Refactors archive generation API to accept a resolved tree + commit metadata; adds ErrUnsupportedFormat.
plumbing/transport/upload_archive.go Updates server-side UploadArchive to resolve tree-ish before writing the archive.
repository_test.go Adds coverage for local repository archive generation (formats/prefix/path filtering/cancellation).
archive_test.go Adds unit tests for ArchiveOptions.Validate and basic ArchiveRemote argument/validation behavior.

Copilot's findings

Comments suppressed due to low confidence (1)

archive_test.go:122

  • These parallel subtests capture the range variable tt. Since tt is reused across iterations, t.Parallel() can lead to subtests using the wrong url/options/wantErr values. Rebind tt := tt (or switch to an index loop) before t.Run.
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			t.Parallel()
			_, err := ArchiveRemote(tt.url, tt.opts)
  • Files reviewed: 6/6 changed files
  • Comments generated: 6

Comment thread repository_test.go
Comment thread archive_test.go
Comment thread archive.go
Comment on lines +58 to +61
return arch.Archive(ctx, &transport.ArchiveRequest{
Args: args,
Progress: o.Progress,
})
Comment thread archive.go
Comment thread archive.go
Comment thread repository.go Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 7

Comment thread archive.go
Comment thread archive.go
Comment thread repository.go Outdated
Comment thread repository.go Outdated
Comment thread repository_test.go Outdated
Comment thread archive_test.go
Comment thread internal/archive/archive.go

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 7/7 changed files
  • Comments generated: 7

Comment thread repository.go Outdated
Comment thread repository.go
Comment thread archive.go
Comment thread archive.go
Comment thread plumbing/transport/pack_stream.go Outdated
Comment thread archive_test.go
Comment on lines +117 to +121
name: "remote allows custom format",
url: "file:///tmp/repo.git",
opts: &ArchiveOptions{Format: "tar.xz", Treeish: "master"},
wantErr: "",
},

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File here is the transport and our client resolved the url to our default file transport. Custom transports can be added via ClientOptions. If a transport is not supported or valid, we error during the construction of client.

Comment thread archive_test.go Outdated
@aymanbagabas aymanbagabas force-pushed the archive-git branch 2 times, most recently from ff3e445 to c1dafd7 Compare May 9, 2026 18:34
@aymanbagabas aymanbagabas requested a review from Copilot May 9, 2026 18:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Comments suppressed due to low confidence (2)

archive_test.go:136

  • The subtests run with t.Parallel() but capture the range variable tt from the outer loop. Rebind tt := tt inside the loop before t.Run to avoid all subtests using the last case / racing.
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			t.Parallel()
			_, err := ArchiveRemote(tt.url, tt.opts)
			if tt.wantErr == "" {

archive_test.go:213

  • The subtests run with t.Parallel() but capture the range variable tt from the outer loop. Rebind tt := tt inside the loop before t.Run (this also ensures each subtest uses the intended tt.opts pointer).
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			t.Parallel()

			rc, err := ArchiveRemote(repoURL, tt.opts)
			require.NoError(t, err)
  • Files reviewed: 7/7 changed files
  • Comments generated: 5

Comment thread repository.go
Comment thread repository_test.go
Comment thread repository_test.go
Comment thread archive_test.go
Comment thread archive_test.go
Comment on lines +151 to +156
base := t.TempDir()
repoFS := test.PrepareRepository(t, fixtures.Basic().One(), base, "basic.git")
repoPath, err := filepath.Abs(repoFS.Root())
require.NoError(t, err)
repoURL := "file://" + repoPath

Comment thread archive.go

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit on defer sess.Close(). Apart from that, please rebase it.

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aymanbagabas thanks for working on this. 🙇

@pjbgf pjbgf merged commit 5f8076d into go-git:main May 13, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants