Skip to content

plumbing: transport, add git-upload-archive support#1986

Merged
pjbgf merged 20 commits into
go-git:mainfrom
aymanbagabas:transport-archive
Apr 24, 2026
Merged

plumbing: transport, add git-upload-archive support#1986
pjbgf merged 20 commits into
go-git:mainfrom
aymanbagabas:transport-archive

Conversation

@aymanbagabas

Copy link
Copy Markdown
Member

This PR adds support for the git-upload-archive service to the transport layer, allowing fetching of archives from remote git repositories over SSH, HTTP, and local file transports.

Background

The git-upload-archive protocol command is used by git archive --remote to fetch archives from remote repositories without cloning them. This is useful for CI/CD pipelines, release automation, and tools that need specific files from a repo.

Changes

Client-Side Types

Archiver interface - Sessions that implement this can perform archive operations:

type Archiver interface {
    Archive(ctx context.Context, req *ArchiveRequest) (io.ReadCloser, error)
}

ArchiveRequest - Client archive request:

  • Args []string: arguments sent as "argument " pkt-lines (format, prefix, tree-ish, paths)
  • Progress sideband.Progress: optional progress callback for status messages

Server-Side Types

UploadArchiveRequest - Server archive request configuration:

  • AllowUnreachable bool: disables security restrictions allowing arbitrary refs

UploadArchive function - Server-side handler for the git-upload-archive protocol.

Backend Support

SSH/TCP/Unix transports - Backend now dispatches to UploadArchive:

case transport.UploadArchiveService:
    return transport.UploadArchive(...)

HTTP - Left for future protocol-v2 work (HTTP archive requires command-based protocol)

Wire Protocol

Implements the git-upload-archive protocol:

  • Client sends "argument " pkt-lines for format, prefix, tree-ish, etc.
  • Server sends "ACK" or "NACK "
  • Archive data streamed via sideband (channel 1 = data, channel 2 = progress)

Supported Formats

Supported formats parsed from args:

  • tar, tar.gz, tgz, zip
  • --prefix=path/ adds prefix to archive entries
  • --list lists available archives
  • Tree-ish (commit, tag, branch) specifies what to archive

Testing

  • TestArchive_Tar, TestArchive_TarGz, TestArchive_Zip (file transport)
  • TestGitTransport_Archive (git daemon transport)
  • TestSSHTransport_Archive (SSH transport)

All tests verify archive contents against git's output.

Security

The uploadArchive.allowUnreachable config option controls whether arbitrary SHA-1 expressions are allowed. By default, only ref targets (e.g., v1.0, main) and ref:path sub-tree syntax are allowed.

Follow-up Work

A future PR will add porcelain-level support:

  • git.Archive() convenience method
  • ArchiveOptions with parametrized fields:
    • Format: tar, tar.gz, zip
    • Prefix: prefix path for archive entries
    • Remote: tree-ish reference to archive
    • Paths: optional path filter
    • Additional git archive --remote flags

Usage Example

sess, _ := client.Dial("[email protected]:go-git/go-git.git")
req := &transport.ArchiveRequest{
    Args: []string{"--format=tar.gz", "--prefix=go-git/", "v5.12.0"},
}
rc, _ := sess.Archive(ctx, req)
defer rc.Close()
// Write rc to file or pipe to tar/gzip

Supersedes: #1985

@aymanbagabas aymanbagabas requested review from Copilot and pjbgf April 13, 2026 15:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Comment thread plumbing/transport/upload_archive.go Outdated
Comment on lines +27 to +36
// AllowUnreachable when true disables the default security restrictions
// and allows clients to use arbitrary SHA-1 expressions. By default,
// only direct ref targets (e.g. v1.0, main) and ref:path sub-tree
// syntax (e.g. v1.0:Documentation) are allowed.
//
// This serves as the default value. When the repository-level config
// uploadArchive.allowUnreachable is explicitly set, it always overrides
// this value (both true → false and false → true).
// See https://git-scm.com/docs/git-upload-archive
AllowUnreachable bool

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is usually derived from global/system config then overridden by repository config. Alternatively, we'd use config.LoadConfig.

cc/ @pjbgf

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with the current cfg, err := st.Config() which is already in place. As that's the safest approach to start with.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I will drop this option and add a comment to investigate this later and derive it from global/system config before repository config.

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aymanbagabas thanks for looking into this, here's some feedback:

Comment thread plumbing/transport/upload_archive.go Outdated
Comment on lines +27 to +36
// AllowUnreachable when true disables the default security restrictions
// and allows clients to use arbitrary SHA-1 expressions. By default,
// only direct ref targets (e.g. v1.0, main) and ref:path sub-tree
// syntax (e.g. v1.0:Documentation) are allowed.
//
// This serves as the default value. When the repository-level config
// uploadArchive.allowUnreachable is explicitly set, it always overrides
// this value (both true → false and false → true).
// See https://git-scm.com/docs/git-upload-archive
AllowUnreachable bool

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with the current cfg, err := st.Config() which is already in place. As that's the safest approach to start with.

Comment thread plumbing/transport/upload_archive.go Outdated
Comment thread plumbing/transport/upload_archive.go Outdated
Comment thread plumbing/transport/upload_archive.go Outdated
Comment thread plumbing/transport/upload_archive.go Outdated
return (mode | 0o777) &^ defaultUmask
}

func writeTarArchive(st storage.Storer, w io.Writer, tree *object.Tree, prefix string, pathFilter []string, modTime time.Time) error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are missing this:

Additionally the commit ID is stored in a global extended pax header if the tar format is used; it can be extracted using git get-tar-commit-id. In ZIP files it is stored as a file comment.

Comment thread plumbing/transport/upload_archive.go Outdated
Comment thread plumbing/transport/upload_archive.go Outdated
@aymanbagabas aymanbagabas requested a review from pjbgf April 15, 2026 03:29
@aymanbagabas aymanbagabas force-pushed the transport-archive branch 2 times, most recently from 5ef59da to cebfcdd Compare April 16, 2026 00:08
Comment thread internal/archive/archive.go Outdated
aymanbagabas and others added 19 commits April 24, 2026 15:28
Add git-upload-archive service support to transport layer. This allows
fetching archives (tarballs) from remote git repositories over various
transports (SSH, HTTP, and local file).

Expose new Archivable interface via transport.Session and implement:
- ArchiveRequest type for archive configuration (format, compression, prefix)
- UploadArchiveRequest/UploadArchive types for wire protocol
- Support for archive filters (tree and commit filters)
- Backend dispatch for SSH/TCP/Unix transports
- Tests for file, git, and SSH transports

HTTP backend support left for future protocol-v2 work.
The previous implementation was losing file permission details by converting
all files to either 0o644 or 0o755, discarding the actual git mode bits
including group-writable.

Added applyUmask() functions that extract Unix permission bits from the
git mode and apply the default umask (0o002) following canonical git's
tar.umask behavior. This preserves the original file permissions from
the tree being archived.

Also added tests to verify file permissions are correctly preserved in
both tar and zip archives.
@pjbgf pjbgf force-pushed the transport-archive branch from 6fa62ef to 4928734 Compare April 24, 2026 14:28

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aymanbagabas thanks for the great effort putting this together. 🙇🙇🙇

@pjbgf pjbgf merged commit 5b47ae8 into go-git:main Apr 24, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants