Skip to content

Remote package repositories#50650

Merged
haampie merged 1 commit intodevelopfrom
hs/feature/git-package-repos
Jun 3, 2025
Merged

Remote package repositories#50650
haampie merged 1 commit intodevelopfrom
hs/feature/git-package-repos

Conversation

@haampie
Copy link
Copy Markdown
Member

@haampie haampie commented May 27, 2025

Adds support for Spack package repositories from external git repos.

repos:
  my_repo:
    git: https://example.com/example/example.git
    destination: ~/example  # optional
    paths:  # optional
    - subdir/spack_repo/example/x
    - subdir/spack_repo/example/y

If destination is not configured, Spack clones the git repo to ~/.spack/git_repos/{hash(repository)}.


Package repositories can put a file spack-repo-index.yaml in their root with relative paths to roots of package repositories (i.e. directories that contain repo.yaml):

repo_index:
  paths:
  - subdir/spack_repo/example/x
  - subdir/spack_repo/example/y

so users don't have to put that under paths in config. The spack-repo-index.yaml is simply a list of paths under repo_index. The idea is to avoid duplicating data such as "namespace" already specified in <git repo>/<repo path>/repo.yaml, to avoid that there are two sources of truth that go out of sync.

Further, paths in user config takes precedence, which allows users to enable specific package repositories in case the git monorepo provides multiple.


Remote package repositories are cloned/initialized in:

  • spack.main
  • spack repo add

This is process safe due to the lock in $SPACK_USER_CACHE_PATH/package-repository.lock; only one process can clone at a time.


The spack repo add command has a few new flags and a new positional arg:

spack repo add [-h] [--name NAME] [--repo-path REPO_PATH] [--scope ...] path_or_repo [destination]

The signature is similar to the familiar git clone <repository> <destination>.

The path_or_repo argument is detected as a remote git repo if it contains a : not preceded by a /, which is what git does as well. If in the future we would support package repositories other than local file paths and remote git repos, we can resolve ambiguities with a new flag [--git | --path | --<other-type>], but this is currently unnecessary.

The positional destination argument allows users to pick their own clone path, and only applies in case of git repos.

The flag --repo-path corresponds to repos:<name>:paths and can be repeated to select specific package repositories in a git monorepo, and is also required if the git repo does not provide a spack-repo-index.yaml file in its root. Spack will never scan for repo.yaml files recursively, it only relies on spack-repo-index.yaml and --repo-path for package repository roots inside a git repo.

The flag --name applies to the config name/key in repos.yaml under repos:<name>. This flag is optional in the common case of adding just one package repository: it is set to the package repository's namespace. In case of monorepos with multiple package repositories, the --name flag is required.

@spackbot-app spackbot-app bot added commands core PR affects Spack core functionality defaults labels May 27, 2025
@haampie haampie force-pushed the hs/feature/git-package-repos branch from 7406385 to fb3085c Compare May 28, 2025 09:23
@haampie haampie force-pushed the hs/feature/git-package-repos branch 5 times, most recently from 67ccd6a to c7c6178 Compare May 28, 2025 13:41
@spackbot-app spackbot-app bot added the tests General test capability(ies) label May 28, 2025
@haampie haampie force-pushed the hs/feature/git-package-repos branch 5 times, most recently from e966548 to abe0330 Compare May 28, 2025 14:05
@haampie haampie requested a review from Copilot May 28, 2025 14:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for defining and managing remote git-based Spack package repositories alongside local ones.

  • Defines new RepoDescriptor classes to parse, initialize, and construct local and remote repos.
  • Extends the YAML schema to allow git objects with repository, destination, and repo_path.
  • Updates spack repo add/list commands, command-line flags, and shell completions for the new options.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
share/spack/spack-completion.fish Added --name and --repo-path options for spack repo add, updated help text.
share/spack/spack-completion.bash Included --name and --repo-path in bash completion for spack repo add.
lib/spack/spack/test/repo.py New tests for parsing local and git-based repo descriptors.
lib/spack/spack/schema/repos.py Extended repos schema to support remote git repository definitions.
lib/spack/spack/repo.py Introduced RepoDescriptor hierarchy and parsing logic for remote repos.
lib/spack/spack/main.py Integrated auto-fetching of remote repos in the main setup.
lib/spack/spack/cmd/repo.py Refactored repo add into _add_repo, added support for --name, --repo-path, and destination.
Comments suppressed due to low confidence (3)

lib/spack/spack/test/repo.py:587

  • No test covers the case where repo_path is provided as a list of strings. Add a test that passes repo_path: ['one','two'] to verify descriptor.relative_repo_paths matches both entries.
descriptor = spack.repo.parse_config_descriptor(

lib/spack/spack/test/repo.py:568

  • There is no test for the default destination path when destination is omitted. Add a test that omits destination and asserts that descriptor.destination is set to a hash-based cache directory.
def test_parse_config_descriptor_git_1(tmp_path: pathlib.Path):

share/spack/spack-completion.fish:2774

  • The completion for positional arguments (path_or_repo and destination) was removed. Add back directory completions for those positions, e.g.:
    complete -c spack -n '__fish_spack_using_command_pos 0 repo add' -f -a '(__fish_complete_directories)'
    and a similar line for the destination argument.
set -g __fish_spack_optspecs_spack_repo_add h/help name= repo-path= scope=

@tgamblin
Copy link
Copy Markdown
Member

High level thoughts on the schema:

  1. We already have attributes that can determine how to fetch something: git, url, svn, etc. from the fetchers. I think it would be good to make this consistent. So, instead of the schema above, sticking to the convention users already know from version() makes more sense to me:
repos:
  my_repo:
    # you can run even these attrs through `from_kwargs` in `fetch_strategy.py`
    git: https://example.com/example/example.git
    # optional (and not necessarily all implemented in this PR)
    branch: ...
    tag: ...
    commit: ...

    # local_path might be more descriptive but I guess destination is ok
    destination: 
    destination: ~/example  # optional

The way repo_path currently works seems good to me, though shouldn't it be repo_paths?

On spack-repo-index.yaml -- the schema seems a bit tightly constrained and isn't symmetric with repos.yaml. I think this works better:

repo_index:
     paths:
         - subdir/spack_repo/example/x
         - subdir/spack_repo/example/y

now paths: mirrors repo_paths:, and we have room to add to the repo_index schema in the future.

@tldahlgren tldahlgren added this to the v1.0.0 milestone May 30, 2025
@haampie haampie force-pushed the hs/feature/git-package-repos branch 4 times, most recently from 7b2e675 to a329d74 Compare June 2, 2025 16:09
@haampie haampie changed the title [wip] remote package repositories Remote package repositories Jun 2, 2025
@haampie haampie force-pushed the hs/feature/git-package-repos branch from a329d74 to ad14202 Compare June 2, 2025 16:18
@haampie haampie requested a review from Copilot June 2, 2025 16:19
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for cloning and registering Spack package repositories from remote git URLs, including selective sub-repo paths and index file discovery. Key changes:

  • Extend repos.yaml schema and parse_config_descriptor to accept git URLs with optional destination and repo_paths.
  • Implement RemoteRepoDescriptor for fetching, indexing, and constructing remote repos, and integrate into spack.main and spack repo commands.
  • Update CLI (spack repo add/remove/list), completions (fish/bash), and tests to cover the new functionality.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
share/spack/spack-completion.fish Updated repo add completion description for plural repositories
share/spack/spack-completion.bash Added --name and --repo-path options to bash completion
lib/spack/spack/test/repo.py Added tests for parsing git vs. local descriptors and index file
lib/spack/spack/schema/repos.py Expanded JSON schema to support git descriptor objects
lib/spack/spack/repo.py Introduced RemoteRepoDescriptor, lock helper, and index reader
lib/spack/spack/main.py Automatically initialize remote repos in the main entry point
lib/spack/spack/cmd/repo.py Extended CLI parser and logic for git-based repo add/remove
Comments suppressed due to low confidence (1)

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Jun 2, 2025

Should repo_paths then just be called paths for symmetry?

@haampie haampie force-pushed the hs/feature/git-package-repos branch 2 times, most recently from f7470d0 to 8e8707d Compare June 2, 2025 17:47
@haampie haampie force-pushed the hs/feature/git-package-repos branch from 8e8707d to aa41bd2 Compare June 3, 2025 06:29
@haampie haampie merged commit 0b8d9f9 into develop Jun 3, 2025
30 of 34 checks passed
@haampie haampie deleted the hs/feature/git-package-repos branch June 3, 2025 07:20
Copy link
Copy Markdown
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

kshea21 pushed a commit to kshea21/spack that referenced this pull request Jun 18, 2025
Adds support for Spack package repositories from external git repos.

```yaml
repos:
  my_repo:
    git: https://example.com/example/example.git
    destination: ~/example  # optional
    paths:  # optional
    - subdir/spack_repo/example/x
    - subdir/spack_repo/example/y
```

If `destination` is not configured, Spack clones the git repo to `~/.spack/git_repos/{hash(repository)}`.

Package repositories can put a file `spack-repo-index.yaml` in their root with relative paths to roots of package repositories (i.e. directories that contain `repo.yaml`):

```yaml
repo_index:
  paths:
  - subdir/spack_repo/example/x
  - subdir/spack_repo/example/y
```

so users don't have to put that under `paths` in config. The `spack-repo-index.yaml` is simply a list of paths under `repo_index`. The idea is to avoid duplicating data such as "namespace" already specified in `<git repo>/<repo path>/repo.yaml`, to avoid that there are two sources of truth that go out of sync.

Further, `paths` in user config takes precedence, which allows users to enable specific package repositories in case the git monorepo provides multiple.

Remote package repositories are cloned/initialized in:
* `spack.main`
* `spack repo add`

This is process safe due to the lock in `$SPACK_USER_CACHE_PATH/package-repository.lock`; only one process can clone at a time.

The `spack repo add` command has a few new flags and a new positional arg:

```
spack repo add [-h] [--name NAME] [--path PATH] [--scope ...] path_or_repo [destination]
```

The signature is similar to the familiar `git clone <repository> <directory>`.

The `path_or_repo` argument is detected as a remote git repo if it contains a `:` not preceded by a `/`, which is what git does as well. If in the future we would support package repositories other than local file paths and remote git repos, we can resolve ambiguities with a new flag `[--git | --path | --<other-type>]`, but this is currently unnecessary.

The positional `destination` argument allows users to pick their own clone path, and only applies in case of git repos.

The flag `--path` corresponds to `repos:<name>:paths` and can be repeated to select specific package repositories in a git monorepo, and is also required if the git repo does not provide a `spack-repo-index.yaml` file in its root. Spack will never scan for `repo.yaml` files recursively, it only relies on `spack-repo-index.yaml` and `--repo-path` for package repository roots inside a git repo.

The flag `--name` applies to the *config* name/key in `repos.yaml` under `repos:<name>`. This flag is optional in the common case of adding just one package repository: it is set to the package repository's namespace. In case of monorepos with multiple package repositories, the `--name` flag is required.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands core PR affects Spack core functionality defaults shell-support tests General test capability(ies)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants