-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Discussion: Move built-in packages to a separate repository #47480
Description
Supersedes #1773
@tgamblin @becker33 @haampie @alalazo @psakievich @citibeth
Contents
Summary
The Spack repository contains a significant number of built-in package recipes and associated files that are needed for installing software. This discussion proposes to move those files into a separate repository and modify the core software to support the change.
Task Checklist
The following is a checklist of the high-level tasks described in Approach. Each task should correspond to an approach subsection.
- Explore Relevant Features of Other Products (@alalazo, @scheibelp)
- Document User Stories or Use Cases (@scheibelp, @becker33, @zackgalbreath )
- Move Build Systems into BuiltIn (@haampie)
- Assess Untangling Unit Tests (@tldahlgren)
- Untangle Compiler Wrappers (@alalazo)
- Support Repository Compatibility (@becker33, @zackgalbreath )
- Determine Support for Bootstrapping (@becker33)
- Assess Untangling GitHub Actions and CI (@zackgalbreath/Kitware)
- Separate the Builtin Repository (@zackgalbreath/Kitware, @haampie)
- Support and Access Package Repositories (@haampie )
- Confirm Package-Related Commands Work
- Add Repository CLI Commands .. tied to use cases; defer to step 2
Back to Contents
Rationale
Spack packages specify the versions, options, and processes required for installing and optionally testing software. Their directories may contain additional files, such as patches and custom tests. For some, the number of associated files is a significant hurdle while package drift is a key issue for others. This change is also considered by the community as a requirement of Spack 1.0.
With over 8,340 packages, the number of files associated with built-in packages currently comprise 86% of a fresh clone of the Spack repository. Embedding the ever growing number of packages in the same repository as the core software creates a burden for some installation environments, especially those with space and or i-node constraints.
Another problem with maintaining built-in packages in the same repository is that packages generally evolve independently, which can be problematic for projects that rely on a specific version (or commit) of a package (e.g., for software they've already installed) but want or need to get updates to the core Spack software. This "drift" in the package implementation can lead to the unanticipated and possibly undesirable installation of new versions of dependent software.
Maintaining packages in the same repository as the core software leads to problems for many Spack users. These problems range from excessive use of file space/i-nodes to the effects on project development associated with package implementation "drift".
Back to Contents
Description
While Spack already supports multiple package repositories (https://spack.readthedocs.io/en/latest/repositories.html), there are additional considerations beyond simply moving the built-in packages folder into a separate repository (https://docs.github.com/en/get-started/using-git/splitting-a-subfolder-out-into-a-new-repository) and configuring the new repository. Hence, the need for discussing issues here include:
- Which directories and files need to be moved?
- How will PR checks be affected?
- How will spack fetch packages for bootstrapping? Where?
- What is the process for fetching and configuring the package repository?
- How will updating separate repositories be managed and changes synchronized?
- What changes, if any, are needed to the many commands that work on the built-in repository?
Back to Contents
Affected Directories and Files
Which directories and files need to be moved?
At a minimum the new package repository shall contain the build systems ($SPACK_ROOT/lib/spack/spack/build_systems) and built-in packages (under $SPACK_ROOT/var/spack/repos/builtin/packages). Additional files may need to be moved depending on the approach chosen for handling CI checks (described in the next section).
Mock packages are not to be moved to the new repository since they are integral to Spack's unit tests.
Back to Contents
PR Checks
How will PR checks be affected?
Pull requests (PRs) trigger a number of checks related to packages. One of the checks uses GitLab CI pipelines to rebuild modified packages to maintain the build cache. Our plan is to migrate these pipelines to the new packages repository (away from core Spack). Unit tests in the core Spack repo will be responsible for properly exercising the package API to maintain compatibility between these two repos.
Spack performs additional checks to include style and audits. Those relevant to packages-only would need to be moved to the new repository; while common ones would need to be copied.
At some point in the not-too-distant future we hope to trigger running stand-alone tests for modified packages. This check could continue as originally envisioned to be part of the CI workflow approach above or could be separated (assuming it can be triggered after any build cache updates or any non-build cache builds that might be added by @bernhardkaindl as discussed in the 2024 Nov 6 Technical Steering Committee meeting).
Back to Contents
Bootstrapping
How will spack fetch packages for bootstrapping? Where?
Spack has several sets of requisite software that must be bootstrapped:
clingo(https://spack.readthedocs.io/en/latest/getting_started.html#bootstrapping-clingo);gnupg; andpatchelf.
Back to Contents
Package Repository
What processes will be used for fetching and configuring the package repository?
Spack will need the ability to configure the package repository and a process for fetching packages. Spack currently supports multiple package repositories through entries in a repos.yaml file (see the docs at https://spack.readthedocs.io/en/latest/repositories.html). However, those entries are required to be on the file system.
If that mechanism is retained, then it will need to support remote package repositories (specified with URLs that can include a commit hash) and the default configuration ($SPACK_ROOT/etc/spack/defaults/repos.yaml) modified to point to the new package repository. The relevant packages would need to at least be fetched from that location and likely cached locally.
Alternatively, spack could support a package registry with a modified or alternate representation for the configuration (such as that used by cargo, which supports named registries with URLs).
Progress:
- New repository: https://github.com/spack/spack-packages
- sync packages #50322
Back to Contents
Repository Updates and Synchronization
How will updating separate repositories be managed and changes synchronized?
There are changes to Spack core that require changes to packages and or, if moved, CI stacks. Some features, such as adding support for stand-alone tests, are optional while others, such as a change in syntax (e.g., maintainers) or the schema for configuring CI stacks, are not. Mandatory changes to packages resulting from changes to the core require coordination of updates to both repositories.
One approach to managing updates and synchronization is to include the package repository as a git submodule of the Spack (core) repository. While this approach makes it somewhat easier to synchronize changes to both repositories (since you can clone the project repository under the core repository), its use of git is more advanced than many contributors may be accustomed (https://git-scm.com/book/en/v2/Git-Tools-Submodules).
An alternative is to rely on some form of dependency management. (TBD: What would this look like? Would we support synchronization features related to the repository configuration? CI checks?)
Back to Contents
Package Commands
What changes, if any, are needed to the many commands that work on the built-in repository?
Commands that expect to operate on built-in packages shall continue to do so, which we'll need to be confirmed (https://spack.readthedocs.io/en/latest/command_index.html#command-reference). These include package creation (e.g., create), query (e.g., find, info), build (e.g., gc), and developer (e.g., blame, pkg).
Approach
There is a significant amount of work needed to accomplish a well designed, implemented, and supported split for the repository. We split the work into more manageable tasks. The high-level tasks, not in priority order, are described here.
Explore Relevant Features of Other Products
There are several tools having relevant features that we would like explored in terms of processing. These can include:
- homebrew (https://github.com/Homebrew/brew)
- Nix (https://github.com/nixos)
- pixi (https://github.com/prefix-dev/pixi)
- rye (https://github.com/astral-sh/rye)
- uv (https://github.com/astral-sh/uv)
And possibly:
- apt (https://salsa.debian.org/apt-team/apt)
- cargo (https://github.com/rust-lang/cargo)
- npm (https://github.com/npm)
Back to Contents
Document User Stories or Use Cases
It would be helpful to everyone working on these tasks to have a better understanding of the use cases this work is addressing, to include the uses for and how people expect to utilize the features being implemented here.
Anticipated Use Cases
1. pinned Spack + pinned Packages
- Intended audience: users that greatly value stability and are okay using older versions of software.
- Pros:
- Most stable
- Cons:
- Cannot install new versions of packages as they are released
- Cannot benefit from new features and performance improvements as they are added to Spack
- CI requirements:
- one-time thorough installation testing (and build cache population) for packages of interest when releases are created.
2. pinned Spack + floating Packages
- Intended audience: users that value stability but want to keep their dependencies as up-to-date as possible.
- Pros:
- Can quickly and easily install new versions of packages.
- Minimize exposure to temporary regressions in the core Spack code base.
- Cons:
- “Early adopter” for packages: willing to accept temporary regressions in package definitions.
- CI requirements:
- CI pipelines using a recently verified version of Spack to test relevant, proposed changes to the Packages repository before they are merged. This is essentially our existing GitLab CI pipelines, which will be relocated to the Packages repo.
3. floating Spack + pinned Packages
- Intended audience: users with pinned dependency versions that are eager to adopt new functionality and performance improvements for Spack.
- Pros:
- Access to latest Spack features & improvements
- Stable package recipes
- Cons:
- Can’t easily install new versions of dependencies
- Willingness to accept temporary regressions in core Spack
- CI requirements:
- Unit tests in Spack that thoroughly exercise the Package API.
- More generally, Spack's existing pytest suite.
4. floating Spack + floating Packages
- Intended audience: users that want cutting edge dependency versions and the latest features Spack has to offer.
- Pros:
- access to both the latest features in Spack, as well as most up-to-date versions of package recipes.
- Cons:
- willingness to accept some instability as regressions are addressed in either repo.
- CI requirements:
- This use case will automatically benefit from CI supporting use cases 2) and 3).
- Will additionally require acceptance tests where we periodically (weekly?) do a “rebuild everything” with bleeding edge versions of both repos.
- If these acceptance tests pass, we bump the pinned version of Spack used for testing in the packages repo (and vice versa, if we end up deciding to test some packages from core Spack).
- When acceptance testing fails we will try to fix it quickly to resume automated snapshot releases.
Back to Contents
Move Build Systems into BuiltIn Package Repository
Determine what needs to be moved from the core's build systems and into the builtin package repository. This task will include any restructuring and refactoring of existing modules and updates/additions to the corresponding unit tests.
Since these changes will impact how packages utilize the core's API, packages that violate the new API will need to be identified and their upgrade status tracked. One proposed option is to maintain a file the records individual packages that violate the new API. One example that was given was dbcsr, which utilizes the protected (some say private) function self._if_ninja_target_execute().
See #47480 (comment) for a snapshot of imports that could be affected by this task.
Progress
- package api: drop wildcard re-export #48760
- builtin: remove redundant imports #48765
- spack.package: wrap llnl.util.tty #48793
- spack.package: re-export EnvironmentModifications / Prefix #48792
Back to Contents
Assess Untangling Unit Tests
Determine what aspects of unit tests are tied to packages in the builtin package repository (e.g., testing ecosystem) and which are specific to the core software. Come up with solutions to any issues such that the (remaining/revised) core unit testing process works after the builtin package repository is moved to a new GitHub repository.
Make as many changes as possible prior to the actual repository spit and create a plan for the remaining tasks.
Progress
- #48232 (Umbrella PR)
- RepoSplit/tests: update repo tests relying on builtin package repo to only use mock repos #48926
- RepoSplit/core: flag core uses of builtin #48927
- RepoSplit/tests: flag/update more unit tests away from relying on the builtin repository #48930.. has since been split into multiple independent PRs - Decide what to do with symlinks from
builtin.mocktobuiltinsee Turn compilers into nodes #45189 (comment) (see RepoSplit/tests: replace mock symlinks with builtin snapshots #50478)
Back to Contents
Untangle Compiler Wrappers
Compiler wrappers are part of the package hash so this work needs to be assessed to determine what should be part of the core and what should be move to the separate builtin package repository. Implement solutions to any issues that arise, which may include restructuring and refactoring existing modules (e.g., establishing a compiler interface in Spack core) and updates/additions to the corresponding unit tests.
- Move the
compiler-wrapperinto its own package - Try to make the compiler wrapper less coupled to Spack internals. In Turn compilers into nodes #45189 the
compiler-wrapperpackage does not depend on Spack's configuration, but it still sets a few environment variables that are needed by Spack. See discussion in Turn compilers into nodes #45189 (comment)
Warning: This task is dependent on maturing support for compilers as dependencies.
Back to Contents
Support Repository Compatibility
After the split, we anticipate maintaining the following independent versions:
- Spack version (semver,
major.minor.patch)- We expect to maintain the current cadence of two releases per year.
- Spack package API version (semver,
major.minor)- The package API includes the
spack.packagemodule, which is a small subset of Spack's scripting API. It also includes the structure of a repository (both on the filesystem and as Python modules). - The minor version is bumped if the package API is extended in a backward compatible way. For example: a new directive is added.
- The major version is bumped if the package API has a breaking change. For example: a directive is removed, a directive's function signature is changed in a breaking way, or the repository filesystem layout is changed in an incompatible way. We will strive to avoid major version bumps as much as possible.
- Each package repo will define the package API version for recipes contained within. If a repo specifies compatibility with package API version
1.3it means>=1.3and<2. - CI for the package repo will need to use an appropriate version of Spack (one that can understand this package API version).
- The package API includes the
- Spack package repo version (date + patch release, e.g.
2025-06.2)- Used for binary caches
Compatibility Guarantees
All versions of Spack within a major release stream (e.g. 1.x.y) will be able to understand the package API version supported by the initial major release (1.0.0). If at some point in the future Spack drops support for a package API version, this will require a major version bump for Spack.
Every Spack release will be able to read packages for the prior two years worth of Package Repo releases.
When a backward incompatible change to the package API is deemed necessary, we will release a final supporting version of Spack and the official Package repo before bumping the package API version.
Spack will define and announce what Package API version(s) it understands.
Back to Contents
Determine Support for Bootstrapping
As described in Bootstrapping](#bootstrapping), Spack relies on a core set of packages. Determine what is needed to support bootstrapping once the builtin package repository is no longer available.
Back to Contents
Assess Untangling GitHub Actions and CI
As discussed in PR Checks, Spack has GitHub actions that result in a variety of checks against the proposed changes. Determine which checks are common to the core and package repositories, specific to the core software, and specific to the *builtin package repository. Document where each belongs and what is needed to ensure they are being performed for the proper repositories.
Back to Contents
Separate the Builtin Repository
Once the groundwork is in place in terms of the API and a plan is in place regarding GitHub Actions and CI, create the new builtin repository and configure the GitHub Actions and CI accordingly. Be mindful of the Affected Paths.
Ensure that the build (and test) outputs continue to get reported in CDash at https://cdash.spack.io/index.php?project=Spack+Testing.
Progress:
- https://github.com/spack/spack-packages
- sync packages #50322
Back to Contents
Support and Access Package Repositories
One of the goals is to support the builtin package repository in the same manner as other package repositories. So there will be no use of git submodules or subsets. This means that not only will builtin packages be moved into and retrieved from a separate (GitHub) repository, but the mechanisms for staging, ensuring compatibility, moving package repositories to their local (cache) location, and accessing the packages from the local cache need to be designed and implemented.
Spack already supports caching of included remote files for environments but the goal is to design and implement the processes such that they can be used to support not only package repositories but mirrors and environments.
Preliminary ideas for specifying repositories have the following forms:
repos.yaml:
repos:
- url: https://github.com/my/package/repo.git
ref: 1.0.0
namespace: builtin
path: /path/to/local/repo
Repositories can also be overridden in a spack.yaml file:
spack:
repos::
- url: https://github.com/other/package/repo.git
ref: develop
namespace: bultin
path: /path/to/local/builtin/repo
If a path is not given, then it would default to one specified/calculated by Spack.
See also Package Repository](#package-repository).
Warning: This task relies on the work done in Support Repository Compatibility.
Back to Contents
Confirm Package-Related Commands Work
Confirm that all of the package-related commands continue to work as before with the separated builtin package repository.
Back to Contents
Add Repository CLI Commands
Design and implement new CLI (sub)commands for interacting with the repositories. At a minimum, there should be the ability to retrieve and update repositories. Preliminary ideas include:
$ spack repo get [<repo-name>] # <repo-name> assumes presence in `repos.yaml`; get all repos if no name provided
$ spack update <repo-name> [<version>] # Update the named repository to the specified version
$ spack update --advance-packages # Alternate proposal to update spack *and* get the latest package repo(s) commits
In the case of the first syntax for spack update, the <version> is optional because a repo with a reference to a branch (e.g., develop) would only need to retrieve the latest commits.
See #47480 (comment) for the alternate spack update syntax.
The update command would need to update repos.yaml if the version is different from that provided. The command should also ensure a new version is compatible with Spack's version.
Back to Contents
Additional information
- Cargo registries: https://doc.rust-lang.org/cargo/reference/registries.html
- Git submodules: https://git-scm.com/book/en/v2/Git-Tools-Submodules
- Spack's package repository configuration: https://spack.readthedocs.io/en/latest/repositories.html
Back to Contents
General information
- I have searched the issues of this repo and believe this is not a duplicate of an open issue
Back to Contents