mirror create --all can mirror everything#12940
Merged
tgamblin merged 67 commits intospack:developfrom Oct 26, 2019
Merged
Conversation
…ring copies of the same resource
… to store, it just downloads each spec that was provided to it
…nk of code can move out of the disable_compiler_existence_check context_manager
… temporary failure, so this adds a small number of retries for each spec that is mirrored
… with a concrete version
… mirror vs. using a per-package directory which will allow multiple packages to refer to the same resource and have it be reused
…s') and also a case where I was using list.append instead of list.extend
…cher has no digest
… a mirror_path property
…r-package mirror directory by default
…ry being the most-preferred)
…olidate this with Spec._spec_hash)
…res concretization
…ackage do_stage/fetch/patch functions do actually require a concrete spec
…te archive mirror id to use a directory based on the digest rather than a hash
…an argparse allows for with its 'create_mutually_exclusive_group' function)
… '--versions-per-spec'; specs are no longer concretized at all unless '--dependencies' is also specified
…d the user doesn't specify '--all'
mirror --all to fetch all downloadable resources
tgamblin
approved these changes
Oct 26, 2019
mirror --all to fetch all downloadable resourcesmirror create --all to mirror all packages
mirror create --all to mirror all packagesmirror create --all can mirror everything
mirror create --all can mirror everythingmirror create --all can mirror everything
jrmadsen
pushed a commit
to jrmadsen/spack
that referenced
this pull request
Oct 30, 2019
Support mirroring all packages with `spack mirror create --all`.
In this mode there is no concretization:
* Spack pulls every version of every package into the created mirror.
* It also makes multiple attempts for each package/version combination
(if there is a temporary connection failure).
* Continues if all attempts fail. i.e., this makes its best effort to
fetch evrerything, even if all attempts to fetch one package fail.
This also changes mirroring logic to prefer storing sources by their hash
or by a unique name derived from the source. For example:
* Archives with checksums are named by the sha256 sum, i.e.,
`archive/f6/f6cf3bd233f9ea6147b21c7c02cac24e5363570ce4fd6be11dab9f499ed6a7d8.tar.gz`
vs the previous `<package-name>-package-version>.tar.gz`
* VCS repositories are stored by a path derived from their URL,
e.g. `git/google/leveldb.git/master.tar.gz`.
The new mirror layout allows different packages to refer to the same
resource or source without duplicating that download in the
mirror/cache. This change is not essential to mirroring everything but is
expected to save space when mirroring packages that all use the same
resource.
The new structure of the mirror is:
```
<base directory>/
_source-cache/ <-- the _source-cache directory is new
archive/ <-- archives/resources/patches stored by hash
00/ <-- 2-letter sha256 prefix
002748bdd0319d5ab82606cf92dc210fc1c05d0607a2e1d5538f60512b029056.tar.gz
01/
0154c25c45b5506b6d618ca8e18d0ef093dac47946ac0df464fb21e77b504118.tar.gz
0173a74a515211997a3117a47e7b9ea43594a04b865b69da5a71c0886fa829ea.tar.gz
...
git/
OpenFAST/
openfast.git/
master.tar.gz <-- repo by branch name
PHASTA/
phasta.git/
11f431f2d1a53a529dab4b0f079ab8aab7ca1109.tar.gz <-- repo by commit
...
svn/ <-- each fetch strategy has its own subdirectory
...
openmpi/ <-- the remaining package directories have the old format
openmpi-1.10.1.tar.gz <-- human-readable name is symlink to _source-cache
```
In addition to the archive names as described above, `mirror create` now
also creates symlinks with the old format to help users understand which
package each mirrored archive is associated with, and to allow mirrors to
work with old spack versions. The symlinks are relative so the mirror
directory can still itself be archived.
Other improvements:
* `spack mirror create` will not re-download resources that have already
been placed in it.
* When creating a mirror, the resources downloaded to the mirror will not
be cached (things are not stored twice).
This comment has been minimized.
This comment has been minimized.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add support for mirroring all packages with
spack mirror create --all. In this mode there is no concretization: Spack pulls every version of every package into a mirror. It makes multiple attempts for each package/version combination (in case there is a temporary build failure for a given version) but also continues if all attempts fail (if there is a permanent build failure for a given version).This includes an update to mirroring logic to prefer storing sources to a name that is derived entirely from the source and is unique (i.e. not including the package name in the cached source). For example
where before it might be <package-name>-package-version>.tar.gz)This allows different packages to refer to the same resource or source without duplicating that download in the mirror/cache. This change is not essential to mirroring everything but is expected to save space when mirroring many versions of packages which all use the same resource.
The new structure of the mirror is:
When creating a mirror with archive names as described above, the mirror creation logic now also creates symlinks with the old format in order to help users understand which package each mirrored archive is associated with; the symlinks are relative so the mirror directory can still itself be archived.
Other changes include:
spack mirror createwill not re-download resources that have already been placed in itTODOs:
--all-versionsoption which you can use to collect all versions/dependencies of a set of root packages, which offers an alternative to caching everything) (possibly in a later PR) I presume that users likely don't want every version of every package by default, but this is difficult to determine up front without concretizing packages (which would be expensive when we are talking about all packages). If a decent approximate solution can be found I'd like to add it in as an option (but still also allow downloading all versions as an option).