Conversation
Stack the current spack repository on top of a remote repository that contains pre-built packages. Essentially, this adds installed packages in remote read-only spack installations (on the same system) to the current spack repository. Motivation: In our group we have a set of pre-built packages that reside in their own spack repository and are available system-wide in a read-only fashion. Up until now there seemed to be no "proper" way to use these packages as dependencies for locally built specs in a seperate spack repository.
We have exactly the same problem at my site, but opted to use relocation of binaries instead of sym-linking software or stacking the db. To foster the discussion, what we have in mind is:
In practice a user should do something like this: # 1. Clone Spack and activate shell support
$ git clone https://github.com/spack/spack.git
# 2. Symlink or copy part of our system configuration (modules.yaml, packages.yaml, compilers.yaml)
# 3. Load the module(s) they want to re-use or modify
$ module load gcc hdf5
# 4. Reproduce the installation via spec.yaml
$ spack install --use-cache -f ${HDF5_SPEC_YAML}This won't exactly reuse the same software installed on the system but, on the other hand, it provides more isolation from it. For instance, you could decide for any reason to remove a package and that won't affect somebody that installed it in his own instance of Spack. |
* updated section and level
lib/spack/spack/cmd/stack.py
Outdated
| level = "long" # TODO: re-check what 'level' is supposed to mean | ||
|
|
||
|
|
||
| def setup_parser(sp): |
There was a problem hiding this comment.
You really do like your 2 letter variable names^^
I feel like there should be a "check remote" command, it should essentially just check the existence of the original file for each symlink. A poor-man implementation of this might be a simple: Remove all links and perform again. There should also probably be a section on "what probably happened if stuff randomly breaks" in the help section
There was a problem hiding this comment.
Hmm.. in what way would it be different from just re-stacking? All symlinks with the same hash will be (deleted and) re-created.
I have yet to check what happends during reindexing the repository with dangling symlinks.
Maybe it would be better to add a "remove all dangling symlinks"-phase to reindex instead?
lib/spack/spack/cmd/stack.py
Outdated
| setup_parser.parser = sp | ||
|
|
||
| sp.add_argument( | ||
| '-v', '--verbose', action='store_true', default=False, |
There was a problem hiding this comment.
General question: should there be a difference between -d and -v?
| """ | ||
| config = spack.config.get_config("config") | ||
|
|
||
| # NOTE: This has to be kept in sync with spack/store.py! |
There was a problem hiding this comment.
add a description of why this is necessary? Come to think of it: Why is it necessary at all? Or are you referring to the fact that the paths in the remote must look the same as in the local spack?
There was a problem hiding this comment.
It is necessary to create the directory layout the same way as the default spack - so if there ever was switch away from YamlDirectoryLayout this file has to be kept in sync.
| # NOTE: This has to be kept in sync with spack/store.py! | ||
| layout = spack.directory_layout.YamlDirectoryLayout( | ||
| canonicalize_path(osp.join(remote, 'opt', 'spack')), | ||
| hash_len=config.get('install_hash_length'), |
There was a problem hiding this comment.
this would use the hash_len of the current spack installation. What happens if the remote one has a different setting?
There was a problem hiding this comment.
Well, we are on the same machine so I suppose there would be system-wide hashlen-setting that gets loaded automatically via config-machinery. If someone cares enough to change the default he will care enough to make it system-wide, so it should not be our concern imho.
There was a problem hiding this comment.
Ack, but if someone did this it tell the user what's the problem and not have some weird Traceback ending somewhere deep inside spack
| layout = spack.directory_layout.YamlDirectoryLayout( | ||
| canonicalize_path(osp.join(remote, 'opt', 'spack')), | ||
| hash_len=config.get('install_hash_length'), | ||
| path_scheme=config.get('install_path_scheme')) |
| if osp.exists(tgt): | ||
| if osp.islink(tgt): | ||
| os.remove(tgt) | ||
| else: |
There was a problem hiding this comment.
there should probably be a flag on how to handle this case. In general it should still be a valid stacked spack if it has some of the remote's packages locally, but this might indicate that something is wrong -> allow for raising an error here?
There was a problem hiding this comment.
Well, if the hash is the same it should be the same installed spec, just in a different location, so we can just change the link and if the file is installed locally we do not overwrite it and print a warning - what would you want the switch to do, exactly?
There was a problem hiding this comment.
basically just: always take existing link, always take new link
lib/spack/spack/cmd/stack.py
Outdated
| continue | ||
| fs.mkdirp(osp.dirname(tgt)) | ||
| if verbose: | ||
| tty.debug("Linking {} -> {}".format(src, tgt)) |
There was a problem hiding this comment.
I'm not sure if this shouldn't be an info and debug information should be something like the tgt and src paths above
lib/spack/spack/cmd/stack.py
Outdated
| num_packages += 1 | ||
|
|
||
| if verbose: | ||
| tty.info("Added {} packages from {}".format(num_packages, remote)) |
There was a problem hiding this comment.
This should always be reported, even if it's just to make sure that the user knows that something did happen.
lib/spack/spack/cmd/stack.py
Outdated
|
|
||
| spack.store.db.reindex(spack.store.layout) | ||
|
|
||
| if args.verbose: |
| verbose=args.verbose) | ||
| for remote in args.remotes)) | ||
|
|
||
| spack.store.db.reindex(spack.store.layout) |
There was a problem hiding this comment.
Add comment on why this is necessary. Also add an optional info-message?
|
@alalazo If I understand you correctly the user "only" copies/symlinks part of the configuration data and then builds the software on his own? Or does each user copy the binaries? If so, wouldn't the I guess what I want the |
* removed `--verbose` argument * changed all verbose statements to call `tty.debug`, left some info statements * Added `-n`/`--no-stack-if-exists` argument that will ignore present symlinks if they exists. The default is to point all existing symlinks to the present remote repository.
Correct, but with two caveats:
We are starting to experiment with this, and the idea is to have the workflow stable and in production in July. Note that Spack has already every feature needed to support it (and at the moment the experiments we are conducting are proceeding without issues).
No, Spack uses |
Since our installs are rather large, it makes sense for people to use the cached version without copying. However, relocating packages in the event the cache gets outdated seems to be a desirable feature.
Ah, I was not aware of this functionality.. Will you push your workflow upstream once it is ready? (So far I only found Up until then, |
Just fyi, the command that expose this functionality is |
Creation of symlink was attempted even though target was already present.
|
Superceded by #8014 |
Stack the local spack repository on top of a remote repository that contains pre-built packages, thereby avoiding building packages twice.
Essentially, this adds installed packages in remote read-only spack installations (on the same system) to the current spack repository.
Motivation:
In our group we have a set of pre-built packages that reside in their own spack repository and are available system-wide in a read-only fashion. Up until now there seemed to be no "proper" way to use these packages as dependencies for locally built specs in a seperate spack repository.
Especially, this is useful when debugging new
package.pys against the installed set of packages. Previously the whole spack database had to be built a second time.Implementation:
We symlink all installed specs from the remote repository into the local
opt/spack-path and reindex.Because everything is linked via
RPATHs, the remote package will have to reside where they are (hence no option to use hardlinks right now). When compiling packages in the local spack repository, theRPATHs might point to symlinks, however, I do not expect this to be an issue.Comments?