Spack stacks: Allow combinatorial Environment specifications.#11057
Spack stacks: Allow combinatorial Environment specifications.#11057
Conversation
scheibelp
left a comment
There was a problem hiding this comment.
I'm still reading through this but have some questions for now.
| tty.msg('No root specs') | ||
| else: | ||
| tty.msg('Root specs') | ||
| # TODO: Change this to not print extraneous deps and variants |
There was a problem hiding this comment.
What specs/variants count as extraneous?
There was a problem hiding this comment.
Ones that are matrixed to the spec as a constraint but are not valid for that spec. Like if you have a matrix of [zlib, hdf5+mpi] X [^mvapich2, ^openmpi], the ^openmpi or ^mvapich2 constraint on zlib is extraneous because it will be filtered out by the specialized stack concretization method.
29f82c6 to
c84674f
Compare
|
Just read the description. This PR is great and I suspect it will simplify CI scripts by a great amount. I have one comment though:
In my experience, when tuning the deployment of an entire system one might have to specify quite long specs - that in order to get details right on dependents etc. In such cases it's kind of easy to forget a variant and have e.g. a serial configuration of a package instead of a parallel one. In that case I'd like the concretization of |
|
I'm testing this branch at ORNL with a massive environment that covers essentially all the software that we deploy on Summit, and have uncovered a couple issues (mostly in packages) and unexpected behavior regarding matrix outer products (nested matrix expansions don't seem to work but fail confusingly at a point during concretization later than I would expect). I'm not sure if any of the issues I've seen are already covered in discussion. @becker33, what's the best way for me to pass feedback to you? Comments in this MR, new issues (if possibly duplicates), and what info/files/logs would be most helpful for reproducers and debugging? |
8444e4e to
015d7d1
Compare
|
@mpbelhorn let's start with comments here. I'm not sure what logs will be most useful to start with until I see what was failing and how. Separate issues would be good for the build issues with individual packages, but let's take care of the stacks bugs here. Can you post some of the issues? I'm especially interested in the issue with matrix products, but let's get all the issues out here so I can start debugging as much as possible. |
|
Still using stacks with good success (thanks!) and wanted to try out a more complete deployment on a system. I have a couple questions / potential issues I experienced when using stacks for building/installing compilers. For instance, I want to begin the stack with some matrix of compilers then in subsequent builds leverage these newly built compilers. Though all the compilers build correctly if I attempt to use them they are obviously not defined under Along this line I'm see errors from spack relating to I can organize and include additional details / logs, just want to make sure this approach could be supported and I wasn't going about it the incorrect way. |
|
I wrote the feature to allow compilers to automatically be built and placed in the configuration as they're requested. I believe that has been merged but is not in this branch yet, but I can double check on that. With a combinatorial stack you will get errors with the default view, which is designed for an environment. I haven't thought of any good way to set a default for stacks other than to require you to configure it. Use the |
|
Thanks for the details on It doesn't appear that the complier functionality is merged into this branch yet, but sounds like what would make sense to see. I threw together a simplified example of what I had tried to do and uploaded it (https://gist.github.com/paulbry/139a822eb40e269a4cf79031322582c8) but it's probably unnecessary. |
|
This looks neat, though it's not directly useful for my use case. Looking at it with a fresh set of eyes, I'm a bit confused by two things in this bit of YAML where you introduce definintions: spack:
definitions:
- first: [libelf, libdwarf]
- compilers: ['%gcc', '^intel']
- second:
- $first
- matrix:
- [zlib]
- [$compilers]
specs:
- $second
- cmake
|
|
Is there a related (or similarly named) PR/feature-request somewhere that would let me use one Spack tree to build e.g. compilers and related tools and then have a number of other trees that use those compilers (perhaps as externals or ???)? |
Yep, that's a typo! |
I think you want to look at #8772 |
We designed the syntax for stacks to dereference the definitions in-place. That is, For example, imagine you have some packages you want to compiler with every compiler, and some you want to only build once. If we did not expand references in place, the specs key would need to be We didn't want to require a matrix heading to dereference a single list by itself, which is why we went with the (admittedly slightly counterintuitive) implementation we chose. |
Is there a way to set a preferred compiler to bootstrap the others? For instance, suppose I have $ # Will compile [email protected] %[email protected]
$ spack install zlib %[email protected]
$ # Will compile [email protected] %[email protected]
$ spack install zlib %[email protected] or the second compiler will be |
alalazo
left a comment
There was a problem hiding this comment.
A few trivial things reported by flake8 on Python 3.7
Based on spack#11057
Unless I'm mistaking the question, this can be accomplished using package preferences (either the |
@becker33 Thanks, will try that and report here if there's any unforeseen impact ( |
|
@alalazo also addressed the last of your review comments. |
|
I'm dubious about this PR as it's currently formulated. A Spack Environment isn't just a list of packages to build; but also a way to load those packages together and use them together. Conflicts can happen in those lists of packages. But they are generally rare; and typically occur in lower-level libraries that nobody uses directly anyway (and are RPATH'd in, so the conflict doesn't matter). In addition to being a set of packages, an environment is also "rendered"; meaning, put together as something someone can use. This is done either by creating a view, or a list of modules to load, or (ideally in the future), a single module that loads the ENTIRE environment at once. Rendering an environment this way eliminates one of the big weaknesses originally inherent in Spack: that users were forced to contend with meaningless hashes when loading modules. No more when you render an environment: either you use the view (whose location you know), or you use the module-loading file (or module itself) that's named after the environment. Hashes are now hidden away. And you never get weird module conflicts or have to fiddle with LMod either. Create the environment, render the environment and load the environment. In the examples presented in this PR, it seems that each combinatorial set of packages should be loaded separately. You wouldn't load [[email protected]@1.0] in the same environment as [[email protected]@1.0]. Therefore, I'm wondering if this functionality would be better implemented as a combinatorial way to produce a number of separate environments, rather than matrices in the same environment. I can think of many ways to do this; for example, some kind of templating on a |
citibeth
left a comment
There was a problem hiding this comment.
I'm quite dubious about this PR: (a) It's a lot of complexity and simpler/more powerful solutions are probably possible, (b) it breaks a fundamental feature of environments, without suggesting how to fix it. I wish I had seen it earlier, and hope that the momentum behind this PR so far does not prevent us from stepping back and thinking things through.
Environments are sets of packages that are meant to be loaded together (and frequently concretized together, unless there are conflicts between low-level libraries within the environment). Environments don't just build the set of packages; but also generate load scripts / views, solving a longstanding problem of how to load packages reliably once they've been built.
-
Please consider/explain how package loading will work with separate matrices of packages that by definition cannot / should not be loaded together, but are still built within one environment. As it stands right now, this breaks a core feature of environments. It seems that at the very least, some kind of extension for environment rendering (creation of views or module load scripts) needs to be thought through.
-
Please consider the alternative of using standard templating procedures to combinatorially generate a number of environment
spack.yamlfiles. In doing so, please consider the complexity/benefit of that scheme vs. the one in this PR. Because... the matrix generation / concretization rules in this PR seem rather complicated, making it hard for the casual user to really understand what's going on. Read through the comments for examples involving definitions, build-once packages, etc to see what I mean. In contrast, a template-based system may be complicated; but the user can always look at the environments that get rendered to know exactly what the result is. Templating is also inherently more flexible than the current approach. -
If templating is used, then maybe we need a notion of lightweight "environment groups" --- i.e. a set of environments that we want to concretize / render / build all with one Spack command. Maybe we can do that using something simple, such as wildcards that match environment names.
78f4595 to
5520565
Compare
| """Check that the expected install directories exist.""" | ||
| assert os.path.exists(str(viewdir.join('.spack', 'mpileaks'))) | ||
| # Check that dependencies got in too | ||
| assert os.path.exists(str(viewdir.join('.spack', 'libdwarf'))) |
There was a problem hiding this comment.
@becker33: I reviewed again; everything looks good but this line. I even rebased it for you. Why was the check for libdwarf removed?
Note: This used to be spread over many tests, but @tldahlgren consolidated the logic, and I applied the change to the consolidated function here. Are we checking that dependencies are linked in properly somewhere else?
There was a problem hiding this comment.
Because we changed the default semantics for environment views to only link the roots. See 5905b54. We discussed this about two weeks ago, based on a request from @lee218llnl
There was a problem hiding this comment.
There are options to tailor this behavior in that commit, and tests for the behavior of both options.
There was a problem hiding this comment.
Default is still to link everything, right?
There was a problem hiding this comment.
Nevermind -- found it. More change requests below.
lib/spack/spack/environment.py
Outdated
| # Magic names | ||
| user_speclist_name = 'specs' | ||
| def_view_name = 'default' | ||
| def_view_link = 'roots' |
There was a problem hiding this comment.
Make this all by default -- otherwise regular environments are not going to work for python (and likely other things). e.g., I think this is a pretty common set of requirements:
spack:
specs:
- py-numpy
- py-pandasBut if you link only roots by default, the environment won't even work. I think roots is still only for when you know what you're doing.
lib/spack/spack/environment.py
Outdated
|
|
||
| # Magic names | ||
| user_speclist_name = 'specs' | ||
| def_view_name = 'default' |
There was a problem hiding this comment.
Rename:
def_view_name->default_view_namedef_view_link->default_view_link
Also the comment "magic names" isn't the greatest comment. I'd put a descriptive comment on each of these.
There was a problem hiding this comment.
Ok, but fair warning it's going to result in really ugly shenanigans to make lines fit under the character limit later. That's why I shortened them.
There was a problem hiding this comment.
The shenanigans weren't as bad as I remembered, but it does hurt my soul to create a new variable just to have a shorter name.
594e629 to
323de5b
Compare
There was a problem hiding this comment.
Ok this all LGTM now.
Last request: refactor the commits to look intelligible, so we can rebase it. IMHO this is too much stuff to just squash into one commit.
Up to you how the features should break down, but the simplest thing might be the feature, tests, and docs. I know you have a few features (like the view linking preferences) already factored into commits -- keep those separate if you can.
323de5b to
eab41e7
Compare
- stack syntax in env schema - switch environment specs over to SpecList object - add stack functionality to environments - handle definition extensions through stack.yaml and SpecList - implement conditional definitions - tests
92d0bc2 to
c75626b
Compare
|
I've refactored the commits here, and assuming the tests pass, this is good to merge. |
- ensure that `Spec('foo').constrain('foo ^bar')` works
- prior to stacks implementation, this constraint would have done nothing.
This adds notion of a default view, other views in environments
- Change old spec expressions to use Spack's new spec formatting sytnax.
- remove redundant code in Environment.__init__
- use socket.gethostname() instead of which('hostname')
- refactor updating SpecList references
- refactor 'specs' literals to a single variable for default list name
- use six.string_types for python 2/3 compatibility
Bug relates to the interplay between: 1. random dict orders in python 3.5 2. bugfix in initial implementation of stacks for `_concretize_dependencies` when `self._dependencies` is empty 3. bug in coconcretization algorithm computation of split specs Result was transient hang in coconcretization. Fixed #3 (bug in coconcretization) to resolve.
c75626b to
ffdf7ad
Compare
Resolves #8835
This PR requires an update after #10017 is merged. I've marked WIP for now.
Allow users to specify combinatorial sets of packages for Environments.
This feature is aimed at facilities deployment people, to enable Spack usage at the facility level.
From the documentation (with only slight formatting changes):
"
Entries in the
specslist can be individual abstract specs or aspec matrix.
A spec matrix is a yaml object containing multiple lists of specs, and
evaluates to the cross-product of those specs. Spec matrices also
contain an
excludesdirective, which eliminates certaincombinations from the evaluated result.
The following two Environment manifests are identical:
Spec matrices can be used to install swaths of software across various
toolchains.
The concretization logic for spec matrices differs slightly from the
rest of Spack. If a variant or dependency constraint from a matrix is
invalid, Spack will reject the constraint and try again without
it. For example, the following two Environment manifests will produce
the same specs:
This allows one to create toolchains out of combinations of
constraints and apply them somewhat indiscriminantly to packages,
without regard for the applicability of the constraint.
The last type of possible entry in the specs list is a reference.
The Spack Environment manifest yaml schema contains an additional
heading
definitions. Under definitions is an array of yamlobjects. Each object has one or two fields. The one required field is
a name, and the optional field is a
whenclause.The named field is a spec list. The spec list uses the same syntax as
the
specsentry. Each entry in the spec list can be a spec, a specmatrix, or a reference to an earlier named list. References are
specified using the
$sigil, and are "splatted" into place(i.e. the elements of the referent are at the same level as the
elements listed separately). As an example, the following two manifest
files are identical.
N.B. Named a spec list in the definitions section may only refer
to a named list defined above itself. Order matters.
In short files like the example, it may be easier to simply list the
included specs. However for more complicated examples involving many
packages across many toolchains, separately factored lists make
Environments substantially more manageable.
Additionally, the
-loption to thespack addcommand allowsone to add to named lists in the definitions section of the manifest
file directly from the command line.
The
whendirective can be used to conditionally add specs to anamed list. The
whendirective takes a string of python codereferring to a restricted set of variables, and evaluates to a
boolean. The specs listed are appended to the named list if the
whenstring evaluates toTrue. In the following snippet, thenamed list
compilersis['%gcc', '%clang', '%intel']onx86_64systems and['%gcc', '%clang']on all other systems.N.B. Any definitions with the same named list with true
whenclauses (or absent
whenclauses) will be appended togetherThe valid variables for a
whenclause are:platform. The platform string of the default Spack architecture on the system.os. The os string of the default Spack architecture on the system.target. The target string of the default Spack architecture on the system.architectureorarch. The full string of the default Spack architecture on the system.re. The standard regex module in python.env. The user environment (usuallyos.environin python).hostname. The hostname of the system (ifhostnameis an executable in the user's PATH)."