Spack Environments (part 4): command-line, spack.yaml, and spack.lock#9612
Spack Environments (part 4): command-line, spack.yaml, and spack.lock#9612
Conversation
citibeth
left a comment
There was a problem hiding this comment.
This is looking good! I've left a lot of comments, etc. The summary and task list here is a succinct and specific version of what I have to say. I would suggest focusing on mainly on this, and skimming over the other text for additional context, details, etc.
- Address requests for expanded explanations in documentation.
- If two included config files set the same key or specify the same package in the
packages.yamlsection, establish a precedence between them. Don't just quit with an error (as it worked in the past), which prevented any kind of overriding of package versions between configs. - Make config and spec precedence work the same order as listed in spack.yanl` file (i.e. earlier vs. later has higher precedence)
- Update precedence rules for packages loading in an environment so that explicitly named packages always take precedence over implicit packages.
- Generate a single module for an environment. Remove
spack env loads, which always felt like a hack. - If
spack env loadsis not removed, make sure that Spack module generation works within an activated environment. It should re-generate modules for just that environment. - Integrate Spack Environment garbage collection: scheibelp#1
- Change YAML grammar to accommodate future inclusion of Spack Setup. See for example: https://github.com/citibeth/spack/blob/efischer/giss/var/spack/environments/twoway-dev-gibbs/env.yaml
- Move description of
spack.yamlinside a repo into a separate section of the manual. I'm not sure it's really Spack Environments, even if it relies on some of the same infrastructure to implement. - Make UI consistent. See explanation and additional tasks below.
UI Consistency
The current UI is inconsistent in a few ways:
- Some operations on a single environment exist as
spack envsub-commands (eg:spack env add), whereas others exist as top-level commands (eg:spack install). Still others are top-level commands that take the environment name as an ad-hoc-eflag (eg:spack cd). - Some
spack envsub-commands duplicate functionality of a top-level command, but with a different name. eg:spack -e <ENV> findis essentially the same asspack env status <ENV>. - Once a user has run
spack env activate <ENV>, they are still required to (unnecessarily) give their env name again when they wish to modify it (eg:spack env add). Moveover, this requirement to re-state the env name exists for some things but not others.
The UI should be updated to conform to the following principles:
-
Operations that deal with multiple environments, or managing or choosing environments (and only such operations), should be sub-commands of
spack env. -
Operations that read/write a single environment should exist as a top-level command. They can be used in one of two ways, in a consistent manner: (a)
spack -e [ENV] <command>or (b)spack env actiate [ENV]; ...; spack <command>. Corollarys:- All
spack envsub-commands should take an environment: egspack env <sub-command> [ENV]. Exceptions are:spack env deactivate, which complementsspack env activate. There's no need for this to be aspack envsub-command, but it seems to make sense.spack env list, which lists Spack Environments.
spack -e [ENV] env <sub-command> [ENV]...should always be an error, because there's no point in specifying the env name twice.- If
spack env <sub-command> [ENV]is run within aspack env activatecontext, then the activated env should be ignored.
- All
-
If a top-level command makes sense either with or without an environment (eg:
spack install), then the env and non-env version should be merged together in one command. -
It's OK if a top-level command ONLY makes sense with an environment (eg:
spack add). Running it without first activating an env (either throughspack env activateor throughspack -eprefix) should raise a runtime error. -
We should think long and hard whenever a
spack envsub-command has the same name as a top-level command. If their functionality is the same, should they just be merged? If their functionality is different, will it be confusing to users? For example,spack activateandspack env activateare two totally different things.
With that in mind, I request the following changes:
- Make sure you raise an error if
spack env destroy [ENV]is called within aspack env activate [ENV]context. You should really de-activate your env before destroying it. - Consider that users can/will put stuff that Spack doesn't understand in a Spack Environment directory.
spack env destroyshould do anlsof the entire directory and ask the user to confirm deletion. - Consider renaming
spack env destroytospack env rm. - Get rid of
despacktivate. That's a heavyweight solution to a non-problem. Users who want it can doalias despacktivate='spack env deactivate'. - Merge
spack env status [ENV]into the top-level commandspack find. - Consider renaming
spack findtospack status, since it's always been highly confusable withspack list. - Move
spack env add SPECto the top-level commandspack add SPEC. Same forspack env remove. It's an error if these are run without an environment. - Remove
spack env install, this functionality is alreday inspack install. - Move
spack env concretizeto the top-level commandspack concretize. It's an error if it's run without an env. - Merge
spack env loadsto the existing "top-level"spack module loads. - Consider moving
spack module loadsto a top-levelspack loads, since that functionality has always been in a strange place. - Move
spack env stage [ENV]tospack stage. - Merge
spack location -e ENVtospack location. (i.e. it can be run, among other ways, withspack -e ENV location). - Merge
spack cd -e ENVinto justspack cd. - Merge
spack env uninstallto justspack uninstall.
lib/spack/docs/environments.rst
Outdated
| ncview: # Highest precedence | ||
| netcdf: | ||
| nco: # Lowest precedence | ||
| py-sphinx: |
There was a problem hiding this comment.
Here we have a situation where configs take precedence if they're listed later, but specs take precedence if they're listed earlier. This is clearly an oversight in the API, which needs to be fixed. I would suggest making spec precedence work the same way as config precedence, since config precedence is an already-merged feature elsewhere.
lib/spack/docs/environments.rst
Outdated
| #. Spack Environments may contain more than one version of the same | ||
| package; but only a single module for a package may be loaded. | ||
| Modules that occur in earlier specs listed in an environment take | ||
| precedence over modules that occur later. |
There was a problem hiding this comment.
Nobody would (knowingly) put two versions of the same package in an environment. So explain why this would be a good thing: because your environment contains A and [email protected], and because A->[email protected].
In this case, it seems that the precedence rules are incomplete. If the user has placed A and [email protected] in an environment, then they would expect that A and [email protected] would both be loaded, regardless of which order they are listed in the file. Users might not even know that A depends on [email protected], and will be mystified when they get [email protected] instead of [email protected].
Of course... if the same packages is explicitly listed twice, then we SHOULD choose between them based on order. Similarly, if the same packages is implicitly listed twice, there should also be an order-based precedence. But explicitly listed packages should always take precedence over implicitly listed packages.
There was a problem hiding this comment.
This is in the old docs but it's still true.
Comments based on the Docs in the main PR
If you do all your work in Spack Environments, then it is now possible to garbage collect unnecessary packages. This is important on shared supercomputers with inode limits. BUT... Spack has to know about all the environments it is supposed to keep around. Creating environments outside of Spack will break garbage collection (as would using Spack without environments). The idea of creating an environment outside of Spack is an obvious feature; but I would want to have a clear idea of why we need it??? In any case, this downside of this feature should be documented.
This should have been said right at the top with
I like that the former contents of the
It might be better to just describe Spack Environments; and then in a later place, relate Spack Environments to how other package managers do environments. Many of us are not familiar with other package managers, and I think it weakens our brand to always compare ourselves to them. I find the seemingly random references to other package managers to be jarring.
Good
Why does
Would use of this features limit portability of Spack Environments between different systems? Actually I suppose that any path-based features could... Something should be said somewhere that when you port a Spack Environment, the paths needs to be manually checked / updated. Acutally, there probably needs to be a whole section on porting environments, once we (someone) have gotten practice with it.
Can this do paths based on env vars, eg:
Note the change in syntax (YAML grammar) required to accomodate Spack Setup (coming in a future PR on the heels of this one).
Wouldn't it be great if this could be the case for the rest of Spack as well and we didn't have to separate our (now numerous) configs into 5 different files each...????
We should expand more on why this is a good idea. You can fork Spack, then add your own environments to it. Now you have a single turn-key Spack solution for your users. They clone your fork and install the environment(s) you included in it. They don't have to do any crazy setup or configuration. When you want to upgrade your users, you can do it in a controlled manner by updating your environment(s) in the repo, pulling from the latest Spack upstream, etc. And you can test it all before deploying your updated fork to users.
Woah... I feel we just moved into an entirely new feature space, this is not really Spack Environments anymore. So... it looks like
Isn't this normally what happens when you install a package, via What would make more sense to me is if 3d party repos can ship a combination of Spack Configs and Spack Environments, allowing them to set up a recommended configuration of software versions for their software (as opposed to trying to encode all that in In general... I think this feature needs more thought, more documented/expected use cases, etc.
It seems from the docs this is not a lockfile, so why is it named like one? Is this just the old
In which directory will that environment be created? Does it copy the
Again... no need to keep referencing other environment managers. It just confuses those of us who are not familiar with them.
Suppose I do:
Question: Will comments from (1) be preserved in (4)? If not... is there a procedure that allows for this to happen?
Huh?
It is inconsistent if activating an env with views includes the env into your
These are solid improvements.
I believe this is heading in the right direction. If functionality is missing, it should be possible to add/tweak it in the future, without radically altering the UI.
I feel we have confusion reigning. We now have the following ways to run something (including maybe a shell) with different env var settings:
Meanwhile, PS: Does |
|
I'll just mark the comments on the docs resolved -- the rst docs still need updating, and the PR is supposed to be a high-level version of what they'll look like. |
|
Yes... I would take the "requested changes" section seriously, the rest is commentary. |
There are basically two reasons to do this:
FWIW, the managed package thing also comes up with spack chains (#8772). We decided to ignore it there but we plan to have commands that will copy needed packages into a sub-spack, instead of just linking. We could do something similar for environments but I have not thought it through. The model that comes immediately to mind is npm's local vs. global modules. |
but I like the comments! |
|
@citibeth <https://github.com/citibeth>: FYI, I really want spack activate
to load the entire environment *without* modules, so that you can use
Spack the same way everywhere without a lot of bootstrapping. So I
basically agree with this and think virtual envs are a better interface
than piecemeal modules.
Do you mean `spack activate` or `spack env activate`? (I'll assume you
mean `spack env activate`...) IMHO there's nothing wrong with this. BUT
we seem to be loading two things into `spack env activate`, which we might
want to keep separate (or allow the user to separate):
1. Enter into an environment context that allows us to add, concretize,
etc. stuff in a Spack Environment.
2. Load the results of an already-built Spack Environment.
Meaning (2) is not always desired; especially when a Spack Environment is
not yet fully built.
Another issue is that I've encountered a LOT of people who are intimidated
by Spack. Tell them they have to run the `spack` command to load your
environment, and you will lose audience. Tell them that you've created
this nifty module named `myenv` that they can load with their trusted
`module load` command --- and they will be very happy.
Once an environment has been built, it should always be possible to use it
without running Spack.
|
|
Yep I mean
I think this can be an option on activate (activate with and without loading the results). I agree you want this for things like x-compiled environments.
I think it would be easy to also generate modules for environments, with corresponding names, which would provide a familiar interface for HPC people. |
|
1. I agree we should document that they can't be gc'd (though we don't
have a gc yet 😄).
We do have a GC, it's just sitting in the queue waiting to get merged:
scheibelp#1
In any case, garbage collecting with an external environment shouldn't be
so hard. For example: External environments get "registered" and
"unregistered." Any registered environments participate in GC. External
environments are automatically registered when created. If the external
environment is deleted but not explicitly unregistered, Spack will act like
it's been unregistered. It will produce a warning, but not unregister the
environment itself.
1. You need this feature to support sticking a spack.yaml in a project
repo, and I'm trying to leverage what we're doing for environments for UI
*and* developer workflow.
Ahh... that's a good thing. I look forward to seeing how some of this,
and the scripting, turns out. I never imagined scripting environments; to
me, the environment is already a script. But I suppose someone will find
it useful.
…-- Elizabeth
|
|
On Tue, Oct 23, 2018 at 12:09 PM Todd Gamblin ***@***.***> wrote:
Yep I mean spack env activate. I really want to get rid of spack activate
once environments are fully fleshed out.
I'm all for getting rid of features we no longer need. So... supposing we
have Views for environments. Then would we get rid of non-environment
Spack Views and `spack activate` (and also non-environment `spack module
loads`?)
After that is done... I'll do this with respect to Python. It seems the
old `spack activate` is now equivalent to an environment in which things
that `extends(python)` are placed into it. Maybe there's an automated way
to do this: somehow set up a `python` environment, and then automatically
add anything Python package you install into it, even if you didn't
explicitly specify Spack Environments.
More generally would be the concept of auto-environment, in which certain
packages get automatically added to certain environments according to a set
of rules when you run plain `spack install`. Not sure if this is a good
idea or a terrible idea. But I probably wouldn't use it.
I think it would be easy to also generate modules for environments, with
corresponding names, which would provide a familiar interface for HPC
people.
Me too, I agree. I would deprecate the `spack module loads` / `spack env
loads` stuff then.
…-- Elizabeth
|
scheibelp
left a comment
There was a problem hiding this comment.
Some preliminary comments, mostly on documentation.
I do have one request to change the code though: Database.query IMO should stay the same as it is and functions using it should take care to filter after the query.
For other folks reviewing this, FYI that several files were simply renamed, which accounts for at least 700 or so lines of the diff. This includes
spack/util/environment.py(moved fromspack/environment.py, this may in particular be confusing because nowenvironment.pyis used to manage the Spack environment introduced in this PR)spack/cmd/build_env.py(renamed fromspack/cmd/env.py)spack/test/cmd/build_env.py
lib/spack/docs/environments.rst
Outdated
|
|
||
| $ spack env myenv create | ||
|
|
||
| Spack then creates the following files: |
There was a problem hiding this comment.
IMO this section is more detail than what should initially be presented to the user for using environments. This is the first time they are seeing commands. Ideally the most-common 5 or so commands should all fit together in one (non-huge) screen. Listing the files is important but can be done later.
lib/spack/docs/environments.rst
Outdated
| The following files may be added to this directory by the user or | ||
| Spack: | ||
|
|
||
| * ``env.yaml``: Additional environment specification and configuration |
There was a problem hiding this comment.
At whatever point we get into this level of detail, it is worth being explicit that env.yaml controls normal Spack configuration that is customized on a per-environment basis. i.e. I want to distinguish the customization of the environment with customization of Spack associated with the environment. This should also be mentioned in the section about advantages.
Also config/ is missing
lib/spack/docs/environments.rst
Outdated
|
|
||
| All environments are stored in the ``var/spack/environments`` folder. | ||
|
|
||
| The following files may be added to this directory by the user or |
There was a problem hiding this comment.
"this directory" refers to var/spack/environments since that is the most recent object. It can be clarified with: The following files may be added to var/spack/environments/myenv
lib/spack/docs/environments.rst
Outdated
| .. note:: | ||
|
|
||
| #. If ``env.yaml`` exists, then Spack will no longer automatically | ||
| load from the default environment ``config/`` directory. This is a |
There was a problem hiding this comment.
It might be a bug but stating it as such is more distracting than helpful here. Just suggesting the fix is sufficient. We can add a TODO or issue to address this later.
lib/spack/docs/environments.rst
Outdated
| elsewhere. | ||
|
|
||
|
|
||
| Initializing an Environment from a Template |
There was a problem hiding this comment.
The functionality referred to in this section is gone, so this section should be removed.
lib/spack/llnl/util/lang.py
Outdated
| Args: | ||
| dicts (list): list of dictionaries | ||
|
|
||
| Return: (dict): a new ``dict`` ``update()``'d with each ``dict`` in |
There was a problem hiding this comment.
This is a bit literal. I'd prefer return: a combination of the dictionaries
Also: mention which dict takes precedence if the same key is defined in multiple dictionaries.
There was a problem hiding this comment.
This is still too literal IMO, and still doesn't mention precedence.
|
|
||
| _arguments['recurse_dependents'] = Args( | ||
| '-R', '--dependents', action='store_true', dest='dependents', | ||
| help='also uninstall any packages that depend on the ones given ' |
There was a problem hiding this comment.
This help message appears to be a bit specific given that it appears in common/arguments.py
| # TODO: like installed and known that can be queried? Or are | ||
| # TODO: these really special cases that only belong here? | ||
|
|
||
| # TODO: handling of hashes restriction is not particularly elegant. |
There was a problem hiding this comment.
IMO environment-constrained queries can just filter the list of specs after the fact. In particular I care because I know of at least one other PR touching this interface: #8772 (Spack chain).
There was a problem hiding this comment.
@tgamblin I still think Database should not be modified in this PR. I expanded on that at #9612 (comment)
There was a problem hiding this comment.
@scheibelp: I didn't see a counterargument to the performance point I made. The reason I think this should be in query is because it really is just another constraint (subset of specs) and query is a general query function.
If you keep this outside query, then every query does as many spec compares as there are installed packages. If you push the hash comparison into query(), then we can quickly eliminate specs that aren't in the set of hashes, and only do spec comparisons on the things in the subset of hashes.
I could see having two query functions if they were separable and you could put the fast one first, but right now that's not possible without a bigger refactor of Database. Thoughts?
There was a problem hiding this comment.
I'd rather this were refactored into a 2-3 functions (in a separate PR) and that it stayed as simple as possible for now (even at the cost of find being a bit slower than it could be in an environment):
- A function with the bulk of this logic that doesn't use
self._dataand just looks through a set of specs that it is given - A function that runs the first function with a set of specs
Then you can call the second function with self._data or "the set of specs in an environment"
IMO this is the sort of speedup that could be added later: I'd prefer to reduce complexity added by this PR since it adds quite a lot already.
|
@citibeth I think there are several requests you make that can be done later:
Potentially useful but this can be done later
This should happen after this PR is merged. GC is a great feature but not essential for core environment functionality. Comments on UI consistency:
I don't think this is true: if an environment is active, That being said, there is the possibility of inconsistency for To make this consistent with other commands, the @tgamblin on that count, why was -e added to Spack's main.py vs. managed as a common arg (added to cmd/common/arguments.py)? I'm guessing it has something to do with config initialization but it would be helpful if you stated it here.
I think that in different cases each of these approaches makes sense. This depends on whether:
Regarding |
|
EDIT: This is an awesome feature! Thanks to everyone that helped bring it to life.
I don't know the origin of the lock, but I know Gem and NPM use "lock files" to store the concrete package versions installed. I agree that the naming is confusing, and it might be best to break with convention here and name it something else. Like maybe
I agree that the
👍 👍
👍 👍 (The stuff below is largely an echo of @scheibelp's comment above)
I wouldn't hold up this PR waiting on this feature to be implemented. I think this should be a follow-on PR.
Not that I have any great alternatives in mind, but this smells off to me. It seems weird to have top-level commands that only work inside an environment. These commands seem ripe for sticking under More specifically, I have an opposition to adding Sort of related: does anyone have a spack "cheat sheet" of commands? Having them all laid out in one place and organized by topic could be useful for this kind of discussion. |
|
On Tue, Oct 23, 2018 at 7:51 PM Peter Scheibel ***@***.***> wrote:
@citibeth <https://github.com/citibeth> I think there are several
requests you make that can be done later:
Great. Things that can be done later... I'll pull them off this PR once
it's merged.
If spack env loads is not removed, make sure that Spack module generation
works within an activated environment. It should re-generate modules for
just that environment.
spack env loads only loads modules for specs in the environment. It
doesn't affect module generation (or regeneration). If there is
module-specific configuration, then it occurs to me that the module files
for different environments will all be merged into the same directory (with
potentially different schemes). IMO this can also be handled later.
Sorry I was not clear about what I need. I need `spack module refresh`,
run from within an environment (eg: `spack -e <ENV> module refresh`, to
regenerate modules for *just files used in that environment*, instead of
everything Spack has ever built. I imagine this should be a really easy
feature to add.
Comments on UI consistency:
Once a user has run spack env activate , they are still required to
(unnecessarily) give their env name again when they wish to modify it (eg:
spack env add)
I don't think this is true: if an environment is active, spack env add
<spec> will add the spec to the current active environment (see
cmd/env.get_env).
Sorry... the issue I mention here is still an issue, but I pointed to
something that is not an example of it. `spack env concretize [ENV]` would
be the right example here. There are too many ways to do things now, some
of them self contradictory:
1. spack env activate foo; spack env concretize bar
2. spack env activate bar; spack env concretize
3. spack -e foo env concretize bar # What does this even mean???
Or:
1. `spack location -e ENV`
2. `spack -e ENV location`
That being said, there is the possibility of inconsistency for spack env add:
the spack env add command has a -e option,
A few points here:
1. You have already put a global `-e` option on the whole Spack command.
I can say `spack -e ENV subcommand ...` for any command, anywhere. Given
that, a second `-e` option on commands (or sub-commands) is redundant and
confusing, and invites nonsense (eg: `spack -e foo cd -e bar`). I don't
see any reason why ANY command has an `-e` option.
2. Why do we say `spack install` but `spack env add`? This is completely
arbitrary. What does adding `env` on the command line add, other than
requiring the user to type 4 extra characters and remember which commands
do / do not require `env`? If `spack install` installs into the currently
selected environment, then `spack add` should add into the currently
selected environment. You go down this logic, and realized that a lot of
things, even environment-related things, should probably just be top-level
commands.
@tgamblin <https://github.com/tgamblin> on that count, why was -e added
to Spack's main.py vs. managed as a common arg (added to
cmd/common/arguments.py)?
Because `spack -e FOO command...` is the same as `spack env activate FOO;
spack command`. You can do the latter for any command, even when the
command itself is insensitive to environemnt (eg `spack list`). So there's
really no harm in being able to set the environment for any command using
`spack -e`, even if the command doesn't use it. In particular, if I always
work with environment FOO, I can add `alias spack='spack -e FOO'` to my
`.bashrc` and then everything I do will work as if the FOO environment is
all there is, and I won't be bothered by all the other installed packages
hanging around Spack. If `spack -e FOO command` bombs out for some
commands, this simple alias no longer works.
In other words: having a common arg is both easier and more powerful.
I'm guessing it has something to do with config initialization but it
would be helpful if you stated it here.
Some operations on a single environment exist as spack env sub-commands
(eg: spack env add), whereas others exist as top-level commands (eg: spack
install). Still others are top-level commands that take the environment
name as an ad-hoc -e flag (eg: spack cd).
I think that in different cases each of these approaches makes sense. This
depends on whether:
- a given action only makes sense within an environment
- an existing Spack command could be modified to operate in the
context of an environment
I understand this logic here. But it is requiring the user to remember a
distinction that the user really doesn't care about. If I routinely work
within an environment (as most Spack Environment users will do), then I
don't really care that *some* of the commands I'm using wouldn't work in
non-environment Spack.
I *don't* work with an environment, then I *do* need to know that some
top-level commands require an environment. But I believe this is better
accomplished by putting such commands in an "environment only" category in
`spack help`.
I am suggesting a different way to decide which commands should / shouldn't
be part of `spack env`: Commands that manipulate between environments are
under `spack env`. Commands that read or modify a single environment are
top-level. It's like the difference between messing around with
files/folders in Macintosh Finder (`spack env COMMAND`) vs. double-clicking
one of those files to launch TextEdit and view/edit that single file
(top-level `spack COMMAND`). I believe that this distinction is more
relevant to the user, and easier to remember, than the current one.
I think having two top level commands spack add and spack install would
be confusing.
Well... currently we have `spack env install` and `spack install`, and they
do (or should do) exactly the same thing. That is confusing. It also
introduces consistency possible problems going forward, unless the two
command call each other and use the same code under the hood.
spack -e <env> install vs. spack env install vs. spack env install <env
name>:
You forgot `spack -e <env1> install <env2>`. Which would be an error?
That requires more code to check for the error; whereas it's less code to
just not allow the user to write nonsense to begin with.
spack install exists separate from environments and intends to install a
single spec; it can take on additional meaning in the context of an active
environment. spack env install installs all specs in an environment and
depends on activation of the environment; spack env install <env name>
does the same without requiring activation of the environment.
OK yes, I am suggesting some overloading of `spack install`. But it makes
sense to me:
1. No environment, `spack install <spec>`: Concretizes and builds a single
spec.
2. No environment, `spack install`: Error
3. With environment: `spack install <spec>`: Concretize and build a single
spec, and add it to the environment.
4. With environment: `spack install`: Concretize (if not already) and build
the entire environment.
I think people will find (4) quite natural. Either they'll do it after
`spack env activate`, or they'll do it with `spack -e FOO install`. Both
make sense to me.
Of the 3, only the 3rd is redundant (with the 2nd), but it is also the one
that is the least important and the easiest to control.
We should aim to have zero redundancies. Spack already has a LARGE number
of commands that aren't used very often. I don't want it to have extra
commands that do the same thing "just because."
In "Spack Environments Round 1", I went through this evolution:
1. Create a new sub-command under `spack env` (eg `spack env install`).
Get it working
2. Realize I need some features of the main `spack install` that I don't
have here. Start pulling them over.
3. Realize that there are a LOT of features in `spack install` that I'll
need to pull over. Start factoring things out so I can share code between
`spack env install` and `spack install`.
4. Realize it would be simpler and cleaner to just have one `install`
command. Then I don't have to do clever factoring / code reuse tricks.
Regarding spack cd: Other than spack cd, all Spack commands customize
their environment by setting spack -e <env> command ... rather than spack
command -e <env> .... I think spack cd is a special case.
Two things:
1. `spack location -e` also has this property.
2. What is the justification for a special case? Why does the user have to
remember that a few special cases use an `-e` flag after the sub-command?
And what happens if the user writes `spack -e FOO cd -e BAR`???
This is already something of a catch-all command: it lets you relocate
your CWD to various locations that are relevant to Spack (it is based on spack
location). Unlike spack install it isn't necessarily related to the
current environment. For example spack cd -P takes you to the package
repository directory.
I see. Well, I don't know what to think of it. It's probably something I
would use rarely if ever. Because `cd` works just as well, and is standard
Unix, and that's one fewer Spack command I have to remember.
- Elizabeth
|
|
On Tue, Oct 23, 2018 at 9:50 PM Stephen Herbein ***@***.***> wrote:
It seems from the docs this is not a lockfile, so why is it named like
one? Is this just the old environment.json file rehashed?
I don't know the origin of the lock, but I know Gem and NPM use "lock
files" to store the concrete package versions installed. I agree that the
naming is confusing, and it might be best to break with convention here and
name it something else. Like maybe spack.concrete?
I have no further opinions on this issue. It's just something that stuck
out to me.
Consider renaming spack find to spack status, since it's always been
highly confusable with spack list.
I agree that the list vs find naming is very confusing, but based on the
open issue (#4159 <#4159>) I don't
think there is consensus on what to change them to. I would recommend
against changing the names of any top-level commands in this PR, as that
would be a large breaking change. Maybe spack env status could become spack
env find if consistency is of utmost importance.
Or better yet... `spack find`, when run from within a Spack Environment,
would do what `spack env status` currently does. And we can put off
renaming `spack find` for another day.
(The stuff below is largely an echo of @scheibelp
<https://github.com/scheibelp>'s comment above)
Integrate Spack Environment garbage collection: scheibelp#1
<scheibelp#1>
I wouldn't hold up this PR waiting on this feature to be implemented. I
think this should be a follow-on PR.
I didn't put it in Spack Environments #3 because that PR was "about to be
merged." Now 6 months later it's been re-done and changed a lot. Had I
know that, I would have thrown it in the bin 6 months ago. At this point,
I have no idea what schedule is in mind for merging Spack Environments.
But either way... if this PR actually gets merged soon, then tackling GC in
a later PR should be no big problem.
Move spack env add SPEC to the top-level command spack add SPEC. Same for
spack env remove. It's an error if these are run without an environment.
Move spack env concretize to the top-level command spack concretize. It's
an error if it's run without an env.
Not that I have any great alternatives in mind, but this smells off to me.
It seems weird to have top-level commands that only work inside an
environment. These commands seem ripe for sticking under spack env.
See discussion above. Consider how it will seem to you a month down the
line, after you realize that Spack Environments are great and you convert
all your workflow to them, and you've stopped worrying about whether
something is an "environment" command or not. And by then you're just
pretending that what you can see in your environment is ALL of Spack; or at
least ALL of your Spack installation that you personally care about.
More specifically, I have an opposition to adding spack add as a top-level
command. Both spack add and spack install, which semantically are very
similar, would be top-level commands. I could see having both as top-level
commands being confusing to new users, just as list/find and clean/purge
can be confusing.
We already have `spack help` for new users, and `spack help --all` for
power users. Top-level Spack commands requiring a Spack Environment would
only be seen under `spack help --all`, and would go under a `Spack
Environments` section. Something like this:
```
Spack Environments:
env Create, destroy and manage environments
add Add a package to an environment
```
if we allow commands to be listed twice under `spack help --all`, then we
could even have ALL environment-relevant stuff in one section under the
help. Something like this is very clear to me:
```
Spack Environments:
[Except for env, all commands in this section require the spack -e flag]
env Create, destroy and manage environments
add Add a package to an environment
install Add, concretize and build a package in an environment
```
And so on...
Sort of related: does anyone have a spack "cheat sheet" of commands?
Having them all laid out in one place and organized by topic could be
useful for this kind of discussion.
Run `spack help --all`
|
|
I wanted to add a note on "lockfiles". The manifest/lockfile model has already become a thing -- it refers to locking the resolved versions of dependencies in most of the language-specific package managers out there right now. Here are some references:
Honorable mentions:
I went with I do think we did better than these package managers by having notions of "abstract" and "concrete" specs and "concretization". I think that's a clearer concept than "dependency resolution", which is what most of the other tools call it. Dependency resolution typically only refers to setting specific versions, while concretization does that and compilers, variants, flags, etc., and I think that's an important distinction. I don't think it's hard or unintuitive to say "the |
|
@tgamblin I think the most important outcome so far on the UI discussion is that the extra -e option should be removed from all (less important) I argue below that
You can already activate the environment-aware version of any spack command by first doing
I didn't. That command is not valid. I do think though that users may end up getting confused about this unless one of the environment specification mechanisms is removed.
The argument you are quoting in #9612 (comment) agrees with your response, which appears to be phrased as though it is a counterargument, so I am confused. |
|
Relatively speaking, the rest of this is less important.
I'm not sure if the arguments behind my change requests have been
considered seriously. For example, I was hoping for more engagement /
discussion of top-level vs. env subcommands. I still think users will find
it confusing, and that we will regret the current setup --- most likely
some time after we are able to do much about it.
The argument for the current way of doing things is that it helps users
distinguish which command are / are not environment-sensitive. I agree
that people implementing Spack Environments care about that; but it is a
distinction that *users* will not care about. Baking it into our UI will
make Spack harder to use. We need to get out of the implementor mindset
and think of it from the users' point of view.
To see why users won't care about this distinction, imagine two typical
kinds of users with respect to Spack Environments:
1. User thinks "maybe I should use Spack Environments." So they set one up
and enter it; either through `spack env activate FOO` or with `alias
spack='spack -e FOO`. Once they've done that, the fact that they set up an
environment --- or that there might be other users doing other stuff on the
same Spack --- fades into the background. At this point, non-environment
sensitive commands become *commands that would still be useful even if they
were not using an environment*. Because the user IS using an environment,
so they really don't care about that hypothetical. The
environment-wielding user just wants to know what commands work, given the
choices they've made.
2. User does NOT use environments. We definitely do not want to confuse
them with environment-sensitive commands that they cannot use. But just
making a command top-level does NOT automatically put it under the nose of
all users. Users discover commands either by reading the manual, or
through `spack help`, or through `spack help --all`. The manual should
have a section on Spack Environments, so only users who want to use
environments will be introduced to environment-sensitive commands.
Similarly, env-related commands will be shown in `spack help --all`, and
not `spack help`. In `spack help --all`, all env-related commands will be
placed together, alerting non-env-using users to ignore them.
Or put it another way... I think Spack Environments will be big. I think
after a "break-in" period, we will all use them. One could even imagine a
future where EVERYTHING happens in an environment: even if the user does
nothing, then a default "global" environment would be set up. if this is
really going to be a central feature of Spack, then we should not be
shoe-horning it into a sub-command. Especially when we don't have to.
Spack Environments would be better thought of as a "spice" that can
potentially pervade all aspects of Spack. Over time, I think that will be
increasingly true.
|
|
At this point, I'm probably the Spacker with the most experience using
Spack Environments (the unmerged version). I currently maintain 4
variations of one environment, and 1 variation of another environment.
Both have about 100 packages in them.
My overall impression of that iteration of Spack Environments was that
basically everything was there but it was a little awkward to use. In many
ways, the `spack env` subcommands replicated the top-level commands ---
except for where they didn't, in surprising ways. The experience wasn't
seamless.
I concluded that best, most seamless approach to Spack Environments would
be to build them into the top-level commands. So doing stuff IN an
environment should be as similar as possible to doing stuff NOT in an
environment. This seamlessness will also help uptake/adoption of
environments, because users won't have to re-learn stuff. They'll just
have to learn a few more commands they didn't necessarily use before.
… Or put it another way... I think Spack Environments will be big. I think
after a "break-in" period, we will all use them. One could even imagine a
future where EVERYTHING happens in an environment: even if the user does
nothing, then a default "global" environment would be set up. if this is
really going to be a central feature of Spack, then we should not be
shoe-horning it into a sub-command. Especially when we don't have to.
Spack Environments would be better thought of as a "spice" that can
potentially pervade all aspects of Spack. Over time, I think that will be
increasingly true.
|
|
Another possible change would be to get rid of `spack env add` / `spack
add` and turn it into an option on an env-sensitive `spack install`.
Again... the procedure of "add", "concretize", "install" all makes sense
and users should be able to control them separately. But once I became
comfortable with Spack Environments, I wanted to type fewer commands to get
things working. I found myself typing too many add/concretize/install
commands in succession, when maybe one command would have sufficed.
My guess is that if we do things right, then env users will mostly just do
`spack install` as they always have, and not worry too much about the steps
going on under the hood. `spack add` will be used only occasionally, and
can be folded into `spack install` with flags that cause it to just add to
the environment, and not concretize/install immediately. Similarly,
someone might want to add-and-concretize, but not build.
|
- uninstall now: - restricts its spec search to the current environment - removes uninstalled specs from the current environment - reports envs that still need specs you're trying to uninstall - removed spack env uninstall command - updated tests
- `spack env status` used to show install status; consolidate that into `spack find`. - `spack env status` will still print out whether there is an active environment
- split 'environment' section into 'environments' and 'modules' - move location to 'query packages' section - move cd to developer section - --env-dir no longer has a short optino (was -E) - -E now means "run without an environment" (no longer same as --env-dir) - -D now means "run with this directory environment" - remove short options for may infrequently used top-level commands
- The `Spec` class maintains a special `_patches_in_order_of_appearance` attribute on patch variants, but it is was preserved when specs are copied. - This caused issues for some builds - Add special logic to `Spec` to preserve this variant on copy - TODO: in the long term we should get rid of the special variant and make it the responsibility of one of the variant classes.
- args weren't being delegated properly from CommentedMap to OrderedDict
- spack.yaml files in the current directory were picked up inconsistently -- make this a sure thing by moving that logic into find_environment() and moving find_environment() to main() - simplify arguments to Spack command: - remove short args for infrequently used commands (--pdb/-D, -P, -s) - `spack -D` now forces an env with a directory
- to aovid changing spec hashes drastically, only add this attribute to differentiated abstract specs. - othherwise assume that read-in specs are concrete
- all commands (except `spack find`, through `ConstraintAction`) now go
through get_env() to get the active environment
- ev.active was hard to read -- and the name wasn't descriptive.
- rename it to _active_environment to be more descriptive and to strongly
indicate that spack.environment manages it
a593bf9 to
84292e1
Compare
|
@scheibelp @becker33: I integrated the We are still having an issue where Python 2.6 builds are hanging and we don't know why. We've seen this in some other builds as well. If it persists on |
|
At any rate, this is merged! Environments are in! Docs, a tutorial, and some new feature additions are coming soon. |
|
@tgamblin are there any plans to marry a repo's That would allow software developers to in-source describe and update dependencies instead of writing a lot of "if between release X and Y, depends on [email protected]:2.5, else [email protected] and if Y>3.4 also adds dependency on W" inside a single |
|
From the timestap, it looks like some midnight oil was burned on Spack Environments. Thank you Peter and Todd for all your hard work. This is truly a glorious day in Spack Land! |
@ax3l: I think there's a long-term path there but right now that is hard. We're not a distributed package management system (yet) and the HPC ecosystem needs some packages to be curated. There is probably a balance where some packages can be managed more like they are in registry-based systems, where you have a |
|
+1 I tend to agree with the need for curation in the various environments or stacks as a starting guide for others to modify locally |
The goal is to try and get the full_hash computed during the 'buildcache check' to match the one computed (or looked up) during the 'buildcache create' if nothing else has changed. Not taking patches into account during the latter was causing packages to rebuild on every pipeline, even when it was unnecessary.
| # If the command-line scope is present, it should always | ||
| # be the scope of highest precedence |
There was a problem hiding this comment.
Except when it shouldn't... I'd rather see this reverted.
This supersedes #8231, reworks the API, and adds a lot of features.
Spack environments are fill a number of needs:
a. Functionally, from the abstract specs of the prior installation (i.e., with an
spack.yamlfile)b. Exactly, from the concrete specs that were previously installed (i.e., with what
bundler,npm,cargo, andpipenvcall a "lockfile")I've attempted to refactor the code so that one basic
Environmentconcept can provide all of these. This builds on the prior work by @scheibelp and @citibeth, and it attempts to integrate some of what is described at #7944.I think it is easiest to describe what's implemented here in terms of two workflows:
Command-line environment usage
You can use enviroments to work with a subset of packages, like you'd normally work with spack:
You can incrementally add things to environments:
Environments also let you concretize groups of specs at the same time:
Environments are created, by default, in
$spack/var/spack/environments, and you refer to them by name. You can also create environments external to Spack in directories:Also, you don't have to activate them to use them. Spack has a new
-e/--envoption you can use to execute any Spack command in aspecific environment:
spack.yaml/spack.lock: environments in the filesystem"Named" environments in
var/spack/environmentsand directory environments are both just directories with two special files:spack.yamlandspack.lock.spack.yamlspack.yamldescribes the specs you want in your environment. It's created when you dospack env create <name>, and it describes the specs you've added so far:This is what other project dependency managers call a manifest file --
a list of the things you want to install with a project. You can
maintain it by hand, or
spack env create,spack env add, andspack env remove,spack install, etc. will all update thespack.yamlfilewhen you use them.
In Spack, the
spack.yamlfile can also contain configuration:The sections in the file can be anything from the regular Spack
configuration files,
so you can do some sophisticated things here if you're creative. You can
also include configs from elsewhere:
The included items can either be Spack configuration scopes (directories
with
packages.yaml,config.yaml,compilers.yaml, etc.) or fileswith all of those config sections merged into a single file (like in the
spack.yamlfile above).Notice that you can have relative paths. Those are relative to the
spack.yamlfile, so you can put it and its associated configuration ina git repository if you want to.
spack install,spack spec, and other commands will use theconfiguration from your environment if it is active. So you can easily
maintain multiple sets of configurations in environments, then switch
quickly between them by activating the one you want to use..
spack.yamlin code repositoriesAs is common elsewhere in the dependnecy management world (Pipenv, Cargo,
Bundler, etc.), you can put a
spack.yamlfile in your project'srepository and use it to make bootstrapping dependencies easier for your
users:
spack install, when called without arguments in a repo that has aspack.yaml, will concretize and install all the specs in thespack.yamlfile.We chose the name
spack.yamlto make it clear that this is a file thatSpack understands, and so that it would be distinct from other files at
the top level of a repo.
spack.lockand reproducing environmentsThe
spack.lock"lockfile" is created whenever you install or concretizean environment. If you run
spack installas suggested in the previoussection, it will produce a file called
spack.lockalongsidespack.yaml. This contains both the abstract specs that were used toinstall the packages, and the full, concretized specs of these packages
and their dependencies. It's intended to allow you to reproduce an
environment exactly as it was built by someone else.
If you send someone a
spack.yamlorspack.lockfile, they can createa new environment from these files with commands like these:
spack env create myenv spack.yamlor:
If you create an environment from
spack.yaml, you'll get a newenvironment with the same root specs (like a pip requirements file), and
it will be re-concretized when you install it on a new machine. If you
create it from
spack.lock, you'll get that and the concrete specs toreproduce things exactly. Currently that has to be completely exact,
but in the future we'll support generating something "as close as
possible" to the original environment on a new host.
Using environments
You can currently use environments by running
spack env loadsandsourcing the resulting file:
spack env loadsgenerates a single file withmodule loadcalls forall packages in the environment. You need a module system in your
environment to use it.
TODO: views We will also be adding a view to each environment. A
view is a single prefix with all packages symlinked into it, like a
Python virtual environment. Activating an environment will add paths
for this directory. @scheibelp will be adding that after he reviews
this PR.
Some more technical details
On a technical level, this differs from the prior implementation in #8231 in a few important ways:
spack.yaml, and you can use it to initialize a new environment or to control an existing one.spack.yamlis updated with new abstract specs when you usespack env add,spack env remove, etc., and comments are preserved through these updates.spack.lockalways contains the results of the last time the environment was concretized (both inputs and outputs).spack.yamlis the human-editable file with inputs, andspack.lockis machine-readable and exact. They're kept in sync.spack find,spack spec, etc.) solves most of the issues.Summary:
Move the old
spack envcommand tospack build-envAdd a new
spack envcommand:spack env create ENV: create a new environmentspack env destroy ENV: destroy an environmentspack env list: list available environmentsspack env status [ENV]get a list of what's been added/installed to this environmentspack env activate ENV: activate the named environment (makes ENV args implicit)spack env deactivateORdespacktivate: deactivate the currently activated environmentspack env add SPEC: add a spec to the current environmentspack env remove SPEC: remove a spec from the current environmentspack env install: concretize (see below) and install all specs in an environment (you can optionally just install already concretized specs)spack env concretize [ENV]: concretize all specs in the environment and write aspack.lockfilespack install SPEC: if an environment is activated, this now installs into the active environmentspack install: if aspack.yamlfile is found in the current directory, this concretizes and installs all specs in thatyamlfile -- so you can keep an environment in a git repo outside Spack.spack find: if an environment is active,spack findshows only specs installed in the current environmentspack spec: if an environment is active,spack specand other commands concretize using configuration from the active environmentspack env loads [ENV]: generate a script that loads all modules for an environmentspack env stage [ENV]: stage all specs in an environmentspack location -e ENV: get the location of an environmentspack cd -e ENV: cd to an environment's directoryspack env uninstall: uninstall all specs from an environmentThe
spackcommand itself now has a-eoption that you can use to specify an environment on the command line. This takes precedence over the current environment fromspack env activate ENVTODO: