Skip to content

Remove spack load dependence on modules#14062

Merged
tgamblin merged 18 commits intodevelopfrom
features/load-without-modules
Jan 23, 2020
Merged

Remove spack load dependence on modules#14062
tgamblin merged 18 commits intodevelopfrom
features/load-without-modules

Conversation

@becker33
Copy link
Copy Markdown
Member

@becker33 becker33 commented Dec 10, 2019

Previously the spack load command was a wrapper around module load. This required some bootstrapping of modules to make spack load work properly.

With this PR, the spack shell function handles the environment modifications necessary to add packages to your user environment. This removes the dependence on environment modules or lmod and removes the requirement to bootstrap spack (beyond using the setup-env scripts).

Included in this PR is support for MacOS when using Apple's System Integrity Protection (SIP), which is enabled by default in modern MacOS versions. SIP clears the LD_LIBRARY_PATH and DYLD_LIBRARY_PATH variables on process startup for executables that live in /usr (but not '/usr/local', /System, /bin, and /sbin among other system locations. Spack cannot know the LD_LIBRARY_PATH of the calling process when executed using /bin/sh and /usr/bin/python. The spack shell function now manually forwards these two variables, if they are present, as SPACK_<VAR> and recovers those values on startup.

@adamjstewart
Copy link
Copy Markdown
Member

Couldn't resist making a meme for this PR...

In all seriousness, I support this PR, and think it's the right direction to go if we want to continue to make Spack easy to set up and use. My only concern is that Environment Modules/Lmod are massive projects, and we're attempting to roll our own module management system here. How much of Lmod's features do we plan on supporting? How large of a codebase will this become? Using a wrapper around module load introduces complexity on its own (not sure if we even support Env Mod 4 yet), so maybe this is actually simpler than it used to be. Just want to make sure we aren't digging ourselves deeper here.

@becker33
Copy link
Copy Markdown
Member Author

@adamjstewart I think we get most of this feature "for free" from other things we have to do. We already need to know the environment changes necessary to load a package (because we generate modules) and we need to know how to reverse those changes (because we deactivate environments). So far, this feature is just a thin shell script shim layer. I do think it makes sense to add tracking for which hashes are loaded, but that's also pretty trivial, and can be hooked up to the find command.

The previous state of affairs:

3j9q95

@hartzell
Copy link
Copy Markdown
Contributor

Will there be something analogous to the module systems ability to filter things from the modulefiles? E.g.

modules:
  lmod:
    all:
      filter:
        environment_blacklist: 'LD_LIBRARY_PATH'
    bowtie2:
      environment:
        LD_LIBRARY_PATH: '$prefix/lib'

Some of us (ok, at least I...) consider setting LD_LIBRARY_PATH and etc. in the modulefiles to be an antipattern, one that throws away all of the work Spack does to get the RPATH bits correct.

This was pretty widely discussed back in #3955.

I recognize the utility of setting those variables if you're supporting folks who are linking against libraries in the Spack tree. I, on the other hand, am simply trying to use the things I've built with Spack with minimal side effects. #3955 documents LD_LIBRARY_PATH breaking yum, we ran into other cases on the linux boxes and I've since tripped over issues on my Mac that were fixed by adding DYLD_LIBRARY_PATH to the blacklist.

I've been running with this for a while now, without the exception for python, and haven't been having any trouble.

modules:
  lmod:
    all:
      filter:
        environment_blacklist: ['CPATH', 'LIBRARY_PATH', 'LD_LIBRARY_PATH', 'DYLD_LIBRARY_PATH']
      environment:
        set:
          '${PACKAGE}_ROOT': '${PREFIX}'

Sorry, no GIF....

@hartzell
Copy link
Copy Markdown
Contributor

Echo'ing a bit what I hear @adamjstewart asking, I'm worried about Spack trying to be do too many things.

The only feature that I've seen for Spack load that it's trivial w/ Lmod modules is the ability to cook up an arbitrary spec and have load the appropriate module. In the few cases where my user community needed that feature I've use the module systems ability to add suffixes to provide simple names that my users can figure out w/out documentation, e.g. ...-without-X for the non-X variant of an application. I've never been eager to try to explain Spack's spec language to my end-user communities.

Beyond that it seems that it's duplicating existing work and building a monolithic system.

I suppose what I'm asking is: what is the end goal for this set of features?

  • Will I be able to use it to also set up things that aren't part of Spack?
  • Will is support blacklisting and etc?
  • Can we give things easy-to-understand names, synonyms?
  • Will it support hierarchical modules that make foot-shooting difficult?
    • If it does, will it fix Lmod's/our problems with things that are built with e.g. D and end up in their only branch of the tree?

@becker33
Copy link
Copy Markdown
Member Author

I suppose what I'm asking is: what is the end goal for this set of features?

* Will I be able to use it to also set up things that aren't part of Spack?

* Will is support blacklisting and etc?

* Can we give things easy-to-understand names, synonyms?

* Will it support hierarchical modules that make foot-shooting difficult?
  
  * If it does, will it fix Lmod's/our problems with things that are built with e.g. D and end up in their only branch of the tree?

I do not think we should enable this for non-spack software. We will still generate modules for Spack-installed software, and those modules will be interoperable with the greater module ecosystem.

In this PR I've refactored the prefix_inspections from lib/spack/spack/environment.py. The plan is to merge those with the module configuration prefix inspections and give them a shared configuration format at some later date.

On your last two points, the answer to both is environments. Spack allows environments with pithy names, and an environment with concretization: together manages coherency to avoid all foot-shooting incidents. The effect of spack load foo in this PR is exactly the same as spack env activate foo-env, where foo-env is an environment containing only the package foo.

@becker33
Copy link
Copy Markdown
Member Author

close-cycling to restart the tests, as they didn't start properly on the last push

@tgamblin
Copy link
Copy Markdown
Member

The only feature that I've seen for Spack load that it's trivial w/ Lmod modules is the ability to cook up an arbitrary spec and have load the appropriate module.

@hartzell: not sure if it's clear from the PR, but the only thing spack load and spack unload ever did was load/unload a single spec -- using the module system. The only thing this PR does is make it so that we don't need modules to do that.

We're definitely not getting rid of modules; all those customizations will still be possible by generating modules, and we'll still do that. Users can still use modules through the module command, not through spack load. Do your users actually use spack load? If not, this won't affect them.

The overarching goal here is to give users some easy ways to use Spack that a) don't require setting up an entire module system, and b) work the same on your laptop, HPC machine, etc. spack load is for loading a one-off and spack environments provide more consistency (via concretization) than modules can actually provide. Lmod's a hierarchy and it keeps you safe for MPI and your compiler -- dependency resolution for environments handles the whole DAG.

Does this and @becker33's comment alleviate your concerns?

tty.msg(*msg)
return 1

if args.recurse_dependencies:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should default to true -- it was always puzzling that spack load didn't load them by default -- I think the original intent was that the module system would handle loading recursive deps. Since we're not loading modules anymore, I think the default should be to load recursive deps (if they are not already loaded).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -r option is always default False on other packages.

I think it will be better to replace it with a --only argument like the install command has. Default will be package and dependencies, but --only can specify only dependencies or only the package itself.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. --only does more and has the right default.

import spack.util.environment
import spack.user_environment as uenv

description = "remove package from the user environment variables`"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can omit "variables"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I added it to differentiate from Spack environments, which could be "the user's active environment"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't like the "environment variables" part since you would normally say "the user's environment". But we have an overload with Spack environments. @adamjstewart: any suggestions for how to disambiguate this?

import spack.util.environment
import spack.user_environment as uenv

description = "add package to the user environment variables"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omit "variables"

env_mod.extend(
uenv.environment_modifications_for_spec(spec).reversed())
env_mod.remove_path(uenv.spack_loaded_hashes_var, spec.dag_hash())
cmds = env_mod.shell_modifications(args.shell)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state, if you load something, then change the package so that the env mods would change, then unload the package, can any of these operations fail? @alalazo?

I was worried about this and I think EnvironmentModifications is actually written so that things like RemovePath don't fail when the path to be removed is not present. Specifically, RemovePath.execute() doesn't try to remove the thing that should be removed -- it just takes all the paths that are not the thing, which is great. So, maybe this works fine and I do not need to worry.

print_module_placeholder_help()
env = ev.get_env(args, 'unload')
specs = list(map(lambda spec: spack.cmd.disambiguate_spec(spec, env),
spack.cmd.parse_specs(args.specs)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What happens if you load something, enter an environment, and then try to unload it? Seems like it should still succeed.
  2. Should loaded specs continue to be loaded if you activate an environment? Or should env activate/deactivate unload/load everything in SPACK_LOADED_HASHES? People are going to try to use these things together.
  3. Shouldn't this just look in SPACK_LOADED_HASHES to disambiguate? Why does it need to search everything?

@hartzell
Copy link
Copy Markdown
Contributor

hartzell commented Dec 16, 2019

[edit: rainbox -> rainbows]

@tgamblin --

Does this and @becker33's comment alleviate your concerns?

The main alleviant [SIC?] is the trust that I have that all y'all will
build something that works.

@hartzell: not sure if it's clear from the PR, but the only thing
spack load and spack unload ever did was load/unload a single spec --
using the module system. The only thing this PR does is make it so
that we don't need modules to do that.

Well..... There are a bunch of Issues and PR's that talk about how
loading a single modulefile isn't really very useful and we have to
load their dependencies (link/run, run, what-have-you) in order to do
anything useful. Then there's spack load -r.

Seems like a slippery slope; writing and supporting more code that
works just up until there's a sudden phase change and users hear
"Well, you need to use modules/environments/activate/*".

[...]
The overarching goal here is to give users some easy ways to use Spack
that a) don't require setting up an entire module system, and b) work
the same on your laptop, HPC machine, etc. [...]

Sure, but a hypothetical/mythical

spack install emacs
spack install lmod
. $(spack setup-lmod --with-unicorn --double-rainbows --but-seriously)
module load emacs

would (at the cost of requiring lmod in addition to whatever else
you've built) provide a more continuous path towards a fluidly
functioning system (e.g. one that doesn't need a chain activations
in order to get flake8 to run).

I'll bet a beer if/when we ever get to meet that eventually spack load
will have the bulk of the functionality of Lmod.

Rather than make spack load work without modules, why not use
modules to make spack work w/out spack load? Yikes, keep reading!

But, the more I argue, the more I put the lie to my opening statement.

Second guessing and arm chair quarterbacking...

There's more than one way to do things and every set of choices
comes complete with its own special corner for painting-into. Your
engineering intuition, use cases, users, available resources, and
priorities are yours and mine are mine. I live with (and perhaps even
have learned to like) the squeaks, groans, and itches that my
approaches generate, but it is known that it all still squeaks,
groans, and itches....

You've built something great and I'm thankful to have it in my
toolbox.

Copy link
Copy Markdown
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@becker33: this is looking great -- one last request is to make a pass over the docs and see if anything needs changing.

I figure we can save modifying setup-env.sh for a separate PR -- do you agree?

@tgamblin
Copy link
Copy Markdown
Member

clarification on this:

I figure we can save modifying setup-env.sh for a separate PR -- do you agree?

I am talking about the part that looks for an lmod/modules installation. I know you already modified load and unload.

@becker33
Copy link
Copy Markdown
Member Author

@tgamblin yes I agree. I will look over the documentation either tonight or tomorrow.

@becker33
Copy link
Copy Markdown
Member Author

This is going to require a bunch or rewriting documentation. I'll get started on it tomorrow.

@aweits
Copy link
Copy Markdown
Contributor

aweits commented Jan 4, 2020

Just tested this branch (doing more homework for #14348). If a package is renamed or deleted (but is installed), it will no longer 'spack load'.

@tgamblin tgamblin force-pushed the features/load-without-modules branch 2 times, most recently from 36c4b24 to 740acb6 Compare January 23, 2020 05:58
@tgamblin tgamblin force-pushed the features/load-without-modules branch from 740acb6 to 640c63a Compare January 23, 2020 05:59
@tgamblin
Copy link
Copy Markdown
Member

@aweits: the installed-package issue is an issue with the module-based spack load, so we'll fix that one in a separate PR.

@tgamblin tgamblin merged commit c9e01ff into develop Jan 23, 2020
@tgamblin tgamblin deleted the features/load-without-modules branch January 23, 2020 06:41
@roblatham00
Copy link
Copy Markdown
Contributor

This change completely alters our workflow. We liked how "spack load -r packagename" loaded the requested module and all its dependencies. we also liked that it abstracted over environment-modules vs lmod.

spack module lmod loads margo does not work for us on summit, telling us "no module available", even though "spack load margo" gives no such error (and "spack load margo" does not load dependencies).

@carns
Copy link
Copy Markdown
Contributor

carns commented Jan 31, 2020

Suppose we were to ignore the lmod modules and let spack manage what packages are loaded on its own. Once we have done a spack load -r margo (or similar), how can we list it and any other packages it may have loaded (e.g., the equivalent of what we would have done with module list if using lmod)? For example, if someone wants to be able to switch between versions with different variant flags, or different cflags?

For some context, we need to rapidly figure out how to fix our documented workflow for a tutorial. One solution would be to go back to a tagged spack release, but then we lose updates to packages that are only present in origin/develop because package descriptions are embedded in the spack repo.

@tgamblin
Copy link
Copy Markdown
Member

We liked how "spack load -r packagename" loaded the requested module and all its dependencies. we also liked that it abstracted over environment-modules vs lmod.

This PR actually doesn’t change how modules are generated, and it doesn’t tie anything to lmod - it just makes spack load do what modules would’ve done, without going through modules. If you want tcl modules, what about:

spack module tcl loads margo

Or just load the package using module load based on what’s in module avail? All the same modules should be there.

@tgamblin
Copy link
Copy Markdown
Member

tgamblin commented Jan 31, 2020

Once we have done a spack load -r margo (or similar), how can we list it and any other packages it may have loaded (e.g., the equivalent of what we would have done with module list if using lmod)?

Try:

spack find --loaded

That will give you the list of loaded packages. You can unload them with spack unload, or all of them with spack unload -a.

The other way you could do this is to build your package in an environment, and spack env activate and spack env deactivate the environment.

@tgamblin
Copy link
Copy Markdown
Member

I’m a bit confused about the use of lmod here, as spack load was always searching tcl modules. I guess it was going through Lmod because that’s what was installed on summit, but what it actually loaded was the tcl module file (which lmod understands) — so also try my suggestion above to use spack module tcl loads, as that may require the least number of changes.

@carns
Copy link
Copy Markdown
Contributor

carns commented Jan 31, 2020

The spack find --loaded option is probably helpful in the short term; we'll try that out.

spack module tcl loads margo is a kind of handful for users to remember in a tutorial context, especially if what it really translates to in order to enact the load is something even more verbose like source <(spack module tcl loads --dependencies margo)

We have a few meta-problems to circle back on later, but I'll comment on quickly here:

The spack documentation is clear that it shouldn't be necessary to load dependencies (and indeed we can see that rpath has correct paths where needed), but all of our packages also use pkgconfig, and right now nothing is putting the .pc dependencies in place unless we instruct spack to load dependencies.

More generally when we run into these kinds of speed bumps, we can usually find a quick solution (like the find --loaded) above, but the fluidity of the syntax / best practice is a challenge from our perspective when we need to track origin/develop to get up to date package versions. In a perfect world I would like to lock onto a stable released version of spack itself while still having access to updated packages so that it's easier to pick up new versions of packages without risking breakage in our overall workflow at an inopportune moment.

As it stands we've had some bad luck with the timing of changes to spack behavior. In this case for example, its not that we care about lmod, its just that we expected to be able to module list to see what's going on, and it stopped working one day without it being immediately obvious what the replacement was.

@aprokop
Copy link
Copy Markdown
Contributor

aprokop commented Mar 19, 2020

With the previous behavior one could use module save and module restore to keep track of the exact environment (which is a mix of spack and non-spack modules). The new approach makes it impossible. Does anyone has workarounds for that?

@tgamblin
Copy link
Copy Markdown
Member

The new approach makes it impossible.

Is it impossible, or impossible with spack load? You can still do all that by module load-ing what you want.

@aprokop
Copy link
Copy Markdown
Contributor

aprokop commented Mar 19, 2020

You are right, I meant impossible with spack load. I was trying to avoid spack module lmod loads thingie which seems to behave differently (Error: the key "core_compilers" must be set in modules.yaml). I guess I'll need to fix that. Thanks.

@hartzell
Copy link
Copy Markdown
Contributor

hartzell commented Mar 20, 2020

Here's a gist that demonstrates an alternative approach, if you're going to go the lmod route.

It doesn't expose the user to the spack command (although it uses it to discover the installed location of the lmod package). Users liked it because they were already familiar with lmod and it gave me a clean way to use modulefiles to set up other non-spack (sigh) software.

In my environment I had a plethora of spack trees, built on different dates w/ different sets of packages for different use cases. The name of the "default" release lives in a file named current, users can override it by setting APPS_DIR before sourcing the script. It assumes that the core compiler is gcc (overridable). There are a bunch of other knobs for adventurous users.

Here's some of the commentary from the script:

# The typical user's .bash_profile would include:
#
# APPS_MODULES="emacs git htop the-silver-searcher"   # and etc....
# source /moose/apps/init.sh
#
# Even setting APPS_MODULES is optional if user is ok with default list...
#
# That will:
#
# 1. Fetch the name of the "current" spack tree from the Well Known
#    File if the user hasn't explicitly provided one.
# 2. Use the spack executable in that tree to ask for the location of
#    the Lmod directory in that tree.
# 3. Load the Lmod bash initialization file from within that Lmod dir.
# 4. Tell Lmod to use the Core dir that spack built.
# 5. Tell Lmod to use our directory of handcrafted module files.
# 6. Load the core compiler module, which makes the bulk of the
#    modulefiles accessible.
# 7. Load the user's modulefiles.
#
# There are a few things that users can customize:
# Common:
#   APPS_MODULES - your list of modules to load
#      e.g. APPS_MODULES="emacs git htop etc etc etc..."
# Less common:
#   APPS_DIR - path to top of a Spack apps tree, e.g. a team specific tree
#              (must include lmod)
#   APPS_SPECIAL_MODULES_DIR - path to dir of snowflake modulefiles
# Uncommon:
#   APPS_CORE_COMPILER - Compiler choice for lmod hierarch. scheme
#

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants