Skip to content

Change from modulecmd interface to module interface#8570

Merged
tgamblin merged 25 commits intodevelopfrom
features/use-module-interface
May 9, 2019
Merged

Change from modulecmd interface to module interface#8570
tgamblin merged 25 commits intodevelopfrom
features/use-module-interface

Conversation

@becker33
Copy link
Copy Markdown
Member

@becker33 becker33 commented Jun 25, 2018

Fixes #6451.
Fixes #7884.
Fixes #8193.

Update Spack core to use new module function instead of get_module_cmd

Previously, Spack relied on either examining the bash module() function or using the which command to find the underlying executable for modules.

As @mgsternberg commented in detail on #6451, more complicated module systems do not allow for the sort of simple analysis we were doing.

This PR uses the module function directly and copies environment changes from the resulting subprocess back into Spack. This should provide a future-proof implementation for changes to the logic underlying the module system on various HPC systems.

@adamjstewart
Copy link
Copy Markdown
Member

I'm testing this on Blue Waters now. I can confirm that it gets rid of the pesky error message from #8193, but I'm encountering a few other problems. With Python 2.6.9:

$ spack -d install zlib
...
==> Error: TypeError: 'str' object is not callable

/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py:694, in child_process:
        691            tb_string = traceback.format_exc()
        692
        693            # build up some context from the offending package so we can
  >>    694            # show that, too.
        695            package_context = get_package_context(tb)
        696
        697            build_log = None


Traceback (most recent call last):
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py", line 676, in child_process
    setup_package(pkg, dirty=dirty)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py", line 585, in setup_package
    pkg.architecture.platform.setup_platform_environment(pkg, spack_env)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/platforms/cray.py", line 109, in setup_platform_environment
    module('unload', module)
TypeError: 'str' object is not callable

With Python 3.5.4:

$ spack -d install zlib
...
Traceback (most recent call last):
  File "/u/sciteam/stewart1/spack/bin/spack", line 85, in <module>
    sys.exit(spack.main.main())
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/main.py", line 580, in main
    parser.add_command(cmd_name)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/main.py", line 288, in add_command
    module = spack.cmd.get_module(cmd_name)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/cmd/__init__.py", line 128, in get_module
    level=0)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/cmd/install.py", line 36, in <module>
    import spack.cmd.common.arguments as arguments
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/cmd/common/arguments.py", line 30, in <module>
    import spack.modules
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/modules/__init__.py", line 32, in <module>
    from .dotkit import DotkitModulefileWriter
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/modules/dotkit.py", line 29, in <module>
    from .common import BaseConfiguration, BaseFileLayout
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/modules/common.py", line 69, in <module>
    configuration = spack.config.get('modules')
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/config.py", line 515, in get
    return config.get(path, default, scope)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/llnl/util/lang.py", line 556, in __getattr__
    return getattr(self.instance, name)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/llnl/util/lang.py", line 552, in instance
    self._instance = self.factory()
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/config.py", line 492, in _config
    platform = spack.architecture.platform().name
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/llnl/util/lang.py", line 182, in __call__
    self.cache[args] = self.func(*args)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/architecture.py", line 498, in platform
    return platform_cls()
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/platforms/cray.py", line 68, in __init__
    for target in self._avail_targets():
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/platforms/cray.py", line 159, in _avail_targets
    _fill_craype_targets_from_modules(targets, craype_modules)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/platforms/cray.py", line 48, in _fill_craype_targets_from_modules
    if 'craype-' in mod:
TypeError: a bytes-like object is required, not 'str'

@mgsternberg
Copy link
Copy Markdown
Contributor

Cool – Thank you for implementing the idea so quickly!

One caveat: The approach puts the onus on local admins/users (reasonably so, I might add) to have the module shell command be available under both login and non-login shells. One hopes that a system-level installation of an env. mod./Lmod/Foo-mod package does the right thing, and DYI-ers should know what they're doing.

@becker33
Copy link
Copy Markdown
Member Author

@adamjstewart The python 2.6.9 issue should be fixed now. It was a name shadowing issue.
The python 3 issue should be fixed as well. I didn't realize subprocess was return bytes in python3, I now convert them to strings properly in both python2 and 3.

@mgsternberg Yes, we do have to rely on the module function to be set up reasonably for this approach to work. Out of the box modules do work properly for this, so I think it's safe to count on DIYers to know what they're doing, as you said. If it becomes an issue, we could look into using the executable argument to subprocess.Popen to launch something more like a login shell.

@adamjstewart
Copy link
Copy Markdown
Member

Thanks @becker33, the Python 2 issue is now fixed. However, I'm now encountering new problems with Python 3:

$ spack -d install zlib
...
==> Error: TypeError: the JSON object must be str, not 'bytes'

/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py:694, in child_process:
        691            tb_string = traceback.format_exc()
        692
        693            # build up some context from the offending package so we can
  >>    694            # show that, too.
        695            package_context = get_package_context(tb)
        696
        697            build_log = None


Traceback (most recent call last):
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py", line 676, in child_process
    setup_package(pkg, dirty=dirty)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py", line 585, in setup_package
    pkg.architecture.platform.setup_platform_environment(pkg, spack_env)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/platforms/cray.py", line 109, in setup_platform_environment
    module('unload', mod)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/util/module_cmd.py", line 48, in module
    env_dict = json.loads(module_p.communicate()[0])
  File "/mnt/bwpy/single/usr/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'

@becker33
Copy link
Copy Markdown
Member Author

becker33 commented Jul 2, 2018

@adamjstewart The python 2.6.9 issue should be fixed now. It was a name shadowing issue.
The python 3 issue should be fixed as well. I didn't realize subprocess was return bytes in python3, I now convert them to strings properly in both python2 and 3.

@mgsternberg Yes, we do have to rely on the module function to be set up reasonably for this approach to work. Out of the box modules do work properly for this, so I think it's safe to count on DIYers to know what they're doing, as you said. If it becomes an issue, we could look into using the executable argument to subprocess.Popen to launch something more like a login shell.

@tgamblin
Copy link
Copy Markdown
Member

I'm ok with what's implemented here but I really wish there were more tests.

@becker33: if you could assume that, say, lmod (or modules) was available, and module worked, could you add tests for this functionality? We can install those things in the Travis environment and just skip them if module is not available. Look at the tests for svn and hg for how to mark tests that way.

@mpbelhorn @mamelara: do you mind testing this to ensure that it still works at your sites:

@tgamblin
Copy link
Copy Markdown
Member

oh and @adamjstewart: does this work well on BW now?

@adamjstewart
Copy link
Copy Markdown
Member

I'll test this tomorrow or next week. Can you rebase in the meantime?

@mpbelhorn
Copy link
Copy Markdown
Contributor

mpbelhorn commented Jul 26, 2018 via email

@mamelara
Copy link
Copy Markdown
Member

LGTM. Tested it out on Cori and Edison.

@tgamblin
Copy link
Copy Markdown
Member

@mpbelhorn: ping

@mpbelhorn
Copy link
Copy Markdown
Contributor

mpbelhorn commented Jul 31, 2018

Running into a problem on Titan. The platform.cray.targets attribute is not being populated:

[email protected]:.../belhorn/titan/clean_upstream_spack [features/use-module-interface|OK]
$ ./bin/spack -d arch        
==> '/usr/bin/env' '-' 'USER=belhorn' 'HOME=/ccs/home/belhorn' '/bin/bash' '--noprofile' '--norc' '-c' '. /etc/profile; module list -lt'
==> Found default modules:
       eswrap/1.3.3-1.020200.1280.0
       craype-network-gemini
       pgi/18.4.0
       craype/2.5.13
       cray-libsci/16.11.1
       udreg/2.3.2-1.0502.10518.2.17.gem
       ugni/6.0-1.0502.10863.8.28.gem
       pmi/5.0.12
       dmapp/7.0.1-1.0502.11080.8.74.gem
       gni-headers/4.0-1.0502.10859.7.8.gem
       xpmem/0.1-2.0502.64982.5.3.gem
       dvs/2.5_0.9.0-1.0502.2188.1.113.gem
       alps/5.2.4-2.0502.9774.31.12.gem
       rca/1.0.0-2.0502.60530.1.63.gem
       atp/2.1.1
       PrgEnv-pgi/5.2.82
       cray-mpich/7.6.3
       craype-interlagos
       lustredu/1.4
       xalt/0.7.5
       git/2.13.0
       module_msg/0.1
       modulator/1.2.0
       hsi/5.0.2.p1
       DefApps
Traceback (most recent call last):
  File "./bin/spack", line 85, in <module>
    sys.exit(spack.main.main())
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/spack/main.py", line 608, in main
    return _main(command, parser, args, unknown)
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/spack/main.py", line 495, in _main
    setup_main_options(args)
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/spack/main.py", line 377, in setup_main_options
    spack.config.set('config:debug', True, scope='command_line')
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/spack/config.py", line 523, in set
    return config.set(path, value, scope)
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/llnl/util/lang.py", line 556, in __getattr__
    return getattr(self.instance, name)
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/llnl/util/lang.py", line 552, in instance
    self._instance = self.factory()
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/spack/config.py", line 492, in _config
    platform = spack.architecture.platform().name
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/llnl/util/lang.py", line 182, in __call__
    self.cache[args] = self.func(*args)
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/spack/architecture.py", line 498, in platform
    return platform_cls()
  File "/autofs/nccs-svm1_sw/.testing/belhorn/titan/clean_upstream_spack/lib/spack/spack/platforms/cray.py", line 82, in __init__
    self.add_target(name, self.targets[safe_name])
KeyError: 'interlagos'

[email protected]:.../belhorn/titan/clean_upstream_spack [features/use-module-interface|OK]
$ git ls
ac0c14688 | flake [Gregory Becker] (HEAD -> features/use-module-interface, origin/features/use-module-interface)
df6fc5cb5 | updated tests for faked modules [Gregory Becker]
27d455694 | Fixed python3 byte/string handling bug [Gregory Becker]
03215e873 | flake [Gregory Becker]
b8d6a76ce | flake fixes [Gregory Becker]
16ffd68ba | Change from modulecmd interface to module interface [Gregory Becker]
5be89bfb7 | poppler needs it test repo URL updated (#8519) [Adam J. Stewart]
398afa460 | pytest: add _pytest/_version.py and LICENSE [Todd Gamblin]
137456fbf | externals: move spack.util.ordereddict to external/ordereddict_backport [Todd Gamblin]
508bbf407 | New package xxhash. (#8530) [Adam J. Stewart]
b477868a5 | Add kealib 1.4.8 (#8522) [GitHub]
5b309ea6a | openPMD-api: update dependencies (#8528) [Adam J. Stewart]
e6834301b | glib: Update to 2.56.1 (#8523) [Adam J. Stewart]
f9683419f | mitofates: needs perl at runtime (#8526) [Adam J. Stewart]
5c731f54b | Bump ZeroMQ from v4.2.2 to v4.2.5 (#8514) [Massimiliano Culpo]

It looks like the output of this line on Titan is

/bin/sh: module: command not found

which is not surprising to me since in my interactive shell, module is just a shell function:

[email protected]:.../belhorn/titan/clean_upstream_spack [features/use-module-interface|OK]
$ which module
module () {
        eval `/sw/xk6/environment-modules/3.2.10.3/sles11.3_gnu4.9.0/bin/modulecmd zsh $*`
}

I would expect spack.util.module_cmd.module to call the python interface to tcl modules:

import os
exec(open('/usr/share/Modules/init/python.py').read())
module('load, 'modulefile', 'modulefile', '...')

The only reason I didn't do this is because I didn't want to inherit the currently loaded modules when trying to probe for the system default modules and so subprocess a shell which invokes module. @mamelara, I guess Cori and Edison have an actual module command? Nevermind, Cori looks the same as Titan to me.

In theory, a correctly configured TCL Environment Module system should have the init scripts located at $MODULESHOME/init, which is the case on Titan.

@mpbelhorn
Copy link
Copy Markdown
Contributor

I see exactly the same behavior on Cori as I do on Titan:

belhorn@cori07:~/development 
$ git clone https://github.com/spack/spack.git
Cloning into 'spack'...
remote: Counting objects: 120993, done.
remote: Compressing objects: 100% (48/48), done.
remote: Total 120993 (delta 28), reused 8 (delta 3), pack-reused 120941
Receiving objects: 100% (120993/120993), 40.76 MiB | 2.01 MiB/s, done.
Resolving deltas: 100% (58525/58525), done.
belhorn@cori07:~/development 
$ cd spack 
belhorn@cori07:~/development/spack [develop|OK]
$ git checkout features/use-module-interface
Branch features/use-module-interface set up to track remote branch features/use-module-interface from origin.
Switched to a new branch 'features/use-module-interface'
belhorn@cori07:~/development/spack [features/use-module-interface|OK]
$ ./bin/spack compilers
Traceback (most recent call last):
  File "./bin/spack", line 85, in <module>
    sys.exit(spack.main.main())
  File "/global/u2/b/belhorn/development/spack/lib/spack/spack/main.py", line 580, in main
    parser.add_command(cmd_name)
  File "/global/u2/b/belhorn/development/spack/lib/spack/spack/main.py", line 291, in add_command
    module.setup_parser(subparser)
  File "/global/u2/b/belhorn/development/spack/lib/spack/spack/cmd/compilers.py", line 34, in setup_parser
    scopes = spack.config.scopes()
  File "/global/u2/b/belhorn/development/spack/lib/spack/spack/config.py", line 528, in scopes
    return config.scopes
  File "/global/u2/b/belhorn/development/spack/lib/spack/llnl/util/lang.py", line 556, in __getattr__
    return getattr(self.instance, name)
  File "/global/u2/b/belhorn/development/spack/lib/spack/llnl/util/lang.py", line 552, in instance
    self._instance = self.factory()
  File "/global/u2/b/belhorn/development/spack/lib/spack/spack/config.py", line 492, in _config
    platform = spack.architecture.platform().name
  File "/global/u2/b/belhorn/development/spack/lib/spack/llnl/util/lang.py", line 182, in __call__
    self.cache[args] = self.func(*args)
  File "/global/u2/b/belhorn/development/spack/lib/spack/spack/architecture.py", line 498, in platform
    return platform_cls()
  File "/global/u2/b/belhorn/development/spack/lib/spack/spack/platforms/cray.py", line 82, in __init__
    self.add_target(name, self.targets[safe_name])
KeyError: 'haswell'

@mamelara
Copy link
Copy Markdown
Member

mamelara commented Jul 31, 2018

Just to note, I did not checkout a fresh copy of Spack so it could be that my configuration is much different.

mamelara@cori07 spack (use-module-interface)$ spack compilers
==> Available compilers
-- cce cnl6-any -------------------------------------------------
[email protected]  [email protected]  [email protected]

-- gcc cnl6-any -------------------------------------------------
[email protected]  [email protected]  [email protected]  [email protected]  [email protected]  [email protected]  [email protected]

-- gcc sles12-x86_64 --------------------------------------------
[email protected]

-- intel cnl6-any -----------------------------------------------
[email protected]  [email protected]  [email protected]  [email protected]  [email protected]  [email protected]  [email protected]

-- pgi cnl6-any -------------------------------------------------
pgi@2017
mamelara@cori07 spack (use-module-interface)$

Either way something must have broke and I didn't see it with my own copy of spack.

Copy link
Copy Markdown
Member

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a quick test (spack install zlib) with Python 2.6.9 and 3.5.5 on Blue Waters. I'm still seeing the Python 3 error message:

$ spack -d install zlib %gcc
...
==> Installing zlib dependencies
==> Installing zlib
==> '/usr/bin/env' '-' 'USER=stewart1' 'HOME=/u/sciteam/stewart1' '/bin/bash' '--noprofile' '--norc' '-c' '. /etc/profile; module list -lt'
==> Found default modules:
       modules/3.2.10.4
       eswrap/1.3.3-1.020200.1280.0
       cce/8.4.6
       craype-network-gemini
       craype/2.5.8
       cray-libsci/16.11.1
       udreg/2.3.2-1.0502.10518.2.17.gem
       ugni/6.0-1.0502.10863.8.28.gem
       pmi/5.0.10-1.0000.11050.179.3.gem
       dmapp/7.0.1-1.0502.11080.8.74.gem
       gni-headers/4.0-1.0502.10859.7.8.gem
       xpmem/0.1-2.0502.64982.5.3.gem
       dvs/2.5_0.9.0-1.0502.2188.1.113.gem
       alps/5.2.4-2.0502.9774.31.12.gem
       rca/1.0.0-2.0502.60530.1.63.gem
       atp/2.0.4
       PrgEnv-cray/5.2.82
       cray-mpich/7.5.0
       craype-interlagos
       torque/6.0.4
       moab/9.1.2-sles11
       java/jdk1.8.0_51
       globus/5.2.5
       gsissh/6.2p2
       xalt/0.7.6.local
       scripts
       OpenSSL/1.0.2m
       cURL/7.59.0
       git/2.17.0
       wget/1.19.4
       user-paths
       gnuplot/5.0.5
       darshan/3.1.3
==> Error: TypeError: the JSON object must be str, not 'bytes'

/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py:694, in child_process:
        691            tb_string = traceback.format_exc()
        692
        693            # build up some context from the offending package so we can
  >>    694            # show that, too.
        695            package_context = get_package_context(tb)
        696
        697            build_log = None


Traceback (most recent call last):
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py", line 676, in child_process
    setup_package(pkg, dirty=dirty)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/build_environment.py", line 585, in setup_package
    pkg.architecture.platform.setup_platform_environment(pkg, spack_env)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/platforms/cray.py", line 109, in setup_platform_environment
    module('unload', mod)
  File "/mnt/a/u/sciteam/stewart1/spack/lib/spack/spack/util/module_cmd.py", line 48, in module
    env_dict = json.loads(module_p.communicate()[0])
  File "/mnt/bwpy/single/usr/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'

@becker33 becker33 force-pushed the features/use-module-interface branch from 34a8711 to 031cf9a Compare August 28, 2018 17:57
@becker33
Copy link
Copy Markdown
Member Author

@adamjstewart your bug should be gone now.

@becker33
Copy link
Copy Markdown
Member Author

@mpbelhorn One reason not to use the python interface to modules is that neither TCL modules nor Lmod have python3 compliant python interfaces. If we could make it work, it would be preferable to this method.

@becker33 becker33 force-pushed the features/use-module-interface branch from d2527dd to 039f477 Compare May 9, 2019 21:13
@tgamblin
Copy link
Copy Markdown
Member

tgamblin commented May 9, 2019

Merging this because it seems to fix an awful lot of issues. But @becker33: can you follow up with @floli about the issue he sees above? I think it is unrelated but I could be wrong.

@becker33
Copy link
Copy Markdown
Member Author

becker33 commented May 9, 2019

@floli Does this bug still manifest for you?

@tgamblin tgamblin merged commit 3d3cea1 into develop May 9, 2019
@floli
Copy link
Copy Markdown

floli commented May 10, 2019

Ok, using a fresh checkout of develop, I could not reproduce the error message from #8570 (comment).

@adamjstewart adamjstewart deleted the features/use-module-interface branch May 10, 2019 18:18
scheibelp pushed a commit that referenced this pull request May 14, 2019
Remove a vestigial print statement introduced in #8570
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Could not detect module function from bash. Trying to detect modulecmd from which

9 participants