Speedup environment activation#13557
Conversation
|
@adamjstewart Can you try this and report back? I'll try to see if I can improve it further in terms of performance. |
|
I think a refactor of |
|
Here are the same benchmarks on my laptop. From my home directory: $ time spack help
...
real 0m2.366s
user 0m0.728s
sys 0m0.298sFrom the directory containing the large $ time spack help
...
real 0m6.649s
user 0m4.290s
sys 0m1.054sDefinitely a drastic improvement over |
|
@adamjstewart So far I tried to optimize for your small environment. Making a small break while the big one is building 🙂 |
|
@adamjstewart Tried on the big environment. I might shave something more if we are fine losing the information in: spack/lib/spack/spack/util/environment.py Lines 338 to 348 in 1114ae9 in cases where we modify the environment a lot. If need be I can compute the information only in debug mode, but if the timings are good enough without this change I would keep the PR as it is. Let me know. |
|
I'm not sure how that function is used, so I don't know if we can skip it or not. |
|
@adamjstewart If there are warnings on e.g. the same variable set two times in a DAG (so the first "set" won't have any effect) it will print context to tell you where both set took place. It would be good to keep it as it is, so if the performance is acceptable right now I'd avoid making any change. |
|
Yeah, I guess it depends on how much of a difference it makes. The current PR drops us from 2 minutes to 6 seconds. If it would drop us from 6 seconds to 3 seconds, it would be worth it in my mind. |
You didn't try the latest 2 commits 😉 |
|
Okay, with the latest commit, I'm down to 4 seconds. |
tgamblin
left a comment
There was a problem hiding this comment.
I tried cherry-picking this into our new tutorial branch but it breaks the tutorial with
==> Error: name 'spack_cc' is not defined
I haven't had a chance to dig into it, but I moved it to 0.13.2 to be on the safe side.
6bfa4e2 to
0aa8f7c
Compare
0aa8f7c to
0cd2610
Compare
tgamblin
left a comment
There was a problem hiding this comment.
@alalazo: These are the functions in packages where spack_cc is used in instance methods within packages:
$ git grep -ph 'spack_cc\b' | grep def| sort | uniq -c | sort -rn
11 def edit(self, spec, prefix):
8 def install(self, spec, prefix):
6 def build(self, spec, prefix):
4 def configure_args(self):
3 def setup_dependent_environment(self, spack_env, run_env, dependent_spec):
2 def setup_dependent_environment(self, *args):
2 def make_defs(self):
1 def setup_environment(self, spack_env, run_env):
1 def setup_dependent_package(self, module, dependent_spec):
1 def setup_dependent_build_environment(self, env, dependent_spec):
1 def setup_build_environment(self, env):
1 def patch(self):
1 def filter_compilers(self):
1 def cmake_args(self):
1 def build_targets(self):
1 def build_libsqlitefunctions(self):
1 def build_args(self, spec, prefix):To get git grep -p to work properly for me I had to add this to my git attributes:
$ cat ~/src/spack/.git/info/attributes
*.py diff=pythonIf you do that I think it's a decent way of seeing where the various module variables are used. I think we need to either:
- ensure that these things are available in package instance methods even when they're called outside the build, or
- be clearer about their availability and fix the packages.
I think for now (1) makes the most sense. Is reconstructing the compiler paths really all that slow? What's the costly part of that operation?
0cd2610 to
3dfaa89
Compare
The most expensive calls seems to be the ones trying to recover compilers and other variables that are needed only at build time. Avoid to compute them wherever possible.
When setting up dependencies the `home` property of the Python package might be called a large number of times. Since its computation is expensive here we cache it based on the DAG hash of the Python spec.
This reverts commit c2f6113.
3dfaa89 to
d29f467
Compare
|
@adamjstewart: how fast is this for you now? |
|
On my laptop: Before (develop)From my home directory: $ time spack help
...
real 0m2.871s
user 0m0.719s
sys 0m0.307sFrom a directory containing a large $ time spack help
...
real 3m22.219s
user 3m10.927s
sys 0m3.275sAfter (fixes/speedup_environment_activation)From my home directory: $ time spack help
...
real 0m1.972s
user 0m0.682s
sys 0m0.271sFrom a directory containing a large $ time spack help
...
real 0m5.015s
user 0m3.607s
sys 0m0.612sLet me know if you want benchmarks on Blue Waters where the problem is even more pronounced. |
* Add a transaction around repeated calls to `spec.prefix` in the activation process * cache the computation of home in the python package to speed up setting deps * ensure that module-scope variables are only set *once* per module
v0.13.2 This release contains major performance improvements for Spack environments, as well as some bugfixes and minor changes. * allow missing modules if they are blacklisted (#13540) * speed up environment activation (#13557) * mirror path works for unknown versions (#13626) * environments: don't try to modify run-env if a spec is not installed (#13589) * use semicolons instead of newlines in module/python command (#13904) * verify.py: os.path.exists exception handling (#13656) * Document use of the maintainers field (#13479) * bugfix with config caching (#13755) * hwloc: added 'master' version pointing at the HEAD of the master branch (#13734) * config option to allow gpg warning suppression (#13744) * fix for relative symlinks when relocating binary packages (#13727) * allow binary relocation of strings in relative binaries (#13724)
v0.13.2 This release contains major performance improvements for Spack environments, as well as some bugfixes and minor changes. * allow missing modules if they are blacklisted (#13540) * speed up environment activation (#13557) * mirror path works for unknown versions (#13626) * environments: don't try to modify run-env if a spec is not installed (#13589) * use semicolons instead of newlines in module/python command (#13904) * verify.py: os.path.exists exception handling (#13656) * Document use of the maintainers field (#13479) * bugfix with config caching (#13755) * hwloc: added 'master' version pointing at the HEAD of the master branch (#13734) * config option to allow gpg warning suppression (#13744) * fix for relative symlinks when relocating binary packages (#13727) * allow binary relocation of strings in relative binaries (#13724)
* Add a transaction around repeated calls to `spec.prefix` in the activation process * cache the computation of home in the python package to speed up setting deps * ensure that module-scope variables are only set *once* per module
This is a direct followup to #13557 which caches additional attributes that were added in #24095 that are expensive to compute. I had to reopen #25556 in another PR to invalidate the GitLab CI cache, but see #25556 for prior discussion. ### Before ```console $ time spack env activate . real 2m13.037s user 1m25.584s sys 0m43.654s $ time spack env view regenerate ==> Updating view at /Users/Adam/.spack/.spack-env/view real 16m3.541s user 10m28.892s sys 4m57.816s $ time spack env deactivate real 2m30.974s user 1m38.090s sys 0m49.781s ``` ### After ```console $ time spack env activate . real 0m8.937s user 0m7.323s sys 0m1.074s $ time spack env view regenerate ==> Updating view at /Users/Adam/.spack/.spack-env/view real 2m22.024s user 1m44.739s sys 0m30.717s $ time spack env deactivate real 0m10.398s user 0m8.414s sys 0m1.630s ``` Fixes #25555 Fixes #25541 * Speedup environment activation, part 2 * Only query distutils a single time * Fix KeyError bug * Make vermin happy * Manual memoize * Add comment on cross-compiling * Use platform-specific include directory * Fix multiple bugs * Fix python_inc discrepancy * Fix import tests
fixes #13554