Speedup environment activation, part 2#25556
Speedup environment activation, part 2#25556adamjstewart wants to merge 10 commits intospack:developfrom
Conversation
There was a problem hiding this comment.
Instead of caching this by DAG hash, can you just make get_python_inc and get_python_lib delegate to a single method to get all the information you need from the underlying python and cache that?
The expensive thing here is running self.command at all. Make a function like get_python_data() that runs something like:
from distutils.sysconfig import get_python_inc, get_python_lib
import json
results = [
fn(plat_specific=ps) for fn, ps in [
(get_python_lib, True),
(get_python_lib, False),
(get_python_inc, True),
(get_python_inc, False),
]
]
json.dump({
"PREFIX": distutils.sysconfig.PREFIX,
"EXEC_PREFIX": distutils.sysconfig.EXEC_PREFIX,
"results": results
})You don't need to call this for even every package in the DAG -- all distutils is doing is replacing PREFIX with your prefix (for plat_specific=False) and EXEC_PREFIX with your prefix (for plat_specific=True). I don't think the results are prefix-dependent at all.
Mark the function that runs this @llnl.util.lang.memoized and you can avoid implementing all the caching yourself.
This approach should call self.command once for a given python package instead of O(nodes) like yours does. Python startup is very expensive so we should avoid it as much as possible.
|
Comment: we should really avoid calling |
|
I think my latest commit is close to what @tgamblin is looking for. Unfortunately, it doesn't work and I haven't yet figured out why. |
|
@adamjstewart could you also see if https://github.com/spack/spack/commit/383cbcb0521a3c7e9168fb166ab6a6bc6cde72d5.patch changes anything for you? For me it saves another 0.4 seconds, which probably isn't really measurable in your env :p but who knows |
|
@haampie thanks for the suggestion, that commit does indeed help quite a bit with activation/deactivation, although less with env update: $ time spack env activate .
real 0m29.377s
user 0m20.498s
sys 0m7.718s
$ time spack env view regenerate
==> Updating view at /Users/Adam/.spack/.spack-env/view
real 15m54.769s
user 10m23.846s
sys 4m55.371s
$ time spack env deactivate
real 0m55.149s
user 0m35.586s
sys 0m15.777sNo idea what the repercussions of that commit are though. |
|
Wait, that's 78% faster activate with only that commit and not this PR? That would make a more compelling argument for #25507 than the results I've collected there 😅 |
|
Yes, that's only with your commit, not with this PR. No idea why both seem to be non-orthogonal. Once I get this PR working again I'll try combining them. |
|
I don’t think we need the trace checking. If it speeds things up that much we should get rid of it as @haampie suggests. That said I wonder what the percent speed up is with this PR. #24095 added a bunch of redundancy so I suspect the percent speed up will be smaller when this is fixed. @alalazo: thoughts on getting rid of inspect in env modifications? |
|
Nope, even if I remove |
|
Okay, with the latest changes we're about the same as with my first commit: $ time spack env activate .
real 0m7.588s
user 0m6.731s
sys 0m0.803s
$ time spack env view regenerate
==> Updating view at /Users/Adam/.spack/.spack-env/view
real 2m33.586s
user 1m49.626s
sys 0m32.851s
$ time spack env deactivate
real 0m9.635s
user 0m7.581s
sys 0m1.456sWhen stacked with @haampie's suggestion to remove trace checking, we get: $ time spack env activate .
real 0m7.077s
user 0m6.362s
sys 0m0.537s
$ time spack env view regenerate
real 1m25.397s
user 1m22.537s
sys 0m1.675s
$ time spack env deactivate
real 0m7.497s
user 0m6.633s
sys 0m0.617s |
|
@spackbot run pipeline |
|
I've started that pipeline for you! |
|
Does anyone have any idea why gitlab is failing? Same error message twice now so I'm guessing this PR broke something. |
|
Here's the log for This is the error: |
|
I've started that pipeline for you! |
7fb14dc to
e3de17d
Compare
|
Still a lot of placeholder stuff in the build logs. I think I have to rebase for GitLab changes to take effect. Rebasing... |
|
GitLab failure is coming from |
|
@scottwittenburg GitLab CI is awesome because without it we would have merged multiple bugs that would've completely broken Spack's Python builds! Pretty sure things are better now. Going to build my entire Python stack with this PR to make sure. |
|
Successfully installed an environment containing 250+ Python packages using this branch. I'm not sure why the GitLab tests are still failing, but this PR should be working now. |
|
Downloaded the build output log from one of the failed builds, here's the error: |
|
Yep, that was happening before when I was installing cython to the wrong location, but that should be fixed now. GitLab doesn't cache "successful" installations from previous commits does it? |
|
@scottwittenburg: I think the issue here is that @adamjstewart is making core changes to PythonPackage, but the packages are getting cached based on sub-package-specific factors. Full hash doesn't include the build system superclasses right now. So I'm guessing that this isn't a gitlab caching issue -- it's that the full hash isn't capturing the changes. @adamjstewart I think the thing to do here is to open one more PR from this branch, now that you're sure it works. Can you try that? The pipeline for that PR won't have full hashes cached for this work and things changed here will get a final rebuild. We do not yet have a way to rebuild everything in a PR... but we might need that and a way to invalidate existing caches for a PR (note that the first is a much bigger hammer). I could also "just" fix the full hash but that is probably best done in a separate PR. |
|
I see, so based on that, what I could do is wipe out the per-PR mirror contents (where as @adamjstewart suggested, successful builds from previous pushes are cached), and then we could ask spackbot to run the pipeline again. I can also just remove a subset of the binaries if you guys think deleting everything is too heavy-handed. |
|
It would be great if we could ask spackbot to rebuild everything from source. That will require some support in spack. |
|
Ah, nevermind, I see you just opened a new PR. 👍 |
|
@scottwittenburg looks like @adamjstewart has made a PR for this but a spackbot command for exactly that (removing all caches for a PR), that would be huge. And another for rebuilding everything would also be good (I think this one is harder). We can wait on that this time, but note here that if we enable PR graduation for PRs like this one, and their caches are actually wrong, then we'll be promoting bad stuff to the shared PR cache. I think the right thing to do here is to make the full hash better. But I suspect we'll still want this kind of thing for real core changes -- even ones outside of the build system packages. |
This is a direct followup to #13557 which caches additional attributes that were added in #24095 that are expensive to compute. I had to reopen #25556 in another PR to invalidate the GitLab CI cache, but see #25556 for prior discussion. ### Before ```console $ time spack env activate . real 2m13.037s user 1m25.584s sys 0m43.654s $ time spack env view regenerate ==> Updating view at /Users/Adam/.spack/.spack-env/view real 16m3.541s user 10m28.892s sys 4m57.816s $ time spack env deactivate real 2m30.974s user 1m38.090s sys 0m49.781s ``` ### After ```console $ time spack env activate . real 0m8.937s user 0m7.323s sys 0m1.074s $ time spack env view regenerate ==> Updating view at /Users/Adam/.spack/.spack-env/view real 2m22.024s user 1m44.739s sys 0m30.717s $ time spack env deactivate real 0m10.398s user 0m8.414s sys 0m1.630s ``` Fixes #25555 Fixes #25541 * Speedup environment activation, part 2 * Only query distutils a single time * Fix KeyError bug * Make vermin happy * Manual memoize * Add comment on cross-compiling * Use platform-specific include directory * Fix multiple bugs * Fix python_inc discrepancy * Fix import tests
This is a direct followup to #13557 which caches additional attributes that were added in #24095 that are expensive to compute.
Before
After
Fixes #25555
Fixes #25541