Skip to content

spec: Faster __hash__#51536

Merged
haampie merged 5 commits intospack:developfrom
vbrunini:spec_cache_pyhash
Jan 16, 2026
Merged

spec: Faster __hash__#51536
haampie merged 5 commits intospack:developfrom
vbrunini:spec_cache_pyhash

Conversation

@vbrunini
Copy link
Copy Markdown
Contributor

@vbrunini vbrunini commented Nov 7, 2025

Split out of #51518 based on reviewer comments @haampie

vbrunini added a commit to vbrunini/spack that referenced this pull request Nov 8, 2025
Combined with spack#51535, spack#51536, spack#51367 and the reduced install
task sleep time from spack#51491 this reduces the time for a no-op install of
a development environment with 94 packages (including dependencies) from 8s
to 2.8s for me. It also provides a similar reduction in time for
`spack build-env --dump env.txt mypkg` which we frequently use to start a
shell with the build environment.

Signed-off-by: Victor Brunini <[email protected]>
@haampie
Copy link
Copy Markdown
Member

haampie commented Nov 8, 2025

Specs are mutable, this breaks the invariant x == y => hash(x) == hash(y). A better approach is to make the hash function cheaper (at the possible cost of more collisions in dicts and sets)

@vbrunini vbrunini changed the title spec: Cache result of __hash__ for non-concrete specs. spec: Faster __hash__ Nov 10, 2025
@vbrunini
Copy link
Copy Markdown
Contributor Author

Specs are mutable, this breaks the invariant x == y => hash(x) == hash(y). A better approach is to make the hash function cheaper (at the possible cost of more collisions in dicts and sets)

Tangent to this PR, but I started looking at the concretize setup time for my development environment as well and it spends a large fraction of time formatting Spec's to strings repeatedly for the same Spec's. I hacked in a similar change to the initial pass of caching the hash from this PR, but now caching the result of __str__. It would have some similar issues with Spec mutability I think, but it reduced concretize setup time from ~30s to ~20s.

@haampie
Copy link
Copy Markdown
Member

haampie commented Nov 21, 2025

Coming back to this (sorry for the delay).

Related to your last comment, I'm wondering whether we could define

class ImmutableSpec(Spec):
  def __hash__(self):
    try:
      return self._immutable_hash
    except AttributeError:
      h = super().__hash()
      self._immutable_hash = h
      return h
  # and the same for __str__

and then construct objects like these in directives.py.

Another possible advantage is that you can define the empty spec as a constant:

EmptySpec = ImmutableSpec()

and use that in directives that don't have a when condition, instead of constructing a new empty spec. That is advantageous because Python has a fast path in dicts and sets when x is y.

@haampie
Copy link
Copy Markdown
Member

haampie commented Nov 21, 2025

And regarding __str__, before caching it, it might be good to know what the call sites are. For all we know it's unused, for example in tty.debug(f"... {spec}") with debug messages disabled, but the string is constructed.

@haampie
Copy link
Copy Markdown
Member

haampie commented Nov 21, 2025

I've tried this PR, and my suggestion, but neither are significant improvements.

@haampie
Copy link
Copy Markdown
Member

haampie commented Nov 21, 2025

Also, just to confirm my original concern (no longer relevant after the change in this PR): we do mutate specs in various places, so freezing certain properties can be dangerous. I noticed the following examples:

  • directives.py: when_spec.name = pkg.name
  • asp.py: def named_spec -- here, after mutating name, __str__ is called

@haampie
Copy link
Copy Markdown
Member

haampie commented Jan 14, 2026

@vbrunini there have been many PRs recently aimed to speed up the Python bits of Spack, from #51819 to micro optimizations like #51841. Setup time has reduced by >25% over the last week.

We also socialized the idea that, by convention, Spec objects should be immutable, whether abstract or concrete. At this point:

  • named_spec was removed, as well as another instance, so the solver is no longer mutating specs.
  • directives.py mutation remains but is not particularly bad as it's part of initialization of a new Spec object.
  • ClosedOpenRange was already immutable and does cache its str and hash value: version_types.py: cache ClosedOpenRange __str__ and __hash__ #51832

I'm still not sure whether caching __hash__ and __str__ is currently best, maybe it's too early. Also, it would be good to check with a histogram how often __str__ and __hash__ are actually called per instance.

Rebasing your PR I see about a 4.38% reduction in package load time (best of 5):

for i in $(seq 5); do time python3 -c 'import spack.repo; x = list(spack.repo.PATH.all_package_classes())'; done

For concretization I see a 6.3% reduction in setup time.

I would consider that significant.

Do you wanna revisit this PR?

@haampie
Copy link
Copy Markdown
Member

haampie commented Jan 14, 2026

I think something like this is sufficient?

        if not self._dependencies:  # fast path for trivial abstract specs (followed 94% in my tests)
            return hash((
                self.name,  # cheap
                self.namespace,  # cheap
                self.versions,
                self.variants if self.variants.dict else None,  # avoids expensive function call
                self.architecture,
                self.abstract_hash,  # cheap
            ))

This gives similar performance to caching the hash in the prior commit
but should avoid the issues with Spec's being mutable brought up in review.

Signed-off-by: Victor Brunini <[email protected]>
Signed-off-by: Victor Brunini <[email protected]>
Timing "spack-python -c 'import spack.repo; x = list(spack.repo.PATH.all_package_classes())'"
I see:
Before: 15.5s
After: 14.8s

Signed-off-by: Victor Brunini <[email protected]>
Signed-off-by: Victor Brunini <[email protected]>
Copy link
Copy Markdown
Member

@haampie haampie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me.

@haampie haampie enabled auto-merge (squash) January 16, 2026 20:07
@haampie haampie merged commit 7868be7 into spack:develop Jan 16, 2026
31 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants