Skip to content

Introduce a class to represent a "closed set of concrete specs"#40636

Closed
alalazo wants to merge 15 commits intospack:developfrom
alalazo:maintainers/environment-simplify-unification
Closed

Introduce a class to represent a "closed set of concrete specs"#40636
alalazo wants to merge 15 commits intospack:developfrom
alalazo:maintainers/environment-simplify-unification

Conversation

@alalazo
Copy link
Copy Markdown
Member

@alalazo alalazo commented Oct 20, 2023

There are many places in Spack where we need to operate with a set of consistent spec objects, i.e. a set of specs where:

  1. A single DAG hash refers to a single object in memory
  2. Each spec has no references to other specs outside the set

This is, for instance, the case when we need to reconstruct a concrete spec during concretization, in particular if that spec is provided by more than one source of "reusable" specs (see #39590).

Another case is when we concretize or reconstruct environments. In those cases we have custom ad-hoc code to ensure consistency in memory. For concretization that would be:

# Unify the specs objects, so we get correct references to all parents
self._read_lockfile_dict(self._to_lockfile_dict())
# Re-attach information on test dependencies
if tests:
# This is slow, but the information on test dependency is lost
# after unification or when reading from a lockfile.
for h in self.specs_by_hash:
current_spec, computed_spec = self.specs_by_hash[h], by_hash[h]
for node in computed_spec.traverse():
test_edges = node.edges_to_dependencies(depflag=dt.TEST)
for current_edge in test_edges:
test_dependency = current_edge.spec
if test_dependency in current_spec[node.name]:
continue
current_spec[node.name].add_dependency_edge(
test_dependency.copy(), depflag=dt.TEST, virtuals=current_edge.virtuals
)
results = [
(abstract, self.specs_by_hash[h])
for abstract, h in zip(self.concretized_user_specs, self.concretized_order)
]
return results

while when reading an environment from a lockfile:

# Traverse the root specs one at a time in the order they appear.
# The first time we see each DAG hash, that's the one we want to
# keep. This is only required as long as we support older lockfile
# formats where the mapping from DAG hash to lockfile key is possibly
# one-to-many.
for lockfile_key in self.concretized_order:
for s in specs_by_hash[lockfile_key].traverse():
if s.dag_hash() not in first_seen:
first_seen[s.dag_hash()] = s
# Now make sure concretized_order and our internal specs dict
# contains the keys used by modern spack (i.e. the dag_hash
# that includes build deps and package hash).
self.concretized_order = [
specs_by_hash[h_key].dag_hash() for h_key in self.concretized_order
]
for spec_dag_hash in self.concretized_order:
self.specs_by_hash[spec_dag_hash] = first_seen[spec_dag_hash]

This PR is an attempt at collecting these similar operations together into a class, which can be reused in different places within the codebase. This class is a mapping from hashes to concrete specs that:

  1. Ensures consistency of the specs being managed
  2. Provides methods to add, delete, query specs

The idea is to refine an API for the class in this PR, and use it for Environments, and apply it in later PRs also to Databases and Indexes.

@spackbot-app spackbot-app bot added core PR affects Spack core functionality environments labels Oct 20, 2023
@alalazo alalazo marked this pull request as draft October 30, 2023 11:20
@alalazo alalazo changed the title Use ConcreteSpecsByHash to unify environments Use ConcreteSpecsByHash in environments Nov 2, 2023
@alalazo alalazo force-pushed the maintainers/environment-simplify-unification branch from 6796aef to 74c35d0 Compare November 2, 2023 23:09
@spackbot-app spackbot-app bot added commands stand-alone-tests Stand-alone (or smoke) tests for installed packages tests General test capability(ies) labels Nov 2, 2023
@alalazo alalazo force-pushed the maintainers/environment-simplify-unification branch from 60aa55c to 8016344 Compare November 15, 2023 15:38
@alalazo alalazo changed the title Use ConcreteSpecsByHash in environments Introduce a class to represent a "closed set of concrete specs" Nov 21, 2023
@alalazo alalazo marked this pull request as ready for review November 21, 2023 09:13
@alalazo alalazo force-pushed the maintainers/environment-simplify-unification branch from 8016344 to 4b4ea7d Compare January 5, 2024 16:34
This method receives a list of concrete specs to keep, and removes
everything else.
@alalazo alalazo force-pushed the maintainers/environment-simplify-unification branch from 4b4ea7d to f5687db Compare January 30, 2024 12:23
@haampie
Copy link
Copy Markdown
Member

haampie commented Jan 31, 2024

I'm not immediately a fan of the fact that this always duplicates.

  1. It's redundant in many cases (and then it's just a performance penalty)
  2. I don't know if we store any properties on Spec object like we do on PackageBase instances, like installed_from_binary_cache, that would get lost after _dup.

If you wanna go this way in environments, you should use ConcreteSpecsByHash as a spec builder, instead of first building the Spec objects graph, only to tear them apart and reconstruct them in ConcreteSpecsByHash again.

The overhead of deserializing an environment is really felt in CI with e.g. the e4s env, any command spack -e <env> ... is ultra slow already.

@alalazo alalazo closed this Mar 2, 2026
@alalazo alalazo deleted the maintainers/environment-simplify-unification branch March 2, 2026 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands core PR affects Spack core functionality environments refactoring stand-alone-tests Stand-alone (or smoke) tests for installed packages tests General test capability(ies)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants