Skip to content

Fix environment reading from lockfile to trust written hashes#25879

Merged
tgamblin merged 7 commits intodevelopfrom
bugfix/old-env-hash-reading
Sep 13, 2021
Merged

Fix environment reading from lockfile to trust written hashes#25879
tgamblin merged 7 commits intodevelopfrom
bugfix/old-env-hash-reading

Conversation

@becker33
Copy link
Copy Markdown
Member

@becker33 becker33 commented Sep 10, 2021

Fixes #25867

@haampie discovered a long-standing bug that had never been triggered prior to #22845 because the hashing algorithm had been stable for multiple years while the bug was in production. The bug was that when reading a concretized environment, Spack did not properly read in the build hashes associated with the specs in the environment. Those hashes were recomputed (and as long as we didn't change the algorithm, were recomputed identically). Spack's policy, though, is never to recompute a hash. Once something is installed, we respect its metadata hash forever -- even if internally Spack changes the hashing method. Put differently, once something is concretized, it has a concrete hash, and that's it -- forever.

When we changed the hashing algorithm for performance in #22845 we exposed the bug. This PR fixes the bug at its source, but properly reading in the cached build hash attributes associated with the specs. I've also renamed some variables in the Environment class methods to make a mistake of this sort more difficult to make in the future.

Includes a regression test.

@nhanford and @tgamblin assisted in debugging.

@spackbot-app spackbot-app bot added environments tests General test capability(ies) labels Sep 10, 2021
@tgamblin
Copy link
Copy Markdown
Member

@bernhardkaindl FYI -- do you mind trying this fix out?

@bernhardkaindl
Copy link
Copy Markdown
Contributor

bernhardkaindl commented Sep 10, 2021

@bernhardkaindl FYI -- do you mind trying this fix out?

  • The KeyError from old environments is completely gone! 👍

When checking out an old spack commit, this is still happening (but we could fix this with PR #25890):

(with -d: InvalidDatabaseVersionError: )Expected database version 5 but found version 6.

However, spack reindex ends execution at the same error. rm opt/spack/.spack-db/index.json is needed instead, after which, reindex is done automatically.

With a reindexed db, "AttributeError: 'str' object has no attribute 'get'" blocks operation when older spack versions try to read the new misc_cache. Like rm opt/spack/.spack-db/index.json, a spack clean -m triggers a regeneration of the misc_cache, but that takes a long time. Here is a debug trace of that:

spack -d install anyword 2>&1 |sed "s@$SPACK_ROOT@\$spack@;s/2021.*]/time]/"
==> [time] Imported install from built-in commands
==> [time] Reading config file $spack/etc/spack/defaults/config.yaml
==> [time] Imported install from built-in commands
==> [time] Reading config file $spack/etc/spack/defaults/repos.yaml
Traceback (most recent call last):
  File "$spack/bin/spack", line 100, in <module>
    sys.exit(spack.main.main())
  File "$spack/lib/spack/spack/main.py", line 797, in main
    return _invoke_command(command, parser, args, unknown)
  File "$spack/lib/spack/spack/main.py", line 525, in _invoke_command
    return_val = command(parser, args)
  File "$spack/lib/spack/spack/cmd/install.py", line 395, in install
    args.spec, concretize=True, tests=tests)
  File "$spack/lib/spack/spack/cmd/__init__.py", line 164, in parse_specs
    spec.concretize(tests=tests)  # implies normalize
  File "$spack/lib/spack/spack/spec.py", line 2564, in concretize
    self._old_concretize(tests)
  File "$spack/lib/spack/spack/spec.py", line 2363, in _old_concretize
    self._expand_virtual_packages(concretizer),
  File "$spack/lib/spack/spack/spec.py", line 2266, in _expand_virtual_packages
    candidates = concretizer.choose_virtual_or_external(spec)
  File "$spack/lib/spack/spack/concretize.py", line 143, in choose_virtual_or_external
    candidates = self._valid_virtuals_and_externals(spec)
  File "$spack/lib/spack/spack/concretize.py", line 98, in _valid_virtuals_and_externals
    candidates = spack.repo.path.providers_for(spec)
  File "$spack/lib/spack/spack/repo.py", line 92, in converter
    return function(self, spec_like, *args, **kwargs)
  File "$spack/lib/spack/spack/repo.py", line 593, in providers_for
    providers = self.provider_index.providers_for(vpkg_spec)
  File "$spack/lib/spack/spack/repo.py", line 577, in provider_index
    self._provider_index.merge(repo.provider_index)
  File "$spack/lib/spack/spack/repo.py", line 991, in provider_index
    return self.index['providers']
  File "$spack/lib/spack/spack/repo.py", line 409, in __getitem__
    self._build_all_indexes()
  File "$spack/lib/spack/spack/repo.py", line 424, in _build_all_indexes
    self.indexes[name] = self._build_index(name, indexer)
  File "$spack/lib/spack/spack/repo.py", line 450, in _build_index
    indexer.read(old) if old else indexer.create()
  File "$spack/lib/spack/spack/repo.py", line 332, in read
    self.index = spack.provider_index.ProviderIndex.from_json(stream)
  File "$spack/lib/spack/spack/provider_index.py", line 294, in from_json
    lambda vpkg, plist: (
  File "$spack/lib/spack/spack/provider_index.py", line 323, in _transform
    for name, mappings in providers.items())
  File "$spack/lib/spack/spack/provider_index.py", line 323, in <genexpr>
    for name, mappings in providers.items())
  File "$spack/lib/spack/spack/provider_index.py", line 321, in <listcomp>
    [transform_fun(vpkg, pset) for vpkg, pset in mapiter(mappings)]
  File "$spack/lib/spack/spack/provider_index.py", line 295, in <lambda>
    spack.spec.Spec.from_node_dict(vpkg),
  File "$spack/lib/spack/spack/spec.py", line 1834, in from_node_dict
    spec.namespace = node.get('namespace', None)
AttributeError: 'str' object has no attribute 'get'

In PR #25890 I've put the v6-generation misc_cache into a new misc_cache subdirectory, and the v6 db in a new path.

specs_by_hash[dag_hash] = Spec.from_node_dict(node_dict)
for build_hash, node_dict in json_specs_by_hash.items():
spec = Spec.from_node_dict(node_dict)
setattr(spec, ht.build_hash.attr, build_hash)
Copy link
Copy Markdown
Member

@tgamblin tgamblin Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves a comment so that people understand why envs were a special case.

Copy link
Copy Markdown
Member

@alalazo alalazo Sep 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the situation with the full hash? It doesn't seem we have that too in the node dict and we don't store that as a key. Should we start storing all the hashes in the node dict?

Copy link
Copy Markdown
Member

@alalazo alalazo Sep 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is failing a unit test checking conversion from lockfile v1 to lockfile v2, so I think a quick solution is:

diff --git a/lib/spack/spack/environment.py b/lib/spack/spack/environment.py
index b678ae1943..fae23335fc 100644
--- a/lib/spack/spack/environment.py
+++ b/lib/spack/spack/environment.py
@@ -1744,9 +1744,10 @@ def _read_lockfile_dict(self, d):
         specs_by_hash = {}
         for build_hash, node_dict in json_specs_by_hash.items():
             spec = Spec.from_node_dict(node_dict)
-            # Build hash is stored as a key, but not as part of the node dict
-            # To ensure build hashes are not recomputed, we reattach here
-            setattr(spec, ht.build_hash.attr, build_hash)
+            if d['_meta']['lockfile-version'] > 1:
+                # Build hash is stored as a key, but not as part of the node dict
+                # To ensure build hashes are not recomputed, we reattach here
+                setattr(spec, ht.build_hash.attr, build_hash)
             specs_by_hash[build_hash] = spec
 
         for build_hash, node_dict in json_specs_by_hash.items():

Since we had a few issues with upgrading the format of files in the past, we might want to investigate more structured solution to upgrading them later (i.e. after this PR is merged)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the situation with the full hash? It doesn't seem we have that too in the node dict and we don't store that as a key. Should we start storing all the hashes in the node dict?

I have an open PR (#25708) that does that. I argue in the description that if we can't recompute hashes from concrete specs from yaml, then we need to write all the hashes to the lock. That PR needs to be rebased now, since I started it before the spec yaml->json PR was merged.

@scottwittenburg
Copy link
Copy Markdown
Contributor

@adrienbernede I'd be curious if this resolves the issue we were attempting to diagnose yesterday.

@adrienbernede
Copy link
Copy Markdown
Contributor

@scottwittenburg looking at this in llnl/radiuss-spack-testing#11

Copy link
Copy Markdown
Member

@alalazo alalazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one test failing on upgrade from lockfile v1 to lockfile v2

specs_by_hash[dag_hash] = Spec.from_node_dict(node_dict)
for build_hash, node_dict in json_specs_by_hash.items():
spec = Spec.from_node_dict(node_dict)
setattr(spec, ht.build_hash.attr, build_hash)
Copy link
Copy Markdown
Member

@alalazo alalazo Sep 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is failing a unit test checking conversion from lockfile v1 to lockfile v2, so I think a quick solution is:

diff --git a/lib/spack/spack/environment.py b/lib/spack/spack/environment.py
index b678ae1943..fae23335fc 100644
--- a/lib/spack/spack/environment.py
+++ b/lib/spack/spack/environment.py
@@ -1744,9 +1744,10 @@ def _read_lockfile_dict(self, d):
         specs_by_hash = {}
         for build_hash, node_dict in json_specs_by_hash.items():
             spec = Spec.from_node_dict(node_dict)
-            # Build hash is stored as a key, but not as part of the node dict
-            # To ensure build hashes are not recomputed, we reattach here
-            setattr(spec, ht.build_hash.attr, build_hash)
+            if d['_meta']['lockfile-version'] > 1:
+                # Build hash is stored as a key, but not as part of the node dict
+                # To ensure build hashes are not recomputed, we reattach here
+                setattr(spec, ht.build_hash.attr, build_hash)
             specs_by_hash[build_hash] = spec
 
         for build_hash, node_dict in json_specs_by_hash.items():

Since we had a few issues with upgrading the format of files in the past, we might want to investigate more structured solution to upgrading them later (i.e. after this PR is merged)

@alalazo alalazo self-requested a review September 13, 2021 20:00
@tgamblin
Copy link
Copy Markdown
Member

tgamblin commented Sep 13, 2021

Thanks @bernhardkaindl for trying this out.

I think we should merge the fix with @alalazo's change, and we'll deal with the issues @bernhardkaindl brought up on his PR #25890 and/or on a follow-up.

@tgamblin tgamblin enabled auto-merge (squash) September 13, 2021 20:13
@tgamblin tgamblin merged commit dad69e7 into develop Sep 13, 2021
@tgamblin tgamblin deleted the bugfix/old-env-hash-reading branch September 13, 2021 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

environments tests General test capability(ies)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Local test failures after #22845

6 participants