Do not try to compute hashes of concrete specs from yaml#25708
Do not try to compute hashes of concrete specs from yaml#25708scottwittenburg wants to merge 1 commit intospack:developfrom
Conversation
46e2e11 to
4fef6e2
Compare
de74bee to
77222ac
Compare
|
The gitlab pipeline failed because this PR changed the full hashes of specs with dependencies, which revealed that Once #25787 passes and is merged, I'll rebase this on |
77222ac to
06cfcbd
Compare
|
fyi @eugeneswalker |
06cfcbd to
973368e
Compare
lib/spack/spack/cmd/ci.py
Outdated
| tty.debug('job concrete spec path: {0}'.format(job_spec_yaml_path)) | ||
| with open(job_spec_yaml_path, 'w') as fd: | ||
| fd.write(job_spec.to_yaml(hash=ht.build_hash)) | ||
| fd.write(job_spec.to_yaml(hash=ht.full_hash)) |
There was a problem hiding this comment.
Yes, will do.
lib/spack/spack/cmd/ci.py
Outdated
| root_spec_yaml_path = os.path.join(repro_dir, 'root.yaml') | ||
| with open(root_spec_yaml_path, 'w') as fd: | ||
| fd.write(spec_map['root'].to_yaml(hash=ht.build_hash)) | ||
| fd.write(spec_map['root'].to_yaml(hash=ht.full_hash)) |
There was a problem hiding this comment.
JSON.
Do we use the contents of this file for anything right now? I am pretty sure it is just for debugging yes?
There was a problem hiding this comment.
Yes, just for debugging.
| # Assumes no legacy formats, since this was just created. | ||
| spec_dict[ht.dag_hash.name] = s.dag_hash() | ||
| spec_dict = s.node_dict_with_hashes( | ||
| hash=ht.full_hash, dep_hash=ht.build_hash) |
There was a problem hiding this comment.
Yes, maybe we should. But then do we have to change the key used in the spack.lock? And does that mean a new version of the file format we have to support in perpetuity (like when the format changed from dag hash to build hash)?
| if self._hashes_final and not getattr(self, hash.attr, None): | ||
| package_hash = getattr(self, ht.dag_hash.attr, None) | ||
| else: | ||
| package_hash = self.package_hash() |
There was a problem hiding this comment.
It's weird to me that we're assigning dag_hash to package_hash. When we discussed this, I said that we should promote dag_hash to build_hash to full_hash, mainly b/c we sometimes need to index things by those and the weaker hash is still the best identifier we have for the spec. package_hash, though, is a content has of the package.py files. We still don't actually know it -- if we can get away with just leaving it None I'd prefer to.
Do we need to set this if we can instead set full_hash and build_hash?
There was a problem hiding this comment.
It's the computation of the package_hash that's causing the problems right now, let me see if I can just leave it None.
There was a problem hiding this comment.
I think the only place we'd ask for it to be computed is when getting the full hash if the full hash is not set.
There was a problem hiding this comment.
Actually, now I'm wondering: Let's say I have an old spec.yaml (now they're spec.json, but this one's old) for pkg a which depends on some dag hash of pkg b. Recently b was rebuilt with a new full hash, but because the dag hash of b didn't change, I need to splice the new b into a. Doesn't it mean I need to recompute the full hash of a now? Because it's the serialized spec dict that's getting hashed, and if the full hash of a dep of a changes, so must the full hash of a itself?
There was a problem hiding this comment.
I guess that's where promoting the dag_hash is intended to help.
973368e to
f9c9a8d
Compare
|
I think the best way to move forward with this is to drop the splice provenance when pushing new specs to a mirror. I think that can be as simple as setting This is lying to the user, but I think it's the least damaging lie we can tell to keep things working until the reusing concretizer works. We should definitely include a TODO comment saying to get rid of this ASAP. |
Ok, I think this makes sense, thanks. I'll give it a try and just make sure that it has what I think is the desired effect, i.e. we don't get a new DAG hash on the spliced spec as a result.
Yes, I'll definitely do that. |
@becker33 When I initially read your suggestion, my brain interpreted it as |
f9c9a8d to
1b8dd2d
Compare
|
So to follow up here, I'm going to try setting I'm also going to keep the build hash as the key of the |
0cedc7e to
ef9166c
Compare
This PR also changes the spack.lock file to include all the hashes for each spec. The reason for this is because when specs are created from the lock, they are marked as concrete with _hashes_final = True, which, after this change, prevents hashes from being computed again. Additionally, when we write out the concrete spec yaml in gitlab CI, we now include all the hashes in that file, for the same reason as above: we don't want to lose the hashes spack knew before we wrote this file.
ef9166c to
6883473
Compare
|
Am I correct that this is outdated since #28504 has been merged? |
Yes, it is indeed outdated. Closing. |
Some old concrete
spec.yamlfiles in binary mirrors may appear withoutfull_hashorpackage_hash, and when spack encounters these (for example, when runningspack buildcache update-indexon a mirror with old binaries), it tries to compute those hashes, even though it already decided it should not when it set_hashes_final = True. When this happens, spack can easily discover that a patch present at the time thespec.yamlwas written is no longer present in the repository, at which point it crashes with an error similar to the following:This PR solves the problem by making sure spack does not attempt to re-compute hashes when the
_hashes_finalproperty is set toTrue. Instead, spack will just take thedag_hashand use it for any of the unknown hashes.This PR also make sure that going forward, the
spack.lockfiles will contain all the hashes for each spec, as otherwise if we were to reconstitute specs from thespack.lockafter this change, those hashes would be lost forever. However, the lockfile is still keyed on build hash.TODO:
jsonspecs instead ofyamlyaml->json