Allow buildcache specs to be referenced by hash#35042
Allow buildcache specs to be referenced by hash#35042tgamblin merged 50 commits intospack:developfrom
Conversation
There was a problem hiding this comment.
Can we please not make the parser do HTTP requests 😕
The current implementation is already wrong; it shouldn't look in the active env / database.
If you really need this, can you make the parser return a non-concrete Spec with the hash set (maybe in a separate property, like query_hash or user_hash or w/e), or some wrapper/subclass. Yes, that's more work, but the PR is really the wrong direction.
That would also be useful for a lot of other things, like being able to do spack add /xyz without it getting expanded to a 1000 characters of spec description inside spack.yaml.
When these things get passed to concretize or find (in env), then they can be replaced by a concrete spec.
My exact hesitation when Greg suggested putting this here ;).
Yeah that sounds pretty reasonable. If you don't mind, we can perhaps discuss when/where hash validation should occur on Slack when it's not so late in your time zone :). |
I think this will cause problems. We rely pretty heavily on the invariant that a spec with a hash is concrete and vice-versa.
I think we can avoid that without needing to have abstract specs with hashes. We could query only the local index (not check for updates from the mirrors). That would make a certain sort of sense, since anyone using a hash for a remote spec has already done something ( |
ebd44c7 to
144655f
Compare
|
Ideally, I think the parser shouldn't be aware of the state of our installations, or of the hashes / files we have available. I'd really like if we had placeholders for hash and filename specifications that delay the lookup until somebody asks for concretization. |
Do we really? Most commands that care about concrete specs run concretize when they get abstract specs. |
90514a2 to
ee42888
Compare
|
@spackbot fix style |
|
Let me see if I can fix that for you! |
|
I was able to run spack style --fix==> Running style checks on spack
selected: isort, black, flake8, mypy
==> Modified files
lib/spack/spack/binary_distribution.py
lib/spack/spack/environment/environment.py
lib/spack/spack/parser.py
lib/spack/spack/spec.py
lib/spack/spack/test/spec_semantics.py
lib/spack/spack/test/spec_syntax.py
lib/spack/spack/traverse.py
==> Running isort checks
isort checks were clean
==> Running black checks
reformatted lib/spack/spack/test/spec_syntax.py
All done! ✨ 🍰 ✨
1 file reformatted, 6 files left unchanged.
black checks were clean
==> Running flake8 checks
flake8 checks were clean
==> Running mypy checks
Success: no issues found in 577 source files
mypy checks were clean
==> spack style checks were clean
I've updated the branch with style fixes. |
|
Ok on further discussion, I think I agree with @haampie and @alalazo here but I also think it would be better to do that as a follow-on optimization to this. Anyone disagree? There is a corner case I still haven't thought through around @becker33 has addressed 2 and 3; we can reduce the lookups further with the optimization. |
I think we can treat this as something orthogonal to the abstract hashes, and part of the matrix expansion mechanism. Basically, we should document that any abstract hash in a matrix is expanded during the matrix evaluation and must match a single known concrete spec. |
addressed; will do abstract hash reference in follow-on PR
lib/spack/spack/spec.py
Outdated
|
|
||
| spec_by_hash = self.lookup_hash() | ||
|
|
||
| # if spec_by_hash != self or not self.eq_dag(spec_by_hash): |
There was a problem hiding this comment.
@becker33 found another piece of commented code here -- should we get rid of it?
| # reattach nodes that were not otherwise satisfied by new dependencies | ||
| for node in self.traverse(root=False): | ||
| if not any(n._satisfies(node) for n in spec.traverse()): | ||
| spec._add_dependency(node.copy(), deptypes=()) |
There was a problem hiding this comment.
I'm not 100% sure this is the right semantic, because it can return a contradiction. Suppose you have:
foo ^/abc ^[email protected]
and the only lookup of /abc has [email protected] somewhere in the DAG. This logic may give you a spec with a contradiction in it. It could concretize if /abc is a build dependency or if it's zlib only reachable through build dependencies, since we allow multiple versions of build dependencies in a DAG. If /abc and its zlib are link dependencies, though, we know it'll fail to concretize.
On the other hand, we currently do this:
> spack spec tcl ^[email protected] ^[email protected]
==> Error: Cannot depend on incompatible specs '[email protected]' and '[email protected]'IMO this is a useful check for users, since it's most likely a mistake, and if (as we've said) our intent is to make ^ mean "unified spec", i.e., a transitive link or run dependency of the root, not some dependency of a build dep, then it's reporting a real error early. I guess tcl ^[email protected] ^%[email protected] would be ok in our future model.
The inconsistency, I guess, is that we detect this earlier at parse time than we do in lookup_hash. Technically if you had a choice between two /abc's, we could be smart enough to choose the one with the satisfiable zlib configuration here. We do that already for constraints on the hash spec; it's just for transitive abstract hashes that we aren't considering the whole spec. I guess this isn't so bad. It's a corner case where we might complain about an abstract hash being ambiguous when we don't need to, but I think I could come up with cases where you'd have to solve to disambiguate, and I wouldn't expect a lookup to imply a solve.
Ok, I think I convinced myself that this is fine. I just wanted to document this and get others' opinions.
| if not node.name: | ||
| raise spack.error.SpecError( | ||
| f"Spec {node} has no name; cannot concretize an anonymous spec" | ||
| ) |
There was a problem hiding this comment.
I guess we don't currently model this, but I don't think it's unreasonable to concretize, say, hdf5 with some dependency built with gcc, e.g.: hdf5 ^%gcc. develop fails much less gracefully on this, though:
> spack spec zlib ^%gcc
==> Error: Invalid module name:vs. this branch:
> spack spec zlib ^%gcc
==> Error: Spec %gcc has no name; cannot concretize an anonymous specSo I guess this is an improvement and we can "fix" it later if needed.
|
I pushed a commit that gets rid of the commented code, as I think it's correct to do that.
@alalazo @becker33 I'm ok with this PR so if you agree with the changes above and have no further objections, feel free to merge. Again, we can add the hash optimization after. |
| def _lookup_hash(self): | ||
| """Lookup just one spec with an abstract hash, returning a spec from the the environment, | ||
| store, or finally, binary caches.""" | ||
| import spack.environment |
There was a problem hiding this comment.
This needs to be refactored at some point, after v0.20, to try disentangling concepts. The design issue I see is that a spec needs to know about all the global objects where specs can be stored to perform a lookup.
There was a problem hiding this comment.
Yep agree with this. I think having some central place for managing the world of known Specs would be useful.
There was a problem hiding this comment.
I like this idea for improvement, but I'm having trouble thinking where else this information would belong, it's not really a natural fit anywhere. It could live in spack.spec but outside the Spec class, but that doesn't feel satisfying really.
spack#35042 introduced lazy hash parsing, but didn't remove a few attributes from the parser that were needed only for concrete specs This commit removes them, since they are effectively dead code.
#35042 introduced lazy hash parsing, but didn't remove a few attributes from the parser that were needed only for concrete specs This commit removes them, since they are effectively dead code.
spack#35042 introduced lazy hash parsing, but didn't remove a few attributes from the parser that were needed only for concrete specs This commit removes them, since they are effectively dead code.
spack#35042 introduced lazy hash parsing, but didn't remove a few attributes from the parser that were needed only for concrete specs This commit removes them, since they are effectively dead code.
What this PR does
Currently, specs on buildcache mirrors must be referenced by their full description. This PR allows buildcache specs to be referenced by their hashes, rather than their full description.
How it works
Hash resolution has been moved from
SpecParserintoSpec, and now includes the ability to execute aBinaryCacheQueryafter checking the local store, but before concluding that the hash doesn't exist.Side-effects of Proposed Changes
Failures will take longer when nonexistent hashes are parsed, as mirrors will now be scanned.
Other Changes
BinaryCacheIndex.updatehas been modified to fail appropriately only when mirrors have been configured.mutable_empty_configso they don't needlessly search mirrors.BinaryCacheQuery, and more documentation has been added to the hash resolution functions added toSpec.