Skip to content

Allow buildcache specs to be referenced by hash#35042

Merged
tgamblin merged 50 commits intospack:developfrom
nhanford:features/ref-buildcache-spec-by-hash
May 12, 2023
Merged

Allow buildcache specs to be referenced by hash#35042
tgamblin merged 50 commits intospack:developfrom
nhanford:features/ref-buildcache-spec-by-hash

Conversation

@nhanford
Copy link
Copy Markdown
Contributor

@nhanford nhanford commented Jan 19, 2023

What this PR does

Currently, specs on buildcache mirrors must be referenced by their full description. This PR allows buildcache specs to be referenced by their hashes, rather than their full description.

How it works

Hash resolution has been moved from SpecParser into Spec, and now includes the ability to execute a BinaryCacheQuery after checking the local store, but before concluding that the hash doesn't exist.

Side-effects of Proposed Changes

Failures will take longer when nonexistent hashes are parsed, as mirrors will now be scanned.

Other Changes

  • BinaryCacheIndex.update has been modified to fail appropriately only when mirrors have been configured.
  • Tests of hash failures have been updated to use mutable_empty_config so they don't needlessly search mirrors.
  • Documentation has been clarified for BinaryCacheQuery, and more documentation has been added to the hash resolution functions added to Spec.

@nhanford nhanford added feature A feature is missing in Spack specs binary-packages labels Jan 19, 2023
@nhanford nhanford requested a review from becker33 January 19, 2023 21:14
@nhanford nhanford self-assigned this Jan 19, 2023
@spackbot-app spackbot-app bot added core PR affects Spack core functionality tests General test capability(ies) labels Jan 19, 2023
Copy link
Copy Markdown
Member

@haampie haampie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please not make the parser do HTTP requests 😕

The current implementation is already wrong; it shouldn't look in the active env / database.

If you really need this, can you make the parser return a non-concrete Spec with the hash set (maybe in a separate property, like query_hash or user_hash or w/e), or some wrapper/subclass. Yes, that's more work, but the PR is really the wrong direction.

That would also be useful for a lot of other things, like being able to do spack add /xyz without it getting expanded to a 1000 characters of spec description inside spack.yaml.

When these things get passed to concretize or find (in env), then they can be replaced by a concrete spec.

@nhanford
Copy link
Copy Markdown
Contributor Author

Can we please not make the parser do HTTP requests 😕

My exact hesitation when Greg suggested putting this here ;).

The current implementation is already wrong; it shouldn't look in the active env / database.

If you really need this, can you make the parser return a non-concrete Spec with the hash set (maybe in a separate property, like query_hash or user_hash or w/e), or some wrapper/subclass. Yes, that's more work, but the PR is really the wrong direction.

That would also be useful for a lot of other things, like being able to do spack add /xyz without it getting expanded to a 1000 characters of spec description inside spack.yaml.

When these things get passed to concretize or find (in env), then they can be replaced by a concrete spec.

Yeah that sounds pretty reasonable. If you don't mind, we can perhaps discuss when/where hash validation should occur on Slack when it's not so late in your time zone :).

@becker33
Copy link
Copy Markdown
Member

If you really need this, can you make the parser return a non-concrete Spec with the hash set (maybe in a separate property, like query_hash or user_hash or w/e), or some wrapper/subclass. Yes, that's more work, but the PR is really the wrong direction.

I think this will cause problems. We rely pretty heavily on the invariant that a spec with a hash is concrete and vice-versa.

Can we please not make the parser do HTTP requests 😕

I think we can avoid that without needing to have abstract specs with hashes. We could query only the local index (not check for updates from the mirrors). That would make a certain sort of sense, since anyone using a hash for a remote spec has already done something (spack buildcache list) that would put the spec in the local cache.

@nhanford nhanford changed the title Allow buildcache specs to be referenced by hash [WIP] Allow buildcache specs to be referenced by hash Feb 1, 2023
@tgamblin tgamblin force-pushed the features/ref-buildcache-spec-by-hash branch from ebd44c7 to 144655f Compare February 15, 2023 22:39
@alalazo
Copy link
Copy Markdown
Member

alalazo commented Feb 17, 2023

Ideally, I think the parser shouldn't be aware of the state of our installations, or of the hashes / files we have available. I'd really like if we had placeholders for hash and filename specifications that delay the lookup until somebody asks for concretization.

@haampie
Copy link
Copy Markdown
Member

haampie commented Feb 17, 2023

We rely pretty heavily on the invariant that a spec with a hash is concrete and vice-versa.

Do we really? Most commands that care about concrete specs run concretize when they get abstract specs.

@tgamblin
Copy link
Copy Markdown
Member

@spackbot fix style

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented May 11, 2023

Let me see if I can fix that for you!

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented May 11, 2023

I was able to run spack style --fix for you!

spack style --fix
==> Running style checks on spack
  selected: isort, black, flake8, mypy
==> Modified files
  lib/spack/spack/binary_distribution.py
  lib/spack/spack/environment/environment.py
  lib/spack/spack/parser.py
  lib/spack/spack/spec.py
  lib/spack/spack/test/spec_semantics.py
  lib/spack/spack/test/spec_syntax.py
  lib/spack/spack/traverse.py
==> Running isort checks
  isort checks were clean
==> Running black checks
reformatted lib/spack/spack/test/spec_syntax.py
All done! ✨ 🍰 ✨
1 file reformatted, 6 files left unchanged.
  black checks were clean
==> Running flake8 checks
  flake8 checks were clean
==> Running mypy checks
Success: no issues found in 577 source files
  mypy checks were clean
==> spack style checks were clean
Keep in mind that I cannot fix your flake8 or mypy errors, so if you have any you'll need to fix them and update the pull request. If I was able to push to your branch, if you make further changes you will need to pull from your updated branch before pushing again.

I've updated the branch with style fixes.

@tgamblin
Copy link
Copy Markdown
Member

Ok on further discussion, I think I agree with @haampie and @alalazo here but I also think it would be better to do that as a follow-on optimization to this. Anyone disagree? There is a corner case I still haven't thought through around excludes: in environments.

@becker33 has addressed 2 and 3; we can reduce the lookups further with the optimization.

@alalazo
Copy link
Copy Markdown
Member

alalazo commented May 11, 2023

There is a corner case I still haven't thought through around excludes: in environments.

I think we can treat this as something orthogonal to the abstract hashes, and part of the matrix expansion mechanism. Basically, we should document that any abstract hash in a matrix is expanded during the matrix evaluation and must match a single known concrete spec.

tgamblin
tgamblin previously approved these changes May 11, 2023
@tgamblin tgamblin dismissed haampie’s stale review May 11, 2023 22:38

addressed; will do abstract hash reference in follow-on PR

Copy link
Copy Markdown
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@becker33 one last actual change request and two comments (really for you and @alalazo to look at)


spec_by_hash = self.lookup_hash()

# if spec_by_hash != self or not self.eq_dag(spec_by_hash):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@becker33 found another piece of commented code here -- should we get rid of it?

Comment on lines +1880 to +1883
# reattach nodes that were not otherwise satisfied by new dependencies
for node in self.traverse(root=False):
if not any(n._satisfies(node) for n in spec.traverse()):
spec._add_dependency(node.copy(), deptypes=())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure this is the right semantic, because it can return a contradiction. Suppose you have:

foo ^/abc ^[email protected]

and the only lookup of /abc has [email protected] somewhere in the DAG. This logic may give you a spec with a contradiction in it. It could concretize if /abc is a build dependency or if it's zlib only reachable through build dependencies, since we allow multiple versions of build dependencies in a DAG. If /abc and its zlib are link dependencies, though, we know it'll fail to concretize.

On the other hand, we currently do this:

> spack spec tcl ^[email protected] ^[email protected]
==> Error: Cannot depend on incompatible specs '[email protected]' and '[email protected]'

IMO this is a useful check for users, since it's most likely a mistake, and if (as we've said) our intent is to make ^ mean "unified spec", i.e., a transitive link or run dependency of the root, not some dependency of a build dep, then it's reporting a real error early. I guess tcl ^[email protected] ^%[email protected] would be ok in our future model.

The inconsistency, I guess, is that we detect this earlier at parse time than we do in lookup_hash. Technically if you had a choice between two /abc's, we could be smart enough to choose the one with the satisfiable zlib configuration here. We do that already for constraints on the hash spec; it's just for transitive abstract hashes that we aren't considering the whole spec. I guess this isn't so bad. It's a corner case where we might complain about an abstract hash being ambiguous when we don't need to, but I think I could come up with cases where you'd have to solve to disambiguate, and I wouldn't expect a lookup to imply a solve.

Ok, I think I convinced myself that this is fine. I just wanted to document this and get others' opinions.

Comment on lines +2861 to +2864
if not node.name:
raise spack.error.SpecError(
f"Spec {node} has no name; cannot concretize an anonymous spec"
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't currently model this, but I don't think it's unreasonable to concretize, say, hdf5 with some dependency built with gcc, e.g.: hdf5 ^%gcc. develop fails much less gracefully on this, though:

> spack spec zlib ^%gcc
==> Error: Invalid module name:

vs. this branch:

> spack spec zlib ^%gcc
==> Error: Spec %gcc has no name; cannot concretize an anonymous spec

So I guess this is an improvement and we can "fix" it later if needed.

@tgamblin
Copy link
Copy Markdown
Member

I pushed a commit that gets rid of the commented code, as I think it's correct to do that.

git diff develop... | grep '^+\s*#' doesn't return anything so I think we're done with commented code for real this time.

@alalazo @becker33 I'm ok with this PR so if you agree with the changes above and have no further objections, feel free to merge. Again, we can add the hash optimization after.

Copy link
Copy Markdown
Member

@alalazo alalazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can improve the current code, but I agree merging this and have following PRs after v0.20 I'll leave to @becker33 to give a final look

def _lookup_hash(self):
"""Lookup just one spec with an abstract hash, returning a spec from the the environment,
store, or finally, binary caches."""
import spack.environment
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be refactored at some point, after v0.20, to try disentangling concepts. The design issue I see is that a spec needs to know about all the global objects where specs can be stored to perform a lookup.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep agree with this. I think having some central place for managing the world of known Specs would be useful.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea for improvement, but I'm having trouble thinking where else this information would belong, it's not really a natural fit anywhere. It could live in spack.spec but outside the Spec class, but that doesn't feel satisfying really.

@tgamblin tgamblin merged commit eef2536 into spack:develop May 12, 2023
alalazo added a commit to alalazo/spack that referenced this pull request Aug 22, 2023
spack#35042 introduced lazy hash parsing, but didn't remove a
few attributes from the parser that were needed only for
concrete specs

This commit removes them, since they are effectively
dead code.
haampie pushed a commit that referenced this pull request Aug 24, 2023
#35042 introduced lazy hash parsing, but didn't remove a
few attributes from the parser that were needed only for
concrete specs

This commit removes them, since they are effectively
dead code.
dyokelson pushed a commit to dyokelson/spack that referenced this pull request Aug 24, 2023
spack#35042 introduced lazy hash parsing, but didn't remove a
few attributes from the parser that were needed only for
concrete specs

This commit removes them, since they are effectively
dead code.
mpokorny pushed a commit to mpokorny/spack that referenced this pull request Sep 18, 2023
spack#35042 introduced lazy hash parsing, but didn't remove a
few attributes from the parser that were needed only for
concrete specs

This commit removes them, since they are effectively
dead code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

binary-packages commands core PR affects Spack core functionality dependencies environments feature A feature is missing in Spack libraries specs tests General test capability(ies)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants