Allow buildcache specs to be referenced by hash by nhanford · Pull Request #35042 · spack/spack

nhanford · 2023-01-19T21:14:36Z

What this PR does

Currently, specs on buildcache mirrors must be referenced by their full description. This PR allows buildcache specs to be referenced by their hashes, rather than their full description.

How it works

Hash resolution has been moved from SpecParser into Spec, and now includes the ability to execute a BinaryCacheQuery after checking the local store, but before concluding that the hash doesn't exist.

Side-effects of Proposed Changes

Failures will take longer when nonexistent hashes are parsed, as mirrors will now be scanned.

Other Changes

BinaryCacheIndex.update has been modified to fail appropriately only when mirrors have been configured.
Tests of hash failures have been updated to use mutable_empty_config so they don't needlessly search mirrors.
Documentation has been clarified for BinaryCacheQuery, and more documentation has been added to the hash resolution functions added to Spec.

haampie

Can we please not make the parser do HTTP requests 😕

The current implementation is already wrong; it shouldn't look in the active env / database.

If you really need this, can you make the parser return a non-concrete Spec with the hash set (maybe in a separate property, like query_hash or user_hash or w/e), or some wrapper/subclass. Yes, that's more work, but the PR is really the wrong direction.

That would also be useful for a lot of other things, like being able to do spack add /xyz without it getting expanded to a 1000 characters of spec description inside spack.yaml.

When these things get passed to concretize or find (in env), then they can be replaced by a concrete spec.

nhanford · 2023-01-19T23:04:02Z

Can we please not make the parser do HTTP requests 😕

My exact hesitation when Greg suggested putting this here ;).

The current implementation is already wrong; it shouldn't look in the active env / database.

If you really need this, can you make the parser return a non-concrete Spec with the hash set (maybe in a separate property, like query_hash or user_hash or w/e), or some wrapper/subclass. Yes, that's more work, but the PR is really the wrong direction.

That would also be useful for a lot of other things, like being able to do spack add /xyz without it getting expanded to a 1000 characters of spec description inside spack.yaml.

When these things get passed to concretize or find (in env), then they can be replaced by a concrete spec.

Yeah that sounds pretty reasonable. If you don't mind, we can perhaps discuss when/where hash validation should occur on Slack when it's not so late in your time zone :).

becker33 · 2023-01-20T18:15:25Z

If you really need this, can you make the parser return a non-concrete Spec with the hash set (maybe in a separate property, like query_hash or user_hash or w/e), or some wrapper/subclass. Yes, that's more work, but the PR is really the wrong direction.

I think this will cause problems. We rely pretty heavily on the invariant that a spec with a hash is concrete and vice-versa.

Can we please not make the parser do HTTP requests 😕

I think we can avoid that without needing to have abstract specs with hashes. We could query only the local index (not check for updates from the mirrors). That would make a certain sort of sense, since anyone using a hash for a remote spec has already done something (spack buildcache list) that would put the spec in the local cache.

alalazo · 2023-02-17T08:08:17Z

Ideally, I think the parser shouldn't be aware of the state of our installations, or of the hashes / files we have available. I'd really like if we had placeholders for hash and filename specifications that delay the lookup until somebody asks for concretization.

haampie · 2023-02-17T09:47:07Z

We rely pretty heavily on the invariant that a spec with a hash is concrete and vice-versa.

Do we really? Most commands that care about concrete specs run concretize when they get abstract specs.

tgamblin · 2023-05-11T17:57:56Z

@spackbot fix style

spackbot-app · 2023-05-11T17:57:59Z

Let me see if I can fix that for you!

spackbot-app · 2023-05-11T17:58:58Z

I was able to run spack style --fix for you!

spack style --fix

==> Running style checks on spack
  selected: isort, black, flake8, mypy
==> Modified files
  lib/spack/spack/binary_distribution.py
  lib/spack/spack/environment/environment.py
  lib/spack/spack/parser.py
  lib/spack/spack/spec.py
  lib/spack/spack/test/spec_semantics.py
  lib/spack/spack/test/spec_syntax.py
  lib/spack/spack/traverse.py
==> Running isort checks
  isort checks were clean
==> Running black checks
reformatted lib/spack/spack/test/spec_syntax.py
All done! ✨ 🍰 ✨
1 file reformatted, 6 files left unchanged.
  black checks were clean
==> Running flake8 checks
  flake8 checks were clean
==> Running mypy checks
Success: no issues found in 577 source files
  mypy checks were clean
==> spack style checks were clean

Keep in mind that I cannot fix your flake8 or mypy errors, so if you have any you'll need to fix them and update the pull request. If I was able to push to your branch, if you make further changes you will need to pull from your updated branch before pushing again.

I've updated the branch with style fixes.

tgamblin · 2023-05-11T18:02:31Z

Ok on further discussion, I think I agree with @haampie and @alalazo here but I also think it would be better to do that as a follow-on optimization to this. Anyone disagree? There is a corner case I still haven't thought through around excludes: in environments.

@becker33 has addressed 2 and 3; we can reduce the lookups further with the optimization.

alalazo · 2023-05-11T18:11:29Z

There is a corner case I still haven't thought through around excludes: in environments.

I think we can treat this as something orthogonal to the abstract hashes, and part of the matrix expansion mechanism. Basically, we should document that any abstract hash in a matrix is expanded during the matrix evaluation and must match a single known concrete spec.

addressed; will do abstract hash reference in follow-on PR

tgamblin

@becker33 one last actual change request and two comments (really for you and @alalazo to look at)

tgamblin · 2023-05-12T07:50:41Z

lib/spack/spack/spec.py

+
+        spec_by_hash = self.lookup_hash()
+
+        # if spec_by_hash != self or not self.eq_dag(spec_by_hash):


@becker33 found another piece of commented code here -- should we get rid of it?

tgamblin · 2023-05-12T08:23:59Z

lib/spack/spack/spec.py

+        # reattach nodes that were not otherwise satisfied by new dependencies
+        for node in self.traverse(root=False):
+            if not any(n._satisfies(node) for n in spec.traverse()):
+                spec._add_dependency(node.copy(), deptypes=())


I'm not 100% sure this is the right semantic, because it can return a contradiction. Suppose you have:

foo ^/abc ^[email protected]

and the only lookup of /abc has [email protected] somewhere in the DAG. This logic may give you a spec with a contradiction in it. It could concretize if /abc is a build dependency or if it's zlib only reachable through build dependencies, since we allow multiple versions of build dependencies in a DAG. If /abc and its zlib are link dependencies, though, we know it'll fail to concretize.

On the other hand, we currently do this:

> spack spec tcl ^[email protected] ^[email protected] ==> Error: Cannot depend on incompatible specs '[email protected]' and '[email protected]'

IMO this is a useful check for users, since it's most likely a mistake, and if (as we've said) our intent is to make ^ mean "unified spec", i.e., a transitive link or run dependency of the root, not some dependency of a build dep, then it's reporting a real error early. I guess tcl ^[email protected] ^%[email protected] would be ok in our future model.

The inconsistency, I guess, is that we detect this earlier at parse time than we do in lookup_hash. Technically if you had a choice between two /abc's, we could be smart enough to choose the one with the satisfiable zlib configuration here. We do that already for constraints on the hash spec; it's just for transitive abstract hashes that we aren't considering the whole spec. I guess this isn't so bad. It's a corner case where we might complain about an abstract hash being ambiguous when we don't need to, but I think I could come up with cases where you'd have to solve to disambiguate, and I wouldn't expect a lookup to imply a solve.

Ok, I think I convinced myself that this is fine. I just wanted to document this and get others' opinions.

tgamblin · 2023-05-12T08:37:12Z

lib/spack/spack/spec.py

+            if not node.name:
+                raise spack.error.SpecError(
+                    f"Spec {node} has no name; cannot concretize an anonymous spec"
+                )


I guess we don't currently model this, but I don't think it's unreasonable to concretize, say, hdf5 with some dependency built with gcc, e.g.: hdf5 ^%gcc. develop fails much less gracefully on this, though:

> spack spec zlib ^%gcc ==> Error: Invalid module name:

vs. this branch:

> spack spec zlib ^%gcc ==> Error: Spec %gcc has no name; cannot concretize an anonymous spec

So I guess this is an improvement and we can "fix" it later if needed.

tgamblin · 2023-05-12T08:46:52Z

I pushed a commit that gets rid of the commented code, as I think it's correct to do that.

git diff develop... | grep '^+\s*#' doesn't return anything so I think we're done with commented code for real this time.

@alalazo @becker33 I'm ok with this PR so if you agree with the changes above and have no further objections, feel free to merge. Again, we can add the hash optimization after.

alalazo

I think we can improve the current code, but I agree merging this and have following PRs after v0.20 I'll leave to @becker33 to give a final look

alalazo · 2023-05-12T14:28:25Z

lib/spack/spack/spec.py

+    def _lookup_hash(self):
+        """Lookup just one spec with an abstract hash, returning a spec from the the environment,
+        store, or finally, binary caches."""
+        import spack.environment


This needs to be refactored at some point, after v0.20, to try disentangling concepts. The design issue I see is that a spec needs to know about all the global objects where specs can be stored to perform a lookup.

Yep agree with this. I think having some central place for managing the world of known Specs would be useful.

I like this idea for improvement, but I'm having trouble thinking where else this information would belong, it's not really a natural fit anywhere. It could live in spack.spec but outside the Spec class, but that doesn't feel satisfying really.

spack#35042 introduced lazy hash parsing, but didn't remove a few attributes from the parser that were needed only for concrete specs This commit removes them, since they are effectively dead code.

#35042 introduced lazy hash parsing, but didn't remove a few attributes from the parser that were needed only for concrete specs This commit removes them, since they are effectively dead code.

spack#35042 introduced lazy hash parsing, but didn't remove a few attributes from the parser that were needed only for concrete specs This commit removes them, since they are effectively dead code.

nhanford added feature A feature is missing in Spack specs binary-packages labels Jan 19, 2023

nhanford requested a review from becker33 January 19, 2023 21:14

nhanford self-assigned this Jan 19, 2023

spackbot-app bot added core PR affects Spack core functionality tests General test capability(ies) labels Jan 19, 2023

haampie previously requested changes Jan 19, 2023

View reviewed changes

nhanford changed the title ~~Allow buildcache specs to be referenced by hash~~ [WIP] Allow buildcache specs to be referenced by hash Feb 1, 2023

tgamblin force-pushed the features/ref-buildcache-spec-by-hash branch from ebd44c7 to 144655f Compare February 15, 2023 22:39

nhanford force-pushed the features/ref-buildcache-spec-by-hash branch from 90514a2 to ee42888 Compare February 21, 2023 17:41

spackbot-app bot added commands conflicts defaults dependencies documentation Improvements or additions to documentation environments libraries new-package new-variant new-version patch python R update-package utilities labels Feb 28, 2023

spec hash lookup: disambiguate hashes by other attributes

7931840

[@spackbot] updating style on behalf of nhanford

d160529

tgamblin previously approved these changes May 11, 2023

View reviewed changes

update error message in test

570ff91

becker33 dismissed tgamblin’s stale review via 570ff91 May 11, 2023 18:28

spackbot-app bot added the commands label May 11, 2023

line length

b6f66b0

remove commented code

883ebd4

tgamblin requested changes May 12, 2023

View reviewed changes

remove commented code

0d64573

tgamblin approved these changes May 12, 2023

View reviewed changes

alalazo reviewed May 12, 2023

View reviewed changes

tgamblin merged commit eef2536 into spack:develop May 12, 2023

greenc-FNAL mentioned this pull request May 18, 2023

Merge from upstream/develop 2023-05-18 FNALssi/spack#140

Merged

haampie mentioned this pull request Jul 7, 2023

Fix multiple quadratic complexity issues in environments #38771

Merged

melven mentioned this pull request Jul 18, 2023

Concretizer for environments ignores specific hashes in v0.20.1 (worked in v0.19.2) #38955

Open

3 tasks

alalazo mentioned this pull request Aug 22, 2023

Remove leftover attributes from parser #39574

Merged

becker33 mentioned this pull request Jun 5, 2024

check remote mirrors for hashes provided by the user #22503

Closed

tgamblin mentioned this pull request Sep 26, 2024

[WIP] Search binary indices for spec hashes parsed from command line #26863

Closed


		spec_by_hash = self.lookup_hash()

		# if spec_by_hash != self or not self.eq_dag(spec_by_hash):

Conversation

nhanford commented Jan 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

How it works

Side-effects of Proposed Changes

Other Changes

Uh oh!

haampie left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nhanford commented Jan 19, 2023

Uh oh!

becker33 commented Jan 20, 2023

Uh oh!

alalazo commented Feb 17, 2023

Uh oh!

haampie commented Feb 17, 2023

Uh oh!

tgamblin commented May 11, 2023

Uh oh!

spackbot-app bot commented May 11, 2023

Uh oh!

spackbot-app bot commented May 11, 2023

Uh oh!

tgamblin commented May 11, 2023

Uh oh!

alalazo commented May 11, 2023

Uh oh!

tgamblin left a comment

Choose a reason for hiding this comment

Uh oh!

tgamblin May 12, 2023

Choose a reason for hiding this comment

Uh oh!

tgamblin May 12, 2023

Choose a reason for hiding this comment

Uh oh!

tgamblin May 12, 2023

Choose a reason for hiding this comment

Uh oh!

tgamblin commented May 12, 2023

Uh oh!

alalazo left a comment

Choose a reason for hiding this comment

Uh oh!

alalazo May 12, 2023

Choose a reason for hiding this comment

Uh oh!

tgamblin May 12, 2023

Choose a reason for hiding this comment

Uh oh!

becker33 May 12, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

nhanford commented Jan 19, 2023 •

edited

Loading

haampie left a comment •

edited

Loading