Conversation
8a35394 to
868f8cc
Compare
c79ef8a to
68a438f
Compare
|
@spackbot run pipeline |
|
I've started that pipeline for you! |
|
@tgamblin Anything jump out at you about the max recursion depth exceeded error while computing the package hash, here? I reproduced that outside of gitlab ci by activating the environment and concretizing it. I see this message in there: I'm investigating, but let me know if this rings and bells for you. |
|
Another question @tgamblin: It seems commit 80649c0b54 introduced the failure I mentioned on Friday regarding: ... which is happening in |
dag_hash()dag_hash everywhere and make it work like full_hash
|
fyi @eugeneswalker this is the work that should get us full hash everywhere and cut out all the spec splicing during |
|
@becker33 I didn't bump the db version on this PR or add the spec dict format version to the database, as you had suggested, because the purpose I thought you suggested it for sort of went away. I was worried about how to interpret While there's still some debug messaging I need to remove from this PR and a couple failing tests, maybe you could take a look at it to provide some initial feedback? |
|
There are conflicts now, I'm going to rebase to fix. |
5375dad to
0884de2
Compare
Issue described in the following PR comment: #28504 (comment) Solution described in subsequent comment: #28504 (comment)
Without some enforcement of spec ordering, python 2 produced different results in the affected test than did python 3. This change makes the arbitrary but reproducible decision to sort the specs by their lockfile key alphabetically.
`hashes_final` was used to indicate when a spec was concrete but possibly lacked `full_hash` or `build_hash` fields. This was only necessary because older Spacks didn't generate them, and we want to avoid recomputing them, as we likely do not have the same package files as existed at concretization time. Now, we don't need to do that -- there is only the DAG hash and specs are either concrete and have a `dag_hash`, or not concrete and have no `dag_hash`. There's no middle ground.
…ocol `spack monitor` expects a field called `spec_full_hash`, so we shouldn't change that. Instead, we can pass a `dag_hash` (which is now the full hash) but not change the field name.
Some test cases had to be modified in a kludgy way so that abstract specs made concrete would have versions on them. We shouldn't *need* to do this, as the only reason we care is because the content hash has to be able to get an archive for a version. This modifies the content hash so that it can be called on abstract specs, including only relevant content. This does NOT add a partial content hash to the DAG hash, as we do not really want that -- we don't need in-memory spec hashes to need to load package files. It just makes `Package.content_hash()` less prickly and tests easier to understand.
…tests
This removes all but one usage of runtime hash. The runtime hash was being used to write
historical lockfiles for tests, but we don't need it for that; we can just save those
lockfiles.
- [x] add legacy lockfiles for v1, v2, v3
- [x] fix bugs with v1 lockfile tests (the dummy lockfile we were writing was not actually
a v1 lockfile because it used the new spec file format).
- [x] remove all but one runtime_hash usage -- that one needs a small rework of the
concretizer to really fix, as it's about separate concretization of build
dependencies.
- [x] Document the history of the lockfile format in `environment/__init__.py`
- [x] update test to use `build_hash` instead of `dag_hash`, as we're testing for
graph structure, and specifically NOT testing for package changes.
- [x] make hash descriptors callable on specs to simplify syntax for invoking them
- [x] make `Spec.spec_hash()` public
With the original DAG hash, we did not store build dependencies in the database, but
with the full DAG hash, we do. Previously, we'd never tell the concretizer about build
dependencies of things used by hash, because we never had them. Now, we have to avoid
telling the concretizer about them, or they'll unnecessarily constrain build
dependencies for new concretizations.
- [x] Make database track all dependencies included in the `dag_hash`
- [x] Modify spec_clauses so that build dependency information is optional
and off by default.
- [x] `spack diff` asks `spec_clauses` for build dependencies for completeness
- [x] Modify `concretize.lp` so that reuse optimization doesn't affect fresh
installations.
- [x] Modify concretizer setup so that it does *not* prioritize installed versions
over package versions. We don't need this with reuse, so they're low priority.
- [x] Fix `test_installed_deps` for full hash and new concretizer (does not work
for old concretizer with full hash -- leave this for later if we need it)
- [x] Move `test_installed_deps` mock packages to `builtin.mock` for easier debugging
with `spack -m`.
- [x] Fix `test_reuse_installed_packages_when_package_def_changes` for full hash
The database now stores full hashes, so we need to adjust the criteria we use to determine if something can be uninstalled. Specifically, it's ok to uninstall thing that have remaining build-only dependents.
We previously had checks in `directory_layout` to check for build-dependency
conflicts when we weren't storing build dependencies. We don't need
those anymore; we can just rely on the DAG hash now that it includes everything
we know about each spec.
- [x] Remove vestigial code for checking installed spec against concrete spec
in `ensure_installed()`
- [x] Remove `SpecHashCollisionError` -- if specs have the same hash now, they're
the same as far as `DirectoryLayout` should be concerned.
- [x] Convert spec comparison to `dag_hash()` comparison when adding extensions.
|
Oh, and we should wait for #30396 to rebase this one. |
alalazo
left a comment
There was a problem hiding this comment.
I think all I could see is addressed. So from my point of view let's 🤞 and merge
fixes #21491.
fixes #23428.
fixes #30310.
For a long time, Spack has used a coarser hash to identify packages than it likely should. Packages are identified by
dag_hash(), which includes only link and run dependencies. Build dependencies are stripped before hashing, and we have notincluded hashes of build artifacts or thepackage.pyfiles used to build. This means the DAG hash actually doesn't represent all the things Spack can build, and it reduces reproducibility.We did this because, in the early days, users were (rightly) annoyed when a new version of CMake, autotools, or some other build dependency would necessitate a rebuild of their entire stack. Coarsening the hash avoided this issue and enabled a modicum of stability when only reusing packages by hash match.
Now that we have
--reuse, we don't need to be so careful. Users can avoid unnecessary rebuilds much more easily, and we can add more provenance to the spec without worrying that frequent hash changes will cause too many rebuilds.This PR makes the following changes:
Spec.dag_hash()include build, run, and link dependencies and the package hash (it is now equivalent tofull_hash()).build_hash()andfull_hash()methods where needed.dag_hash()spack cito not splice specs on updateThere are also some bugfixes for problems discovered during the switch:
package_hash()into_node_dict()unless the spec is concrete (fixes breaks on abstract specs)Spec.patchesis memoized. Usingllnl.util.lang.memoizedonSpecobjects causes specs to be stored in adict, which means they need a hash. But,dag_hash()now includes patchsha256's via the package hash, which can lead to infinite recursion.