ASP-based solver: unique objects for reused specs#39590
ASP-based solver: unique objects for reused specs#39590haampie merged 9 commits intospack:developfrom
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
156e802 to
a8e99c7
Compare
This comment was marked as outdated.
This comment was marked as outdated.
a8e99c7 to
7f8c44a
Compare
lib/spack/spack/solver/asp.py
Outdated
| return False | ||
|
|
||
| # Unify objects in the container | ||
| for edge in reversed(spack.traverse.traverse_edges_topo([spec], direction="children")): |
There was a problem hiding this comment.
why edges / double loop / topo order?
There was a problem hiding this comment.
fyi: if the goal is to early exit on sub-dag added before, don't use topo order, cause it has to traverse the full dag to order things :) so, just use dfs and don't recurse into children already added (so, use a visitor directly cause an iterator can't be stopped from recursing; if that's too verbose, do recursion or keep a stack)
There was a problem hiding this comment.
why edges?
Cause I forgot to update this function
There was a problem hiding this comment.
if the goal is to early exit on sub-dag added before
No, the goal is to unify object for subdags added before - similarly to a roundtrip to-from DB. That unifies the 2 Python objects with the same DAG we have in memory.
There was a problem hiding this comment.
Can you do early exit though? It doesn't make sense to iterate over everything and skip every node from each subdag seen before.
There was a problem hiding this comment.
I mean, you must do this. Your current implementation effectively copies a database in O(n^2) time.
There was a problem hiding this comment.
Actually, ReusableSpecsByHash is rather wasteful, it always copies everything.
By design daghash(x) == daghash(y) => id(x) == id(y) for all x and y from the same database. So the issue is just about merging databases (and adding small number of specs from say spec.json files).
So, why not duplicate only dependents of specs that occur in multiple databases? There's already a dictionary of dag hashes per db.
haampie
left a comment
There was a problem hiding this comment.
Let's avoid copies of all specs from all databases, and instead exploit the invariant that each database has one spec instance per unique dag hash -- if there are two databases A and B, it's only necessary to copy those nodes that are unique to A but have dependencies in common with B.
|
I'll look into that optimization, but for reference this is what I get with:
after adding: $ spack mirror add v0.20.1 https://binaries.spack.io/v0.20.1radiuss-pr.csv and that is the reason why this implementation seemed viable. I'll try with a bigger buildcache (more traversal and copies) and see what I get. |
|
sure, but these things add up over time, so why not implement it efficiently from the start, if that's not difficult to achieve. At the very least, make it linear time... |
ef06f0d to
1970a6f
Compare
|
notice: |
|
I implemented an explicit visitor. With that, using:
with a setup like: $ spack mirror add v0.20.1 https://binaries.spack.io/v0.20.1
$ spack mirror add E4S https://cache.e4s.ioradiuss-develop.csv The case of |
|
Hm okay. The current implementation is roughly equivalent to |
34b99e3 to
16b3a2f
Compare
Reused specs used to be referenced directly into the built spec. This might cause issues like in spack#39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. Here the issue is solved by registering the concrete specs to be reused in a container that ensures their consistency.
Since the test is brittle, just add the spec to a temporary store
16b3a2f to
c4d84b5
Compare
| assert spack.solver.asp._is_checksummed_version((v, v_opts)) == checksummed | ||
|
|
||
|
|
||
| class TestConcreteSpecsByHash: |
There was a problem hiding this comment.
Just to group tests for the ConcreteSpecsByHash class. This can be used during development:
$ spack unit-test lib/spack/spack/test/concretize.py::TestConcreteSpecsByHashto run all related unit tests.
Reused specs used to be referenced directly into the built spec. This might cause issues like in issue 39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. The issue is solved by copying concrete specs to a dictionary keyed by dag hash.
Reused specs used to be referenced directly into the built spec. This might cause issues like in issue 39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. The issue is solved by copying concrete specs to a dictionary keyed by dag hash.
Reused specs used to be referenced directly into the built spec. This might cause issues like in issue 39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. The issue is solved by copying concrete specs to a dictionary keyed by dag hash.
Reused specs used to be referenced directly into the built spec. This might cause issues like in issue 39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. The issue is solved by copying concrete specs to a dictionary keyed by dag hash.
Reused specs used to be referenced directly into the built spec. This might cause issues like in issue 39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. The issue is solved by copying concrete specs to a dictionary keyed by dag hash.
Reused specs used to be referenced directly into the built spec. This might cause issues like in issue 39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. The issue is solved by copying concrete specs to a dictionary keyed by dag hash.
Reused specs used to be referenced directly into the built spec. This might cause issues like in issue 39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash. The issue is solved by copying concrete specs to a dictionary keyed by dag hash.


fixes #39570
Reused specs used to be referenced directly into the built spec.
This might cause issues like in #39570 where two objects in memory represent the same node, because two reused specs were loaded from different sources but referred to the same spec by DAG hash.
Here the issue is solved by registering the concrete specs to be reused in a container that ensures their consistency.