Skip to content

Add virtual information on DAG edges#34821

Merged
becker33 merged 6 commits intospack:developfrom
alalazo:features/keep_track_of_virtuals_on_edges
Jun 15, 2023
Merged

Add virtual information on DAG edges#34821
becker33 merged 6 commits intospack:developfrom
alalazo:features/keep_track_of_virtuals_on_edges

Conversation

@alalazo
Copy link
Copy Markdown
Member

@alalazo alalazo commented Jan 5, 2023

depends on #34807
closes #35258
fixes #34886

⚠️ Bumps format of specfile, and thus of lockfile and database ⚠️

Currently virtual dependencies are used only during concretization to select a provider. No annotation is kept on edges to reflect that a leaf node is providing a virtual dependency. This PR adds virtual information on edges, so we can keep track of which spec is providing what virtual both in memory and in the JSON / YAML representation.

An example entry in JSON format is:

        "dependencies": [
          {
            "name": "cmake",
            "hash": "iqpoju67adln3yqzvqzccrtkjmpc666m",
            "parameters": {
              "deptypes": [
                "build"
              ],
              "virtuals": []
            }
          },
          {
            "name": "openmpi",
            "hash": "p23v372ij7bomrarjrhcv4uw6ieahjt3",
            "parameters": {
              "deptypes": [
                "build",
                "link"
              ],
              "virtuals": [
                "mpi"
              ]
            }
          },
          {
            "name": "pkgconf",
            "hash": "i4avrindvhcamhurzbfdaggbj2zgsrrh",
            "parameters": {
              "deptypes": [
                "run"
              ],
              "virtuals": [
                "pkgconfig"
              ]
            }
          },
          {
            "name": "zlib",
            "hash": "nizxi5u5bbrzhzwfy2qb7hatlhuswlrz",
            "parameters": {
              "deptypes": [
                "build",
                "link"
              ],
              "virtuals": []
            }
          }
        ],

Modifications:

  • Add a virtuals attribute to DependencySpec
  • Modify Spec.to_node_dict to output information on virtuals in the `"dependencies" section
  • Add unit tests (round-trip specs / check annotation)
  • Modified DependencySpec to hold an arbitrary number of attributes (98b9806)
  • Separated loading functions from dict according to specfile version, add unit-tests to ensure we can read old specfiles (f19daa8)
  • Reconstruct virtuals from old specfile formats (0b2db30)

@spackbot-app spackbot-app bot added build-systems core PR affects Spack core functionality environments tests General test capability(ies) labels Jan 5, 2023
@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch 3 times, most recently from 528dddc to 857754f Compare January 9, 2023 11:05
@alalazo alalazo added this to the v0.20.0 milestone Jan 9, 2023
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Jan 9, 2023

@tgamblin Do we want to reconstruct virtuals on edges when reading "old" specs in JSON format? I think that is possible in principle, since specs have a unique provider.

@haampie
Copy link
Copy Markdown
Member

haampie commented Jan 10, 2023

We could also consider printing this info in spack spec:

pkg
    ^openblas (provides lapack, blas)

@tgamblin tgamblin self-assigned this Jan 10, 2023
@alalazo alalazo marked this pull request as draft January 11, 2023 17:59
@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch 2 times, most recently from dee0025 to 0b2db30 Compare January 18, 2023 15:52
@alalazo alalazo marked this pull request as ready for review January 18, 2023 16:38
@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch from 0b2db30 to 0d01503 Compare January 19, 2023 13:39
@alalazo alalazo mentioned this pull request Jan 19, 2023
2 tasks
@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch from 0d01503 to c2207c9 Compare January 20, 2023 09:51
alalazo added a commit to alalazo/spack that referenced this pull request Jan 23, 2023
Extracted from spack#34821

This commit makes explicit the format version of the spec file
we are reading from.

Before there were different functions capable of reading some
part of the spec file at multiple format versions. The decision
was implicit, since checks were based on the structure of the
JSON without ever checking a format version number.

The refactor makes also explicit which spec file format is used
by which database and lockfile format, since the information is
stored in global mappings.

To ensure we don't change the hash of old specs, JSON representations
of specs have been added as data. A unit tests checks that we read
the correct hash in, and that the hash stays the same when we
re-serialize the spec using the most recent format version.
alalazo added a commit to alalazo/spack that referenced this pull request Jan 23, 2023
Extracted from spack#34821

This commit makes explicit the format version of the spec file
we are reading from.

Before there were different functions capable of reading some
part of the spec file at multiple format versions. The decision
was implicit, since checks were based on the structure of the
JSON without ever checking a format version number.

The refactor makes also explicit which spec file format is used
by which database and lockfile format, since the information is
stored in global mappings.

To ensure we don't change the hash of old specs, JSON representations
of specs have been added as data. A unit tests checks that we read
the correct hash in, and that the hash stays the same when we
re-serialize the spec using the most recent format version.
@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch from c2207c9 to 5a41fb2 Compare January 26, 2023 10:41
Comment on lines +165 to +167
# TODO: Check this with Peter
parent_spec._add_dependency(dep_spec, deptypes=deptypes, virtuals=())
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scheibelp Is ☝️ correct when reading from a Cray manifest? Or are there cases where we read DAGs that should include virtual dependency annotations on edges?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Virtual information would not be included in the manifest currently. If virtuals=(), does that mean that the dependency provides no virtuals (i.e. spec["mpi"] would fail for a dag that contains openmpi)? Or does it default to providing all the virtuals that it could provide?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that the dependency provides no virtuals on the edge being added. Currently I didn't change any type of computation internal to Spack, so spec["mpi"] still works without looking at edge attributes - but I'm pretty sure that will change in the future.

When reading from previous formats I have added a function that tries to reconstruct the virtual attributes on edges:

spack/lib/spack/spack/spec.py

Lines 4808 to 4817 in 5a41fb2

def _reconstruct_virtuals_on_edges(spec):
"""Reconstruct virtuals on edges. Used to read from old DB
and reindex.
"""
possible_virtuals = [x for x in spec.package.dependencies if Spec(x).virtual]
for vspec in possible_virtuals:
if vspec in spec:
name = spec[vspec].name
for edge in spec.edges_to_dependencies(name=name):
edge.update_virtuals([vspec])

Should we do something similar for Cray too (or reuse the same function if possible)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that the dependency provides no virtuals on the edge being added.

I think we should probably default to inferring that all possible virtuals are provided, but perhaps the next bit will provide clarity on why I think that (and if I'm wrong).

Should we do something similar for Cray too (or reuse the same function if possible)?

Cray (specifically specs read from manifests on Cray machines) is not different from reading/using specs that were installed prior to this merge. I am confused by the notion that Cray-specific logic is needed:

  • If a new install directly depends on a spec read from the manifest, I assume this PR could assign virtual information to the edges
  • However, DAGs read from the manifest do not encode the virtual information on the edges, so Specs using these cannot select a subset of virtual functionality (i.e., if X -> foo and foo provides blas and lapack, any package that uses X must use foo for both blas and lapack)
    • I don't anticipate this will be a problem, as the manifest is primarily for describing things like cray-libsci that are direct providers

Does that resolve the issue? Or am I missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alalazo it seems like this would need the virtual inference that you use for reading old spec formats here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 9c31dea It would be good if someone can suggest how to add a test that checks virtuals are reconstructed properly1

Footnotes

  1. I stepped in with a debugger and I believe it works, but it would be good to test it I guess.

@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch 2 times, most recently from 0fa8e54 to 0d8a9eb Compare May 4, 2023 18:10
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented May 10, 2023

For reference, moving from the current develop to this PR is fine and any write operation will trigger an upgrade of:

  • The database
  • The lockfiles
  • The spec.json files

where appropriate.

When moving from this PR back to develop, we'll see the following errors.

Develop (e70755f)

Bad error message for databases, understandable error message for environments.

For databases:

$ spack find
==> Error: Version("7")

For environments:

$ spack concretize
==> Error: Spack 0.20.0.dev0 cannot read environment lockfiles using the v5 format

releases/v0.19 (9e8e725)

Understandable error message for databases, bad error message for environments.

For databases:

$ spack find
==> Error: Expected database version 6 but found version 7.
`spack reindex` may fix this, or you may need a newer Spack version.

For environments:

$ spack concretize
==> Error: 'type'

So not sure how to fix these properly. I'll post some tests with buildcaches tomorrow.

@tgamblin tgamblin modified the milestones: v0.20.0, v0.21.0 May 10, 2023
@tgamblin
Copy link
Copy Markdown
Member

moving this to 0.21, as it's a large change and we don't need it for 0.20. I think we need to get the safeguards in develop, 0.20, 0.19.3, then get this in soon after 0.20

@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch from b7929e7 to 141357c Compare June 6, 2023 13:34
@alalazo alalazo removed this from the v0.21.0 milestone Jun 7, 2023
@alalazo alalazo self-assigned this Jun 7, 2023
Copy link
Copy Markdown
Member

@becker33 becker33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably note in the docs that this extension to the syntax doesn't work well with the old concretizer -- I only know of one user still using it and I think that will be zero pretty soon, but still worth noting. It looks like the syntax will work, but the virtual splitting looks like it will not. Is that accurate?

Comment on lines +165 to +167
# TODO: Check this with Peter
parent_spec._add_dependency(dep_spec, deptypes=deptypes, virtuals=())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alalazo it seems like this would need the virtual inference that you use for reading old spec formats here.

continue
current_spec[node.name].add_dependency_edge(
test_dependency.copy(), deptypes="test"
test_dependency.copy(), deptypes="test", virtuals=()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it would lose information if the package has a test dependency on a virtual? Am I missing something?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d594511

Comment on lines +2176 to +2180
def test_virtuals_are_annotated_on_edges(self, default_mock_concretization):
spec = default_mock_concretization("mpileaks ^mpich")

edges = spec.edges_to_dependencies(name="mpich")
assert len(edges) == 1 and edges[0].virtuals == ("mpi",)
edges = spec.edges_to_dependencies(name="callpath")
assert len(edges) == 1 and edges[0].virtuals == ()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this test the case in which the dependency is found entirely by spack as well? Something like

Suggested change
def test_virtuals_are_annotated_on_edges(self, default_mock_concretization):
spec = default_mock_concretization("mpileaks ^mpich")
edges = spec.edges_to_dependencies(name="mpich")
assert len(edges) == 1 and edges[0].virtuals == ("mpi",)
edges = spec.edges_to_dependencies(name="callpath")
assert len(edges) == 1 and edges[0].virtuals == ()
@pytest.mark.parametrize("spec", ["mpileaks", "mpileaks ^mpich"])
def test_virtuals_are_annotated_on_edges(self, spec, default_mock_concretization):
spec = default_mock_concretization(spec)
mpiname = spec["mpi"].name
edges = spec.edges_to_dependencies(name=mpiname)
assert len(edges) == 1 and edges[0].virtuals == ("mpi",)
edges = spec.edges_to_dependencies(name="callpath")
assert len(edges) == 1 and edges[0].virtuals == ()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 29a8457

yield self.parent.name if self.parent else None
yield self.spec.name if self.spec else None
yield self.deptypes
yield self.virtuals
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more forward thinking to yield a dict of self.parameters?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can extend that when we'll work on use_variants and we'll introduce parameters other than these two. Using a dict requires us to make the object hashable, since:

>>> {} < {}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'dict' and 'dict'

# both the parent and the child. When we update this object they'll
# both see the deptype modification.
edge.add_type(deptypes)
edge.update_deptypes(deptypes=deptypes)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to update the virtual information here as well, is there a reason that's not implemented as part of this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 427bb95

@alalazo alalazo force-pushed the features/keep_track_of_virtuals_on_edges branch from 141357c to 9c31dea Compare June 14, 2023 08:18
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Jun 14, 2023

We should probably note in the docs that this extension to the syntax doesn't work well with the old concretizer -- I only know of one user still using it and I think that will be zero pretty soon, but still worth noting. It looks like the syntax will work, but the virtual splitting looks like it will not. Is that accurate?

This PR doesn't introduce any change in the spec syntax. It's just a preliminary change in the internal representation. #35322 is coded on top of this PR and introduces that change. Are you referring to that PR?

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Jun 14, 2023

Do we need to take any action on the develop buildcache when we merge this? In my opinion it may be worth wiping the develop buildcache out and rebuild it from scratch after this PR is merged (so that we'll move every spec to the new format), but let me know if you disagree.

Also, people using the develop buildcache with a commit preceding the eventual merge, will be unable to continue doing so. I think that is fine, since we also have develop snapshots, but this is worth noting explicitly.

@becker33
Copy link
Copy Markdown
Member

This PR doesn't introduce any change in the spec syntax. It's just a preliminary change in the internal representation. #35322 is coded on top of this PR and introduces that change. Are you referring to that PR?

Yes, I was conflating with that PR.

Copy link
Copy Markdown
Member

@becker33 becker33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now

@becker33 becker33 merged commit f27d012 into spack:develop Jun 15, 2023
@alalazo alalazo deleted the features/keep_track_of_virtuals_on_edges branch June 15, 2023 14:26
alalazo added a commit to alalazo/spack that referenced this pull request Aug 18, 2023
Since spack#34821 we are annotating virtual dependencies on
DAG edges, and reconstructing virtuals in mempry when
we read a concrete spec from previous formats.

Therefore, we can remove a TODO in asp.py, and rely on
"virtual_on_edge" facts to be imposed.
tgamblin pushed a commit that referenced this pull request Aug 22, 2023
Since #34821 we are annotating virtual dependencies on
DAG edges, and reconstructing virtuals in memory when
we read a concrete spec from previous formats.

Therefore, we can remove a TODO in asp.py, and rely on
"virtual_on_edge" facts to be imposed.
mpokorny pushed a commit to mpokorny/spack that referenced this pull request Sep 18, 2023
Since spack#34821 we are annotating virtual dependencies on
DAG edges, and reconstructing virtuals in memory when
we read a concrete spec from previous formats.

Therefore, we can remove a TODO in asp.py, and rely on
"virtual_on_edge" facts to be imposed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build-systems commands core PR affects Spack core functionality environments tests General test capability(ies)

Projects

Development

Successfully merging this pull request may close these issues.

Annotate virtual dependencies on DAG edges

5 participants