concretizer: use only `attr()` for Spec attributes by tgamblin · Pull Request #31202 · spack/spack

tgamblin · 2022-06-21T00:18:18Z

concretizer: use only attr() for Spec attributes

All Spec attributes are now represented as attr(attribute_name, ... args ...), e.g.
attr(node, "hdf5") instead of node("hdf5"), as we have to maintain the attr()
form anyway, and it simplifies the encoding to just maintain one form of the Spec
information.

Background

In #20644, we unified the way conditionals are done in the concretizer, but this
introduced a nasty aspect to the encoding: we have to maintain everything we want in
general conditions in two forms: predicate(...) and attr("predicate", ...). For
example, here's the start of the table of spec attributes we had to maintain:

node(Package)                      :- attr("node", Package).
virtual_node(Virtual)              :- attr("virtual_node", Virtual).
hash(Package, Hash)                :- attr("hash", Package, Hash).
version(Package, Version)          :- attr("version", Package, Version).
...

attr("node", Package)              :- node(Package).
attr("virtual_node", Virtual)      :- virtual_node(Virtual).
attr("hash", Package, Hash)        :- hash(Package, Hash).
attr("version", Package, Version)  :- version(Package, Version).
...

This adds cognitive load to understanding how the concretizer works, as you have to
understand the equivalence between the two forms of spec attributes. It also makes the
general condition logic in #20644 hard to explain, and it's easy to forget to add a new
equivalence to this list when adding new spec attributes (at least two people have been
bitten by this).

Solution

remove the equivalence list from concretize.lp
simplify spec_clauses(), condition(), and other functions in asp.py that need
to deal with Spec attributes.
Convert all old-form spec attributes in concretize.lp to the attr() form
Simplify display.lp, where we also had to maintain a list of spec attributes. Now
we only need to show attr/2, attr/3, and attr/4.
Simplify model extraction logic in asp.py.

Performance

This seems to result in a smaller grounded problem (as there are no longer duplicated
attr("foo", ...) / foo(...) predicates in the program), but it also adds a slight
performance overhead vs. develop. Ultimately, simplifying the encoding will be a win,
particularly for improving error messages.

Notes

This will simplify future node refactors in concretize.lp (e.g., not identifying nodes
by package name, which we need for separate build dependencies).

I'm still not entirely used to reading attr() notation, but I thnk it's ultimately
clearer than what we did before. We need more uniform naming, and it's now clear what is
part of a solution. We should probably continue making the encoding of concretize.lp
simpler and more self-explanatory. It may make sense to rename attr to something like
node_attr and to simplify the names of node attributes. It also might make sense to do
something similar for other types of predicates in concretize.lp.

lib/spack/spack/solver/concretize.lp

trws · 2022-06-22T18:13:35Z

This looks really good to me, I took a relatively close pass over it and found no issues at all past what @becker33 already mentioned.

lib/spack/spack/solver/asp.py

lib/spack/spack/solver/concretize.lp

alalazo

It seems there are tests that need their assertion slightly changed:

        # ensure that variant diffs are in here the result
>       assert ['variant_value', 'mpileaks debug False'] in c['a_not_b']
E       AssertionError: assert ['variant_value', 'mpileaks debug False'] in [['attr', 'hash mpileaks qek6lz5qlm7tgckn5j3pcnlxxlj57nqe'], ['attr', 'variant_value mpileaks debug False']]

becker33 · 2022-11-28T15:13:21Z

lib/spack/spack/solver/asp.py

-        if name == "error":
-            priority = function_tuple[1][0]
-            return (-5, priority)
-        elif name == "hash":


This needs to be restored for error messages to work properly.

This was intentional. See https://github.com/spack/spack/pull/31202/files#diff-50f1229db6f3f9305546f946f508c28f829ab7e778c0a6be4e3a9f4dd6c9b1b7R763-R767

becker33 · 2022-11-28T15:14:20Z

lib/spack/spack/solver/asp.py

-    def error(self, priority, msg, *args):
-        msg = msg.format(*args)
-
-        # For variant formatting, we sometimes have to construct specs
-        # to format values properly. Find/replace all occurances of
-        # Spec(...) with the string representation of the spec mentioned
-        specs_to_construct = re.findall(r"Spec\(([^)]*)\)", msg)
-        for spec_str in specs_to_construct:
-            msg = msg.replace("Spec(%s)" % spec_str, str(spack.spec.Spec(spec_str)))
-        raise UnsatisfiableSpecError(msg)


This needs to be restored, looks like a rebase issue that removed it

Nope -- this was intentional. It's moved to here: https://github.com/spack/spack/pull/31202/files#diff-50f1229db6f3f9305546f946f508c28f829ab7e778c0a6be4e3a9f4dd6c9b1b7R649-R662

becker33 · 2022-11-28T15:54:59Z

lib/spack/spack/solver/concretize.lp

-#defined virtual_node/1.
-#defined virtual_root/1.
 #defined virtual_condition_holds/2.
 #defined external/1.


Shouldn't external be an attr, since it's something we use in the reconstruction of the spec?

I don't see where we have a handler for it. We have a handle for external_spec_selected, and attr("external_spec_selected", Package, LocalIndex) is used to get the information back to the solver. That was previously in display.lp and is used by SpecBuilder, but external/1 is just input metadata.

alalazo · 2022-11-29T09:32:40Z

lib/spack/spack/solver/display.lp

+#show attr/2.
+#show attr/3.
+#show attr/4.


Is there any way to print only attr/N atoms which match a certain first argument? Before in complex debugging situations we could comment all but a few atoms. I wonder if now, that are all aggregated under the same name, we can do something similar.

Unfortunately, there isn't.

alalazo · 2022-11-29T14:26:05Z

I timed the solve on this list of specs:

radiuss.txt

and obtained the following results:

Plotting an histogram in seconds of the various phases, on this branch:

while on develop:

alalazo

Only minor comments to the code. I'll try to check clingo internal stats next, to see if I can make any sense of them 🤯

lib/spack/spack/solver/asp.py

alalazo · 2022-11-29T15:01:54Z

lib/spack/spack/solver/asp.py

-            node_flag = fn.node_flag
-            node_flag_propagate = fn.node_flag_propagate
-            variant_propagate = fn.variant_propagate
+            node = fn.attr("node")


Wondering if using enum to collect all attr is a good idea or not 🤔

Not all attr need to be herein spec_clauses. There are internal solver attributes that are also passed through our general conditions that need to be attr's but we don't want them to be listed here.

Sure. I meant if we should enumerate all the literal strings we use in attr and use the enum throughout this entire file (instead of repeating the string everywhere).

lib/spack/spack/solver/asp.py

tgamblin · 2022-11-30T00:36:08Z

@alalazo the performance numbers you reported are between 15 - 32% overhead for grounding vs. develop which is weird for an encoding that's logically equivalent to what we had before. I'll be interested to see what you come up with -- might also dig into the numbers myself. Let's not merge until we decide whether this is acceptable or whether it can be mitigated.

alalazo · 2022-11-30T15:53:13Z

I'm attaching a tarball radiuss.tar.gz with some analysis I'm trying to do to get insight on the timing. All the numbers (atoms, choices, etc.) seem slightly lower in this new encoding, as we expect. The only one that is higher is "Problems".

Using:

clingo --stats=2 --configuration=tweety --opt-strategy=usc,one concretize.lp os_compatibility.lp display.lp lbann.lp

I obtain the following statistics on develop:

Optimization: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 32 27 0 0 0
OPTIMUM FOUND

Models       : 10
  Optimum    : yes
Optimization : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 32 27 0 0 0
Calls        : 1
Time         : 17.686s (Solving: 6.79s 1st Model: 3.51s Unsat: 0.00s)
CPU Time     : 17.663s

Choices      : 112092  
Conflicts    : 658      (Analyzed: 652)
Restarts     : 5        (Average: 130.40 Last: 1)
Model-Level  : 22617.0 
Problems     : 7        (Average Length: 0.00 Splits: 0)
Lemmas       : 8839     (Deleted: 0)
  Binary     : 3750     (Ratio:  42.43%)
  Ternary    : 1064     (Ratio:  12.04%)
  Conflict   : 652      (Average Length:   12.1 Ratio:   7.38%) 
  Loop       : 8187     (Average Length:   21.4 Ratio:  92.62%) 
  Other      : 0        (Average Length:    0.0 Ratio:   0.00%) 
Backjumps    : 652      (Average: 3510.83 Max: 192904 Sum: 2289062)
  Executed   : 606      (Average: 138.83 Max: 192904 Sum:  90520 Ratio:   3.95%)
  Bounded    : 46       (Average: 47794.39 Max: 190267 Sum: 2198542 Ratio:  96.05%)

Rules        : 2239567  (Original: 2179418)
  Choice     : 60586   
  Minimize   : 39      
  Heuristic  : 5876     (Original: 5878)
Atoms        : 953741  
Bodies       : 1697374  (Original: 1703260)
  Count      : 0        (Original: 5970)
Equivalences : 2117871  (Atom=Atom: 206153 Body=Body: 363106 Other: 1548612)
Tight        : No       (SCCs: 1810 Non-Hcfs: 0 Nodes: 237176 Gammas: 0)
Variables    : 1200500  (Eliminated:    0 Frozen: 454445)
Constraints  : 4951653  (Binary:  73.4% Ternary:  17.3% Other:   9.3%)

and the following from this PR:

Optimization: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 32 27 0 0 0
OPTIMUM FOUND

Models       : 9
  Optimum    : yes
Optimization : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 32 27 0 0 0
Calls        : 1
Time         : 18.986s (Solving: 6.63s 1st Model: 3.69s Unsat: 0.02s)
CPU Time     : 18.904s

Choices      : 107824  
Conflicts    : 591      (Analyzed: 582)
Restarts     : 4        (Average: 145.50 Last: 0)
Model-Level  : 25502.3 
Problems     : 10       (Average Length: 0.00 Splits: 0)
Lemmas       : 8512     (Deleted: 0)
  Binary     : 3517     (Ratio:  41.32%)
  Ternary    : 985      (Ratio:  11.57%)
  Conflict   : 582      (Average Length:   16.0 Ratio:   6.84%) 
  Loop       : 7930     (Average Length:   21.3 Ratio:  93.16%) 
  Other      : 0        (Average Length:    0.0 Ratio:   0.00%) 
Backjumps    : 582      (Average: 4496.39 Max: 193116 Sum: 2616898)
  Executed   : 528      (Average: 142.67 Max: 193116 Sum:  83034 Ratio:   3.17%)
  Bounded    : 54       (Average: 46923.41 Max: 189627 Sum: 2533864 Ratio:  96.83%)

Rules        : 2137286  (Original: 2077137)
  Choice     : 60586   
  Minimize   : 39      
  Heuristic  : 5876     (Original: 5878)
Atoms        : 906889  
Bodies       : 1639341  (Original: 1645223)
  Count      : 0        (Original: 5970)
Equivalences : 1973868  (Atom=Atom: 155585 Body=Body: 361218 Other: 1457065)
Tight        : No       (SCCs: 1695 Non-Hcfs: 0 Nodes: 234103 Gammas: 0)
Variables    : 1198930  (Eliminated:    0 Frozen: 452653)
Constraints  : 4942471  (Binary:  73.4% Ternary:  17.4% Other:   9.3%)

alalazo · 2022-11-30T16:09:59Z

I think this line might be interesting:

# develop
Conflict   : 652      (Average Length:   12.1 Ratio:   7.38%) 

# This PR
Conflict   : 582      (Average Length:   16.0 Ratio:   6.84%)

which suggest to me that it takes more chaining of rules in this PR to find a conflict and backtrack.

All Spec attributes are now represented as `attr(attribute_name, ... args ...)`, e.g. `attr(node, "hdf5")` instead of `node("hdf5")`, as we *have* to maintain the `attr()` form anyway, and it simplifies the encoding to just maintain one form of the Spec information. Background ---------- In #20644, we unified the way conditionals are done in the concretizer, but this introduced a nasty aspect to the encoding: we have to maintain everything we want in general conditions in two forms: `predicate(...)` and `attr("predicate", ...)`. For example, here's the start of the table of spec attributes we had to maintain: ```prolog node(Package) :- attr("node", Package). virtual_node(Virtual) :- attr("virtual_node", Virtual). hash(Package, Hash) :- attr("hash", Package, Hash). version(Package, Version) :- attr("version", Package, Version). ... ``` ```prolog attr("node", Package) :- node(Package). attr("virtual_node", Virtual) :- virtual_node(Virtual). attr("hash", Package, Hash) :- hash(Package, Hash). attr("version", Package, Version) :- version(Package, Version). ... ``` This adds cognitive load to understanding how the concretizer works, as you have to understand the equivalence between the two forms of spec attributes. It also makes the general condition logic in #20644 hard to explain, and it's easy to forget to add a new equivalence to this list when adding new spec attributes (at least two people have been bitten by this). Solution -------- - [x] remove the equivalence list from `concretize.lp` - [x] simplify `spec_clauses()`, `condition()`, and other functions in `asp.py` that need to deal with `Spec` attributes. - [x] Convert all old-form spec attributes in `concretize.lp` to the `attr()` form - [x] Simplify `display.lp`, where we also had to maintain a list of spec attributes. Now we only need to show `attr/2`, `attr/3`, and `attr/4`. - [x] Simplify model extraction logic in `asp.py`. Performance ----------- This seems to result in a smaller grounded problem (as there are no longer duplicated `attr("foo", ...)` / `foo(...)` predicates in the program), but it also adds a slight performance overhead vs. develop. Ultimately, simplifying the encoding will be a win, particularly for improving error messages. Notes ----- This will simplify future node refactors in `concretize.lp` (e.g., not identifying nodes by package name, which we need for separate build dependencies). I'm still not entirely used to reading `attr()` notation, but I thnk it's ultimately clearer than what we did before. We need more uniform naming, and it's now clear what is part of a solution. We should probably continue making the encoding of `concretize.lp` simpler and more self-explanatory. It may make sense to rename `attr` to something like `node_attr` and to simplify the names of node attributes. It also might make sense to do something similar for other types of predicates in `concretize.lp`.

tgamblin · 2022-12-02T08:58:24Z

I'm a little confused because on my M1 I see basically no difference in timing (for hdf5 and ascent), and if I output the ASP and run it through clingo on the CLI, the attr-encoding is actually somewhat faster (at least with the timings shown for --outf=1).

One thought: are you running with a bootstrapped clingo or a more recent one that you've installed? Curious if you see the same performance degradation with a more recent clingo. I'm using 5.5.2, and it's a spack-built 5.5.2. We should maybe think about bootstrapping a more recent one for develop now that we are no longer supporting Python 2.7.

alalazo · 2022-12-02T09:02:30Z

I got the stats in #31202 (comment) using clingo 5.5.2 and they show a degradation of less than 10%. The plots have been done with the binaries we provide, so last version of clingo supporting Python 2.7

alalazo · 2022-12-02T09:04:01Z

Also, fwiw the details on my setup are:

Spack: 0.20.0.dev0 (65a5369)
Python: 3.8.10
Platform: linux-ubuntu20.04-icelake
Concretizer: clingo

so icelake instead of m1.

Review addressed

alalazo

The average slowdown from the plots in #31202 (comment) is 5.85% - so I think it's acceptable even though I couldn't figure out the exact cause of it 😟

tgamblin · 2022-12-02T17:40:04Z

Ok looking at this again -- the total overhead vs. develop with older, bootstrapped clingo is ~9%, which is not a whole lot. I think this is not a huge concern given the simplification in the encoding. Considering that new clingo seems to be faster with 5.85% overhead I think it's even less of a worry.

I really wish I knew where the differences are coming from.

We should probably start updating the bootstrap binaries to newer clingo now that we require Python 3.6

All Spec attributes are now represented as `attr(attribute_name, ... args ...)`, e.g. `attr(node, "hdf5")` instead of `node("hdf5")`, as we *have* to maintain the `attr()` form anyway, and it simplifies the encoding to just maintain one form of the Spec information. Background ---------- In spack#20644, we unified the way conditionals are done in the concretizer, but this introduced a nasty aspect to the encoding: we have to maintain everything we want in general conditions in two forms: `predicate(...)` and `attr("predicate", ...)`. For example, here's the start of the table of spec attributes we had to maintain: ```prolog node(Package) :- attr("node", Package). virtual_node(Virtual) :- attr("virtual_node", Virtual). hash(Package, Hash) :- attr("hash", Package, Hash). version(Package, Version) :- attr("version", Package, Version). ... ``` ```prolog attr("node", Package) :- node(Package). attr("virtual_node", Virtual) :- virtual_node(Virtual). attr("hash", Package, Hash) :- hash(Package, Hash). attr("version", Package, Version) :- version(Package, Version). ... ``` This adds cognitive load to understanding how the concretizer works, as you have to understand the equivalence between the two forms of spec attributes. It also makes the general condition logic in spack#20644 hard to explain, and it's easy to forget to add a new equivalence to this list when adding new spec attributes (at least two people have been bitten by this). Solution -------- - [x] remove the equivalence list from `concretize.lp` - [x] simplify `spec_clauses()`, `condition()`, and other functions in `asp.py` that need to deal with `Spec` attributes. - [x] Convert all old-form spec attributes in `concretize.lp` to the `attr()` form - [x] Simplify `display.lp`, where we also had to maintain a list of spec attributes. Now we only need to show `attr/2`, `attr/3`, and `attr/4`. - [x] Simplify model extraction logic in `asp.py`. Performance ----------- This seems to result in a smaller grounded problem (as there are no longer duplicated `attr("foo", ...)` / `foo(...)` predicates in the program), but it also adds a slight performance overhead vs. develop. Ultimately, simplifying the encoding will be a win, particularly for improving error messages. Notes ----- This will simplify future node refactors in `concretize.lp` (e.g., not identifying nodes by package name, which we need for separate build dependencies). I'm still not entirely used to reading `attr()` notation, but I thnk it's ultimately clearer than what we did before. We need more uniform naming, and it's now clear what is part of a solution. We should probably continue making the encoding of `concretize.lp` simpler and more self-explanatory. It may make sense to rename `attr` to something like `node_attr` and to simplify the names of node attributes. It also might make sense to do something similar for other types of predicates in `concretize.lp`.

alalazo · 2023-03-27T19:39:49Z

develop (10d10b6, blue) vs. PR (8756204, orange)

radiuss_develop.csv
radiuss_31202.csv
radiuss.txt

There seems to be a small slowdown in solve and ground, comparable to what reported in #31202 (review)

tgamblin requested review from alalazo, becker33 and trws June 21, 2022 00:18

spackbot-app bot added the dependencies label Jun 21, 2022

tgamblin force-pushed the attr-encoding branch from 35f1120 to f476acf Compare June 21, 2022 00:24

becker33 reviewed Jun 21, 2022

View reviewed changes

lib/spack/spack/solver/concretize.lp Outdated Show resolved Hide resolved

tgamblin force-pushed the attr-encoding branch from f476acf to 1208bba Compare June 22, 2022 08:19

tgamblin force-pushed the attr-encoding branch 4 times, most recently from 784b292 to 9f1ec6b Compare June 24, 2022 07:10

tldahlgren reviewed Jun 28, 2022

View reviewed changes

alalazo previously requested changes Jun 30, 2022

View reviewed changes

alalazo self-assigned this Jun 30, 2022

tgamblin force-pushed the attr-encoding branch from 9f1ec6b to 6beaeec Compare November 25, 2022 09:13

spackbot-app bot added the core PR affects Spack core functionality label Nov 25, 2022

tgamblin force-pushed the attr-encoding branch 3 times, most recently from b4de145 to 419c050 Compare November 27, 2022 19:25

spackbot-app bot added the commands label Nov 27, 2022

tgamblin force-pushed the attr-encoding branch 2 times, most recently from b719fe2 to 1030f8d Compare November 28, 2022 01:08

tgamblin requested review from alalazo and becker33 November 28, 2022 01:17

tgamblin force-pushed the attr-encoding branch 2 times, most recently from 34feb96 to 61c6b74 Compare November 28, 2022 06:53

tgamblin added this to the v0.20.0 milestone Nov 28, 2022

becker33 requested changes Nov 28, 2022

View reviewed changes

alalazo reviewed Nov 29, 2022

View reviewed changes

tgamblin force-pushed the attr-encoding branch from 61c6b74 to 4cd054a Compare November 30, 2022 00:17

becker33 previously approved these changes Nov 30, 2022

View reviewed changes

tgamblin dismissed becker33’s stale review via e114263 December 2, 2022 08:36

tgamblin force-pushed the attr-encoding branch from 4cd054a to e114263 Compare December 2, 2022 08:36

alalazo approved these changes Dec 2, 2022

View reviewed changes

alalazo merged commit 8756204 into develop Dec 2, 2022

alalazo deleted the attr-encoding branch December 2, 2022 17:56

alalazo mentioned this pull request Mar 27, 2023

Avoid verifying variants in default package requirements #35037

Merged

Conversation

tgamblin commented Jun 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Solution

Performance

Notes

Uh oh!

Uh oh!

trws commented Jun 22, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alalazo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalazo commented Nov 29, 2022

Uh oh!

alalazo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tgamblin commented Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalazo commented Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalazo commented Nov 30, 2022

Uh oh!

tgamblin commented Dec 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalazo commented Dec 2, 2022

Uh oh!

alalazo commented Dec 2, 2022

Uh oh!

alalazo left a comment

Choose a reason for hiding this comment

Uh oh!

tgamblin commented Dec 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalazo commented Mar 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

tgamblin commented Jun 21, 2022 •

edited

Loading

tgamblin commented Nov 30, 2022 •

edited

Loading

alalazo commented Nov 30, 2022 •

edited

Loading

tgamblin commented Dec 2, 2022 •

edited

Loading

tgamblin commented Dec 2, 2022 •

edited

Loading