Skip to content

Patches: the underlying MV variant now uses the property setter#5596

Merged
scheibelp merged 2 commits intospack:developfrom
epfl-scitas:fixes/patch_in_dependencies
Oct 4, 2017
Merged

Patches: the underlying MV variant now uses the property setter#5596
scheibelp merged 2 commits intospack:developfrom
epfl-scitas:fixes/patch_in_dependencies

Conversation

@alalazo
Copy link
Copy Markdown
Member

@alalazo alalazo commented Oct 4, 2017

refers #5587
closes #5593

The implementation introduced in #5476 used dedupe to set a private member of a MultiValuedVariant object. dedupe is meant to give a stable deduplication of the items in a sequence, and returns an object whose value depends on the order of the items in its argument.

Now the private member is set implicitly using the associated property setter. The logic changes from stable deduplication to a totally ordered deduplication (i.e. instead of dedupe(t) it uses sorted(set(t))).

This should also grant that packages that shuffle the same set of patch directives maintain the same hash. I didn't try that out, but looking at the implementation I expect:

class A(Package):
    
    patch('foo.patch')
    patch('bar.patch')
    patch('baz.patch')

to have a different hash in develop with respect to:

class A(Package):
    
    # The order of directives changes here
    patch('bar.patch')
    patch('baz.patch')
    patch('foo.patch')

while it should give the same hash now.

@tgamblin Was the previous implementation done on purpose out of concerns of the order of application of patches?

EDIT: after reading the code in #5476 I can say for sure that the answer to the question above is yes. This PR is currently a work in progress because I need to figure out the best way to ensure the right order of application for patches.

Probably there are also workarounds that could maintain the same logic as in develop, even though I argue that setting private attributes to by-pass a setter is not a best-practice (and fires back sometimes 😄).

refers spack#5587
closes spack#5593

The implementation introduced in spack#5476 used `dedupe` to set a private
member of a `MultiValuedVariant` object. `dedupe` is meant to give a
stable deduplication of the items in a sequence, and returns an object
whose value depends on the order of items in its argument.

Now the private member is set implicitly using the associated property
setter. The logic changes from stable deduplication to a totally ordered
deduplication (i.e. instead of `dedupe(t)` it uses `sorted(set(t))`).
This should also grant that packages that shuffle the same set of
patch directives maintain the same hash.
@alalazo alalazo added the bug Something isn't working label Oct 4, 2017
@alalazo alalazo requested review from scheibelp and tgamblin October 4, 2017 07:53
@dlukes
Copy link
Copy Markdown
Contributor

dlukes commented Oct 4, 2017

This seems to break mpfr in yet a different way (see #5564 for previous problems) -- applying patches fails. The patches have to be applied in a particular order, as listed in package.py:

    patch('vasprintf.patch', when='@3.1.5')
    patch('strtofr.patch',   when='@3.1.5')

What I think is happening instead is that the patches are reordered based on their hashes (dfd... and 88d... respectively) and applied in this order, which fails, because both patches edit the version number, which means the second one relies on the output of the first one.

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Oct 4, 2017

Yeah, that's what I wrote in the edit above:

This PR is currently a work in progress because I need to figure out the best way to ensure the right order of application for patches.

You might want to check also this related comment. I would really prefer that we just keep commutative patches, but I see why it could be better sometimes to enforce their application order.

@dlukes
Copy link
Copy Markdown
Contributor

dlukes commented Oct 4, 2017

Yes, sorry, I just meant to provide a specific example, but I should've guessed you would already be on to it (cf. your comment) :)

What do you mean by idempotence in the context of patches? That applying the same patch more than once is a no-op? That would have to be supported on the level of the patch program, which I don't think it is, and even if it were, it doesn't help with the ordering issue, I'm afraid :( What would help would be to check the commutativeness of the patches, but that means parsing the patches, checking the files they affect, the line ranges, perhaps even the specific changes (depending on how hamfisted the commutativeness check is allowed to be) -- and that means additional non-trivial and brittle code to maintain.

A simple fix would be to require that there always be at most one patch per package, but that would probably break existing packages, and the part about injecting patches in your comment sounds like there are other places besides package.py where patches can be defined (?), so that rules it out completely.

I may very well be wrong (obviously, I know much less about Spack in general and this issue in particular), but it seems to me that having a predictable ordering of patches is the easier solution :)

closes spack#5587

This commit restores the original weak ordering of patches application.
The ordering DOES NOT enter the hashing mechanism. It's supposed to be
an hotfix, while we think of a cleaner and more permanent solution.
@alalazo alalazo force-pushed the fixes/patch_in_dependencies branch from e6c8333 to 8ae7be4 Compare October 4, 2017 14:15
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Oct 4, 2017

What do you mean by idempotence in the context of patches? That applying the same patch more than once is a no-op?

I meant commutative, but I just went out of a very long discussion on C++, hence the lapsus. I fixed it now in the comment, thanks.

Anyhow, I think having commutative patches is feasible in principle, but would require some extra efforts on our side to rework them - for instance aggregating them if they need to be applied in order under the same constraints. Not sure it's worth it.

A simple fix would be to require that there always be at most one patch per package, but that would probably break existing packages, and the part about injecting patches in your comment sounds like there are other places besides package.py where patches can be defined (?), so that rules it out completely.

Well, some places besides the same package.py that is going to get patched. The full story is that in #5476 @tgamblin took the effort of reworking part of the directives so that a package could require to patch its dependencies for its own purposes. It's nasty, but that was to adapt to a pattern we have seen a fair amount of times (e.g. qmcpack, openspeedshop). The idea is that patches belong where they are needed and maintained, not where they are applied.


@dlukes Can you check if the latest version of the PR fixes the issue for you?

@dlukes
Copy link
Copy Markdown
Contributor

dlukes commented Oct 4, 2017

The patch currently doesn't cleanly apply for me, even though git diff develop upstream/develop -- lib/spack/spack/spec.py returns no differences. I'm trying to figure out what's wrong.

Thanks for the additional background on patches!

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Oct 4, 2017

Thanks for the additional background on patches

No worries. Just in case you don't know it already, you can check-out a PR locally:

$ git fetch upstream pull/5596/head:PR/5596
$ git checkout PR/5596

more info here.

@dlukes
Copy link
Copy Markdown
Contributor

dlukes commented Oct 4, 2017

you can check-out a PR locally

I did not know that, thanks a lot! That's really handy :) I had already ended up adding your repo as a remote, but this is definitely simpler.

And happy to report that 8ae7be4 fixes the issue for me!

@hartzell
Copy link
Copy Markdown
Contributor

hartzell commented Oct 4, 2017

I'm following along, fascinated, but have nothing other than encouragement to add. Thanks for working all this out.

@hartzell
Copy link
Copy Markdown
Contributor

hartzell commented Oct 4, 2017

I was able to build [email protected] with this branch. LGTM (says the guy with tunnel vision about his particular needs...).

mvar = s.variants.setdefault(
'patches', MultiValuedVariant('patches', ())
)
mvar.value = patches
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this no longer needs dedupe?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The statement:

m.value = patches

activates this setter that internally does:

m._value = tuple(sorted(set(patches)))

(note the underscore above) making dedupe unnecessary here.

If I interpret correctly @tgamblin intents, setting the "private" value using dedupe was meant to preserve the order of patches, i.e. the correct application order. What we didn't take into account is that the ordering of patches right now is weak (enforced among common constraints) and patches is populated iterating over a dictionary (that doesn't preserve order).

The nastiness in this PR is in the last commit, where I set an attribute on the instance of MultiValuedVariant used to store patches. That attribute stores the order in which good patches appeared. Anyhow, it's contained in 3 statements and a single module (spec.py)

@scheibelp scheibelp merged commit 5fa1191 into spack:develop Oct 4, 2017
@alalazo alalazo deleted the fixes/patch_in_dependencies branch October 4, 2017 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working don't-merge-yet WIP

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants