MNT metadata routing: remove `MethodMapping.from_str()` and sort `caller`, `callee` in `MethodPair()` #28422

StefanieSenger · 2024-02-14T12:38:12Z

What does this implement/fix? Explain your changes.

This PR aims to simplify some things about the development side of metadata routing: how routing and consuming methods are mapped together.

It does not change any functionality, but rather helps to make the code more readable.

MethodMapping.from_str() and it's usage are removed from the entire codebase. The alternative functionality, MethodMapping.add() that was existing next to it already before, is now used instead. This ensures consistency and clarity.
caller and callee in MethodPair() are consistently sorted so that caller comes first. These two have been kwarguments before, but putting caller always first supports code readability.

These should have been two separate PRs really, I recognise. Merging these two things together happened by accident.

github-actions · 2024-02-14T12:39:26Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 35ecc4c. Link to the linter CI: here}

adrinjalali · 2024-02-14T18:08:03Z

Looking at the diff, I'm not sure if this is simplifying things. The mapping="score" was quite concise and neat, and now there's no way to do that w/o importing another class.

The CI fails and the prints are still there though.

StefanieSenger · 2024-02-15T10:02:55Z

I have now removed the print statements (oh my) and added the set_config(enable_metadata_routing=False) at the end of the plot_metadata_routing.py file, because that issue popped up here again. I think this should calm the doctest failures in the CI mostly. There might be a remaining one in grid_search.rst, that has this diff:

     GridSearchCV(cv=5,
    -             estimator=CalibratedClassifierCV(...),
    +             estimator=CalibratedClassifierCV(estimator=RandomForestClassifier(n_estimators=10)),

and that I am not sure why the shorter output was expected and how to handle that yet.

Edit: grid_search.rst did not raise again.

Concerning removing from_str as a whole: yes, mapping="score" is not possible anymore. People need to use MethodMapping in these cases now, too.

I think this is more explicit and easier to understand, since there are not several parallel ways of mapping methods together for the routing, but only one way. It's rather seldom that mapping="score" etc. were used before and applies mainly to testing with mocking classes. Also, meta-estimators that have routing implemented, do import MethodMapping for other parts of the code already.

I believe that having only this one way to do the routing, that is readable even if you don't look into what from_str does, is the simpler way.

adrinjalali · 2024-02-15T12:38:16Z

I think this is more explicit and easier to understand, since there are not several parallel ways of mapping methods together for the routing, but only one way. It's rather seldom that mapping="score" etc. were used before and applies mainly to testing with mocking classes. Also, meta-estimators that have routing implemented, do import MethodMapping for other parts of the code already.

I think this is similar to the way people can pass a string as a scorer, or a make_scorer object result. That's normal to me, but I'd need to see what others feel about it.

I think even allowing to pass a list instead of just a string would be okay, and the list results in a one to one mapping between caller and callee.

WDYT @thomasjpfan @glemaitre @OmarManzoor

thomasjpfan

I think even allowing to pass a list instead of just a string would be okay, and the list results in a one to one mapping between caller and callee.

Are you think of this API for a list of strings?

.add(method_mapping=["fit", "predict", "score"])

It's not obvious to me that the mapping is "one-to-one" from fit to fit.

Maybe:

.add(method_mapping=MethodMapping().add_one_to_one(["fit", "predict", "score"]))

Looking at the method_mapping="one-to-one" API with fresher eyes, I agree that "one-to-one" is a bit implicit. It's hard to tell from the code itself which methods are being routed.

StefanieSenger · 2024-02-18T07:33:43Z

Thanks for your insight, @thomasjpfan. 🔸
How do you feel about removing or keeping the method_mapping="score" syntax? @adrinjalali and I both agree on removing method_mapping="one-to-one" , but not on the former question.

thomasjpfan · 2024-02-18T16:29:06Z

How do you feel about removing or keeping the method_mapping="score" syntax?

I think it's simple enough to infer "one-to-one" when there is only one method. But then there is a significant complexity jump for adding a second mapping. Concretely:

.add(method_mapping="score")

# Adding the second mapping requires learning another object and syntax:
.add(method_mapping=method_mapping=MethodMapping()
            .add(caller="score", callee="score")
            .add(caller="predict", callee="predict"))

For all one-to-one mappings, I am considering:

# Single string for a single mapping:
.add(method_mapping=MethodMapping().add_one_to_one("score"))

# Add a second mapping by using a list:
.add(method_mapping=MethodMapping().add_one_to_one(["score", "predict"]))

adrinjalali · 2024-02-19T08:27:59Z

I agree we can remove one-to-one from MetadataRouter.add, and have a MethodMapping().add_one_to_one or MethodMapping().one_to_one method.

StefanieSenger · 2024-02-19T09:06:08Z

I'm a bit frustrated with how this discussion goes. It's a PR, not an issue. There is a concrete proposal to discuss and new ideas need to be compared with the proposal.

The goal of my proposal is to simplify the implementation of metadata routing by dropping two of three options for building the mapping for the routing. I had your verbal agreement for removing one-to-one before, @adrinjalali. Now I propose to also remove the other option from the from_str classmethod (passing the name of the one-to-one mapping methods as a string, like method_mapping="score").

This is hardly used (8 times throughout the whole repo) and mainly in test files.
See the number of occurrences in the diff:

sklearn/metrics/_scorer.py, l. 186
sklearn/metrics/tests/test_score_objects.py, l. 1249,1260
sklearn/tests/metadata_routing_common.py, l. 433
sklearn/tests/test_metadata_routing.py, l. 145, 652, 765, 799

In the diff you can also see, that I have substituted it with another, already existing method.

Inventing yet another method instead is not in accordance with the intention of this PR (which aims to simplify).

adrinjalali · 2024-02-19T09:16:15Z

We can only have a final agreement on a concrete implementation @StefanieSenger . It's normal that we see the implementation and we realize it doesn't look as good as we thought.

In this particular case, I thought removing one-to-one would be nice, but then we started removing other cases like method_mapping="score". This leads to an alternative proposal to add an explicit method for when all mappings are one-to-one, but only for a subset of the methods.

If you feel frustrated, it's okay to move to another issue / PR and let this rest a bit and you can come back to it with fresh eyes when you feel like it. Or close it and leave it for a later date. But the discussion here is a very normal and expected part of the development cycle.

StefanieSenger · 2024-02-23T12:46:24Z

I'm sorry @adrinjalali and @thomasjpfan.

Now I know that you were thinking and discussing about facilitating third party development rather than trying to keep MethodMapping.from_string() in our test files because it might entail an additional import. I didn't think about third parties at all and couldn't read this out of the discussion.

I now don't know what I could do or provide to move this forward. Let's see which direction the discussion on #28467 will take.

StefanieSenger · 2024-03-01T11:45:26Z

I'd need some guidance in translating the consensus from issue #28467 into this PR.

Should I incorporate the creation of the new method here and replace from_str with it? Or rather entirely revert what I tried with from_str here?

Or keep the status quo, which is from_str is removed and not replaced by anything, in which case I would not do anything else here and would wait for reviews.

thomasjpfan · 2024-03-01T17:43:39Z

I think we can keep this PR as is (removing from_str). Adding add_one_to_one can be a follow up PR.

sklearn/utils/_metadata_requests.py

Co-authored-by: Adrin Jalali <[email protected]>

sklearn/tests/test_metadata_routing.py

glemaitre · 2024-03-05T14:29:58Z

I merged RidgeCV and RidgeClassifierCV. We should merge main in the branch and check that we don't need any change.

Otherwise the PR LGTM.

…learn into remove_one-to-one

adrinjalali · 2024-04-10T12:12:21Z

Got a conflict here. Could you fix please @StefanieSenger ? We can merge then.

adrinjalali · 2024-04-30T17:20:36Z

more conflicts here @StefanieSenger

remove MethodMapping.from_str() and sort caller, callee

bf02a9e

StefanieSenger and others added 2 commits February 14, 2024 15:10

Merge branch 'main' into remove_one-to-one

a2ba3dc

undo merge conflict solition

0a05507

repair doctest failurs

3bc0c8c

thomasjpfan reviewed Feb 17, 2024

View reviewed changes

thomasjpfan mentioned this pull request Feb 19, 2024

RFC Revisiting meta-routing developer API for defining method mappings #28467

Closed

Merge branch 'main' into remove_one-to-one

572fd61

adrinjalali reviewed Mar 4, 2024

View reviewed changes

sklearn/utils/_metadata_requests.py Outdated Show resolved Hide resolved

sklearn/utils/_metadata_requests.py Outdated Show resolved Hide resolved

glemaitre self-requested a review March 4, 2024 21:18

StefanieSenger and others added 2 commits March 5, 2024 13:57

change after review

a01a6e2

Update sklearn/utils/_metadata_requests.py

282e370

Co-authored-by: Adrin Jalali <[email protected]>

glemaitre added the No Changelog Needed label Mar 5, 2024

glemaitre reviewed Mar 5, 2024

View reviewed changes

sklearn/tests/test_metadata_routing.py Outdated Show resolved Hide resolved

glemaitre and others added 4 commits March 5, 2024 19:27

Merge branch 'main' into remove_one-to-one

d31c0ec

remove comment

ed54388

Merge branch 'remove_one-to-one' of github.com:StefanieSenger/scikit-…

5d657ea

…learn into remove_one-to-one

Merge branch 'main' into remove_one-to-one

d7aaf8c

Merge branch 'main' into remove_one-to-one

cf03225

Merge branch 'main' into remove_one-to-one

35ecc4c

adrinjalali approved these changes May 2, 2024

View reviewed changes

adrinjalali merged commit 2bafd7b into scikit-learn:main May 2, 2024

StefanieSenger deleted the remove_one-to-one branch May 3, 2024 09:39

Uh oh!

MNT metadata routing: remove MethodMapping.from_str() and sort caller, callee in MethodPair() #28422

MNT metadata routing: remove MethodMapping.from_str() and sort caller, callee in MethodPair() #28422

Uh oh!

Conversation

StefanieSenger commented Feb 14, 2024

What does this implement/fix? Explain your changes.

Uh oh!

github-actions bot commented Feb 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

adrinjalali commented Feb 14, 2024

Uh oh!

StefanieSenger commented Feb 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali commented Feb 15, 2024

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

StefanieSenger commented Feb 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasjpfan commented Feb 18, 2024

Uh oh!

adrinjalali commented Feb 19, 2024

Uh oh!

StefanieSenger commented Feb 19, 2024

Uh oh!

adrinjalali commented Feb 19, 2024

Uh oh!

StefanieSenger commented Feb 23, 2024

Uh oh!

StefanieSenger commented Mar 1, 2024

Uh oh!

thomasjpfan commented Mar 1, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre commented Mar 5, 2024

Uh oh!

adrinjalali commented Apr 10, 2024

Uh oh!

adrinjalali commented Apr 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MNT metadata routing: remove `MethodMapping.from_str()` and sort `caller`, `callee` in `MethodPair()` #28422

MNT metadata routing: remove `MethodMapping.from_str()` and sort `caller`, `callee` in `MethodPair()` #28422

github-actions bot commented Feb 14, 2024 •

edited

Loading

StefanieSenger commented Feb 15, 2024 •

edited

Loading

StefanieSenger commented Feb 18, 2024 •

edited

Loading