ENH move estimator type to tags #30122

adrinjalali · 2024-10-21T07:42:21Z

Closes #16469

Moving _estimator_tpye to tags.

Right now this doesn't work.

github-actions · 2024-10-21T07:43:34Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 216f442. Link to the linter CI: here}

adrinjalali · 2024-10-21T14:54:10Z

sklearn/model_selection/tests/test_search.py

    # Test that the best estimator contains the right value for foo_param
    clf = MockClassifier()
-    grid_search = GridSearchCV(clf, {"foo_param": [1, 2, 3]}, cv=3, verbose=3)
+    grid_search = GridSearchCV(clf, {"foo_param": [1, 2, 3]}, cv=2, verbose=3)


the data has only two samples for each class, and in classification case we use stratified cv, which requires cv <= n_samples per class. This wasn't an issue so far cause our MockClassifier wasn't declaring that it's a classifier.

adrinjalali · 2024-10-21T14:54:48Z

sklearn/model_selection/tests/test_validation.py

 # The number of samples per class needs to be > n_splits,
 # for StratifiedKFold(n_splits=3)
-y2 = np.array([1, 1, 1, 2, 2, 2, 3, 3, 3, 3])
+y2 = np.array([1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3])


same issue, now we need 5 samples for each class since devault cv is 5.

adrinjalali · 2024-10-21T14:56:02Z

sklearn/tests/test_metaestimators.py

+        ) not in sig:
            continue

+        for meta_estimator in _construct_instances(Estimator):


meta estimators' type sometimes depends on their sub-estimator, and we should run the test for all their instances.

adrinjalali · 2024-10-21T14:58:33Z

cc @Charlie-XIAO @adam2392

adam2392

Great to see this getting consolidated!

sklearn/base.py

sklearn/utils/_tags.py

Co-authored-by: Adam Li <[email protected]>

…into estimator_type

doc/whats_new/upcoming_changes/sklearn.utils/30122.api.rst

sklearn/base.py

glemaitre

It looks good. One merge, I'll test it in imbalanced-learn. But I think that having the logic in the tag makes things easier sometimes when we want to extend some part.

sklearn/tests/test_metaestimators.py

glemaitre · 2024-10-30T10:53:28Z

sklearn/tests/test_metaestimators.py

            )


+def _get_instance_with_pipeline(meta_estimator, init_params):


Uhm codecov is complaining of not covered line here. I would not expect it since this should be some code that we have before.

It isn't complaining anymore, I think.

sklearn/tests/test_metaestimators.py

adrinjalali

@adam2392 wanna have another look here? Comments should be resolved now.

adrinjalali · 2024-10-31T12:51:52Z

doc/sphinxext/allow_nan_estimators.py

-                exists = True
-                item += para
-                lst += item
+                if est.__sklearn_tags__().input_tags.allow_nan:


noticed that this section wasn't in the scope of the suppress(SkipTest), which it should be

adrinjalali · 2024-10-31T12:52:38Z

sklearn/tests/test_docstring_parameters.py

+    elif Estimator.__name__ == "FrozenEstimator":
+        X, y = make_classification(n_samples=20, n_features=5, random_state=0)
+        est = Estimator(LogisticRegression().fit(X, y))


noticed we didn't have this while checking our usages of _construct_instance

adam2392

Sorry few more questions after reading through the code (I wanna do a thorough job :p). These are mostly small comments. Once addressed (assuming those aren't massive changes), then I can approve and merge.

adam2392 · 2024-11-01T00:18:28Z

sklearn/base.py

+        return Tags(
+            estimator_type=None,
+            target_tags=TargetTags(required=False),
+            transformer_tags=None,
+            regressor_tags=None,
+            classifier_tags=None,
+        )


Why aren't these the default_tags anymore? Since it's BaseEstimator, I assumed that would be "default".

cause we need to remove default_tags once we're done with the deprecation period. default_tags depends on detecting the type of estimator from the class, and not the instance, which we're removing in this PR.

adam2392 · 2024-11-01T00:54:39Z

sklearn/utils/_tags.py

+    est_is_classifier = getattr(estimator, "_estimator_type", None) == "classifier"
+    est_is_regressor = getattr(estimator, "_estimator_type", None) == "regressor"
+    target_required = est_is_classifier or est_is_regressor


Once _estimator_type is removed, what will we do here?

we'll be removing default_tags alltogether.

adrinjalali · 2024-11-05T08:44:37Z

sklearn/base.py

+        return Tags(
+            estimator_type=None,
+            target_tags=TargetTags(required=False),
+            transformer_tags=None,
+            regressor_tags=None,
+            classifier_tags=None,
+        )


cause we need to remove default_tags once we're done with the deprecation period. default_tags depends on detecting the type of estimator from the class, and not the instance, which we're removing in this PR.

adrinjalali · 2024-11-05T08:44:56Z

sklearn/utils/_tags.py

+    est_is_classifier = getattr(estimator, "_estimator_type", None) == "classifier"
+    est_is_regressor = getattr(estimator, "_estimator_type", None) == "regressor"
+    target_required = est_is_classifier or est_is_regressor


we'll be removing default_tags alltogether.

adrinjalali · 2024-11-05T08:45:09Z

sklearn/utils/_tags.py

    input_tags: InputTags = field(default_factory=InputTags)


+# TODO(1.8): Remove this function


@adam2392 note this comment 😁

Ah I see 😅

adam2392

LGTM. thanks @adrinjalali

glemaitre · 2024-11-05T19:17:22Z

I open #30227 to solve a bug that we did not catch since we did not build the full documentation.

I added an entry in the changelog because things could have gone wrong even before with the wrong ordering of the mixin.

ENH move estimator type to tags

d3cc2a5

github-actions bot added module:feature_selection module:utils labels Oct 21, 2024

adrinjalali added 3 commits October 21, 2024 16:43

tests pass

b3109a1

Merge remote-tracking branch 'upstream/main' into estimator_type

02c0aa5

changelog

c0e519e

adrinjalali commented Oct 21, 2024

View reviewed changes

...

4d8a860

adrinjalali marked this pull request as ready for review October 21, 2024 14:57

adrinjalali added the Developer API Third party developer API related label Oct 21, 2024

adrinjalali added this to the 1.6 milestone Oct 21, 2024

adam2392 reviewed Oct 21, 2024

View reviewed changes

adrinjalali and others added 5 commits October 21, 2024 17:25

Apply suggestions from code review

63fd3bb

Co-authored-by: Adam Li <[email protected]>

test deprecation

fc2fd3d

Merge branch 'estimator_type' of github.com:adrinjalali/scikit-learn …

d04de03

…into estimator_type

fix condition

1386e72

fix set

4d4201d

glemaitre reviewed Oct 30, 2024

View reviewed changes

doc/whats_new/upcoming_changes/sklearn.utils/30122.api.rst Outdated Show resolved Hide resolved

glemaitre reviewed Oct 30, 2024

View reviewed changes

sklearn/base.py Show resolved Hide resolved

adrinjalali added 2 commits October 30, 2024 09:42

Merge remote-tracking branch 'upstream/main' into estimator_type

983d7a9

Guillaume's comments

a31e9fc

glemaitre approved these changes Oct 30, 2024

View reviewed changes

sklearn/tests/test_metaestimators.py Outdated Show resolved Hide resolved

use method instead of operator

0e41007

glemaitre reviewed Oct 30, 2024

View reviewed changes

sklearn/tests/test_metaestimators.py Outdated Show resolved Hide resolved

adrinjalali added 2 commits October 30, 2024 13:02

Guillaume doesn't like set operations

ab1cd7a

fix skiptest issue

7060fd2

adrinjalali commented Oct 31, 2024

View reviewed changes

adam2392 self-requested a review October 31, 2024 22:03

adam2392 reviewed Nov 1, 2024

View reviewed changes

adrinjalali commented Nov 5, 2024

View reviewed changes

adrinjalali added 2 commits November 5, 2024 09:45

Merge remote-tracking branch 'upstream/main' into estimator_type

8a23544

improve docs accordintly

216f442

adam2392 approved these changes Nov 5, 2024

View reviewed changes

adam2392 merged commit 613cff9 into scikit-learn:main Nov 5, 2024
30 checks passed

adrinjalali deleted the estimator_type branch November 5, 2024 12:01

This was referenced Nov 5, 2024

MNT Deprecates _estimator_type and replaces by a estimator tag #17806

Closed

FIX handle empty steps in Pipeline #30203

Merged

larsoner mentioned this pull request Nov 7, 2024

BUG: Test collection for Transformer fails #30237

Closed

This was referenced Nov 12, 2024

[ci] [python-package] more errors from scikit-learn rearranging estimator tags microsoft/LightGBM#6717

Closed

[python-package][R-package] adapt to scikit-learn 1.6 testing changes, pin more packages in R 3.6 CI jobs microsoft/LightGBM#6718

Merged

This was referenced Dec 11, 2024

[Bug]: Regression and classification learner checks for breaking changes in scikit-learn DoubleML/doubleml-for-py#278

Closed

Incompability between scikit-learn and xgboost dmlc/xgboost#11093

Closed

This was referenced Oct 6, 2025

Attribute error from sklearn.base.is_regressor due to missing __sklearn_tags__ on nightly builds #32394

Open

Align docstring of is_classifier and is_clusterer #32415

Merged

		)


		def _get_instance_with_pipeline(meta_estimator, init_params):

		input_tags: InputTags = field(default_factory=InputTags)


		# TODO(1.8): Remove this function

Uh oh!

ENH move estimator type to tags #30122

ENH move estimator type to tags #30122

Uh oh!

Conversation

adrinjalali commented Oct 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Oct 21, 2024

Uh oh!

adam2392 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adam2392 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adam2392 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

glemaitre commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

adrinjalali commented Oct 21, 2024 •

edited

Loading

github-actions bot commented Oct 21, 2024 •

edited

Loading

adam2392 left a comment •

edited

Loading

glemaitre commented Nov 5, 2024 •

edited

Loading