-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
FIX Fix error when using Calibrated with Voting #20087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX Fix error when using Calibrated with Voting #20087
Conversation
bc43de1 to
502b8a7
Compare
thomasjpfan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this issue @Clement-F !
sklearn/calibration.py
Outdated
| elif method_name == 'predict_proba' or method_name == '_predict_proba': | ||
| # The `_predict_proba` option is needed for `VotingClassifier` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think special casing VotingClassifier is great. I think it would be better to get _get_prediction_method to return a tuple (callable, method_name) and then use the method_name here.
(_get_prediction_method is always called before _compute_predictions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I have updated the code.
502b8a7 to
34322ee
Compare
ogrisel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix. Just a few minor suggestions:
34322ee to
8eef25f
Compare
thomasjpfan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* TST enable test docstring params for feature extraction module (scikit-learn#20188) * DOC fix a reference in sklearn.ensemble.GradientBoostingRegressor (scikit-learn#20198) * FIX mcc zero divsion (scikit-learn#19977) * TST Add TransformedTargetRegressor to test_meta_estimators_delegate_data_validation (scikit-learn#20175) Co-authored-by: Guillaume Lemaitre <[email protected]> * TST enable n_feature_in_ test for feature_extraction module * FIX Uses points instead of pixels in plot_tree (scikit-learn#20023) * MNT n_features_in through the multiclass module (scikit-learn#20193) * CI Removes python 3.6 builds from wheel building (scikit-learn#20184) * FIX Fix typo in error message in `fetch_openml` (scikit-learn#20201) * FIX Fix error when using Calibrated with Voting (scikit-learn#20087) * FIX Fix RandomForestRegressor doesn't accept max_samples=1.0 (scikit-learn#20159) Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]> * ENH Adds Poisson criterion in RandomForestRegressor (scikit-learn#19836) Co-authored-by: Christian Lorentzen <[email protected]> Co-authored-by: Alihan Zihna <[email protected]> Co-authored-by: Alihan Zihna <[email protected]> Co-authored-by: Chiara Marmo <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: naozin555 <[email protected]> Co-authored-by: Venkatachalam N <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]> * TST Replace assert_warns from decomposition/tests (scikit-learn#20214) * TST check n_features_in_ in pipeline module (scikit-learn#20192) Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Jérémie du Boisberranger <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> * Allow `n_knots=None` if knots are explicitly specified in `SplineTransformer` (scikit-learn#20191) Co-authored-by: Olivier Grisel <[email protected]> * FIX make check_complex_data deterministic (scikit-learn#20221) * TST test_fit_docstring_attributes include properties (scikit-learn#20190) * FIX Uses the color max for colormap in ConfusionMatrixDisplay (scikit-learn#19784) * STY Changing .format method to f-string formatting (scikit-learn#20215) * CI Adds permissions for label action Co-authored-by: Jérémie du Boisberranger <[email protected]> Co-authored-by: tsuga <[email protected]> Co-authored-by: Conner Shen <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]> Co-authored-by: mlondschien <[email protected]> Co-authored-by: Clément Fauchereau <[email protected]> Co-authored-by: murata-yu <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Brian Sun <[email protected]> Co-authored-by: Christian Lorentzen <[email protected]> Co-authored-by: Alihan Zihna <[email protected]> Co-authored-by: Alihan Zihna <[email protected]> Co-authored-by: Chiara Marmo <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: naozin555 <[email protected]> Co-authored-by: Venkatachalam N <[email protected]> Co-authored-by: Nanshan Li <[email protected]> Co-authored-by: solosilence <[email protected]>
Reference Issues/PRs
Fixes #20053
What does this implement/fix? Explain your changes.
The commit #17856 changed the way
CalibratedClassifierCVinternally works. Due to weird implementation ofVotingClassifier.predict_proba, it broke the compatibility whenVotingClassifieris used asbase_estimatorinCalibratedClassifier.This is a simple fix to restore compatibility.
Any other comments?
The real issue is the way
VotingClassifier.predict_probais implemented. However, it seems to me that it can't be resolved without breaking changes.The issue is that
predict_probais not a method but an attribute which holds another method. The goal of this trick was to do polymorphism and implementpredict_probaonly whenvoting="soft"using a getter.I think
VotingClassifiershould be an abstract class and we should implementSoftVotingClassifierandHardVotingClassifiersince they don't implement the same methods. It is however a big API change.It would be simpler to only raise an error if
voting="hard"but if I understand correctly it is assumed that ifpredict_probaexists then it must work. It would therefore leads to other incompatibilities.I am not familiar enough with the code base to know how it is usually dealt with. Anyway, I think it needs further discussion.