[MRG+2] Adds helpful messages in all error assertions in estimator_checks#9588
[MRG+2] Adds helpful messages in all error assertions in estimator_checks#9588lesteve merged 18 commits intoscikit-learn:masterfrom
Conversation
2b05c74 to
7046d97
Compare
|
Kindly review this Pull Request. Need reviews on how the nice error messages should look. |
jnothman
left a comment
There was a problem hiding this comment.
Your error messages should be telling the person who wrote some new estimator what the estimator should be doing but did not.
sklearn/utils/estimator_checks.py
Outdated
| transformer = clone(transformer) | ||
| assert_raises((AttributeError, ValueError), transformer.transform, X) | ||
| with assert_raises((AttributeError, ValueError), | ||
| msg="Transformers unfitted"): |
There was a problem hiding this comment.
Message should be more like: "The unfitted transformer {name} does not raise an error when transform is called. Perhaps use check_is_fitted"
|
Why do we have two |
|
I am not sure about why same statement is used twice ( |
|
The duplicated assertion is probably an error. Is it possible one of those
should have been a method other than decision_function?
On 27 Aug 2017 10:30 pm, "Kumar Ashutosh" <[email protected]> wrote:
I am not sure about why same statement is used twice (assert_raises(ValueError,
classifier.decision_function, X.T) is stated twice). I guess the rest is
fine. WDYT?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#9588 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6xyYoRIgiAS-6Mbwl4Rc68urDaqlks5scWFygaJpZM4O8vdR>
.
|
jnothman
left a comment
There was a problem hiding this comment.
This is a great start, thanks, but I think we can afford to be more specific and hold the hands of developers.
sklearn/utils/estimator_checks.py
Outdated
| if hasattr(X, 'T'): | ||
| # If it's not an array, it does not have a 'T' property | ||
| assert_raises(ValueError, transformer.transform, X.T) | ||
| with assert_raises(ValueError, msg="The transformer {name} does " |
There was a problem hiding this comment.
By malformed here do we mean "has a different number of features to when fitting"?
sklearn/utils/estimator_checks.py
Outdated
| # The precise message can change depending on whether X or y is | ||
| # validated first. Let us test the type of exception only: | ||
| assert_raises(ValueError, e.fit, X_zero_samples, []) | ||
| with assert_raises(ValueError, msg="The estimators {name} does not" |
sklearn/utils/estimator_checks.py
Outdated
|
|
||
| assert_raises(ValueError, estimator.partial_fit, X[:, :-1], y) | ||
| with assert_raises(ValueError, | ||
| msg="The estimators {name} does not raise an" |
| # raises error on malformed input for fit | ||
| assert_raises(ValueError, classifier.fit, X, y[:-1]) | ||
| with assert_raises(ValueError, msg="The classifers {name} does not" | ||
| " raise an error when incorrect/malformed input " |
There was a problem hiding this comment.
Be more specific about the malformation
sklearn/utils/estimator_checks.py
Outdated
| set_random_state(classifier) | ||
| # raises error on malformed input for fit | ||
| assert_raises(ValueError, classifier.fit, X, y[:-1]) | ||
| with assert_raises(ValueError, msg="The classifers {name} does not" |
sklearn/utils/estimator_checks.py
Outdated
| # raises error on malformed input for predict | ||
| assert_raises(ValueError, classifier.predict, X.T) | ||
| with assert_raises(ValueError, msg="The classifers {name} does not" | ||
| " raise an error when incorrect/malformed input " |
There was a problem hiding this comment.
Be more specific about the malformation
sklearn/utils/estimator_checks.py
Outdated
| # raises error on malformed input | ||
| assert_raises(ValueError, | ||
| classifier.decision_function, X.T) | ||
| with assert_raises(ValueError, msg="Malformed inputs"): |
|
Addded the suggested changes. |
|
Changes done. @jnothman Thanks a lot!! :) |
sklearn/utils/estimator_checks.py
Outdated
| if hasattr(X, 'T'): | ||
| # If it's not an array, it does not have a 'T' property | ||
| assert_raises(ValueError, transformer.transform, X.T) | ||
| with assert_raises(ValueError, msg="The classifer {name} does not " |
There was a problem hiding this comment.
But should be "transformer" right? "The transformer {} does not raise an error when the number of features in transform is different from the number of features in fit"?
sklearn/utils/estimator_checks.py
Outdated
| msg="The estimator {name} does not raise an" | ||
| " error when number of features changes " | ||
| "between calls to partial_fit. Perhaps" | ||
| " use check_X_y"): |
There was a problem hiding this comment.
How does check_X_y relate to this?
sklearn/utils/estimator_checks.py
Outdated
| " raise an error when incorrect/malformed input " | ||
| "data for fit is passed. Number of training exam" | ||
| "ples is not the same as the number of " | ||
| "labels. Perhapse use check_array"): |
sklearn/utils/estimator_checks.py
Outdated
| " raise an error when incorrect/malformed input " | ||
| " for predict is passed. Number of features in p" | ||
| "redict dataset does not match the number of fea" | ||
| "tures in fit dataset. Perhaps use check_array"): |
There was a problem hiding this comment.
I don't think check_array helps here.
sklearn/utils/estimator_checks.py
Outdated
| " for predict is passed. Number of features in " | ||
| "predict dataset does not match the number of f" | ||
| "eatures in fit dataset." | ||
| " Perhaps use check_array"): |
There was a problem hiding this comment.
I don't think check_array helps here.
There was a problem hiding this comment.
Would it be fine if I only replace this statement with the one you mentioned "The transformer {} does not raise an error when the number of features in transform is different from the number of features in fit" ?
There was a problem hiding this comment.
sure (though you want to keep the name).
sklearn/utils/estimator_checks.py
Outdated
| classifier.decision_function, X.T) | ||
| with assert_raises(ValueError, msg="The classifer {name} does " | ||
| "not raise an error when incorrect/malforme" | ||
| "d input for decision_function is passed. N" |
There was a problem hiding this comment.
I would really try to make this a bit shorter and maybe make it one sentence. The context of the second sentence is slightly confusing to me. Again, I don't think check_array helps here.
There was a problem hiding this comment.
" The classifier {} does not raise an error when number of features is inconsistent "
How about this? I am not much familiar with error messages, so sorry for this :(
sklearn/utils/estimator_checks.py
Outdated
| " raise an error when incorrect/malformed inpu" | ||
| "t for predict_proba is passed. Number of fea" | ||
| "tures in predict dataset does not match the n" | ||
| "umber of features in fit dataset. " |
sklearn/utils/estimator_checks.py
Outdated
| " raise an error when incorrect/malformed input " | ||
| "data for fit is passed. Number of training exam" | ||
| "ples is not the same as the number of " | ||
| "labels. Perhapse use check_array"): |
|
@amueller Changes added. Kindly suggest if any more change is required. |
sklearn/utils/estimator_checks.py
Outdated
| with assert_raises(ValueError, | ||
| msg="The estimator {name} does not raise an" | ||
| " error when number of features changes " | ||
| "between calls to partial_fit. Perhaps"): |
sklearn/utils/estimator_checks.py
Outdated
| with assert_raises(ValueError, msg="The classifier {name} does not" | ||
| " raise an error when the number of features " | ||
| "in predict is different from the number of" | ||
| " features in fit"): |
There was a problem hiding this comment.
all sentences should end with full stop
|
@amueller Also, we have redundant assert_raises statement like: Should I remove the extra same statement? I don't think they serve any special purpose. |
|
yes. they were not redundant in master, right? |
|
Yes, they were. Here's the code copied from master |
|
@amueller Hope this works!! |
|
looks good :) |
|
+1 from me but maybe @jnothman wants to have another look. |
sklearn/utils/estimator_checks.py
Outdated
| transformer = clone(transformer) | ||
| assert_raises((AttributeError, ValueError), transformer.transform, X) | ||
| with assert_raises((AttributeError, ValueError), msg="The unfitted " | ||
| "transformer {name} does not raise an error when " |
There was a problem hiding this comment.
Perhaps I missed it. Does this {name} thing work??? Surely we need to format the string...
|
@jnothman Changes added. Kindly review. Hope this works!! |
sklearn/utils/estimator_checks.py
Outdated
| transformer = clone(transformer) | ||
| assert_raises((AttributeError, ValueError), transformer.transform, X) | ||
| with assert_raises((AttributeError, ValueError), msg="The unfitted " | ||
| "transformer {} does not raise an error when tra" |
There was a problem hiding this comment.
Please please please, why would you break the line in the middle of words 🙃 ? Can you fix that everywhere?
There was a problem hiding this comment.
In order to minimise the number of lines.. sorry :( .. will be done in 5 minutes.
There was a problem hiding this comment.
@lesteve Added the changes. Kindly review.
There was a problem hiding this comment.
Well that was rather creative but there is nothing to be sorry about ;-).
lesteve
left a comment
There was a problem hiding this comment.
Some minor comments with improvements to the error messages.
sklearn/utils/estimator_checks.py
Outdated
| with assert_raises((AttributeError, ValueError), msg="The unfitted " | ||
| "transformer {} does not raise an error when " | ||
| "transform is called. Perhaps use " | ||
| "check_is_fitted.".format(name)): |
There was a problem hiding this comment.
I would be more explicit: "use check_is_fitted in transform"
sklearn/utils/estimator_checks.py
Outdated
| with assert_raises(ValueError, msg="The estimator {} does not" | ||
| " raise an error when an empty data is used " | ||
| "to train. Perhaps use " | ||
| "check_array.".format(name)): |
sklearn/utils/estimator_checks.py
Outdated
| assert_raises(ValueError, estimator.partial_fit, X[:, :-1], y) | ||
| with assert_raises(ValueError, | ||
| msg="The estimator {} does not raise an" | ||
| " error when number of features changes " |
sklearn/utils/estimator_checks.py
Outdated
| assert_raises(ValueError, classifier.fit, X, y[:-1]) | ||
| with assert_raises(ValueError, msg="The classifer {} does not" | ||
| " raise an error when incorrect/malformed input " | ||
| "data for fit is passed. Number of training " |
There was a problem hiding this comment.
The number of training examples
sklearn/utils/estimator_checks.py
Outdated
| " raise an error when incorrect/malformed input " | ||
| "data for fit is passed. Number of training " | ||
| "examples is not the same as the number of " | ||
| "labels. Perhapse use check_X_y.".format(name)): |
There was a problem hiding this comment.
Perhaps (without the e at the end) use check_X_y in fit.
sklearn/utils/estimator_checks.py
Outdated
| assert_raises(ValueError, regressor.fit, X, y[:-1]) | ||
| with assert_raises(ValueError, msg="The classifer {} does not" | ||
| " raise an error when incorrect/malformed input " | ||
| "data for fit is passed. Number of training " |
sklearn/utils/estimator_checks.py
Outdated
| " raise an error when incorrect/malformed input " | ||
| "data for fit is passed. Number of training " | ||
| "examples is not the same as the number of " | ||
| "labels. Perhapse use check_X_y.".format(name)): |
There was a problem hiding this comment.
Typo in Perhaps
use check_X_y in fit
|
@lesteve changes added. |
|
Thanks a lot, merging! |
|
thanks! |
|
You wanted me to have a look between 2:30am and 6:30am?? :P
All good! Thanks, @thechargedneutron
…On 30 August 2017 at 07:04, Kumar Ashutosh ***@***.***> wrote:
Thanks a lot @lesteve <https://github.com/lesteve> @amueller
<https://github.com/amueller> . Feels good to contribute with such a
great mentor support. :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9588 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz69wD6EBUC53v3DaP-uCOStIKXFRpks5sdHzugaJpZM4O8vdR>
.
|
Reference Issue
Fixes #9572
What does this implement/fix? Explain your changes.
Added a context manager for each occurrence of
assert_raises.The
msgarguments have not been changed which returns nice error messages.