-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
[MRG] Add few more tests + Documentation for re-entrant cross-validation estimators #7823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add few more tests + Documentation for re-entrant cross-validation estimators #7823
Conversation
|
Maybe it's easier just to document under |
jnothman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain the issue with assert_equal a little more? perhaps it needs a comment.
sklearn/model_selection/_search.py
Outdated
| +------------+-----------+------------+-----------------+---+---------+ | ||
| |param_kernel|param_gamma|param_degree|split0_test_score|...|rank_....| | ||
| |param_kernel|param_gamma|param_degree|split0_test_score|...| rank... | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not better as rank_t...?
sklearn/model_selection/_split.py
Outdated
| Note | ||
| ---- | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer no blank line here...
sklearn/model_selection/_split.py
Outdated
| ---- | ||
| Multiple calls to the ``split`` method will not return identical | ||
| training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No such param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TimeSeriesSplit has a random_state...
sklearn/model_selection/_split.py
Outdated
| ---- | ||
| Multiple calls to the ``split`` method will not return identical | ||
| training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only if shuffle=True
|
@jnothman Have addressed your comments... Another look please? |
| np.testing.assert_equal( | ||
| np.array(list(kf_iter_wrapped.split(X, y))), | ||
| np.array(list(kf_randomized_iter_wrapped.split(X, y)))) | ||
| except AssertionError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry you asked this before and I failed to give a response!
So the nested lists comparison raises a deprecation warning...
>>> assert_true(np.array(list(kfold.split(X, y))) != np.array(list(kfold.split(X, y))))
DeprecationWarning: elementwise != comparison failed; this will raise an error in the future.np.testing.assert_equal on the other hand handles list of np.ndarrays gracefully...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh wait I forgot the else clause. Sorry for that...
| cv=KFold(n_splits=n_splits)) | ||
| gs2.fit(X, y) | ||
|
|
||
| # Give generator as a cv parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be sure, we've not got anything that ensures this will remain a generator. Either use a generator expression or test for its type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| list(kf_randomized_iter_wrapped.split(X, y))) | ||
| assert_true(np.any(np.array(list(kf_iter_wrapped.split(X, y))) != | ||
| np.array(list(kf_randomized_iter_wrapped.split(X, y))))) | ||
| np.testing.assert_array_equal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still don't understand why you've resorted to np.testing.assert_array_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.testing.assert_array_equal handles nested lists unlike sklearn.testing.assert_array_equal...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think this should at least be commented on somewhere in the file, if not imported into sklearn.utils.testing and given a different name, or the tests rewritten to do this comparison of nested lists explicitly.
| list(kf_randomized_iter_wrapped.split(X, y))) | ||
| assert_true(np.any(np.array(list(kf_iter_wrapped.split(X, y))) != | ||
| np.array(list(kf_randomized_iter_wrapped.split(X, y))))) | ||
| np.testing.assert_array_equal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think this should at least be commented on somewhere in the file, if not imported into sklearn.utils.testing and given a different name, or the tests rewritten to do this comparison of nested lists explicitly.
| list(kf_randomized_iter_wrapped.split(X, y))) | ||
| assert_true(np.any(np.array(list(kf_iter_wrapped.split(X, y))) != | ||
| np.array(list(kf_randomized_iter_wrapped.split(X, y))))) | ||
| # numpy's assert_array_equal properly compares nested lists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnothman would this suffice? or would you prefer having this imported into sklearn.utils.testing rather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this comment is okay. Ideally I think we want all our asserts to come from one place and have clear naming for where they should be applied.
amueller
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We say that the splits are different with random_state=None, but that's not tested anywhere, is it?
sklearn/model_selection/_split.py
Outdated
| Note | ||
| ---- | ||
| Multiple calls to the ``split`` method will not return identical | ||
| training or testing sets if ``random_state`` parameter exists and is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "if splitting is randomized and the random_state parameter is not set to an integer"? I had to check where this was and what it means for a parameter to exist. Also, is that true? This class can not know how the subclasses treat the random_state.
So maybe rather "this might not be deterministic" which is not very explicit. Maybe describe this in the class docstring for each class and link there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's my lack of coffee, but why does split use _iter_test_masks? Currently we create indices, transform them to booleans and then transform them back to indices. It is for making the negation easy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for coming to this very late. What do you suggest as the right thing to do?
"This *may not* be deterministic if `random_state`, if available, is not explicitly set while initializing the class"Sounds okay to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Randomized CV splitters may return different results for each call of split. This can be avoided (and identical results returned for each split) by setting random_state to an integer."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this belongs in the narrative docs too.
sklearn/model_selection/_split.py
Outdated
| Note | ||
| ---- | ||
| Multiple calls to the ``split`` method will not return identical | ||
| training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "exists" means here. This class has a random_state parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw the if shuffle in the _make_test_fold method seems unnecessary and makes the code harder to follow imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am really unable to recollect why I did that :@ :@ :@ Arghh. I guess it was done so different split calls would produce same split? But I guess that was vetoed down (see: #7935).
I'll revert for now. I can reintroduce when someone complains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, personally, I think having multiple split calls produce the same split is a better design, but not one that we currently implement.
| return cv_results | ||
|
|
||
| # Check if generators as supported as cv and that the splits are consistent | ||
| np.testing.assert_equal(_pop_time_keys(gs3.cv_results_), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it be better to somehow test that the same samples have been used? We could have a model that stores the training data and a scorer that produces a hash of the training data in the model and the test data passed to score? Or is that too hacky for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that can be done. Maybe I'm being lazy but I feel this is sufficient? I can't see a case where this would pass if different samples were used. The score is quite sensitive to sample order no?
(Especially given that the dataset is make_classification and estimator is LinearSVC and not one of our toy estimators which would return 1 / 0 as the score)
| cv_results.pop(key) | ||
| return cv_results | ||
|
|
||
| # Check if generators as supported as cv and that the splits are consistent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are supported
89593cd to
91963f4
Compare
jnothman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
sklearn/model_selection/_split.py
Outdated
| Note | ||
| ---- | ||
| Multiple calls to the ``split`` method will not return identical | ||
| training or testing sets if ``random_state`` parameter exists and is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Randomized CV splitters may return different results for each call of split. This can be avoided (and identical results returned for each split) by setting random_state to an integer."
sklearn/model_selection/_split.py
Outdated
| Note | ||
| ---- | ||
| Multiple calls to the ``split`` method will not return identical | ||
| training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, personally, I think having multiple split calls produce the same split is a better design, but not one that we currently implement.
sklearn/model_selection/_split.py
Outdated
| Note | ||
| ---- | ||
| Multiple calls to the ``split`` method will not return identical | ||
| training or testing sets if ``random_state`` parameter exists and is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this belongs in the narrative docs too.
| cv_results['mean_train_score'][1]) | ||
| try: | ||
| assert_almost_equal(cv_results['mean_test_score'][1], | ||
| cv_results['mean_test_score'][2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest: assert cv_results['mean_test_score'][1] != approx(cv_results['mean_test_score'][2])
:|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just: assert_false(np.allclose(cv_results['mean_test_score'][1], cv_results['mean_test_score'][2]))
| shuffle=True, random_state=0).split(X, y), | ||
| GeneratorType)) | ||
|
|
||
| # Give generator as a cv parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really want to test that it's a generator, you should confirm (or ensure by using a generator expression) that it is indeed a generator. Otherwise the implementation in KFold may change and this test is no longer doing the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you haven't addressed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a test a few lines above (at L1431), which confirms if KFold indeed returns a GeneratorType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
But then the comment needs to appear before
|
Thanks Joel, have addressed your comments |
doc/modules/cross_validation.rst
Outdated
| * To ensure results are repeatable (*on the same platform*), use a fixed value | ||
| for ``random_state``. | ||
|
|
||
| The randomized CV splitters may return different results for each call of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is described in less clear terms in the preceding bullet point. Please merge them.
| shuffle=True, random_state=0).split(X, y), | ||
| GeneratorType)) | ||
|
|
||
| # Give generator as a cv parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you haven't addressed this.
sklearn/model_selection/_split.py
Outdated
| Note | ||
| ---- | ||
| Randomized CV splitters may return different results for each call of | ||
| split. This can be avoided (and identical results returned for each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit redundant. Maybe "You can make the results identical by setting random_state to an integer"?
|
this looks good but I haven't double checked if it addressed all of @jnothman's comments from the original PR. |
|
Done. If you guys are happy, this can be merged now. Unless travis decides to give me a headache. |
|
Thanks |
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
TODO
random_stateis not set.np.testing.assert_equalto test for equality of nested lists fixes that.The word rank in the mock table view ofFixed in [MRG + 2] ENH Allowcv_results_is assumed as a link.cross_val_score,GridSearchCVet al. to evaluate on multiple metrics #7388@jnothman @amueller Pl. review :)