API Rename force_all_finite into ensure_all_finite#29404
API Rename force_all_finite into ensure_all_finite#29404adrinjalali merged 14 commits intoscikit-learn:mainfrom
Conversation
dacdf67 to
4432e70
Compare
sklearn/utils/validation.py
Outdated
| f"{estimator_name} requires y to be passed, but the target y is None" | ||
| ) | ||
|
|
||
| if force_all_finite != "deprecated": |
There was a problem hiding this comment.
What should the behavior be when both are set?
check_array(force_all_finite=True, ensure_all_finite=False)There was a problem hiding this comment.
The way I implemented it for now is that if both are set, force_all_finite takes over.
I can see that we'd like to have an error if both are different though. Let me change that
There was a problem hiding this comment.
Actually I'm not so sure. It makes the logic quite complicated.
What about the following ?
check_array(force_all_finite=False, ensure_all_finite=True)
ensure_all_finite=True is the default. The user already gets a warning to use ensure_all_finite instead of force_all_finite. I don't want to raise an additional error, while he just left the other to its default value. Otherwise we need a sentinel to be able to check if ensure_all_finite is set to a non-default value, but I don't think the added complexity is worth.
There was a problem hiding this comment.
Instead, I added a comment in warning saying that ensure_all_finite is ignored when force_all_finite. I know that since ensure_all_finite is the new name it feels a bit weird and one would expect the opposite, but this way keeps things simple and avoids adding a complex logic.
What do you think ?
There was a problem hiding this comment.
I'm +0.5 on just improving the docs/warning message.
The alternative is to have ensure_all_finite=None as the default that behaves as True. Then if force_all_finite!="deprecated" and ensure_all_finite is not None, raise an error that says that are not be set at the same time?
In two versions, we update ensure_all_finite=True and it'll be backward compatible because None behaved the same way.
There was a problem hiding this comment.
The alternative is to have
ensure_all_finite=None
Yes that's what I was not enthusiastic about. I would be in favor of doing that only if we agree to remove None as a valid option in 1.8 without deprecation 😄. Because otherwise I find that a 4 releases cycle is too long for a renaming.
There was a problem hiding this comment.
Since this is a developer tools, I think I'm fine avoiding to have the default sentinel None and later change the default.
There was a problem hiding this comment.
I'm not sure what @glemaitre is suggesting here. But I would rather use None, not document it, and later remove it after deprecation of force_...
There was a problem hiding this comment.
I changed it so that now an error is raised when both are set. I added a comment to remove None in 1.8 without deprecation.
| @pytest.mark.filterwarnings( | ||
| "ignore:'force_all_finite' was renamed to 'ensure_all_finite':FutureWarning" | ||
| ) |
There was a problem hiding this comment.
Is this coming from lightgbm?
There was a problem hiding this comment.
Yes, technically from scikit-learn as a dependency of lightgbm.
sklearn/metrics/pairwise.py
Outdated
| .. versionadded:: 0.22 | ||
| .. versionadded:: 0.20 |
There was a problem hiding this comment.
No, just a copy paste mistake on my end. I fixed that.
sklearn/metrics/pairwise.py
Outdated
| .. versionadded:: 0.20 | ||
| Accepts the string ``'allow-nan'``. | ||
|
|
||
| .. versionchanged:: 0.23 | ||
| Accepts `pd.NA` and converts it into `np.nan` | ||
|
|
||
| .. versionadded:: 1.6 | ||
| `force_all_finite` was renamed to `ensure_all_finite`. |
There was a problem hiding this comment.
we only need a version added I guess?
There was a problem hiding this comment.
The thing, if we don't copy it here, is that in 2 versions when we remove force_all_finite, then this information will disappear. I find it weird to remove the history for a renaming of a parameter.
There was a problem hiding this comment.
I don't think this information is relevant anymore when we're renaming the parameter.
There was a problem hiding this comment.
I agree with @adrinjalali. Basically since we change the name of parameter, we cannot go back to version 0.23 because this parameter is not existing there.
We just need to make sure that the options added are mentioned in the docstring that is the case here.
sklearn/metrics/pairwise.py
Outdated
| if force_all_finite != "deprecated": | ||
| warnings.warn( | ||
| "'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be " | ||
| "removed in 1.8. Until then, ensure_all_finite is ignored when " | ||
| "force_all_finite is set.", | ||
| FutureWarning, | ||
| ) |
There was a problem hiding this comment.
usually in these cases we raise if both are set. Only the new one should be set. More like
if force_all_finite != "deprecated" and ensure_all_finate != "default":
raise ...
elif force_all_finite != "deprecated":
warn("use ensure...")
ensure_... = force_...
elif ensure_... == "default":
ensure_... = TrueThere was a problem hiding this comment.
See the discussion here #29404 (comment), and especially my last comment #29404 (comment). I agree with you, but I just want to make sure that we're on the same page regarding the removal of "default" or None (whichever we chose) in 1.8 without deprecation.
glemaitre
left a comment
There was a problem hiding this comment.
LGTM.
I considered to be slightly more lenient because we are touching to a developer tool. If it would be an estimator, I would go through the 4 deprecation cycles thingy.
| `force_all_finite` was renamed to `ensure_all_finite` and will be removed | ||
| in 1.8. | ||
|
|
||
| ensure_all_finite : bool or 'allow-nan', default=True |
There was a problem hiding this comment.
I'm happy with the approach of using None as the default and documenting that it's True.
Co-authored-by: Adrin Jalali <[email protected]>
Co-authored-by: Adrin Jalali <[email protected]>
closes #29262
renames
force_all_finiteintoensure_all_finitein the following public functions:check_arraycheck_X_yas_float_arraycheck_pairwise_arrayspairwise_distancesThe
ensure_xxxpattern already used by several parameters of these functions only means that some checks are activated or not but has no effect on the output array.force_all_finitefollows this behavior but does not follow this naming pattern. I think it should, especially since we now have a parameterforce_writeablethat has an effect on the output array.