-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix polyfit fail on deficient rank #4193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looks fine to me.
That could be annoying as a user. This happens in with warnings.catch_warnings():
warnings.simplefilter("once", np.RankWarning)
results = da.apply_along_axis(
nputils._nanpolyfit_1d,
...
)I think this also warrants tests and a whats-new entry. |
|
So while adding tests, I realized there were more bugs concerning deficient rank matrices and |
|
Pity that the dask warnings are not caught but I guess there's not much we can do about that. Does this now work for the originally posted issue? Could you add it as a test? |
Apply suggestion from review. Co-authored-by: Mathias Hauser <[email protected]>
|
@mathause Fixed the tests to catch the warning and added a case for the specific of the issue : too many nans causing a rank deficiency. |
mathause
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I wait a few more days before merging to see if someone else has comments.
Co-authored-by: keewis <[email protected]>
|
thanks @aulemahal! |
isort -rc . && black . && mypy . && flake8whats-new.rstapi.rstFixes #4190. In cases where the input matrix had a deficient rank (matrix rank != order) because of the number of NaN values, polyfit would fail, simply because numpy's lstsq returned an empty array for the residuals (instead of a size 1 array). This fixes the problem by catching the case and returning
np.naninstead.The other point in the issue was that
RankWarningis also not raised in that case. That was due to the fact thatda.polyfitwas computing the rank from the coordinate (Vandermonde) matrix, instead of the masked data. Thus, is a given line has too many NaN values, its deficient rank was not detected. I added a test and warning at all places where a rank is computed (5 different lines). Also, to match np.polyfit behaviour of no warning whenfull=True, I changed the warning filters using a context manager, ignoring theRankWarningin that case. Overall, it feels a bi ugly because of the duplicated code and it will print the warning for every line of an array that has a deficient rank, which can be a lot...