-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Linear interp with NaNs in nd indexer #4233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When interpolating with an nd indexer that contains NaN's, the code previously threw a KeyError from the missing._localize function. This commit fixes this by swapping `np.min` and `np.max` with `np.nanmin` and `np.nanmax`, ignoring any NaN values.
|
Thanks, @jenssss for sending a PR. |
xarray/core/missing.py
Outdated
| imin = index.get_loc(np.min(new_x.values), method="nearest") | ||
| imax = index.get_loc(np.max(new_x.values), method="nearest") | ||
| imin = index.get_loc(np.nanmin(new_x.values), method="nearest") | ||
| imax = index.get_loc(np.nanmax(new_x.values), method="nearest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks that np.nanmin (and nanmax) supports np.datetime-dtype only with numpy>=1.18.
We can copy np.nanmin to our core/npcompat.py and call this function here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they added some low level loops to fix this (numpy/numpy#14841) so guess we cannot copy nanmin over and have to use if LooseVersion(np. __version__) < LooseVersion("1.18") (or similar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some LooseVersion checks for numpy>=1.18 in the d044142 commit to the missing._localize function and to the test. Would this do?
|
@jenssss, there's something wrong with the commit history (which makes reviewing pretty hard). Did you by change do |
|
@keewis, sorry I'm still kinda new to Github. |
|
not your fault, the contributing guide did tell you to do that, and I only updated it a few days ago (not in |
|
I want to try to fix it. |
…terp This merges an updated upstream into this topic branch
d044142 to
2152fb8
Compare
|
Okay, I think that should do it. How does it look now? |
|
perfect, thanks for the fix. |
|
Looks good - if you want to go a step further you can probably do something along the lines of (untested): if new_x.dtype in "mM" and LooseVersion(np.__version__) < LooseVersion("1.18") and new_x.isnull().any():
raise ValueError("numpy 1.18 or newer required to use interp with datetime/ timedelta array containing missing values")
else:
imin = ... np.nanmin(...)
imax = ... |
This means the PR now also works for numpy < 1.18, as long as index is not with datetime
|
I misremembered the behavior of the old Sorry for sending you down the wrong path. I can have another look tomorrow. |
It seems that np.min/max works in place of nanmin/nanmax for datetime types for numpy < 1.18, see https://github.com/pydata/xarray/pull/3924/files
896f699 to
8313c3e
Compare
|
Okay, I removed the I thought it better to use |
|
Thanks! I have two more suggestions: Can you add an additional line with (["2000-01-01T12:00", "2000-01-02T12:00", "NaT"], [0.5, 1.5]), here: xarray/xarray/tests/test_interp.py Line 575 in 8313c3e
to have a test with a missing datetime. Do you also want to add a note to the docstring, e.g. adding xarray/xarray/core/dataarray.py Line 1425 in a36d0a1
here: xarray/xarray/core/dataarray.py Line 1495 in a36d0a1
Does it also work for DataSets? Then maybe add to the docstring here: Line 2603 in a36d0a1
and here: Line 2731 in a36d0a1
|
Also added a test for `Dataset` to `test_interpolate_nd_with_nan`, and "Missing values are skipped." to the dosctring of `interp` and `interp_like` methods of `DataArray` and `Dataset`.
|
LGTM. I'll merge in a day or two unless someone else has a comment. (#3924) |
|
thanks @jenssss I see this is your first PR - welcome to xarray! |
When doing linear interpolatiion with an nd indexer that contains NaN's,
xarraypreviously threw aKeyErrorfrom themissing._localizefunction. This PR fixes this by swappingnp.minandnp.maxwithnp.nanminandnp.nanmaxin that function, ignoring any NaN values.