-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
MCVE Code Sample
arr = xr.DataArray(np.arange(6).reshape(2, 3),
coords=[('x', ['a', 'b']), ('y', [0, 1, 2])])
arr
stacked = arr.stack(z=('x', 'y'))
stacked[:4].unstack().dtypeExpected Output
>>> arr = xr.DataArray(np.arange(6).reshape(2, 3),
... coords=[('x', ['a', 'b']), ('y', [0, 1, 2])])
>>> arr
<xarray.DataArray (x: 2, y: 3)>
array([[0, 1, 2],
[3, 4, 5]])
Coordinates:
* x (x) <U1 'a' 'b'
* y (y) int64 0 1 2
>>> stacked = arr.stack(z=('x', 'y'))
>>> stacked[:4].unstack().dtype
dtype('float64')Problem Description
Unstacking changes the data type to float for NaN's.
Are there thoughts on alternative options, e.g. fill_value=0 or return_boolean_mask, in order to retain the original data type?
Currently, I obtain a boolean missing array by checking for isnan.
Then I call fillnan(0) and convert the data type back to integer.
However, this is quite inefficient.
Output of xr.show_versions()
Details
INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.10.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1xarray: 0.14.0
pandas: 0.25.1
numpy: 1.17.2
scipy: 1.3.1
netCDF4: 1.4.2
pydap: None
h5netcdf: 0.7.4
h5py: 2.9.0
Nio: None
zarr: 2.3.2
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.5.2
distributed: 2.5.2
matplotlib: 3.1.1
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 41.4.0
pip: 19.2.3
conda: None
pytest: 5.0.1
IPython: 7.8.0
sphinx: None