-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Is your feature request related to a problem?
Based on the documentation of xarray.save_mfdataset, I would expect that arguments that can be passed to xarray.Dataset.to_netcdf() can also be passed to xarray.save_mfdataset:
When not using dask, it is no different than calling to_netcdf repeatedly.
But it appears that the unlimited_dims and encoding arguments available in to_netcdf are not also available in save_mfdataset:
test_save_mfdataset_encoding_opt.py:
import xarray as xr
# create a timeseries to store in a netCDF file
times = list(range(0,3652))
time = xr.DataArray(times, dims = ("time",))
# create a simple dataset to write using save_mfdataset
test_ds = xr.Dataset()
test_ds['time'] = time
# tell netCDF to write the times as doubles
encoding = dict(time = dict(dtype = "double"))
# set the output file name
output_path = "test.nc"
# the test fails when encoding is added as an argument to save_mfdataset
# but it works if instead the dataset is saved using
# test_ds.to_netcdf(output_path, encoding = encoding)
xr.save_mfdataset([test_ds], [output_path], encoding = encoding)$ python3 test_save_mfdataset_encoding_opt.py
Traceback (most recent call last):
File "test_save_mfdataset_encoding_opt.py", line 21, in <module>
xr.save_mfdataset([test_ds], [output_path], encoding = encoding)
TypeError: save_mfdataset() got an unexpected keyword argument 'encoding'This appears to be because save_mfdataset does not accept the encoding argument, nor does it accept and pass along **kwargs.
This means that datasets written with save_mfdataset are less flexible than those written with to_netcdf.
Describe the solution you'd like
A simple fix, which I have verified, is to modify save_mfdataset to accept and pass along **kwargs:
diff --git a/xarray/backends/api.py b/xarray/backends/api.py
index d1166624..8baca58c 100644
--- a/xarray/backends/api.py
+++ b/xarray/backends/api.py
@@ -1258,7 +1258,7 @@ def dump_to_store(
def save_mfdataset(
- datasets, paths, mode="w", format=None, groups=None, engine=None, compute=True
+ datasets, paths, mode="w", format=None, groups=None, engine=None, compute=True, **kwargs
):
"""Write multiple datasets to disk as netCDF files simultaneously.
@@ -1280,6 +1280,7 @@ def save_mfdataset(
these locations will be overwritten.
format : {"NETCDF4", "NETCDF4_CLASSIC", "NETCDF3_64BIT", \
"NETCDF3_CLASSIC"}, optional
+ **kwargs : additional arguments are passed along to ``to_netcdf``
File format for the resulting netCDF file:
@@ -1358,7 +1359,7 @@ def save_mfdataset(
writers, stores = zip(
*[
to_netcdf(
- ds, path, mode, format, group, engine, compute=compute, multifile=True
+ ds, path, mode, format, group, engine, compute=compute, multifile=True, **kwargs
)
for ds, path, group in zip(datasets, paths, groups)
]When a version of xarray with xarray/backends/api.py patched as above, the test file indicated above runs as expected, with the encoding passed along:
$ python3 test_save_mfdataset_encoding_opt.py
$ ncdump -h test.nc
netcdf test {
dimensions:
time = 3652 ;
variables:
double time(time) ;
time:_FillValue = NaN ;
}Describe alternatives you've considered
I attempted to set the encoding dictionary directly on the dataset prior to calling save_mfdataset, but that didn't seem to have an effect.
Additional context
Here is version information, in case it is relevant:
$ python3 -c 'import xarray; print(xarray.show_versions())'
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.4 (default, Aug 13 2019, 15:17:50)
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 21.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.1
xarray: 0.15.0
pandas: 0.25.1
numpy: 1.17.2
scipy: 1.6.3
netCDF4: 1.4.2
pydap: installed
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.1.1.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 2.5.2
distributed: 2.5.2
matplotlib: 3.1.3
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 41.4.0
pip: 19.2.3
conda: 4.8.3
pytest: 5.2.1
IPython: 7.8.0
sphinx: 2.2.0
None