-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Both Issues https://github.com/akleeman/xray/pull/20 and https://github.com/akleeman/xray/pull/21 are dealing with similar conceptual issues. Namely sometimes the user may want fine control over how a dataset is stored (integer packing, time units and calendars ...). Taking time as an example, the current model interprets the units and calendar in order to create a DatetimeIndex, but then throws out those attributes so that if the dataset were re-serialized the units may not be preserved.
One proposed solution to this issue is to include a distinct set of encoding attributes that would hold things like 'scale_factor', and 'add_offset' allowing something like this
ds['time'] = ('time', pd.date_range('1999-01-05', periods=10))
ds['time'].encoding['units'] = 'days since 1989-08-19'
ds.dump('netcdf.nc')
> ncdump -h
...
int time(time) ;
time:units = "days since 1989-08-19" ;
...
The encoding attributes could also handle masking, scaling, compression etc ...