Skip to content

Allow the ability to add/persist details of how a dataset is stored. #26

@akleeman

Description

@akleeman

Both Issues https://github.com/akleeman/xray/pull/20 and https://github.com/akleeman/xray/pull/21 are dealing with similar conceptual issues. Namely sometimes the user may want fine control over how a dataset is stored (integer packing, time units and calendars ...). Taking time as an example, the current model interprets the units and calendar in order to create a DatetimeIndex, but then throws out those attributes so that if the dataset were re-serialized the units may not be preserved.

One proposed solution to this issue is to include a distinct set of encoding attributes that would hold things like 'scale_factor', and 'add_offset' allowing something like this

ds['time'] = ('time', pd.date_range('1999-01-05', periods=10))
ds['time'].encoding['units'] = 'days since 1989-08-19'
ds.dump('netcdf.nc')

> ncdump -h
...
    int time(time) ;
        time:units = "days since 1989-08-19" ;
...

The encoding attributes could also handle masking, scaling, compression etc ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions