v2: Standardizing .zmetadata

I want to begin a discussion about standardizing
the .zmetadata format for consolidated metadata.

Suppose we have this Zarr container.
````
.zgroup -- of the root group
var1
    .zarray -- for var1
subgroup1
    .zgroup
    var2
        .zarray -- for var2
        .zattrs  -- for var2    
````
This structure needs to be encoded as JSON in the .zmetadata object.
I can see two obvious encodings:

1. nested encoding
````
{
".zgroup": {<contents of the .zgroup>},
"var1": {
    ".zarray": {<contents of .zarray>},
    }
"subgroup1": {
    ".zgroup": {<contents of the .zgroup>},
    "var2": {
        ".zarray": {<contents of .zarray>},
        ".zattrs": {<contents of .zattrs>},
        }
    }
}
````

2. flat-key encoding
````
{
"/.zgroup": {<contents of the .zgroup>},
"/var1/.zarray": {<contents of .zarray>},
"/subgroup1/.zgroup": {<contents of the .zgroup>},
"/subgroup1/var2/.zarray": {<contents of .zarray>},
"/subgroup1/var2/.zattrs": {<contents of .zattr>},
}
````

My observations:
* The flat-key encoding should, as a rule, be slightly smaller than the
nested encode
* The nested encoding would easier to process into internal data structures,
but that would depend on the implementation. It would be faster for netcdf-c,
but might not be for zarr-python.
* Note that I have prefixed each key with "/", but that is just my choice; a decision is need about that.
* The one example I have seen in the wild uses flat-key encoding.
* The flat-key encoding has no entries for non-content bearing objects. So, for example, there is no "/subgroup1" key nor a "/subgroup1/var2" key. This seems reasonable since it would not add any useful information.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2: Standardizing .zmetadata #113

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

v2: Standardizing .zmetadata #113

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions