Skip to content

Add support for saving the index produced with a full GRIB scan at open. #20

@alexamici

Description

@alexamici

At the moment every time a GRIB file is opened cfgrib needs to scan all the messages in the file to build the index that is then used to compute the values of the coordinates and build the hypercube representation of the variables.

Worse, when opening a GRIB file with the convenience function open_datasets the index is discarded every time the recursive call fails and the expensive file scan is done again.

Proposed implementation requirements for the feature are:

  • save the index to disk with path + .idx immediately after computation
    • a pickle of the in-memory structure is the simplest implementation
    • shall not fail if the index cannot be written (file can be on a read only filesystem)
  • when opening a file search for the path + .idx index file, test that it is in sync with the GRIB file and load it
    • timestamp ordering is enough for now
    • do not fail if the index is corrupt
  • use locking to avoid concurrent writes or reads and write
    • concurrents reads must be ok

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions