-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Description
Sometimes we need a cache that spills to disk.
>>> from chest import Chest
>>> cache = Chest()
>>> dsk = ...
>>> dask.threaded.get(dsk, keys, cache=cache)My current solution to this is Chest an object that satisfies the MutableMapping interface that spills large contents to disk as pickle files. When using this in production it worked, but not great. We need something better.
One solution would be to improve chest. A first step there would be to build a dask-like workflow to test chest and see what's falling down. Is it concurrency, the lack of an LRU mechanism, the O(n) things we do on each spill to disk?
Another solution would be to look around and find something better, possibly shove or something else in the ecosystem (or something new altogether.)
On disk caching will probably be necessary.
Metadata
Metadata
Assignees
Labels
No labels