Skip to content

Spill to constrained disk space #5364

@crusaderky

Description

@crusaderky

Use case

This has been raised offline by a power user.
Their workers have limited disk space - frequently less than the amount of RAM. At the moment, the user has completely disabled spilling as the spill file will occupy all available space and when that happens OSErrors will start being raised and data will be lost.

Proposed behaviour

  • If spill-to-disk limit is hit, stop spilling to disk and keep data in memory.
  • Buildup of memory will then pause worker

Proposed design

  1. Add a line to the dask config to put an upper limit to the size of the spill file.
  2. Keep track of the current size of the spill files on disk. This may not be exactly the same as the size of the keys due to discrepancies between sizeof() and pickle output. Note that this measure can be done without I/O by intercepting the calls to zict.File.__setitem__.
  3. Ahead of spilling, add the sizeof() of the key to be spilled to the current size of the spill files on disk. If the maximum size would be exceeded, log a warning and don't spill. If memory pressure keeps building up, this will in turn cause the worker to eventually reach the pause threshold.

Metadata

Metadata

Assignees

Labels

enhancementImprove existing functionality or make things work better

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions