-
Notifications
You must be signed in to change notification settings - Fork 30
Closed
Description
A colleague and I were struggling to rechunk a zarr dataset using this workflow: https://nbviewer.jupyter.org/gist/rsignell-usgs/89b61c3dc53d5107e70cf5574fc3c833
After much trial and error, we discovered that we needed to increase the worker size to 8GB and decrease max_mem to 3GB to avoid workers running out of memory and the cluster dying with "killed_worker".
Watching the dask dashboard shows a number of the workers spiking over 5GB, despite setting max_mem to 3GB:

When we looked at the worker logs we saw tons of these warnings:
distributed.worker - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker.html#memtrim for more information. -- Unmanaged memory: 5.66 GiB -- Worker memory limit: 8.00 GiB
Is this expected behavior?
Metadata
Metadata
Assignees
Labels
No labels