workers exceeding max_mem setting

A colleague and I were struggling to rechunk a zarr dataset using this workflow: https://nbviewer.jupyter.org/gist/rsignell-usgs/89b61c3dc53d5107e70cf5574fc3c833

After much trial and error, we discovered that we needed to increase the worker size to 8GB and decrease max_mem to 3GB to avoid workers running out of memory and the cluster dying with "killed_worker".  

Watching the dask dashboard shows a number of the workers spiking over 5GB, despite setting max_mem to 3GB:
![2021-10-04_16-16-26](https://user-images.githubusercontent.com/1872600/135919299-616d7a7a-a03c-485a-8b1c-b5256848b69d.png)

When we looked at the worker logs we saw tons of these warnings:
```
distributed.worker - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker.html#memtrim for more information. -- Unmanaged memory: 5.66 GiB -- Worker memory limit: 8.00 GiB
```
Is this expected behavior?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

workers exceeding max_mem setting #100

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

workers exceeding max_mem setting #100

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions