Skip to content

RuntimeError: dictionary changed size during iteration #7581

@bsesar

Description

@bsesar

What happened: Dask reported the following RuntimeError

RuntimeError: dictionary changed size during iteration

tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOLoop object at 0x000002C4D8032A00>>, <Task finished name='Task-889722' coro=<Worker.heartbeat() done, defined at C:\Users\xxx\Anaconda3\envs\work2\lib\site-packages\distributed\worker.py:941> exception=RuntimeError('dictionary changed size during iteration')>)
Traceback (most recent call last):
  File "C:\Users\xxx\Anaconda3\envs\work2\lib\site-packages\tornado\ioloop.py", line 741, in _run_callback
    ret = callback()
  File "C:\Users\xxx\Anaconda3\envs\work2\lib\site-packages\tornado\ioloop.py", line 765, in _discard_future_result
    future.result()
  File "C:\Users\xxx\Anaconda3\envs\work2\lib\site-packages\distributed\worker.py", line 955, in heartbeat
    executing={
  File "C:\Users\xxx\Anaconda3\envs\work2\lib\site-packages\distributed\worker.py", line 955, in <dictcomp>
    executing={

I am also getting a fair number of these messages:

distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.worker - WARNING - gc.collect() took 2.438s. This is usually a sign that some tasks handle too many Python objects at the same time. Rechunking the work into smaller tasks might help.

I do have a lot of tasks (> half a million), and reducing the number of tasks by re-partitioning the data did help. However, I think the above RuntimeError should not happen.

What you expected to happen: I certainly did not expect a RuntimeError :-)

Minimal Complete Verifiable Example: I'm sorry, I don't have one. I am processing a lot of data (I'm using dask, so duh :-)), and I think the error may be related to that fact. However, I cannot share the data.

Anything else we need to know?: Perhaps this Stackoverflow post could be a possible fix?

Environment:

  • Dask version: 2021.04.0
  • Python version: 3.9.2
  • Operating System: Windows 10
  • Install method (conda, pip, source): conda

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions