daemon: clean up dead containers on start by akerouanton · Pull Request #51692 · moby/moby

akerouanton · 2025-12-11T17:08:48Z

Fixes / addresses Error response from daemon: no such container when running docker compose up docker/desktop-linux#309
fixes / addresses Multiple dead containers that can't be deleted docker/desktop-feedback#253

- What I did

Stopping the Engine while a container with autoremove set is running may leave behind dead containers on disk. These containers aren't reclaimed on next start, appear as "dead" in docker ps -a and can't be inspected or removed by the user.

This bug has existed since a long time but became user visible with 9f5f4f5. Prior to that commit, containers with no rwlayer weren't added to the in-memory viewdb, so they weren't visible in docker ps -a. However, some dangling files would still live on disk (e.g. folder in /var/lib/docker/containers, mount points, etc).

The underlying issue is that when the daemon stops, it tries to stop all running containers and then closes the containerd client. This leaves a small window of time where the Engine might receive 'task stop' events from containerd, and trigger autoremove. If the containerd client is closed in parallel, the Engine is unable to complete the removal, leaving the container in 'dead' state. In such case, the Engine logs the following error:

cannot remove container "bcbc98b4f5c2b072eb3c4ca673fa1c222d2a8af00bf58eae0f37085b9724ea46": Canceled: grpc: the client connection is closing: context canceled

Solving the underlying issue would require complex changes to the shutdown sequence. Moreover, the same issue could also happen if the daemon crashes while it deletes a container. Thus, add a cleanup step on daemon startup to remove these dead containers.

- How to verify it

A new integration test has been added.

- Human readable description for the release notes

Fix a bug that could cause the Engine to leave containers with autoremove set in 'dead' state on shutdown, and never reclaim them.

austinvazquez

LGTM.

vvoland · 2025-12-11T18:42:17Z

The CI fails on Windows:

=== FAIL: integration/container TestRemoveDeadContainersOnDaemonRestart (0.53s)
    daemon.go:333: [de1dbe7ab70dc] Status: unknown flag: --userland-proxy
    daemon.go:333: [de1dbe7ab70dc] See 'dockerd --help'., Code: 125
    remove_test.go:117: [de1dbe7ab70dc] failed to start daemon with arguments [--config-file /dev/null --data-root D:\a\moby\moby\go\src\github.com\docker\docker\bundles\tmp\TestRemoveDeadContainersOnDaemonRestart\de1dbe7ab70dc\root --exec-root C:\Users\RUNNER~1\AppData\Local\Temp\dxr\de1dbe7ab70dc --pidfile D:\a\moby\moby\go\src\github.com\docker\docker\bundles\tmp\TestRemoveDeadContainersOnDaemonRestart\de1dbe7ab70dc\docker.pid --userland-proxy=true --containerd-namespace de1dbe7ab70dc --containerd-plugins-namespace de1dbe7ab70dcp --containerd /var/run/docker/containerd/containerd.sock --host unix://C:\Users\RUNNER~1\AppData\Local\Temp\docker-integration\de1dbe7ab70dc.sock --debug] : [de1dbe7ab70dc] daemon exited during startup: exit status 1

Stopping the Engine while a container with autoremove set is running may leave behind dead containers on disk. These containers aren't reclaimed on next start, appear as "dead" in `docker ps -a` and can't be inspected or removed by the user. This bug has existed since a long time but became user visible with 9f5f4f5. Prior to that commit, containers with no rwlayer weren't added to the in-memory viewdb, so they weren't visible in `docker ps -a`. However, some dangling files would still live on disk (e.g. folder in /var/lib/docker/containers, mount points, etc). The underlying issue is that when the daemon stops, it tries to stop all running containers and then closes the containerd client. This leaves a small window of time where the Engine might receive 'task stop' events from containerd, and trigger autoremove. If the containerd client is closed in parallel, the Engine is unable to complete the removal, leaving the container in 'dead' state. In such case, the Engine logs the following error: cannot remove container "bcbc98b4f5c2b072eb3c4ca673fa1c222d2a8af00bf58eae0f37085b9724ea46": Canceled: grpc: the client connection is closing: context canceled Solving the underlying issue would require complex changes to the shutdown sequence. Moreover, the same issue could also happen if the daemon crashes while it deletes a container. Thus, add a cleanup step on daemon startup to remove these dead containers. Signed-off-by: Albin Kerouanton <[email protected]>

vvoland

LGTM (after CI green)

thaJeztah

LGTM on green, thanks!

akerouanton added this to the 29.2.0 milestone Dec 11, 2025

akerouanton self-assigned this Dec 11, 2025

akerouanton added impact/changelog process/cherry-pick area/daemon Core Engine kind/bugfix PR's that fix bugs process/cherry-pick/docker-29.x labels Dec 11, 2025

akerouanton force-pushed the remove-dead-ctrs-on-startup branch from 264a31f to fd96eaf Compare December 11, 2025 17:11

akerouanton marked this pull request as ready for review December 11, 2025 17:17

robmry approved these changes Dec 11, 2025

View reviewed changes

austinvazquez approved these changes Dec 11, 2025

View reviewed changes

vvoland mentioned this pull request Dec 11, 2025

[docker-29.x backport] daemon: clean up dead containers on start #51693

Merged

austinvazquez force-pushed the remove-dead-ctrs-on-startup branch from fd96eaf to ec9315c Compare December 11, 2025 19:40

vvoland approved these changes Dec 11, 2025

View reviewed changes

thaJeztah approved these changes Dec 11, 2025

View reviewed changes

vvoland added process/cherry-picked and removed process/cherry-pick process/cherry-pick/docker-29.x labels Dec 11, 2025

vvoland merged commit 31184e6 into moby:master Dec 11, 2025
265 of 266 checks passed

akerouanton deleted the remove-dead-ctrs-on-startup branch December 11, 2025 21:31

akerouanton mentioned this pull request Dec 15, 2025

daemon: restore: register containers without rwlayer #51724

Merged

thaJeztah mentioned this pull request Dec 16, 2025

Multiple dead containers that can't be deleted docker/desktop-feedback#253

Open

elibosley mentioned this pull request Mar 18, 2026

fix: filter Docker ghost/dead containers from WebGUI and auto-clean stale entries unraid/webgui#2577

Merged

elibosley mentioned this pull request Apr 12, 2026

fix(docker): filter ghost containers from WebGUI unraid/webgui#2611

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

daemon: clean up dead containers on start#51692

daemon: clean up dead containers on start#51692
vvoland merged 1 commit intomoby:masterfrom
akerouanton:remove-dead-ctrs-on-startup

akerouanton commented Dec 11, 2025 •

edited by thaJeztah

Loading

Uh oh!

austinvazquez left a comment

Uh oh!

vvoland commented Dec 11, 2025 •

edited

Loading

Uh oh!

vvoland left a comment

Uh oh!

thaJeztah left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

akerouanton commented Dec 11, 2025 • edited by thaJeztah Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

austinvazquez left a comment

Choose a reason for hiding this comment

Uh oh!

vvoland commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vvoland left a comment

Choose a reason for hiding this comment

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

akerouanton commented Dec 11, 2025 •

edited by thaJeztah

Loading

vvoland commented Dec 11, 2025 •

edited

Loading