daemon: restore: register containers without rwlayer by akerouanton · Pull Request #51724 · moby/moby

akerouanton · 2025-12-15T18:00:19Z

Related to daemon: clean up dead containers on start #51692
Fixes / addresses Error response from daemon: no such container when running docker compose up docker/desktop-linux#309
fixes / addresses Multiple dead containers that can't be deleted docker/desktop-feedback#253

- What I did

Under some undetermined circumstances, a container might end up with no rwlayer on daemon restart. The restore logic would add them to the containersReplica list but not to the 'containers' list. Thus, users would see them on docker ps -a, but would be unable to remove them.

To fix that, add these containers to the 'containers' list as well.

- Human readable description for the release notes

Fix a rare bug that could cause containers to become unremovable.

vvoland · 2025-12-15T18:07:00Z

+				// A container without a rwlayer is in a bad state, but we must register that container to let users
+				// remove it. So, log the error but do not early-return.
 				logger.WithError(err).Error("failed to load container mount")
-				return


Looking at the history, looks like this logic has been there for quite a long time...

We should investigate why this is happening

I didn't put as much details in my commit message as I did in #51692.

Before 9f5f4f5, restore would scan the state directory to build the list of containers to restore (i.e. put in the restore-local containers slice). If the rwlayer was missing, it'd log the error message and the early-return I'm removing in this PR would prevent the container from being added to daemon.containers and daemon.containersReplica.

Starting with 9f5f4f5, loadContainers just scans the state directory to build the list of containers, without checking rwlayers. Then, that list is passed to restore, which does the check. If the check fails, the container isn't register()'d (i.e. not added to daemon.containers). However, the whole of containers is used to rebuild daemon.containersReplica.

Hence, I believe that this is an old bug (probably going back to the introduction of containerd snapshotter support) that no one spotted beforehand because the only and very subtle symptom was a state dir with more containers than returned by docker ps -a.

It looks like we have a sentinel error for missing layers (which .. probably should be changed to implement errdefs.NotFound); should we handle "not found" errors different from other errors here?

moby/daemon/internal/layer/layer_store.go

Lines 544 to 552 in bdda339

func (ls *layerStore) GetRWLayer(id string) (RWLayer, error) {

ls.locker.Lock(id)

defer ls.locker.Unlock(id)

ls.mountL.Lock()

mount := ls.mounts[id]

ls.mountL.Unlock()

if mount == nil {

return nil, ErrMountDoesNotExist

Actually; looks like the containerd snapshotter does have some special handling for "not-found" errors in some parts;

moby/daemon/containerd/image_snapshot.go

Lines 173 to 196 in f74e5d4

func (i *ImageService) GetLayerByID(cid string) (container.RWLayer, error) {

ctx := context.TODO()

sn := i.client.SnapshotService(i.StorageDriver())

if _, err := sn.Stat(ctx, cid); err != nil {

if !cerrdefs.IsNotFound(err) {

return nil, fmt.Errorf("failed to stat snapshot %s: %w", cid, err)

}

return nil, errdefs.NotFound(fmt.Errorf("RW layer for container %s not found", cid))

}

ls := i.client.LeasesService()

lss, err := ls.List(ctx, "id=="+cid)

if err != nil {

return nil, err

}

switch len(lss) {

case 0:

return nil, errdefs.NotFound(errors.New("rw layer lease not found for container " + cid))

case 1:

// found

default:

log.G(ctx).WithFields(log.Fields{"container": cid, "leases": lss}).Warn("multiple leases with the same id found, this should not happen")

It looks like we have a sentinel error for missing layers (which .. probably should be changed to implement errdefs.NotFound); should we handle "not found" errors different from other errors here?

No, we should handle all errors the same way to ensure that the container is added to both daemon.containers and daemon.containersReplica.

thaJeztah · 2025-12-16T09:05:56Z

 				logger.WithError(err).Error("failed to load container mount")
-				return
 			}
 			c.RWLayer = rwlayer


Can we put this in an else or something similar? Currently, it means that we're using a value that was produced by a function that returned an error, which means undocumented behavior; we should probably make that explicit, and only use the value for non-error returns?

It's risky to keep this as-is, as either a linter will start complaining, or someone stumbles on this code and may change it (e.g. re-introduce an early return).

Can we put this in an else or something similar?

Done

someone stumbles on this code and may change it (e.g. re-introduce an early return).

I wrote this comment specifically to ensure that no one is going to re-add an early-return here:

// A container without a rwlayer is in a bad state, but we must register that container to let users // remove it. So, log the error but do not early-return.

thaJeztah · 2025-12-16T09:06:12Z

+				// A container without a rwlayer is in a bad state, but we must register that container to let users
+				// remove it. So, log the error but do not early-return.
 				logger.WithError(err).Error("failed to load container mount")
-				return


It looks like we have a sentinel error for missing layers (which .. probably should be changed to implement errdefs.NotFound); should we handle "not found" errors different from other errors here?

moby/daemon/internal/layer/layer_store.go

Lines 544 to 552 in bdda339

func (ls *layerStore) GetRWLayer(id string) (RWLayer, error) {

ls.locker.Lock(id)

defer ls.locker.Unlock(id)

ls.mountL.Lock()

mount := ls.mounts[id]

ls.mountL.Unlock()

if mount == nil {

return nil, ErrMountDoesNotExist

Actually; looks like the containerd snapshotter does have some special handling for "not-found" errors in some parts;

moby/daemon/containerd/image_snapshot.go

Lines 173 to 196 in f74e5d4

func (i *ImageService) GetLayerByID(cid string) (container.RWLayer, error) {

ctx := context.TODO()

sn := i.client.SnapshotService(i.StorageDriver())

if _, err := sn.Stat(ctx, cid); err != nil {

if !cerrdefs.IsNotFound(err) {

return nil, fmt.Errorf("failed to stat snapshot %s: %w", cid, err)

}

return nil, errdefs.NotFound(fmt.Errorf("RW layer for container %s not found", cid))

}

ls := i.client.LeasesService()

lss, err := ls.List(ctx, "id=="+cid)

if err != nil {

return nil, err

}

switch len(lss) {

case 0:

return nil, errdefs.NotFound(errors.New("rw layer lease not found for container " + cid))

case 1:

// found

default:

log.G(ctx).WithFields(log.Fields{"container": cid, "leases": lss}).Warn("multiple leases with the same id found, this should not happen")

thaJeztah · 2025-12-16T09:09:07Z

+				// A container without a rwlayer is in a bad state, but we must register that container to let users
+				// remove it. So, log the error but do not early-return.
 				logger.WithError(err).Error("failed to load container mount")


Do we need to change the state to dead for these?

I don't think so.

dead is set in a specific code path (i.e. daemon.cleanupContainer) and shows that the Engine stopped while progressing through it. This means, either the container was autoremove, or a ContainerRm API call was in progress.

This PR is addressing another edge case. From the Desktop report:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 44b9cb711bf2 postgres:17.6-alpine3.22 "docker-entrypoint.s…" 3 weeks ago Exited (0) 3 weeks ago ea85756a438f postgres:17.6-alpine3.22 "docker-entrypoint.s…" 5 weeks ago Exited (0) 5 weeks ago cbfecd6aedec 88e904502786 "docker-entrypoint.s…" 2 months ago Exited (128) 2 months ago

I have a good explanation of how a container can end up in the Dead state (cf. my previous PR), but I have no explanation for these Exited containers without a rwlayer. Transitioning containers to Dead would remove an important clue that could be useful in future bug reports and investigations.

Moreover, the current expectation is for dead containers to be reclaimed automatically by daemon.restore(). However, that method only reclaims containers which have autoremove set:

moby/daemon/daemon.go

Lines 524 to 535 in ec136d7

if cfg.AutoRestart && c.ShouldRestart() && !c.NetworkSettings.HasSwarmEndpoint && c.HasBeenStartedBefore {

mapLock.Lock()

restartContainers[c] = make(chan struct{})

mapLock.Unlock()

} else if c.HostConfig != nil && c.HostConfig.AutoRemove {

// Remove the container if live-restore is disabled or if the container has already exited.

if !cfg.LiveRestoreEnabled || !alive {

mapLock.Lock()

removeContainers[c.ID] = c

mapLock.Unlock()

}

}

You pointed out on Slack:

we’d be restoring a container without a writable layer, so we wouldn’t be able to start it (I think?)

Yes, and the restore logic might even try to restart a container without a rwlayer.

daemon.start() calls daemon.conditionalMountOnStart() which calls daemon.Mount(). That method returns an error if the container has no rwlayer:

moby/daemon/daemon.go

Lines 1510 to 1512 in ec136d7

if container.RWLayer == nil {

return errors.New("RWLayer of container " + container.ID + " is unexpectedly nil")

}

So the (re)start flow will fail correctly.

I think we should rely on that instead of doing too much changes in daemon.restore(). Users will be able to docker ps / docker inspect / docker rm their broken containers, but won't be able to start it.

but I have no explanation for these Exited containers without a rwlayer.

Curious; could this be containers before/after switching to the containerd image store? Do we correctly detect if a container existed, but was created with graph-drivers, and now try to find its writable layer in containerd (e.g.)?

However, that method only reclaims containers which have autoremove set:

That looks like a potential bug, or at least to my knowledge, a "DEAD" container should always be attempted to be removed, as it's no longer functional, so the only thing that can be done is for it to be garbage collected 🤔

Users will be able to docker ps / docker inspect / docker rm their broken containers, but won't be able to start it.

Hm, right; but do we correctly handle things like docker commit etc?

vvoland · 2026-01-15T10:47:07Z

Moving to 29.3 to unblock 29.2. We can still backport it to 29.2.X later as it's a bugfix.

Under some undetermined circumstances, a container might end up with no rwlayer on daemon restart. The restore logic would add them to the containersReplica list but not to the 'containers' list. Thus, users would see them on `docker ps -a`, but would be unable to remove them. To fix that, add these containers to the 'containers' list as well. Signed-off-by: Albin Kerouanton <[email protected]>

thaJeztah

LGTM

akerouanton added this to the 29.2.0 milestone Dec 15, 2025

akerouanton added impact/changelog process/cherry-pick area/daemon Core Engine kind/bugfix PR's that fix bugs process/cherry-pick/docker-29.x labels Dec 15, 2025

vvoland reviewed Dec 15, 2025

View reviewed changes

vvoland approved these changes Dec 15, 2025

View reviewed changes

akerouanton mentioned this pull request Dec 15, 2025

Error response from daemon: no such container when running docker compose up docker/desktop-linux#309

Open

thaJeztah reviewed Dec 16, 2025

View reviewed changes

akerouanton force-pushed the fix-zombie-without-rwlayer branch from 7cfccb6 to 283c686 Compare December 16, 2025 10:08

thaJeztah mentioned this pull request Dec 16, 2025

Multiple dead containers that can't be deleted docker/desktop-feedback#253

Open

vvoland added the release-blocker PRs we want to block a release on label Dec 16, 2025

thaJeztah mentioned this pull request Jan 6, 2026

[docker-29.x backport] daemon: clean up dead containers on start #51693

Merged

xAt0mZ mentioned this pull request Jan 7, 2026

Containers page doesn't load when a container doesn't have a name portainer/portainer#12948

Closed

2 tasks

vvoland modified the milestones: 29.2.0, 29.3.0 Jan 15, 2026

vvoland modified the milestones: 29.3.0, 29.2.1, 29.2.2 Jan 29, 2026

vvoland added this to 🔦 Maintainer spotlight Feb 26, 2026

github-project-automation Bot moved this to New in 🔦 Maintainer spotlight Feb 26, 2026

vvoland modified the milestones: 29.2.2, 29.3.0, 29.4.0 Feb 26, 2026

thaJeztah removed the process/cherry-pick/docker-29.x label Mar 24, 2026

thaJeztah force-pushed the fix-zombie-without-rwlayer branch from 283c686 to 968fb57 Compare March 24, 2026 22:50

thaJeztah approved these changes Mar 26, 2026

View reviewed changes

thaJeztah moved this from New to Complete in 🔦 Maintainer spotlight Mar 26, 2026

vvoland merged commit b6265d0 into moby:master Mar 26, 2026
183 of 184 checks passed

akerouanton deleted the fix-zombie-without-rwlayer branch March 27, 2026 06:34

thaJeztah modified the milestones: 29.4.0, 29.3.2 Apr 1, 2026

thaJeztah mentioned this pull request Apr 4, 2026

CI began to fail on 2026-04-03: xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option #52300

Closed

claude Bot mentioned this pull request Apr 7, 2026

feat(container): update image docker.io/docker (29.3.1 → 29.4.0) drag0n141/home-ops#5233

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

daemon: restore: register containers without rwlayer#51724

daemon: restore: register containers without rwlayer#51724
vvoland merged 1 commit intomoby:masterfrom
akerouanton:fix-zombie-without-rwlayer

akerouanton commented Dec 15, 2025 •

edited by thaJeztah

Loading

Uh oh!

vvoland Dec 15, 2025

Uh oh!

akerouanton Dec 15, 2025 •

edited

Loading

Uh oh!

thaJeztah Dec 16, 2025

Uh oh!

akerouanton Dec 16, 2025

Uh oh!

thaJeztah Dec 16, 2025

Uh oh!

akerouanton Dec 16, 2025

Uh oh!

thaJeztah Dec 16, 2025

Uh oh!

thaJeztah Dec 16, 2025

Uh oh!

akerouanton Dec 16, 2025

Uh oh!

thaJeztah Dec 16, 2025

Uh oh!

vvoland commented Jan 15, 2026

Uh oh!

thaJeztah left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	func (ls *layerStore) GetRWLayer(id string) (RWLayer, error) {
	ls.locker.Lock(id)
	defer ls.locker.Unlock(id)

	ls.mountL.Lock()
	mount := ls.mounts[id]
	ls.mountL.Unlock()
	if mount == nil {
	return nil, ErrMountDoesNotExist

	func (i *ImageService) GetLayerByID(cid string) (container.RWLayer, error) {
	ctx := context.TODO()

	sn := i.client.SnapshotService(i.StorageDriver())
	if _, err := sn.Stat(ctx, cid); err != nil {
	if !cerrdefs.IsNotFound(err) {
	return nil, fmt.Errorf("failed to stat snapshot %s: %w", cid, err)
	}
	return nil, errdefs.NotFound(fmt.Errorf("RW layer for container %s not found", cid))
	}

	ls := i.client.LeasesService()
	lss, err := ls.List(ctx, "id=="+cid)
	if err != nil {
	return nil, err
	}

	switch len(lss) {
	case 0:
	return nil, errdefs.NotFound(errors.New("rw layer lease not found for container " + cid))
	case 1:
	// found
	default:
	log.G(ctx).WithFields(log.Fields{"container": cid, "leases": lss}).Warn("multiple leases with the same id found, this should not happen")

	if cfg.AutoRestart && c.ShouldRestart() && !c.NetworkSettings.HasSwarmEndpoint && c.HasBeenStartedBefore {
	mapLock.Lock()
	restartContainers[c] = make(chan struct{})
	mapLock.Unlock()
	} else if c.HostConfig != nil && c.HostConfig.AutoRemove {
	// Remove the container if live-restore is disabled or if the container has already exited.
	if !cfg.LiveRestoreEnabled \|\| !alive {
	mapLock.Lock()
	removeContainers[c.ID] = c
	mapLock.Unlock()
	}
	}

	if container.RWLayer == nil {
	return errors.New("RWLayer of container " + container.ID + " is unexpectedly nil")
	}

Conversation

akerouanton commented Dec 15, 2025 • edited by thaJeztah Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akerouanton Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vvoland commented Jan 15, 2026

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

akerouanton commented Dec 15, 2025 •

edited by thaJeztah

Loading

akerouanton Dec 15, 2025 •

edited

Loading