daemon: restore: register containers without rwlayer#51724
daemon: restore: register containers without rwlayer#51724vvoland merged 1 commit intomoby:masterfrom
Conversation
| // A container without a rwlayer is in a bad state, but we must register that container to let users | ||
| // remove it. So, log the error but do not early-return. | ||
| logger.WithError(err).Error("failed to load container mount") | ||
| return |
There was a problem hiding this comment.
Looking at the history, looks like this logic has been there for quite a long time...
We should investigate why this is happening
There was a problem hiding this comment.
I didn't put as much details in my commit message as I did in #51692.
Before 9f5f4f5, restore would scan the state directory to build the list of containers to restore (i.e. put in the restore-local containers slice). If the rwlayer was missing, it'd log the error message and the early-return I'm removing in this PR would prevent the container from being added to daemon.containers and daemon.containersReplica.
Starting with 9f5f4f5, loadContainers just scans the state directory to build the list of containers, without checking rwlayers. Then, that list is passed to restore, which does the check. If the check fails, the container isn't register()'d (i.e. not added to daemon.containers). However, the whole of containers is used to rebuild daemon.containersReplica.
Hence, I believe that this is an old bug (probably going back to the introduction of containerd snapshotter support) that no one spotted beforehand because the only and very subtle symptom was a state dir with more containers than returned by docker ps -a.
There was a problem hiding this comment.
It looks like we have a sentinel error for missing layers (which .. probably should be changed to implement errdefs.NotFound); should we handle "not found" errors different from other errors here?
moby/daemon/internal/layer/layer_store.go
Lines 544 to 552 in bdda339
Actually; looks like the containerd snapshotter does have some special handling for "not-found" errors in some parts;
moby/daemon/containerd/image_snapshot.go
Lines 173 to 196 in f74e5d4
There was a problem hiding this comment.
It looks like we have a sentinel error for missing layers (which .. probably should be changed to implement errdefs.NotFound); should we handle "not found" errors different from other errors here?
No, we should handle all errors the same way to ensure that the container is added to both daemon.containers and daemon.containersReplica.
| logger.WithError(err).Error("failed to load container mount") | ||
| return | ||
| } | ||
| c.RWLayer = rwlayer |
There was a problem hiding this comment.
Can we put this in an else or something similar? Currently, it means that we're using a value that was produced by a function that returned an error, which means undocumented behavior; we should probably make that explicit, and only use the value for non-error returns?
It's risky to keep this as-is, as either a linter will start complaining, or someone stumbles on this code and may change it (e.g. re-introduce an early return).
There was a problem hiding this comment.
Can we put this in an else or something similar?
Done
someone stumbles on this code and may change it (e.g. re-introduce an early return).
I wrote this comment specifically to ensure that no one is going to re-add an early-return here:
// A container without a rwlayer is in a bad state, but we must register that container to let users
// remove it. So, log the error but do not early-return.
| // A container without a rwlayer is in a bad state, but we must register that container to let users | ||
| // remove it. So, log the error but do not early-return. | ||
| logger.WithError(err).Error("failed to load container mount") | ||
| return |
There was a problem hiding this comment.
It looks like we have a sentinel error for missing layers (which .. probably should be changed to implement errdefs.NotFound); should we handle "not found" errors different from other errors here?
moby/daemon/internal/layer/layer_store.go
Lines 544 to 552 in bdda339
Actually; looks like the containerd snapshotter does have some special handling for "not-found" errors in some parts;
moby/daemon/containerd/image_snapshot.go
Lines 173 to 196 in f74e5d4
| // A container without a rwlayer is in a bad state, but we must register that container to let users | ||
| // remove it. So, log the error but do not early-return. | ||
| logger.WithError(err).Error("failed to load container mount") |
There was a problem hiding this comment.
Do we need to change the state to dead for these?
There was a problem hiding this comment.
I don't think so.
dead is set in a specific code path (i.e. daemon.cleanupContainer) and shows that the Engine stopped while progressing through it. This means, either the container was autoremove, or a ContainerRm API call was in progress.
This PR is addressing another edge case. From the Desktop report:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
44b9cb711bf2 postgres:17.6-alpine3.22 "docker-entrypoint.s…" 3 weeks ago Exited (0) 3 weeks ago
ea85756a438f postgres:17.6-alpine3.22 "docker-entrypoint.s…" 5 weeks ago Exited (0) 5 weeks ago
cbfecd6aedec 88e904502786 "docker-entrypoint.s…" 2 months ago Exited (128) 2 months ago
I have a good explanation of how a container can end up in the Dead state (cf. my previous PR), but I have no explanation for these Exited containers without a rwlayer. Transitioning containers to Dead would remove an important clue that could be useful in future bug reports and investigations.
Moreover, the current expectation is for dead containers to be reclaimed automatically by daemon.restore(). However, that method only reclaims containers which have autoremove set:
Lines 524 to 535 in ec136d7
You pointed out on Slack:
we’d be restoring a container without a writable layer, so we wouldn’t be able to start it (I think?)
Yes, and the restore logic might even try to restart a container without a rwlayer.
daemon.start() calls daemon.conditionalMountOnStart() which calls daemon.Mount(). That method returns an error if the container has no rwlayer:
Lines 1510 to 1512 in ec136d7
So the (re)start flow will fail correctly.
I think we should rely on that instead of doing too much changes in daemon.restore(). Users will be able to docker ps / docker inspect / docker rm their broken containers, but won't be able to start it.
There was a problem hiding this comment.
but I have no explanation for these Exited containers without a rwlayer.
Curious; could this be containers before/after switching to the containerd image store? Do we correctly detect if a container existed, but was created with graph-drivers, and now try to find its writable layer in containerd (e.g.)?
However, that method only reclaims containers which have autoremove set:
That looks like a potential bug, or at least to my knowledge, a "DEAD" container should always be attempted to be removed, as it's no longer functional, so the only thing that can be done is for it to be garbage collected 🤔
Users will be able to docker ps / docker inspect / docker rm their broken containers, but won't be able to start it.
Hm, right; but do we correctly handle things like docker commit etc?
7cfccb6 to
283c686
Compare
|
Moving to 29.3 to unblock 29.2. We can still backport it to 29.2.X later as it's a bugfix. |
Under some undetermined circumstances, a container might end up with no rwlayer on daemon restart. The restore logic would add them to the containersReplica list but not to the 'containers' list. Thus, users would see them on `docker ps -a`, but would be unable to remove them. To fix that, add these containers to the 'containers' list as well. Signed-off-by: Albin Kerouanton <[email protected]>
283c686 to
968fb57
Compare
docker compose updocker/desktop-linux#309- What I did
Under some undetermined circumstances, a container might end up with no rwlayer on daemon restart. The restore logic would add them to the containersReplica list but not to the 'containers' list. Thus, users would see them on
docker ps -a, but would be unable to remove them.To fix that, add these containers to the 'containers' list as well.
- Human readable description for the release notes