Skip to content

Commit da3dc1e

Browse files
committed
core/mount: Retry unmounting idmapped directories
When a lot of pods are created all at the same time, the umount fails with EBUSY sometimes. This causes some mounts to be leaked. A way to repro this issue is here: containerd#12139 (comment) While using lsof/fuser it was not possible to get the culprits on time (the mount is busy for a few ms), @fuwei has found what is causing the mount to be busy: containerd#12139 (comment) When we fork in `GetUsernsFD()`, if a call to prepareIDMappedOverlay() is ongoing and has an open fd to the path to idmap, then the unmount callback will fail as the forked process still has an open fd to it. Let's handle the idmap unmounts in the same way other temp mounts are done, using the Unmount() helper, that retries the unmount. This retry fixes it 100% of the times on my system and in the systems of @halaney that reported the issue too. This was originally fixed by containerd#10721 by using a detached mount, but the mount was switched to non-detached again in containerd#10955 and we started to leak mounts. As @fuwei mentioned, using a detached mount is quite invisible for the admin and, therefore, a retry is a better alternative. It seems these umounts are a great candidate for containerd#11303, which will manage the life-cycle of mounts and can handle these retries whenever needed. [1]: To do it, I just run as root "unshare -m", that creates a mntns with private propagation, and then run the containerd daemon. Signed-off-by: Rodrigo Campos <[email protected]>
1 parent 27ba690 commit da3dc1e

1 file changed

Lines changed: 6 additions & 2 deletions

File tree

core/mount/mount_linux.go

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -266,8 +266,12 @@ func doPrepareIDMappedOverlay(tmpDir string, lowerDirs []string, usernsFd int) (
266266
return nil, nil, err
267267
}
268268
cleanMount := func() {
269-
if err := unix.Unmount(tempRemountsLocation, 0); err != nil {
270-
log.L.WithError(err).Warnf("failed to unmount idmapped directory %s", tempRemountsLocation)
269+
// Use the Unmount helper that does retries because there can be easily an open fd
270+
// to the idmapped directory and when containerd forks to create a userns fd (maybe
271+
// for another container), it will make the mount busy for a few ms.
272+
err := Unmount(tempRemountsLocation, 0)
273+
if err != nil {
274+
log.L.WithError(err).Warnf("failed to unmount idmapped directory %s: %v", tempRemountsLocation, err)
271275
}
272276
}
273277
defer func() {

0 commit comments

Comments
 (0)