Commit da3dc1e
committed
core/mount: Retry unmounting idmapped directories
When a lot of pods are created all at the same time, the umount fails
with EBUSY sometimes. This causes some mounts to be leaked. A way to
repro this issue is here:
containerd#12139 (comment)
While using lsof/fuser it was not possible to get the culprits on time
(the mount is busy for a few ms), @fuwei has found what is causing the
mount to be busy:
containerd#12139 (comment)
When we fork in `GetUsernsFD()`, if a call to prepareIDMappedOverlay()
is ongoing and has an open fd to the path to idmap, then the unmount
callback will fail as the forked process still has an open fd to it.
Let's handle the idmap unmounts in the same way other temp mounts are
done, using the Unmount() helper, that retries the unmount.
This retry fixes it 100% of the times on my system and in the systems
of @halaney that reported the issue too.
This was originally fixed by containerd#10721 by using a detached mount, but the
mount was switched to non-detached again in containerd#10955 and we started to
leak mounts. As @fuwei mentioned, using a detached mount is quite
invisible for the admin and, therefore, a retry is a better alternative.
It seems these umounts are a great candidate for containerd#11303, which will
manage the life-cycle of mounts and can handle these retries whenever
needed.
[1]: To do it, I just run as root "unshare -m", that creates a mntns
with private propagation, and then run the containerd daemon.
Signed-off-by: Rodrigo Campos <[email protected]>1 parent 27ba690 commit da3dc1e
1 file changed
Lines changed: 6 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
266 | 266 | | |
267 | 267 | | |
268 | 268 | | |
269 | | - | |
270 | | - | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
271 | 275 | | |
272 | 276 | | |
273 | 277 | | |
| |||
0 commit comments