Description
containerd cleans up everything in tmpmounts when it starts up0. Starting with
1e3d10d ("Make ovl idmap mounts read-only"), at least on my systems, many more
mounts end up being leaked during containerd's lifecycle to the point where its a concern
in periods between restarting containerd.
# On a system that's been taking quick pods all night and is now empty
$ ls /tmp/ | grep ovl-idmapped | wc -l
0
$ ls /mnt/containerd/tmpmounts/ | grep ovl-idmapped | wc -l
2204
$ mount | grep tmpmounts | wc -l
2204
$ mount | grep ovl-idmapped | wc -l
2204
These are mounts made a part of WithUserID() while using user namespace enabled pods and come from containerd proper. The shims also make these sorts of mounts, but they're on /tmp/ instead and don't end up leaking in my tests.
The mentioned commit seemed to have some background over here, where its also reported that @mbaynton and @rata seemed to have similar issues, whereas @fuweid did not.
I for the life of me (i.e. an hour or two of poking around with bpf tools like opensnoop) can't figure out why exactly those mounts are reporting -EBUSY on my system, but they are with 2.1 and main.
Leaking these mounts is bad as the mount table gets quite large, which can lead to large systemd slow downs, etc.
I'm wondering if we should prefer to go back to MNT_DETACH, or if we should add a retry mechanism on -EBUSY similar to other umount's in containerd? It seems hiding issues with MNT_DETACH isn't ideal, so maybe the latter is a safer bet.
For now on my systems I'm setting up a cron to clean up mounts in that folder that are considered "old enough", otherwise this would impact us too much.
Steps to reproduce the issue
- Run some user namespace enabled pods on containerd versions containing 1e3d10d
- Watch mounts hang around in
<root>/tmpmounts/ovl-idmapped*
Describe the results you received and expected
Very few if any mounts are leaked
What version of containerd are you using?
v2.1.0-226-g43f9cdd3b 43f9cdd
Any other relevant information
No response
Show configuration if it is related to CRI plugin.
No response
Description
containerd cleans up everything in tmpmounts when it starts up0. Starting with
1e3d10d
("Make ovl idmap mounts read-only"), at least on my systems, many moremounts end up being leaked during containerd's lifecycle to the point where its a concern
in periods between restarting containerd.
These are mounts made a part of
WithUserID()while using user namespace enabled pods and come from containerd proper. The shims also make these sorts of mounts, but they're on/tmp/instead and don't end up leaking in my tests.The mentioned commit seemed to have some background over here, where its also reported that @mbaynton and @rata seemed to have similar issues, whereas @fuweid did not.
I for the life of me (i.e. an hour or two of poking around with bpf tools like opensnoop) can't figure out why exactly those mounts are reporting
-EBUSYon my system, but they are with 2.1 and main.Leaking these mounts is bad as the mount table gets quite large, which can lead to large systemd slow downs, etc.
I'm wondering if we should prefer to go back to
MNT_DETACH, or if we should add a retry mechanism on-EBUSYsimilar to other umount's in containerd? It seems hiding issues withMNT_DETACHisn't ideal, so maybe the latter is a safer bet.For now on my systems I'm setting up a cron to clean up mounts in that folder that are considered "old enough", otherwise this would impact us too much.
Steps to reproduce the issue
<root>/tmpmounts/ovl-idmapped*Describe the results you received and expected
Very few if any mounts are leaked
What version of containerd are you using?
v2.1.0-226-g43f9cdd3b 43f9cdd
Any other relevant information
No response
Show configuration if it is related to CRI plugin.
No response