nspawn: let's mount(/tmp) inside the user namespace#4340
nspawn: let's mount(/tmp) inside the user namespace#4340keszybz merged 1 commit intosystemd:masterfrom
Conversation
|
@tixxdz , please, have a look. Seems like you know everything about mount/user-namespace interaction :-) |
src/nspawn/nspawn-mount.c
Outdated
| { "tmpfs", "/dev/shm", "tmpfs", "mode=1777", MS_NOSUID|MS_NODEV|MS_STRICTATIME, true, false, false }, | ||
| { "tmpfs", "/run", "tmpfs", "mode=755", MS_NOSUID|MS_NODEV|MS_STRICTATIME, true, false, false }, | ||
| { "tmpfs", "/tmp", "tmpfs", "mode=1777", MS_STRICTATIME, true, false, false }, | ||
| { "tmpfs", "/tmp", "tmpfs", "mode=1777", MS_STRICTATIME, true, true, false }, |
There was a problem hiding this comment.
minor nitpick: please ensure to keep the columns aligned here!
Fixes: host# systemd-nspawn -D ... -U -b systemd.unit=multi-user.target ... $ grep /tmp /proc/self/mountinfo 154 145 0:41 / /tmp rw - tmpfs tmpfs rw,seclabel,uid=1036124160,gid=1036124160 $ umount /tmp umount: /root/tmp: not mounted $ systemctl poweroff ... [FAILED] Failed unmounting Temporary Directory.
56528f0 to
e9d0cc9
Compare
|
@evverx hey, the change seems fine, however I think I'm missing some bits about what's actually happening here why it was failing... I would say it depends if you mount it inside user namespace the user will own the filesystem which means it will also relate to the memory controller and if the user can use it to consume memory or maybe there are limits and in that case it may fail due to these strict limits ... ? Maybe it's better if you keep the original /tmp mount outside + add another /tmp inside userns, and maybe unmount the first tmpfs the one outside during nspawn cleaning as it is done with custom mounts ? or maybe just try to clean it when nspawn is finishing ? otherwise if your patch is the proper fix then lgtm ;-) Thanks! |
|
@tixxdz , thanks!
Well, the minimal reproducer: # nspawn
root# unshare -m
#outer child
root# mkdir -p tmp
root# mount -t tmpfs tmpfs ./tmp/
root# unshare -m -U -r
#inner child
root# umount ./tmp
umount: /root/tmp: not mounted
root# strace -e umount2 umount ./tmp
umount2("/root/tmp", 0) = -1 EINVAL (Invalid argument)
umount: /root/tmp: not mounted
+++ exited with 32 +++
There are no limits by default. But I guess people limit
Well, the systemd inside the container tries to [FAILED] Failed unmounting /run/systemd/nspawn/incoming.and so on. And I'm not sure what to do about it :) |
That seems like a good thing, no? We want those limits to apply to the container if possible. |
|
Yep, I can confirm that this fixes the error message during shutdown. |
|
@evverx yeh @keszybz it's not clear and probably the safest thing is always mount tmpfs with 'size' option or have memory controller active...
Yes but in most these case users have to explicitly set them, if you do mounts inside user namespace, then depending on the fs type, quotas, xattrs, probably security labels and others ... they will be translated into that new user namespace which means an unprivileged user will be privileged in that context, instead of the traditional easy way where you are in init user namespace and you can trust the real root to not fill up disk space, ram or other harmful operations... so you only care about other users, but in this case you have to set it explicitly. Now tmpfs is a bit limited, and also for this case the mount is world writable so probably setting a size or memory resource will do it, however for tmpfs mounts which are |
Fixes: host# systemd-nspawn -D ... -U -b systemd.unit=multi-user.target ... $ grep /tmp /proc/self/mountinfo 154 145 0:41 / /tmp rw - tmpfs tmpfs rw,seclabel,uid=1036124160,gid=1036124160 $ umount /tmp umount: /root/tmp: not mounted $ systemctl poweroff ... [FAILED] Failed unmounting Temporary Directory.
backport: "nspawn: let's mount(/tmp) inside the user namespace (systemd#4340)"
Fixes:
@keszybz , please, have a look. I know you have already seen