Skip to content

nspawn: let's mount(/tmp) inside the user namespace#4340

Merged
keszybz merged 1 commit intosystemd:masterfrom
evverx:fix-nspawn-U-failed-unmounting-temp-dir
Oct 11, 2016
Merged

nspawn: let's mount(/tmp) inside the user namespace#4340
keszybz merged 1 commit intosystemd:masterfrom
evverx:fix-nspawn-U-failed-unmounting-temp-dir

Conversation

@evverx
Copy link
Contributor

@evverx evverx commented Oct 11, 2016

Fixes:

host# systemd-nspawn -D ... -U -b systemd.unit=multi-user.target
...
$ grep /tmp /proc/self/mountinfo
154 145 0:41 / /tmp rw - tmpfs tmpfs rw,seclabel,uid=1036124160,gid=1036124160

$ umount /tmp
umount: /tmp: not mounted

$ systemctl poweroff
...
[FAILED] Failed unmounting Temporary Directory.

@keszybz , please, have a look. I know you have already seen

[FAILED] Failed unmounting Temporary Directory.

@evverx evverx added the nspawn label Oct 11, 2016
@evverx
Copy link
Contributor Author

evverx commented Oct 11, 2016

@tixxdz , please, have a look. Seems like you know everything about mount/user-namespace interaction :-)

{ "tmpfs", "/dev/shm", "tmpfs", "mode=1777", MS_NOSUID|MS_NODEV|MS_STRICTATIME, true, false, false },
{ "tmpfs", "/run", "tmpfs", "mode=755", MS_NOSUID|MS_NODEV|MS_STRICTATIME, true, false, false },
{ "tmpfs", "/tmp", "tmpfs", "mode=1777", MS_STRICTATIME, true, false, false },
{ "tmpfs", "/tmp", "tmpfs", "mode=1777", MS_STRICTATIME, true, true, false },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nitpick: please ensure to keep the columns aligned here!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poettering , oh. Fixed. Thanks!

@poettering
Copy link
Member

lgtm, assuming it works fine. @tixxdz, @keszybz do you want to have a look, too?

Fixes:
host# systemd-nspawn -D ... -U -b systemd.unit=multi-user.target
...
$ grep /tmp /proc/self/mountinfo
154 145 0:41 / /tmp rw - tmpfs tmpfs rw,seclabel,uid=1036124160,gid=1036124160

$ umount /tmp
umount: /root/tmp: not mounted

$ systemctl poweroff
...
[FAILED] Failed unmounting Temporary Directory.
@evverx evverx force-pushed the fix-nspawn-U-failed-unmounting-temp-dir branch 2 times, most recently from 56528f0 to e9d0cc9 Compare October 11, 2016 15:25
@tixxdz
Copy link
Member

tixxdz commented Oct 11, 2016

@evverx hey, the change seems fine, however I think I'm missing some bits about what's actually happening here why it was failing... I would say it depends if you mount it inside user namespace the user will own the filesystem which means it will also relate to the memory controller and if the user can use it to consume memory or maybe there are limits and in that case it may fail due to these strict limits ... ?

Maybe it's better if you keep the original /tmp mount outside + add another /tmp inside userns, and maybe unmount the first tmpfs the one outside during nspawn cleaning as it is done with custom mounts ? or maybe just try to clean it when nspawn is finishing ? otherwise if your patch is the proper fix then lgtm ;-)

Thanks!

@evverx
Copy link
Contributor Author

evverx commented Oct 11, 2016

@tixxdz , thanks!

I think I'm missing some bits about what's actually happening here why it was failing

Well, the minimal reproducer:

# nspawn
root# unshare -m

#outer child
root# mkdir -p tmp
root# mount -t tmpfs tmpfs ./tmp/

root# unshare -m -U -r

#inner child
root# umount ./tmp
umount: /root/tmp: not mounted

root# strace -e umount2 umount ./tmp
umount2("/root/tmp", 0)                 = -1 EINVAL (Invalid argument)
umount: /root/tmp: not mounted
+++ exited with 32 +++

or maybe there are limits and in that case it may fail due to these strict limits ... ?

There are no limits by default. But I guess people limit [email protected], machine.slice or so.

the first tmpfs the one outside during nspawn cleaning as it is done with custom mounts ?

Well, the systemd inside the container tries to umount all mounts on shutdown. So, it gets -EINVAL and produces [FAILED] Failed unmounting .... And actually we always get:

[FAILED] Failed unmounting /run/systemd/nspawn/incoming.

and so on. And I'm not sure what to do about it :)

@keszybz
Copy link
Member

keszybz commented Oct 11, 2016

if you mount it inside user namespace the user will own the filesystem which means it will also relate to the memory controller and if the user can use it to consume memory or maybe there are limits and in that case it may fail due to these strict limits ... ?

That seems like a good thing, no? We want those limits to apply to the container if possible.

@keszybz
Copy link
Member

keszybz commented Oct 11, 2016

Yep, I can confirm that this fixes the error message during shutdown.

@keszybz keszybz merged commit 8492849 into systemd:master Oct 11, 2016
@evverx evverx deleted the fix-nspawn-U-failed-unmounting-temp-dir branch October 12, 2016 01:31
@tixxdz
Copy link
Member

tixxdz commented Oct 12, 2016

@evverx yeh @keszybz it's not clear and probably the safest thing is always mount tmpfs with 'size' option or have memory controller active...

That seems like a good thing, no? We want those limits to apply to the container if possible

Yes but in most these case users have to explicitly set them, if you do mounts inside user namespace, then depending on the fs type, quotas, xattrs, probably security labels and others ... they will be translated into that new user namespace which means an unprivileged user will be privileged in that context, instead of the traditional easy way where you are in init user namespace and you can trust the real root to not fill up disk space, ram or other harmful operations... so you only care about other users, but in this case you have to set it explicitly. Now tmpfs is a bit limited, and also for this case the mount is world writable so probably setting a size or memory resource will do it, however for tmpfs mounts which are mode=755,uid=0, mounting in init user namespace and other user namespaces is not the same, in init user namespace you can trust the real root to not DoS the machine where in other user namespaces you have to set explicitly resource limits for the mapped users... Anyway thanks for fixing ;-)

@evverx evverx restored the fix-nspawn-U-failed-unmounting-temp-dir branch October 12, 2016 20:00
@evverx evverx deleted the fix-nspawn-U-failed-unmounting-temp-dir branch October 12, 2016 20:13
s-urbaniak pushed a commit to s-urbaniak/systemd that referenced this pull request Jan 17, 2017
Fixes:
host# systemd-nspawn -D ... -U -b systemd.unit=multi-user.target
...
$ grep /tmp /proc/self/mountinfo
154 145 0:41 / /tmp rw - tmpfs tmpfs rw,seclabel,uid=1036124160,gid=1036124160

$ umount /tmp
umount: /root/tmp: not mounted

$ systemctl poweroff
...
[FAILED] Failed unmounting Temporary Directory.
crawford added a commit to coreos/systemd that referenced this pull request Jan 17, 2017
backport: "nspawn: let's mount(/tmp) inside the user namespace (systemd#4340)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

4 participants