Set daemon root to use shared propagation #36096

cpuguy83 · 2018-01-23T19:23:22Z

This change sets an explicit mount propagation for the daemon root.
This is useful for people who need to bind mount the docker daemon root
into a container.

Since bind mounting the daemon root should only ever happen with at
least rlsave propagation (to prevent the container from holding
references to mounts making it impossible for the daemon to clean up its
resources), we should make sure the user is actually able to this.

Most modern systems have shared root (/) propagation by default
already, however there are some cases where this may not be so
(e.g. potentially docker-in-docker scenarios, but also other cases).
So this just gives the daemon a little more control here and provides
a more uniform experience across different systems.

vdemeester

LGTM 🐻

trapier · 2018-01-23T20:24:01Z

daemon/daemon_unix.go

@@ -1169,6 +1170,12 @@ func setupDaemonRoot(config *config.Config, rootDir string, rootIDs idtools.IDPa
 			}
 		}
 	}
+
+	if err := ensureSharedOrSlave(config.Root); err != nil {


If someone has set MountFlags=Private on docker.service in systemd, won't this result in dockerd root being converted to shared, thereby negating the operator's preference to isolate docker mounts? If so, and that was their intent, we should recommend they switch to MountFlags=slave.

With MountFlags=Private, this would till be disconnected from the parent mount namespace even though we've set it to shared.
So new mounts would not leak in or out (to/from the parent).

works for me.

This change sets an explicit mount propagation for the daemon root. This is useful for people who need to bind mount the docker daemon root into a container. Since bind mounting the daemon root should only ever happen with at least `rlsave` propagation (to prevent the container from holding references to mounts making it impossible for the daemon to clean up its resources), we should make sure the user is actually able to this. Most modern systems have shared root (`/`) propagation by default already, however there are some cases where this may not be so (e.g. potentially docker-in-docker scenarios, but also other cases). So this just gives the daemon a little more control here and provides a more uniform experience across different systems. Signed-off-by: Brian Goff <[email protected]>

kolyshkin · 2018-01-24T01:13:30Z

LGTM

yongtang

LGTM

tonistiigi · 2018-01-24T22:38:44Z

Where is this mount cleaned up?

This is useful for people who need to bind mount the docker daemon root
into a container.

Why do we want to allow that? We are setting the parent directories of graph-dir and container dir to private/unbindable (in part) to avoid these accidental leaks and possible EBUSY. If something outside is tracking the graphdriver mounts then the reference counter in Docker gets confused and will delete mounted data (or at least in some versions in aufs fail doing that).

cpuguy83 · 2018-01-24T23:03:41Z

@tonistiigi

Where is this mount cleaned up?

Nice catch, I'll follow up to clean that up on shutdown.

Why do we want to allow that?

It's already alllowed today, lots of things do it including Kube, cadavisor, and even docker-ee.
We could restrict this even farther to say you can only mount with slave propagation (e.g. no changes in a container would propagate to the host), but I didn't feel like it's totally necessary to make such a restriction since someone has to actually specify shared propagation.... there must be some reason they decided to.

We are setting the parent directories of graph-dir and container dir to private/unbindable

We pretty much need to stop setting private propagation on these things because it is really causing a lot of these issues. See #36047 for this change and explanation.

We could do unbindable... and maybe we should still... but I worry about this breaking people. I definitely don't think we should set the graph dir itself to unbindable because this would prevent any sort of way to do analysis on the graph driver dir from a container. This would mean setting up a new sub-dir (like we did for IPC and secret mounts on containers) and adjusting the graphdriver implementation to use it, but also be backwards compatible for old containers (live restore at least).

cpuguy83 · 2018-01-24T23:06:48Z

Also, maybe missing #36055 as context to also fix issues with people bind mounting the daemon root.

GordonTheTurtle added the status/0-triage label Jan 23, 2018

vdemeester added status/2-code-review and removed status/0-triage labels Jan 23, 2018

vdemeester requested review from tonistiigi, thaJeztah and mlaventure January 23, 2018 19:37

vdemeester approved these changes Jan 23, 2018

View reviewed changes

trapier reviewed Jan 23, 2018

View reviewed changes

cpuguy83 force-pushed the use_rshared_prop_for_daemon_root branch from 1954c67 to a510192 Compare January 23, 2018 22:17

cpuguy83 added the rebuild/windowsRS1 label Jan 24, 2018

GordonTheTurtle removed the rebuild/windowsRS1 label Jan 24, 2018

yongtang added the rebuild/windowsRS1 label Jan 24, 2018

GordonTheTurtle removed the rebuild/windowsRS1 label Jan 24, 2018

yongtang approved these changes Jan 24, 2018

View reviewed changes

yongtang merged commit 3ca99ac into moby:master Jan 24, 2018

cpuguy83 deleted the use_rshared_prop_for_daemon_root branch January 24, 2018 23:03

cpuguy83 mentioned this pull request Jan 29, 2018

Unable to remove a stopped container: device or resource busy #22260

Closed

cpuguy83 mentioned this pull request Feb 7, 2018

[17.12] Set daemon root to use shared propagation docker-archive/docker-ce#416

Merged

tonistiigi mentioned this pull request Feb 15, 2018

Ensure daemon root is unmounted on shutdown #36107

Merged

gbarr01 mentioned this pull request Feb 20, 2018

[17.12] update changelog for 17.12.1-ce-rc2 docker-archive/docker-ce#431

Merged

This was referenced Mar 19, 2018

Container state "Removal In Progress" when ntpd/chronyd is active docker/for-linux#124

Closed

Cannot delete dead containers with overlay2 on RHEL 7.4 #34538

Closed

cpuguy83 mentioned this pull request May 28, 2019

aufs: retry umount on ebusy, ignore ENOENT in graphdriver.Mounted #39270

Merged

cpuguy83 mentioned this pull request Feb 14, 2020

Pod is stuck in terminating due to containerd-shim unmount error. containerd/containerd#4020

Closed

thaJeztah added the area/storage label Jun 22, 2024

thaJeztah added the area/daemon label Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set daemon root to use shared propagation #36096

Set daemon root to use shared propagation #36096

cpuguy83 commented Jan 23, 2018

vdemeester left a comment

trapier Jan 23, 2018

cpuguy83 Jan 23, 2018

trapier Jan 23, 2018

kolyshkin commented Jan 24, 2018

yongtang left a comment

tonistiigi commented Jan 24, 2018

cpuguy83 commented Jan 24, 2018

cpuguy83 commented Jan 24, 2018

Set daemon root to use shared propagation #36096

Set daemon root to use shared propagation #36096

Conversation

cpuguy83 commented Jan 23, 2018

vdemeester left a comment

Choose a reason for hiding this comment

trapier Jan 23, 2018

Choose a reason for hiding this comment

cpuguy83 Jan 23, 2018

Choose a reason for hiding this comment

trapier Jan 23, 2018

Choose a reason for hiding this comment

kolyshkin commented Jan 24, 2018

yongtang left a comment

Choose a reason for hiding this comment

tonistiigi commented Jan 24, 2018

cpuguy83 commented Jan 24, 2018

cpuguy83 commented Jan 24, 2018