Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set daemon root to use shared propagation #36096

Merged
merged 1 commit into from
Jan 24, 2018

Conversation

cpuguy83
Copy link
Member

This change sets an explicit mount propagation for the daemon root.
This is useful for people who need to bind mount the docker daemon root
into a container.

Since bind mounting the daemon root should only ever happen with at
least rlsave propagation (to prevent the container from holding
references to mounts making it impossible for the daemon to clean up its
resources), we should make sure the user is actually able to this.

Most modern systems have shared root (/) propagation by default
already, however there are some cases where this may not be so
(e.g. potentially docker-in-docker scenarios, but also other cases).
So this just gives the daemon a little more control here and provides
a more uniform experience across different systems.

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐻

@@ -1169,6 +1170,12 @@ func setupDaemonRoot(config *config.Config, rootDir string, rootIDs idtools.IDPa
}
}
}

if err := ensureSharedOrSlave(config.Root); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone has set MountFlags=Private on docker.service in systemd, won't this result in dockerd root being converted to shared, thereby negating the operator's preference to isolate docker mounts? If so, and that was their intent, we should recommend they switch to MountFlags=slave.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With MountFlags=Private, this would till be disconnected from the parent mount namespace even though we've set it to shared.
So new mounts would not leak in or out (to/from the parent).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me.

This change sets an explicit mount propagation for the daemon root.
This is useful for people who need to bind mount the docker daemon root
into a container.

Since bind mounting the daemon root should only ever happen with at
least `rlsave` propagation (to prevent the container from holding
references to mounts making it impossible for the daemon to clean up its
resources), we should make sure the user is actually able to this.

Most modern systems have shared root (`/`) propagation by default
already, however there are some cases where this may not be so
(e.g. potentially docker-in-docker scenarios, but also other cases).
So this just gives the daemon a little more control here and provides
a more uniform experience across different systems.

Signed-off-by: Brian Goff <[email protected]>
@cpuguy83 cpuguy83 force-pushed the use_rshared_prop_for_daemon_root branch from 1954c67 to a510192 Compare January 23, 2018 22:17
@kolyshkin
Copy link
Contributor

LGTM

Copy link
Member

@yongtang yongtang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yongtang yongtang merged commit 3ca99ac into moby:master Jan 24, 2018
@tonistiigi
Copy link
Member

Where is this mount cleaned up?

This is useful for people who need to bind mount the docker daemon root
into a container.

Why do we want to allow that? We are setting the parent directories of graph-dir and container dir to private/unbindable (in part) to avoid these accidental leaks and possible EBUSY. If something outside is tracking the graphdriver mounts then the reference counter in Docker gets confused and will delete mounted data (or at least in some versions in aufs fail doing that).

@cpuguy83
Copy link
Member Author

@tonistiigi

Where is this mount cleaned up?

Nice catch, I'll follow up to clean that up on shutdown.

Why do we want to allow that?

It's already alllowed today, lots of things do it including Kube, cadavisor, and even docker-ee.
We could restrict this even farther to say you can only mount with slave propagation (e.g. no changes in a container would propagate to the host), but I didn't feel like it's totally necessary to make such a restriction since someone has to actually specify shared propagation.... there must be some reason they decided to.

We are setting the parent directories of graph-dir and container dir to private/unbindable

We pretty much need to stop setting private propagation on these things because it is really causing a lot of these issues. See #36047 for this change and explanation.

We could do unbindable... and maybe we should still... but I worry about this breaking people. I definitely don't think we should set the graph dir itself to unbindable because this would prevent any sort of way to do analysis on the graph driver dir from a container. This would mean setting up a new sub-dir (like we did for IPC and secret mounts on containers) and adjusting the graphdriver implementation to use it, but also be backwards compatible for old containers (live restore at least).

@cpuguy83 cpuguy83 deleted the use_rshared_prop_for_daemon_root branch January 24, 2018 23:03
@cpuguy83
Copy link
Member Author

Also, maybe missing #36055 as context to also fix issues with people bind mounting the daemon root.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants