Skip to content

Allow running dockerd as a non-root user (Rootless mode)#38050

Merged
thaJeztah merged 1 commit intomoby:masterfrom
AkihiroSuda:rootless
Feb 3, 2019
Merged

Allow running dockerd as a non-root user (Rootless mode)#38050
thaJeztah merged 1 commit intomoby:masterfrom
AkihiroSuda:rootless

Conversation

@AkihiroSuda
Copy link
Copy Markdown
Member

@AkihiroSuda AkihiroSuda commented Oct 16, 2018

- What I did

Allow running dockerd in an unprivileged user namespace (rootless mode).
Close #37375

No SETUID/SETCAP binary is required, except newuidmap and newgidmap.

For Kubernetes integration, please refer to https://github.com/rootless-containers/usernetes .

This PR contains two commits, but the first one is same as #38038 (overlayfs in userns for Ubuntu).
I'll rebase this PR when #38038 gets merged.
(Updated: #38083 is merged now)

- How I did it

By using user_namespaces(7), mount_namespaces(7), network_namespaces(7), and slirp4netns.

Please refer to docs/rootless.md for the details.

- How to verify it

  • Make sure /etc/subuid and /etc/subgid contain the entry for you
$ id -u
1001
$ whoami
penguin
$ grep ^$(whoami): /etc/subuid
penguin:231072:65536
$ grep ^$(whoami): /etc/subgid
penguin:231072:65536
  • Start daemon: dockerd-rootless.sh --experimental
  • Start client: docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...

Remarks:

  • Some distros such as Debian (excluding Ubuntu) and Arch Linux require sudo sh -c "echo 1 > /proc/sys/kernel/unprivileged_userns_clone".
  • Some distros require sudo modprobe ip_tables iptable_mangle iptable_nat iptable_filter.

Restrictions:

  • Only vfs graphdriver is supported. However, on Ubuntu and a few distros, overlay2 and overlay are also supported. Starting with Linux 4.18, we will be also able to implement FUSE snapshotters.
  • Cgroups (including docker top) and AppArmor are disabled at the moment. In future, Cgroups will be optionally available when delegation permission is confi
    gured on the host.
  • Checkpoint is not supported at the moment.
  • Running rootless dockerd in rootless/rootful dockerd is also possible, but not fully tested.

- Description for the changelog

Allow running dockerd in an unprivileged user namespace (rootless mode)

- A picture of a cute animal (not mandatory but encouraged)

penguin

https://en.wikipedia.org/wiki/Little_penguin#/media/File:Eudyptula_minor_Bruny_1.jpg

Screenshot:

penguin0@suda-ws01:~$ id
uid=1002(penguin0) gid=1006(penguin0) groups=1006(penguin0)
penguin0@suda-ws01:~$ ps u
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
penguin0 122952  0.0  0.0  21484  5156 pts/3    Ss   16:58   0:00 /bin/bash -l
penguin0 123093  0.0  0.0  21484  5200 pts/4    Ss   16:58   0:00 /bin/bash -l
penguin0 123094  0.0  0.0 134792  2860 pts/4    S    16:58   0:00 (sd-pam)
penguin0 123252  0.0  0.0   4628   784 pts/4    S+   16:58   0:00 /bin/sh /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123253  0.0  0.0 105772  3696 pts/4    Sl+  16:58   0:00 rootlesskit --net=slirp4netns --mtu=65520 --copy-up=/etc --copy-up=/run /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123257  0.0  0.0 105516  4024 pts/4    Sl+  16:58   0:00 /proc/self/exe --net=slirp4netns --mtu=65520 --copy-up=/etc --copy-up=/run /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123265  0.0  0.0   2980  1072 pts/4    S+   16:58   0:00 slirp4netns --mtu 65520 123257 tap0
penguin0 123281  0.0  0.0   4628   828 pts/4    S+   16:58   0:00 /bin/sh /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123283  0.6  0.8 583536 65728 pts/4    Sl+  16:58   0:00 dockerd --experimental
penguin0 125126  0.0  0.0  38372  3688 pts/3    R+   17:00   0:00 ps u
penguin0@suda-ws01:~$ docker -H unix:///run/user/1002/docker.sock run --rm hello-world

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

@AkihiroSuda
Copy link
Copy Markdown
Member Author

cc @tonistiigi @tiborvass

@AkihiroSuda AkihiroSuda force-pushed the rootless branch 2 times, most recently from 483ab2e to e183cfb Compare October 16, 2018 08:12
@AkihiroSuda AkihiroSuda changed the title Rootless mode Allow running dockerd as a non-root user (Rootless mode) Oct 16, 2018
@AkihiroSuda AkihiroSuda force-pushed the rootless branch 2 times, most recently from abb3322 to 79c8968 Compare October 16, 2018 17:35
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 16, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@50e63ad). Click here to learn what that means.
The diff coverage is 19.23%.

@@            Coverage Diff            @@
##             master   #38050   +/-   ##
=========================================
  Coverage          ?   36.54%           
=========================================
  Files             ?      610           
  Lines             ?    45368           
  Branches          ?        0           
=========================================
  Hits              ?    16581           
  Misses            ?    26497           
  Partials          ?     2290

@sargun
Copy link
Copy Markdown
Contributor

sargun commented Oct 17, 2018

How can you delegate cgroups? A piece of work prior to this might be supporting cgroup namespace?

@AkihiroSuda
Copy link
Copy Markdown
Member Author

AkihiroSuda commented Oct 17, 2018

How can you delegate cgroups? A piece of work prior to this might be supporting cgroup namespace?

Cgroups delegation is disabled on this PR and it is likely to be a separate PR in future.

Until we can get full cgroups v2 support in runc (blocked due to lack of freezer and device subsystems, see opencontainers/runc#654), we would need to use pam_cgfs, although it is unlikely to be available on Red Hat distros: containers/podman#1429

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too familiar with all the requirements to make this work, but had a quick glance over, and left some comments/suggestions 🤗

Comment thread Dockerfile Outdated
Comment thread cmd/dockerd/config_common_unix.go Outdated
Comment thread cmd/dockerd/daemon_unix.go Outdated
Comment thread cmd/dockerd/daemon_unix.go Outdated
Comment thread contrib/dockerd-rootless.sh Outdated
Comment thread pkg/archive/archive_linux.go Outdated
Comment thread pkg/archive/archive_linux.go Outdated
Comment thread pkg/archive/archive_linux.go Outdated
Comment thread pkg/archive/archive_linux.go Outdated
Comment thread pkg/archive/archive_linux.go Outdated
@AkihiroSuda
Copy link
Copy Markdown
Member Author

addressed comments

@cpuguy83
Copy link
Copy Markdown
Member

Ah I was looking at libproxy in vpnkit and assumed that was what was being used :(

Well, IF these are tcp conns being proxied there, io.Copy will use splice(2) on Linux instead of a user space copy.

@AkihiroSuda
Copy link
Copy Markdown
Member Author

Updated to use prebuilt djs55/vpnkit binary.

Support for non-amd64 and slirp4netns can be discussed in follow-up PR series.

Comment thread Dockerfile Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we can build from source; there's a Dockerfile in the repo to build vpnkit https://github.com/moby/vpnkit/blob/master/Dockerfile, but not sure we should copy that (perhaps the steps from the Dockerfile could be move into the Makefile? @djs55 - think that would work?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It requires more than 10 minutes...

Any chance to get non-amd64 prebuilt binaries?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djs55 Is it possible to cross-compile vpnkit for non-amd64 targets?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cross-compilation is actively being worked on in OCaml, see https://discuss.ocaml.org/t/ocaml-cross-compiler/1494 . I don't think it works 100% yet :(

I think the only reliable way to build vpnkit for other targets would be to build on those targets. The current vpnkit CI builds binaries for macOS and Windows (for use in Docker Desktop). Which other targets do you need?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux for armhf, arm64, s390x, and ppc64le

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkihiroSuda Btw, we don't have s390x and ppc64le releases for 18.09 anymore.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonistiigi @thaJeztah does that mean that we can actually consider to drop s390x and ppc64le CI builds from Moby?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been part of this discussion, but I don't see why moby would need to drop them from CI as moby is a source project to Docker 18.xx and future 19.xx products; whether Docker has official releases seems unrelated to whether those architectures are still validated in the moby upstream project during CI.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, no need to drop CI but if it becomes troublesome to support them for new features it is a data point to consider. For example, I think if we can't have these new binaries available on these platforms it shouldn't block this PR.

@AkihiroSuda
Copy link
Copy Markdown
Member Author

Alternative plan: can we just remove vpnkit/slirp4netns from make install, and ask users to install it separately?

@thaJeztah
Copy link
Copy Markdown
Member

thaJeztah commented Jan 26, 2019 via email

@thaJeztah
Copy link
Copy Markdown
Member

thaJeztah commented Jan 26, 2019 via email

@AkihiroSuda
Copy link
Copy Markdown
Member Author

Updated PR. Now user needs to be install either slirp4netns or vpnkit separately.

We should still include vpnkit in make install bundle, but let's discuss it separately. (And also slirp4netns in some "contrib" binary tgz maybe)

Copy link
Copy Markdown
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cpuguy83
Copy link
Copy Markdown
Member

cpuguy83 commented Feb 1, 2019

@thaJeztah You good?

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing from my phone, so just from looking over the changes; left some comments/questions

I'm good to move this forward if those were errors on my side (and this is really cool to see arrive 👌😍🥳)

Comment thread daemon/config/config_unix.go Outdated
Comment thread cmd/dockerd/daemon.go Outdated
Comment thread daemon/daemon_unix.go Outdated
@thaJeztah
Copy link
Copy Markdown
Member

@AkihiroSuda are any packaging changes needed for this?

/cc @seemethere

…ode)

Please refer to `docs/rootless.md`.

TLDR:
 * Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you
 * `dockerd-rootless.sh --experimental`
 * `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...`

Signed-off-by: Akihiro Suda <[email protected]>
@AkihiroSuda
Copy link
Copy Markdown
Member Author

Updated PR

cc @estesp @icecrime 😃

@AkihiroSuda are any packaging changes needed for this?

I suggest adding dockerd-rootless.sh, rootlesskit and vpnkit to official RPM/DEB and binary tar archive, but not sure how to build VPNKit for non-amd64.
Also, I suggest providing slirp4netns binary as "contrib" pkg.

@thaJeztah
Copy link
Copy Markdown
Member

I suggest adding dockerd-rootless.sh, rootlesskit and vpnkit to official RPM/DEB and binary tar archive, but not sure how to build VPNKit for non-amd64.
Also, I suggest providing slirp4netns binary as "contrib" pkg.

Makes sense; if you have time; could you try opening a pull request in the https://github.com/docker/docker-ce-packaging repository? Perhaps the packaging team can work on it, but if you can prepare a PR, that may help speeding it up 🤗 (feel free to ping me if you need help/input on that one; I'll be on PTO for the next few days, but will try to catch up on notifications)

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks; this is really cool stuff 🥳

@cyphar
Copy link
Copy Markdown
Contributor

cyphar commented Feb 4, 2019

@AkihiroSuda Great work. 🎉

I suggest adding dockerd-rootless.sh, rootlesskit and vpnkit to official RPM/DEB and binary tar archive, but not sure how to build VPNKit for non-amd64.

Can we use slirp4netns instead of VPNKit (asking for openSUSE when we package this).

@thaJeztah
Copy link
Copy Markdown
Member

@cyphar I think the license for slirp4netns was the blocker for bundling it, but you can use it

@cyphar
Copy link
Copy Markdown
Contributor

cyphar commented Feb 4, 2019

Right, because we use rootlesskit anyway. Thanks.

flags.Var(&conf.NetworkConfig.DefaultAddressPools, "default-address-pool", "Default address pools for node specific local networks")

// Mostly users don't need to set this flag explicitly.
flags.BoolVar(&conf.Rootless, "rootless", rootless.RunningWithNonRootUsername(), "Enable rootless mode (experimental)")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For anyone looking to thread the needle, this line appears to be the cause of #39009. 👍 ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.