Skip to content

Proposal: allow running dockerd as an unprivileged user (aka rootless mode) #37375

@AkihiroSuda

Description

@AkihiroSuda

POC: https://github.com/AkihiroSuda/docker/commits/rootless

Rootless mode (Experimental)

The rootless mode allows running dockerd as an unprivileged user, using user_namespaces(7), mount_namespaces(7), network_namespaces(7), and VPNKit.

No SUID binary is required except newuidmap and newgidmap.

Requirements

  • newuidmap and newgidmap need to be installed on the host. These commands are provided by the uidmap package.

  • /etc/subuid and /etc/subgid should contain >= 65536 sub-IDs. e.g. penguin:231072:65536.

$ id -u
1001
$ grep ^$(whoami): /etc/subuid
penguin:231072:65536
$ grep ^$(whoami): /etc/subgid
penguin:231072:65536
  • Some distros such as Debian and Arch Linux require echo 1 > /proc/sys/kernel/unprivileged_userns_clone.

Restrictions

  • Only vfs graphdriver is supported. However, on Ubuntu and a few distros, overlay2 and overlay are also supported. Starting with Linux 4.18, we will be also able to implement FUSE snapshotters.
  • Cgroups, AppArmor, and SELinux are disabled at the moment. (FIXME: we could enable Cgroups if configured on the host)
  • Checkpoint is not supported at the moment.
  • Running rootless dockerd in rootless/rootful dockerd should be also possible, but not tested yet.

Usage

Daemon

Before running dockerd you need to unshare userns, mountns, and netns.

You may use RootlessKit for unsharing them and VPNKit for enabling usermode networking.

If your /etc/resolv.conf is managed by systemd or NetworkManager, you need to run RootlessKit with --copy-up=/etc so as to prevent /etc/resolv.conf in the namespace from being unexpectedly unmounted when /etc/resolv.conf is recreated on the host.

Also, currently you need to mount /run/docker as tmpfs before running dockerd, because "/run/docker/libnetwork" is still hard-coded in vendor/github.com/docker/libnetwork/sandbox_externalkey_unix.go.

$ docker-rootlesskit --net=vpnkit --vpnkit-binary=docker-vpnkit --copy-up=/etc \
  sh -ec "mount -t tmpfs none /run/docker; dockerd --experimental"

If /run/docker mount point is not available on your host, you can create the mount point by running RootlessKit with --copy-up=/run:

$ docker-rootlesskit --net=vpnkit --vpnkit-binary=docker-vpnkit --copy-up=/etc --copy-up=/run \
  sh -ec "mkdir -p /run/docker; mount -t tmpfs none /run/docker; dockerd --experimental"

Remarks:

  • The socket path is set to /run/user/$UID/docker.sock by default.
  • The data dir is set to ~/.local/share/docker by default.
  • The exec dir is set to /run/user/$UID/docker by default.
  • The config dir is set to ~/.config/docker (not ~/.docker) by default.

Client

You can just use the upstream Docker client (without nsenter-ing to the dockerd namespaces), but you need to set the socket path.

$ docker -H unix:///run/user/1001/docker.sock run -d nginx

Exposing ports

In addition to exposing container ports to the dockerd network namespace, you also need to expose the ports in the network namespace to the host network namespace.

e.g.

$ docker-rootlesskit --state=/tmp/foo ...
$ docker -H unix:///run/user/1001/docker.sock run -d -p 80:80 nginx
$ socat -t -- TCP-LISTEN:8080,reuseaddr,fork EXEC:"nsenter -U -n -t $(cat /tmp/foo/child_pid) socat -t -- STDIN TCP4\:127.0.0.1\:80"

In future, we could integrate RootlessKit into dockerd for exposing the namespace ports automatically.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/daemonCore Enginearea/securitykind/featureFunctionality or other elements that the project doesn't currently have. Features are new and shinyroadmap

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions