Skip to content

decompose --rootless into $DOCKER_HONOR_XDG and --exec-opt native.*#39024

Closed
AkihiroSuda wants to merge 1 commit intomoby:masterfrom
AkihiroSuda:decompose-rootless
Closed

decompose --rootless into $DOCKER_HONOR_XDG and --exec-opt native.*#39024
AkihiroSuda wants to merge 1 commit intomoby:masterfrom
AkihiroSuda:decompose-rootless

Conversation

@AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Apr 8, 2019

- What I did

Fix #38702 #39009

- How I did it

The --rootless monolith did not work well for several edge cases:

This commit decompose the --rootless monolith to the following environment variable and flags:

  • $DOCKER_HONOR_XDG: honor $XDG_RUNTIME_DIR, $XDG_DATA_HOME, and $XDG_CONFIG_HOME for detecting the default dirs
  • --exec-opt native.cgroupdriver=none: disable cgroups (needs --experimental to be specified together)
  • --exec-opt native.restrict_oom_score_adj=1: restrict oom_score_adj value (needs --experimental to be specified together)

dockerd doesn't try to auto-detect these configuration but now dockerd-rootless.sh launches dockerd with "rootless flags" by default: --experimental --exec-opt native.cgroupdriver=none --exec-opt native.restrict_oom_score_adj=1 --userland-proxy --userland-proxy-path=$which rootlesskit-docker-proxy)

- How to verify it

$ dockerd-rootless.sh
$ docker -H unix://$XDG_RUNTIME_DIR/docker.sock info
...
Cgroup Driver: none
...

(Note: Security Options no longer contains "rootless")

$ docker -H unix://$XDG_RUNTIME_DIR/docker.sock  run ...

- Description for the changelog

decompose --rootless into $DOCKER_HONOR_XDG and --exec-opt native.*

- A picture of a cute animal (not mandatory but encouraged)
🐧

@AkihiroSuda
Copy link
Member Author

@codecov
Copy link

codecov bot commented Apr 8, 2019

Codecov Report

Merging #39024 into master will decrease coverage by 0.03%.
The diff coverage is 19.23%.

@@            Coverage Diff             @@
##           master   #39024      +/-   ##
==========================================
- Coverage   37.01%   36.97%   -0.04%     
==========================================
  Files         612      612              
  Lines       45390    45441      +51     
==========================================
+ Hits        16799    16800       +1     
- Misses      26307    26352      +45     
- Partials     2284     2289       +5

@cpuguy83
Copy link
Member

cpuguy83 commented Apr 8, 2019

I applaud the desire to split these up into more explicit options, however I don't like moving even more rootless logic to a helper script.

Having a mode whereby a user can just specify something like --rootless is extremely nice. I think the issue is just with the auto detection.

I guess this is fine to fix the bug, but would definitely like to see work around the UX for rootless.

@cpuguy83
Copy link
Member

cpuguy83 commented Apr 8, 2019

Also, wondering if we can reproduce this issue in a test.

@tonistiigi
Copy link
Member

I don't understand the issue precisely and why we can't just ignore the $USER when --rootless is not set , but adding DOCKER_HONOR_XDG seems to limit the cases where rootless works without any extra configuration. Also, not sure about the --exec-opt. Normally we would not have options like these if we didn't have rootless mode. And in the future if we can support these with rootless they would become meaningless and a security risk.

@AkihiroSuda
Copy link
Member Author

A quick workaround for the boot2docker issue #39009 would be to just disable setting--rootless automatically depending on the $USER value, but it won't fix the rootful-docker-in-rootless-docker issue #38702 which only requires --exec-opt native.cgroupdriver=none --exec-opt native.restrict_oom_score_adj=1

@tianon
Copy link
Member

tianon commented Apr 9, 2019 via email

@tianon
Copy link
Member

tianon commented Apr 9, 2019 via email

@AkihiroSuda
Copy link
Member Author

@cpuguy83 @tonistiigi WDYT on #39024 (comment) ?

@AkihiroSuda
Copy link
Member Author

rebased

@tonistiigi
Copy link
Member

but it won't fix the rootful-docker-in-rootless-docker issue #38702 which only requires --exec-opt native.cgroupdriver=none --exec-opt native.restrict_oom_score_adj=1

When I tried this in #38817 I still had to set --rootless. I did not invoke dockerd through rootlesskit, so --rootless there meant "I know I'm already in rootless container so reuse parent rootless limits?". So can't we just run tests for these cases on allow them when --rootless has been set as --rootless already implies that user is ok with these limitations.

@AkihiroSuda
Copy link
Member Author

The --rootless monolith is not extensible for future potential support of --exec-opt native.cgroupdriver=cgroup2-rootless or --exec-opt native.cgroupdriver=pam_cgfs

@AkihiroSuda AkihiroSuda force-pushed the decompose-rootless branch 2 times, most recently from 0c2a26d to 23caa41 Compare April 13, 2019 13:36
@AkihiroSuda
Copy link
Member Author

I'd like to either get this merged or find an alternative way soon, so that we can provide almost stable package before DockerCon.

@tonistiigi
Copy link
Member

The --rootless monolith is not extensible for future potential support of --exec-opt native.cgroupdriver=cgroup2-rootless or --exec-opt native.cgroupdriver=pam_cgfs

I think we can discuss it again when these become a reality. If there is a usecase to actually choose between different drivers, exec-opt is probably a right flag. But atm we are just determining reduced featureset so that rootless mode(dockerd inside rootless container) can start.

@AkihiroSuda
Copy link
Member Author

Note that --exec-opt native.cgroupdriver=<driver> is not a new stuff: https://docs.docker.com/engine/reference/commandline/dockerd/#docker-runtime-execution-options

The native.cgroupdriver option specifies the management of the container’s cgroups. You can only specify cgroupfs or systemd. If you specify systemd and it is not available, the system errors out. If you omit the native.cgroupdriver option, cgroupfs is used.

And it had been very confusing that the cgroup driver value was substantially ignored when the daemon is running in "rootless" mode. So I feel --exec-opt native.cgroupdriver=none is better for clarifying that cgroup stuffs are disabled.

@tonistiigi
Copy link
Member

Note that --exec-opt native.cgroupdriver= is not a new stuff:

Yes, I had forgotten that. So shouldn't be an issue if there is a need for a special driver for cgroup2 in the future(although ideally it can still be avoided).

My thinking is that if user is running rootless containers --rootless should be enough and --exec-opt native.cgroupdriver=none should not be needed. And if user is running without any interest in rootless containers there shouldn't be an option for --exec-opt native.cgroupdriver=none. Unless there is a practical case other usecase. Using none has a security impact what users might not realize and is generally a bad idea if your system is cgroup capable.

The --rootless monolith did not work well for several edge cases:
* moby#38702: euid=0, $USER="root" but no access to cgroup ("rootful" Docker in rootless Docker)
* moby#39009: euid=0 but $USER="docker" (rootful boot2docker)

This commit decompose the --rootless monolith to the following environment variable and flags:
* $DOCKER_HONOR_XDG: honor $XDG_RUNTIME_DIR, $XDG_DATA_HOME, and $XDG_CONFIG_HOME for detecting the default dirs
* --exec-opt native.cgroupdriver=none: disable cgroups (needs --experimental to be specified together)
* --exec-opt native.restrict_oom_score_adj=1: restrict oom_score_adj value (needs --experimental to be specified together)

dockerd doesn't try to auto-detect these configuration but now dockerd-rootless.sh launches dockerd with "rootless flags" by default:
--experimental --exec-opt native.cgroupdriver=none --exec-opt native.restrict_oom_score_adj=1 --userland-proxy --userland-proxy-path=$which rootlesskit-docker-proxy

Fix moby#38702 moby#39009

Signed-off-by: Akihiro Suda <[email protected]>
Signed-off-by: Akihiro Suda <[email protected]>
@AkihiroSuda
Copy link
Member Author

And if user is running without any interest in rootless containers there shouldn't be an option for --exec-opt native.cgroupdriver=none. Unless there is a practical case other usecase. Using none has a security impact what users might not realize and is generally a bad idea if your system is cgroup capable.

Updated PR to disallow native.cgroupdriver=none when the daemon is not running in userns. (daemon/daemon_unix.go:VerifyCgroupDriver())

WDYT?

My thinking is that if user is running rootless containers --rootless should be enough

As user still need to use dockerd-rootless.sh, this does not seem to improve UX.
In future we may embed dockerd-rootless.sh equivalent to dockerd though.

Also, the behavior of --rootless is likely to be inconsistent and confusing when running dockerd as a userns root in rootless docker (#38702), because XDG must be ignored and rootlesskit-docker-proxy must not be set in #38702 usecase.

@AkihiroSuda
Copy link
Member Author

It might be good to reserve --rootless flag for embedding dockerd-rootless.sh equivalent to dockerd in future.

@cpuguy83
Copy link
Member

I agree with @tonistiigi, this should just be encompassed by --rootless, which would hopefully just get better (require less from docker-rootless.sh) over time.

@AkihiroSuda
Copy link
Member Author

this should just be encompassed by --rootless, which would hopefully just get better (require less from docker-rootless.sh) over time.

Do you mean you want --rootless to do equivalent of DOCKER_HONOR_XDG=1 (implementation might be hard) and --userland-proxy-path=$(which rootlesskit-docker-proxy) over time?
The problem is that these flags needs to be ignored for dind case: #39024 (comment)

@AkihiroSuda
Copy link
Member Author

How about combined approach - retore --rootless as the shorthard for DOCKER_HONOR_XDG=1 --exec-opt native.cgroupdriver=none --exec-opt native.restrict_oom_score_adj=1 --userland-proxy --userland-proxy-path=$which rootlesskit-docker-proxy).

For typical use case, user only needs to set --rootless via dockerd-rootless.sh.
When running dockerd as root in rootless Docker, user must not specify --rootless but must specify --exec-opt native.cgroupdriver=none --exec-opt native.restrict_oom_score_adj=1.

@cpuguy83
Copy link
Member

Coming from a mostly naive place here... but I would say let's not set --userland-proxy-path with --rootless.

Why must the XDG dir be ignored (with --rootless) in root-in-userns case?

AkihiroSuda added a commit to AkihiroSuda/docker that referenced this pull request Apr 25, 2019
The `--rootless` flag had a couple of issues:
* moby#38702: euid=0, $USER="root" but no access to cgroup ("rootful" Docker in rootless Docker)
* moby#39009: euid=0 but $USER="docker" (rootful boot2docker)

To fix moby#38702, XDG dirs are ignored as in rootful Docker, unless the
dockerd is directly running under RootlessKit namespaces.

RootlessKit detection is implemented by checking whether `$ROOTLESSKIT_STATE_DIR` is set.

To fix moby#39009, the non-robust `$USER` check is now completely removed.

The entire logic can be illustrated as follows:

```
withRootlessKit := getenv("ROOTLESSKIT_STATE_DIR")
rootlessMode := withRootlessKit || cliFlag("--rootless")
honorXDG := withRootlessKit
useRootlessKitDockerProxy := withRootlessKit
removeCgroupSpec := rootlessMode
adjustOOMScoreAdj := rootlessMode
```

Close moby#39024
Fix moby#38702 moby#39009

Signed-off-by: Akihiro Suda <[email protected]>
@AkihiroSuda
Copy link
Member Author

Closing per #39138

Why must the XDG dir be ignored (with --rootless) in root-in-userns case?

Because "rootful"-in-rootless dockerd is expected to behave almost same as rootful dockerd.
So, the API path should be /var/run/docker.sock, not $XDG_RUNTIME_DIR/docker.sock.

@cpuguy83
Copy link
Member

But if the user set --rootless why would they expect it to behave the same as rootful?

@AkihiroSuda
Copy link
Member Author

same as rootful but cgroup disabled

cpuguy83 added a commit that referenced this pull request Apr 26, 2019
…ative-to-39024

dockerd: fix rootless detection (alternative to #39024)
thaJeztah pushed a commit to thaJeztah/docker that referenced this pull request May 13, 2019
The `--rootless` flag had a couple of issues:
* moby#38702: euid=0, $USER="root" but no access to cgroup ("rootful" Docker in rootless Docker)
* moby#39009: euid=0 but $USER="docker" (rootful boot2docker)

To fix moby#38702, XDG dirs are ignored as in rootful Docker, unless the
dockerd is directly running under RootlessKit namespaces.

RootlessKit detection is implemented by checking whether `$ROOTLESSKIT_STATE_DIR` is set.

To fix moby#39009, the non-robust `$USER` check is now completely removed.

The entire logic can be illustrated as follows:

```
withRootlessKit := getenv("ROOTLESSKIT_STATE_DIR")
rootlessMode := withRootlessKit || cliFlag("--rootless")
honorXDG := withRootlessKit
useRootlessKitDockerProxy := withRootlessKit
removeCgroupSpec := rootlessMode
adjustOOMScoreAdj := rootlessMode
```

Close moby#39024
Fix moby#38702 moby#39009

Signed-off-by: Akihiro Suda <[email protected]>
(cherry picked from commit 3518383)
Signed-off-by: Sebastiaan van Stijn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rootful-Docker-in-Rootless-Docker doesn't work

7 participants