Skip to content

Conversation

@crosbymichael
Copy link
Contributor

We need to add delegate yes to docker's service file so that it can
manage the cgroups of the processes that it launches without systemd
interfering with them and moving the processes after it is reloaded.

       Delegate=
           Turns on delegation of further resource control partitioning to
           processes of the unit. For unprivileged services (i.e. those
           using the User= setting), this allows processes to create a
           subhierarchy beneath its control group path. For privileged
           services and scopes, this ensures the processes will have all
           control group controllers enabled.

This is the proper fix for issue #20152

Signed-off-by: Michael Crosby [email protected]

@crosbymichael
Copy link
Contributor Author

@PierreF @LK4D4 Can you two check this out and see if it fixes your issue. You an do a build off of master because it does not include the libcontainer fix yet so that you can test this independently. It should work with all systemd versions properly.

@LK4D4
Copy link
Contributor

LK4D4 commented Feb 23, 2016

Works for me.

@philips
Copy link
Contributor

philips commented Feb 23, 2016

Nack. We should default to systemd on systemd: #20635

@crosbymichael
Copy link
Contributor Author

@philips why? systemd cgroups gives us a small subset of the features we need and then we have to do everything else via the filesystem api anyways so why depend on it if it provides us with so little?

We also fight with weird bugs all the time that are related to systemd cgroups across different version of os and systemd. It appears that Delegate=yes is exactly what we want from the docker daemon to manage the cgroups of the containers that it launches.

@philips
Copy link
Contributor

philips commented Feb 24, 2016

@crosbymichael Is there a documented chain of issues of where and why this is happening? If we are making changes to default behavior we should document why. The other way to look at this is #20152 was caused by #17704

@crosbymichael
Copy link
Contributor Author

@philips actually #20152 still applies when using systemd cgroup driver. The underlying cause is in systemd and how it handles the children of daemons that write any cgroup changes.

The previous change just made it more obvious to users; it is a long running issue that no one noticed until the change of defaults and there is no other way to properly fix this that i can see.

[Unit]
Description=docker
Documentation=https://docker.com
After=network.target

[Service]
MemoryLimt=3072M
CPUQuota=200%
ExecStartPre=/sbin/modprobe overlay
ExecStart=/usr/local/bin/docker daemon -D --userland-proxy=false -s overlay --exec-opt native.cgroupdriver=systemd
#Delegate=yes

[Unit]
Description=docker
Documentation=https://docker.com
After=network.target

michael|~/development/gocode/src/github.com/docker/docker > cat /proc/17630/cgroup 
10:memory:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
9:hugetlb:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
8:freezer:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
7:net_cls,net_prio:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
6:perf_event:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
5:devices:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
4:cpu,cpuacct:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
3:blkio:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
2:cpuset:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
1:name=systemd:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
michael|~/development/gocode/src/github.com/docker/docker > sudo systemctl daemon-reload 
michael|~/development/gocode/src/github.com/docker/docker > sudo systemctl restart cron
michael|~/development/gocode/src/github.com/docker/docker > cat /proc/17630/cgroup 
10:memory:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
9:hugetlb:/system.slice
8:freezer:/system.slice
7:net_cls,net_prio:/system.slice
6:perf_event:/system.slice
5:devices:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
4:cpu,cpuacct:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
3:blkio:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope
2:cpuset:/system.slice
1:name=systemd:/system.slice/docker-933fca94925897532d07c180bf73552abdba1b709c5f3e283cb2462cbee01f76.scope

@calavera
Copy link
Contributor

This change LGTM. We know as a fact that our implementation will always work for containers started by Docker, we cannot say the same about systemd's implementation unfortunately.

@PierreF
Copy link

PierreF commented Feb 24, 2016

Built crosbymichael/unit-file (839f86c):

$ docker version
Client:
 Version:      1.11.0-dev
 API version:  1.23
 Go version:   go1.5.3
 Git commit:   839f86c
 Built:        Wed Feb 24 09:30:04 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.0-dev
 API version:  1.23
 Go version:   go1.5.3
 Git commit:   839f86c
 Built:        Wed Feb 24 09:30:04 2016
 OS/Arch:      linux/amd64

Tested on Ubuntu wily, Ubuntu xenial:

  • first WITHOUT changing unit-file : it failed
  • the by creating /etc/systemd/system/docker.service.d/override.conf : it works
$ cat /etc/systemd/system/docker.service.d/override.conf                                                                                                            
[Service]
Delegate=yes

Also tested on Debian jessie : it does NOT works... but it's because Delegate doesn't exists in this Systemd version:

# journalctl -n1
Feb 24 10:52:37 jessie systemd[1]: [/etc/systemd/system/docker.service.d/override.conf:2] Unknown lvalue 'Delegate' in section 'Service'
# systemctl --version
systemd 215
+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ -SECCOMP -APPARMOR

@cyphar
Copy link
Contributor

cyphar commented Feb 24, 2016

LGTM. This solves several of the really hairy problems with systemd changing cgroups due to external factors (like the angle of the sun and how many pencils are on the left side of your desk, etc).

@philips
Copy link
Contributor

philips commented Feb 24, 2016

Given that we are trying to work with a thousand different versions of systemd all patched in different ways it seems like this is the best default. LGTM.

@philips
Copy link
Contributor

philips commented Feb 24, 2016

We should probably add a comment about --exec-opt native.cgroupdriver=systemd however to this service file.

@crosbymichael
Copy link
Contributor Author

I'll update with a comment soon

@tianon
Copy link
Member

tianon commented Feb 24, 2016

To be clear, on platforms where Delegate= isn't supported yet (like Debian Jessie, noted by @PierreF above in #20633 (comment)), does this create an error (and thus a daemon that doesn't start), or just a warning?

@crosbymichael
Copy link
Contributor Author

@tianon i don't know if it errors or not, if someone is running 215 can you try really quick?

@PierreF
Copy link

PierreF commented Feb 24, 2016

Just a warning, and Docker daemon is started.

@tianon
Copy link
Member

tianon commented Feb 24, 2016 via email

We need to add delegate yes to docker's service file so that it can
manage the cgroups of the processes that it launches without systemd
interfering with them and moving the processes after it is reloaded.

```
       Delegate=
           Turns on delegation of further resource control partitioning to
           processes of the unit. For unprivileged services (i.e. those
           using the User= setting), this allows processes to create a
           subhierarchy beneath its control group path. For privileged
           services and scopes, this ensures the processes will have all
           control group controllers enabled.
```

This is the proper fix for issue moby#20152

Signed-off-by: Michael Crosby <[email protected]>
@crosbymichael
Copy link
Contributor Author

Ok, I added comments for the cgroup default and the delegate yes section in the file

@philips
Copy link
Contributor

philips commented Feb 25, 2016

lgtm

@crosbymichael
Copy link
Contributor Author

Thanks for the reviews all.

@ngrilly
Copy link

ngrilly commented May 23, 2018

Is it safe to run Docker on Debian Jessie with systemd 215 that doesn't support Delegate=?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.