Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fs cgroups by default #17704

Merged
merged 1 commit into from
Nov 19, 2015
Merged

Use fs cgroups by default #17704

merged 1 commit into from
Nov 19, 2015

Conversation

LK4D4
Copy link
Contributor

@LK4D4 LK4D4 commented Nov 4, 2015

Our implementation of systemd cgroups is mixture of systemd api and
plain filesystem api. It's hard to keep it up to date with systemd and
it already contains some nasty bugs with new versions. Ideally it should
be replaced with some daemon flag which will allow to set parent systemd
slice.

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 4, 2015

ping @vishh @crosbymichael

@vishh
Copy link
Contributor

vishh commented Nov 4, 2015

LGTM 💯

@hqhq
Copy link
Contributor

hqhq commented Nov 6, 2015

With this, all container processes will in the same service or scope with daemon, so users can not control them by systemd commands, isn't that a regression we need to warn systemd users for now?

I agree we should allow to set parent systemd slice as I suggested in opencontainers/runc#336 (comment), but are we planning to achieve this with plain filesystem api and eventually remove systemd code for good? I think it'll be much easier to keep some systemd api and share others with cgroupfs. Anyway that's another story.

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 6, 2015

@hqhq Yeah, actually I plan to introduce daemon-wide cgroup in next release. Also I think all containers will be in PID 1 cgroup, not daemon cgroup.

@hqhq
Copy link
Contributor

hqhq commented Nov 6, 2015

@LK4D4 Agreed, we had proposal for daemon-wide cgroups long time ago, and create containers in root cgroup makes more sense, constraints like cpu.share will act more like they expected, also I think that'll simplify hierarchy and unify behavior between cgroupfs and systemd. 👍

@crosbymichael
Copy link
Contributor

what is a deamon-wide cgroup?

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 10, 2015

@crosbymichael like parent for all containers. There is nothing like this in this pr.

@crosbymichael
Copy link
Contributor

The parent is /docker/ right?

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 10, 2015

@crosbymichael I'm not sure, for me at least device cgroup in /user.slice/docker, but yes, for other cgroups it's docker.

@hqhq
Copy link
Contributor

hqhq commented Nov 11, 2015

@LK4D4 @crosbymichael I think daemon-wide cgroup is the cgroup for docker daemon, use it like docker daemon -m 2G, currently we create all container processes in daemon's cgroup with cgroupfs. It's not always in the same path because it depend on which cgroup daemon was in.

For example, daemon was in /sys/fs/cgroup/memory/, the root cgroup of memory, so container will in /sys/fs/cgroup/memory/docker/ae65xfe..., but if for devices cgroup, daemon was in /sys/fs/cgroup/devices/child/, container will in /sys/fs/cgroup/devices/child/docker/fdsafw76..., but the parent is always docker.

But the problem is systemd cgroup does not work like that, it always put container into system.slice by default, not into the daemon's cgroup, so that's not consistent. And I think always make containers have a parent docker doesn't make much sense, so I agree with what @LK4D4 will do.

@vishh
Copy link
Contributor

vishh commented Nov 11, 2015

How about providing a daemon flag which defines the default cgroup root for the daemon? It can default to /docker, but on systemd deployments, it can be overridden to /system.slice or some other slice?

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 11, 2015

@vishh that's what I had in mind

@crosbymichael
Copy link
Contributor

LGTM

@vishh
Copy link
Contributor

vishh commented Nov 16, 2015

@LK4D4: Can you update this PR to also include a daemon level flag for configuring the parent cgroup of all docker containers?

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 16, 2015

@vishh I prefer to do it in separate PR, because it'll be mostly docs PR and needed to be reviewed by more people.

@vishh
Copy link
Contributor

vishh commented Nov 16, 2015

Ok. I just want the configuration to land in the same release.

On Mon, Nov 16, 2015 at 10:44 AM, Alexander Morozov <
[email protected]> wrote:

@vishh https://github.com/vishh I prefer to do it in separate PR,
because it'll be mostly docs PR and needed to be reviewed by more people.


Reply to this email directly or view it on GitHub
#17704 (comment).

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 16, 2015

@vishh I'll create issue and add to milestone.

@icecrime
Copy link
Contributor

LGTM

@icecrime
Copy link
Contributor

I think this section needs updating.

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 18, 2015

@icecrime Yup, thanks!
ping @thaJeztah @moxiegirl for docs

it is not available, the system uses `cgroupfs`. By default, if no option is
specified, the execdriver first tries `systemd` and falls back to `cgroupfs`.
This example sets the execdriver to `cgroupfs`:
it is not available, the system uses `cgroupfs`. By default `cgroupfs` is used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you omit the `native.cgroupdriver` option,` cgroupfs` is used.

Minor nit of cours..I just think that "by default" is confusing as we have lots of instances where flags/options have defaults and we have what the system does when a user doesn't specify the flag/option.

Our implementation of systemd cgroups is mixture of systemd api and
plain filesystem api. It's hard to keep it up to date with systemd and
it already contains some nasty bugs with new versions. Ideally it should
be replaced with some daemon flag which will allow to set parent systemd
slice.

Signed-off-by: Alexander Morozov <[email protected]>
@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 19, 2015

@moxiegirl I changed your suggestion.

@anusha-ragunathan
Copy link
Contributor

LGTM

@moxiegirl
Copy link
Contributor

Thank you @LK4D4 LGTM

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 20, 2015

@rhatdan it's not true for my systemd-226, because systemd doesn't put docker in docker.service at all. Could you show your docker.service? Is it in our repo?

@rhatdan
Copy link
Contributor

rhatdan commented Nov 20, 2015

How are you starting docker daemon? In the docker.service unit file?

@rhatdan
Copy link
Contributor

rhatdan commented Nov 20, 2015

Bottom line the container processes all end up being in the same scope as the docker daemon. This means if someone changes the cgroups of the docker daemon, all containers will be effected. If some one of the container processes registers with systemd-machine, and then terminates systemd will attempt to kill all processes in the same scope as the container, meaning all processes under the docker daemon. If you run the docker daemon by hand this means all container processes will be in the same scope as your user process.

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 20, 2015

@rhatdan Yes, I'm starting with systemd:

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target docker.socket
Requires=docker.socket

[Service]
Type=notify
ExecStart=/usr/bin/docker daemon -H fd://
MountFlags=slave
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity

[Install]
WantedBy=multi-user.target

@rhatdan
Copy link
Contributor

rhatdan commented Nov 20, 2015

What does systemd-cgls show? Where is the docker daemon running? Where is the container process running?

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 20, 2015

@rhatdan Sorry, seems like I misunderstood what we're talking here about. I thought we're talking about cgroups overall, but everything was about systemd cgroup. I'll fix this, no problem.

@rhatdan
Copy link
Contributor

rhatdan commented Nov 20, 2015

Right this is all about the "Scope" which in systemd terms is how it groups like processes together. Each user, service, vm gets its own "Scope". The problem with this change is the Docker container processes no longer get their own "Scope", they share the scope with the docker daemon.

@LK4D4
Copy link
Contributor Author

LK4D4 commented Nov 20, 2015

@rhatdan Yeah, it appeared not so easy, because you can't just remove subscope from systemd cgroup :/

@vishh
Copy link
Contributor

vishh commented Dec 1, 2015

I see two sub-issues here:

  1. Containers being grouped as cgroup children of docker daemon - This is undesirable. @LK4D4: I hope the daemon level flag you are planning on adding will land before this PR makes it into a release.
  2. Starting docker container's under a *.scope cgroup - The main goal of this PR was to eliminate explicit systemd support in docker. Assuming that the motivation behind that change is valid, will it be possible to use the --cgroup-parent option to place docker containers under specific scopes @rhatdan? For example, unit files can include the following option --cgroup-parent=/system.slice/%H.scope.

@vishh
Copy link
Contributor

vishh commented Dec 1, 2015

Related issue: #18022

@LK4D4
Copy link
Contributor Author

LK4D4 commented Dec 1, 2015

@vishh I will add flag tomorrow. However "name=systemd" cgroup seems to be usable only through systemd API. So, I'll appreciate any help on this. In worst case we will be forced to use systemd API even in fs cgroups(only for name=systemd).

@rhatdan
Copy link
Contributor

rhatdan commented Dec 1, 2015

I would hope --cgroup-parent would work. We plan on shipping the docker daemon with the systemd support turned on, since we don't agree with this patch, and feel that this is a breaking behaviour.

@tianon
Copy link
Member

tianon commented Dec 1, 2015 via email

@derekwaynecarr
Copy link

Does docker info report back the native.cgroupdriver that is configured?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.