Skip to content

Review / revisit systemd unit files #73

@thaJeztah

Description

@thaJeztah

We currently have two different unit files; one for .deb based packages, and one for .rpm. The .rpm version currently assumes systemd 226 or older, which is correct for CentOS and RHEL (RHEL 7.4 uses systemd-219-42.el7.x86_64), but incorrect for (at least) Fedora.

Default install of Docker CE 17.07 on Fedora 26:

systemctl cat docker.service
# /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
# TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

Version of systemd running:

$ systemctl --version
systemd 233
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN default-hierarchy=hybrid

Things to notice;

  • We set LimitNOFILE, LimitNPROC and LimitCORE to infinity to prevent overhead due to accounting
  • We don't set TasksMax as it's not supported on older versions of systemd (and those versions are not affected by systemd setting a low value)

Configuring TasksMax

On newer versions of systemd we should set TasksMax because the default set by systemd is too low. All docker processes, including containers are started as a child of dockerd, so 4915 processes can easiliy be reached on bigger servers (see moby/moby#23332) (Looks like the Limit was raised since the original limit of 512 systemd/systemd#3211)

$ cat /sys/fs/cgroup/pids/system.slice/docker.service/pids.max
4915

In our .deb packages we automatically set this option based on systemd version; we should have a similar approach for our RPM packages.

From the systemd man-page:

TasksMax=N
Specify the maximum number of tasks that may be created in the unit. This
ensures that the number of tasks accounted for the unit (see above) stays below
a specific limit. This either takes an absolute number of tasks or a percentage
value that is taken relative to the configured maximum number of tasks on the
system. If assigned the special value "infinity", no tasks limit is applied.
This controls the "pids.max" control group attribute. For details about this
control group attribute, see pids.txt.

Implies "TasksAccounting=true". The system default for this setting may be
controlled with DefaultTasksMax= in systemd-system.conf(5).

Disable accounting (if possible)

Reading this blog post; Enable CPU and Memory accounting for docker (or any systemd service) I found that systemd has options to disable accounting. We should consider using these options instead of setting the limits to infinity (which does have the same effect). I have not found yet which version of systemd introduced these options.

The following options are available (see systemd.resource-control;

  • MemoryAccounting=no
  • TasksAccounting=no (same result as our current TasksMax=infinity)
  • CPUAccounting
  • IOAccounting=no (replaces BlockIOAccounting)
  • BlockIOAccounting=no (deprecated, see IOAccounting)

The defaults on Fedora 26 look like this;

[root@fedora-2gb-ams3-01 ~]# systemctl show docker | grep Accounting
CPUAccounting=no
IOAccounting=no
BlockIOAccounting=no
MemoryAccounting=no
TasksAccounting=yes

Questions to answer

  • Which version of systemd introduced the xxAccounting options?
  • What's the right approach for the RPM packages to override the default based on systemd version? (custom crafted unit file, drop-in file, or modify the unit file like we do for the debs?)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions