-
Notifications
You must be signed in to change notification settings - Fork 88
Description
- This is a bug report
- This is a feature request
- I searched existing issues before opening this one
Expected behavior
Spawning processes which use a pseudo TTY is consistently fast.
Actual behavior
Sometimes, spawning processes that use a pseudo TTY inside a container is many times slower than usual.
The first time I've encountered this problem (on a really slow machine), the time to spawn such a process in a container jumped from something reasonable (~0.5 seconds) to something much slower (8-10 seconds).
On faster machines, it's still a reproducible problem (see below). It's much less noticeable as we're talking about smaller numbers, but there's at least a factor of 4-5x slowdown (process spawning goes from taking 0.1 seconds to 0.5+ seconds).
Steps to reproduce the behavior
As described in Additional environment details (below), I've managed to reproduce it on various CentOS 7 machines (I've tried Hetzner Cloud servers and Digital Ocean droplets and it's reproducible on both).
Before describing the steps, let me first define a few files for a "benchmark" Docker image that would be used to illustrate the performance problem.
bench.py:
print('Starting')
import pexpect
import time
s = time.time()
print('Spawn..')
# Well, you would usually do this with an interactive program.
# This is just for illustrative purposes.
pexpect.spawn('whoami')
print(time.time() - s)bench.sh:
#!/bin/bash
# Strace is also available in the image, so running with it is possible
#time docker run -it --rm --security-opt seccomp:unconfined docker-benchmark strace -r python bench.py
time docker run -it --rm docker-benchmark python bench.pySteps to reproduce on a clean CentOS 7 install:
# Make sure we're up to date and have the latest Docker (18.09)
yum update -y
yum install -y wget
wget https://download.docker.com/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum install -y docker-ce
systemctl start docker # just start, do not enable (!) on startup
mkdir docker-benchmark
cd docker-benchmark
# Copy over the necessary files (bench.sh, bench.py, Dockerfile)
docker build -t docker-benchmark -f Dockerfile .
# Run this a couple of times. It should be fast (e.g. 0.1 sec.)
./bench.sh
# Restarting will likely cause performance to degrade
systemctl restart docker
# Try benchmarking again now. It should be consistently slower (e.g. 0.4+ sec)
./bench.sh
reboot -f
cd docker-benchmark
systemctl start docker
# Run this a couple of times. It's likely fast again.
./bench.sh
# Restarting will likely cause performance to degrade
systemctl restart docker
# Try benchmarking again now. It should be consistently slower.
./bench.sh
# Downgrade Docker (18.09 -> 18.06)
systemctl stop docker
yum remove -y docker-ce docker-ce-cli
yum install -y docker-ce-18.06.1.ce-3.el7
systemctl start docker
# Try benchmarking again now. It should be fast.
# Restarting Docker, rebooting or whatever, will not slow it down.
./bench.shOutput of docker version:
Client:
Version: 18.09.0
API version: 1.39
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:48:22 2018
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.0
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:19:08 2018
OS/Arch: linux/amd64
Experimental: false
Output of docker info:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 5
Server Version: 18.09.0
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: c4446665cb9c30056f4998ed953e6d4ff22c7c39
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-862.14.4.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 991.7MiB
Name: centos-s-1vcpu-1gb-fra1-01
ID: HJ5J:T65L:WR5U:W7GA:X3E7:NSEQ:HBIE:F6DF:B47K:C4OG:K33Q:I3XY
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
Additional environment details (AWS, VirtualBox, physical, etc.)
I have first encountered this problem on a CentOS 7.5 KVM machine running on Proxmox.
I have since also managed to reproduce it using CentOS 7 on:
- Hetzner Cloud VM servers (€2.99 server in the Falkenstein region)
- Digital Ocean droplets ($5 droplet in the Frankfurt region)
As the flow for reproducing in Steps to reproduce on a clean CentOS 7 install (above) says, this is only an issue with Docker 18.09. Downgrading to 18.06 makes the issue go away.
I have tried to reproduce the problem on Ubuntu 18.04 (LTS), but I haven't managed to.
Or maybe I have.. Actually the performance I've managed to achieve on Ubuntu 18.04 matches the slow benchmarks from Centos 7 + Docker 18.09. Even if I downgrade Docker to 18.06 on Ubuntu 18.04, it's still just as slow.. In summary, it's always slow on Ubuntu, regardless of the Docker version.. So maybe there is a problem there too.
Additional debugging output (strace):
bench.sh (above) also contains a line to run the same benchmark with strace -r.
I've received the following results (last part is most interesting as it shows where the slowdown is).
Slowness at SIGCHLD:
... more stuff above ...
0.000026 stat("/dev/pts/1", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
0.000033 open("/dev/pts/1", O_RDWR|O_NOCTTY) = 6
0.000035 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f9d30ff89d0) = 8
0.000232 close(6) = 0
0.000276 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000066 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000117 fstat(5, {st_mode=S_IFCHR|0666, st_rdev=makedev(5, 2), ...}) = 0
0.000039 ioctl(5, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000031 fstat(5, {st_mode=S_IFCHR|0666, st_rdev=makedev(5, 2), ...}) = 0
0.000033 ioctl(5, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000062 lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
0.000056 lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
0.000114 close(4) = 0
0.000039 read(3, "", 4096) = 0
0.648181 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8, si_uid=0, si_status=0, si_utime=27, si_stime=37} ---
0.000037 close(3) = 0
0.000102 ioctl(5, TIOCSWINSZ, {ws_row=24, ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0
0.000125 close(5) = 0
0.000050 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=100000}) = 0 (Timeout)
0.100275 wait4(8, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, NULL) = 8
0.000154 write(1, "0.7546336650848389\n", 190.7546336650848389
) = 19
0.000185 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f9d3067e0c0}, {sa_handler=0x7f9d30a96af0, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f9d3067e0c0}, 8) = 0
0.006374 sigaltstack(NULL, {ss_sp=0x55aca4608c90, ss_flags=0, ss_size=8192}) = 0
0.000051 sigaltstack({ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=0}, NULL) = 0
0.000242 exit_group(0) = ?
0.001359 +++ exited with 0 +++
or slowness at close():
... more stuff above ...
0.000027 stat("/dev/pts/1", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
0.000032 open("/dev/pts/1", O_RDWR|O_NOCTTY) = 6
0.000038 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f35d72f39d0) = 8
0.000235 close(6) = 0
0.000289 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000063 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000114 fstat(5, {st_mode=S_IFCHR|0666, st_rdev=makedev(5, 2), ...}) = 0
0.000040 ioctl(5, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000031 fstat(5, {st_mode=S_IFCHR|0666, st_rdev=makedev(5, 2), ...}) = 0
0.000034 ioctl(5, TCGETS, {B38400 opost isig icanon echo ...}) = 0
0.000052 lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
0.000057 lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
0.000083 close(4) = 0
0.000035 read(3, "", 4096) = 0
0.647527 close(3) = 0
0.000104 ioctl(5, TIOCSWINSZ, {ws_row=24, ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0
0.000129 close(5) = 0
0.000047 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=100000}) = ? ERESTARTNOHAND (To be restarted if no handler)
0.000495 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8, si_uid=0, si_status=0, si_utime=24, si_stime=39} ---
0.000018 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=99545}) = 0 (Timeout)
0.099814 wait4(8, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, NULL) = 8
0.000204 write(1, "0.7540490627288818\n", 190.7540490627288818
) = 19
0.000292 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f35d69790c0}, {sa_handler=0x7f35d6d91af0, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f35d69790c0}, 8) = 0
0.008504 sigaltstack(NULL, {ss_sp=0x5564cd06cc90, ss_flags=0, ss_size=8192}) = 0
0.000052 sigaltstack({ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=0}, NULL) = 0
0.000216 exit_group(0) = ?
0.001180 +++ exited with 0 +++