Skip to content

[SWARM] Very poor performance for ingress network with lots of parallel requests #35082

@vide

Description

@vide

Description

Executing a large number of parallel connections against plain Docker and Docker Swarms leads to 2 completely different perfomance results, with Swarm being the slowest one by a 50x factor!
The test is reproducible (at least on my VMs) easily with Siege and the official Nginx image, but I'm actually experiencing the problem in production with our custom java-based HTTP microservice. I cannot see any obvious error message in Docker logs or kernel logs.

Steps to reproduce the issue:
Run the nginx container:

[root@stresstest01 ~]# docker run -d --rm --net bridge -m 0b  -p 80:80  --name test nginx
35c231e361d7e5ca73fb1bcfbeeaf57a066da057b708055477855e6d16af575d

Siege the container, and the results are good, over 13k trans/sec, and CPU in stresstest01 is 100% used by the nginx process.

[root@siege01 ~]# siege -b -c 250 -t 20s -f test_vm_docker.txt >/dev/null
** SIEGE 4.0.2
** Preparing 250 concurrent users for battle.
The server is now under siege...

Lifting the server siege...
Transactions:		      260810 hits
Availability:		      100.00 %
Elapsed time:		       19.03 secs
Data transferred:	      140.03 MB
Response time:		        0.02 secs
Transaction rate:	    13705.20 trans/sec
Throughput:		        7.36 MB/sec
Concurrency:		      245.51
Successful transactions:      231942
Failed transactions:	           0
Longest transaction:	        7.03
Shortest transaction:	        0.00

Now, lets try with Docker Swarm (1 node swarm, 1 container stack)

[root@stresstest01 ~]# cat docker-compose.yml 
services:
  server:
    deploy:
      replicas: 1
    image: nginx:latest
    ports:
    - published: 80
      target: 80
version: '3.3'
[root@stresstest01 ~]# docker stack deploy test --compose-file docker-compose.yml 
Creating network test_default
Creating service test_server

After the first run, the results are already far worse than with plain Docker, but after the second it's
just a disaster :( Moreover, host CPU is only slightly used, and only by the nginx process. No docker related processes (dockerd,cointanerd etc) seem to be contained by CPU.

[root@siege01 ~]# siege -b -c 250 -t 20s -f test_vm_docker.txt >/dev/null
** SIEGE 4.0.2
** Preparing 250 concurrent users for battle.
The server is now under siege...

Lifting the server siege...
Transactions:		       65647 hits
Availability:		      100.00 %
Elapsed time:		       19.44 secs
Data transferred:	       35.28 MB
Response time:		        0.07 secs
Transaction rate:	     3376.90 trans/sec
Throughput:		        1.81 MB/sec
Concurrency:		      246.66
Successful transactions:       58469
Failed transactions:	           0
Longest transaction:	        3.02
Shortest transaction:	        0.00
 
[root@siege01 ~]# siege -b -c 250 -t 20s -f test_vm_docker.txt >/dev/null
** SIEGE 4.0.2
** Preparing 250 concurrent users for battle.
The server is now under siege...

Lifting the server siege...
Transactions:		        4791 hits
Availability:		      100.00 %
Elapsed time:		       19.47 secs
Data transferred:	        2.59 MB
Response time:		        1.00 secs
Transaction rate:	      246.07 trans/sec
Throughput:		        0.13 MB/sec
Concurrency:		      245.61
Successful transactions:        4291
Failed transactions:	           0
Longest transaction:	        1.20
Shortest transaction:	        0.00

Describe the results you received:
Good performances with plain Docker.
Very bad performances with Docker Swarm enabled.

Describe the results you expected:
Similar performances for the two Docker flavour on the same machine

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:      17.09.0-ce
 API version:  1.32
 Go version:   go1.8.3
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:41:23 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.09.0-ce
 API version:  1.32 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:42:49 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 17.09.0-ce
Storage Driver: overlay
 Backing Filesystem: xfs
 Supports d_type: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: s2ei2tx1nbf6lgn6d2yi9k782
 Is Manager: true
 ClusterID: s2dwwy929baleeoyk943wh2r9
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 192.168.10.187
 Manager Addresses:
  192.168.10.187:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 06b9cb35161009dcb7123345749fef02f7cea8e0
runc version: 3f2f8b84a77f73d38244dd690525642a72156c64
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-693.2.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 7.609GiB
Name: stresstest01
ID: 4XPS:KBEY:W53L:YAK6:4MZL:4HDN:DMUR:DD4T:5RWA:IUK6:522E:TCAL
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):
It's a KVM virtual machine (under oVirt) but the same happens when using a physical machine.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions