Skip to content

Docker daemon hanging #25321

@chbatey

Description

@chbatey

I have seen the daemon hang (1.8.3) under high load: #13885. However this appears different. I am now running 1.11.2 and had a hang under little to no load.

BUG REPORT INFORMATION

Output of docker version:

centos@ip-10-50-185-106 ~]$ docker version
Client:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   b9f10c9
 Built:        Wed Jun  1 21:23:11 2016
 OS/Arch:      linux/amd64
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
[centos@ip-10-50-185-106 ~]$ sudo docker version
Client:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   b9f10c9
 Built:        Wed Jun  1 21:23:11 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   b9f10c9
 Built:        Wed Jun  1 21:23:11 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 13
 Running: 12
 Paused: 0
 Stopped: 1
Images: 11
Server Version: 1.11.2
Storage Driver: devicemapper
 Pool Name: direct_lvm-thin_pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 107.4 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 2.39 GB
 Data Space Total: 66.57 GB
 Data Space Available: 64.18 GB
 Metadata Space Used: 4.375 MB
 Metadata Space Total: 1.074 GB
 Metadata Space Available: 1.069 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.107-RHEL7 (2016-06-09)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.10.0-327.22.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 6.897 GiB
Name: ip-10-50-185-106.internal
ID: DDCD:TC7W:6V5N:QDUA:YQF6:24EU:5SVR:WY3L:VZ7X:4BRW:NKM4:INSA
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Http Proxy: http://10.50.185.193:80
Https Proxy: http://10.50.185.193:80
No Proxy: 10.50.185.0/24,.internal,registry-k8.api.bskyb.com,master-test-k8.api.bskyb.com,localhost,127.0.0.0/8,::1,/var/run/docker.sock,169.254.169.254
Registry: https://index.docker.io/v1/

Additional environment details (AWS, VirtualBox, physical, etc.):
AWS, Centos 7

Steps to reproduce the issue:
Unknown

Describe the results you received:
All docker client commands hang

Stracing docker client:

read(6, 0xc8203ea000, 4096)             = -1 EAGAIN (Resource temporarily unavailable)
write(6, "GET /v1.23/containers/json HTTP/"..., 89) = 89
futex(0x21ef2b0, FUTEX_WAIT, 0, NULL

So basically no response back from the daemon.

Stracing docker daemon

[root@ip-10-50-185-112 ~]# strace -p 1095
Process 1095 attached
read(46, 

So the daemon look to be reading on FD 46

[centos@ip-10-50-185-112 ~]$ sudo lsof -d 46                                                                                                                                                                                                                                                                                                                               
COMMAND     PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
kube-prox   991 root   46u  IPv6              24429      0t0       TCP *:31624 (LISTEN)
docker     1095 root   46r  FIFO               0,18      0t0   3836156 /run/docker/libcontainerd/110d033df2a6bf66ddffce6aeef574148f04b25b61a4f931978933cec4f51116/init-stderr
master     1540 root   46u  unix 0xffff8800e787c740      0t0     24830 public/flush
docker-co  1583 root   46r  FIFO               0,18      0t0     36330 /run/containerd/d6bd98f8998e1e5638a8d0122e44ae452a30ff89a7e6733ab962188762b3785e/init/exit
kubelet    2244 root   46u  sock                0,6      0t0   6102239 protocol: TCPv6

Describe the results you expected:
Docker to work

Additional information you deem important (e.g. issue happens only occasionally):
Happens occasionally

Any additional information I can add today. I have taken this node out of use and left it hung. The only fix is to restart docker which I can hold off for 24 hours.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions