-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Description
When some containers are started concurrently, the container's netns fd may leaks into the docker-proxy process of other containers. This will cause the container netns to not clean up successfully after the container stops.
Sometimes, the user will insert the veth into the container netns manually. When the container netns is destroyed, the veth and the corresponding ip/mac are recycled automatically.
Unfortunately, if the container's netns does not clean up successfully, then veth(and the corresponding ip/mac) cannot be released automatically, which has an impact on the user.
Steps to reproduce the issue:
-
Create containers concurrently
$ docker run -itd -p $port:80 nginx sh -
In a few minutes, the container netns fds leak into the docker-proxy process
$ for pid in $(pidof docker-proxy); do ll /proc/$pid/fd | grep netns; done
lr-x------ 1 root root 64 Jun 22 10:48 342 -> /run/docker/netns/063bf24fc1d9
lr-x------ 1 root root 64 Jun 22 10:48 347 -> /run/docker/netns/849ee8f704ce
lr-x------ 1 root root 64 Jun 22 10:48 349 -> /run/docker/netns/6e84024c3ee2
lr-x------ 1 root root 64 Jun 22 10:48 329 -> /run/docker/netns/0ca6fb1562d7
lr-x------ 1 root root 64 Jun 22 10:48 309 -> /run/docker/netns/2bcdf542173b
lr-x------ 1 root root 64 Jun 22 10:48 292 -> /run/docker/netns/c4a58dd54666
lr-x------ 1 root root 64 Jun 22 10:48 294 -> /run/docker/netns/b532d0718aa8
lr-x------ 1 root root 64 Jun 22 10:48 275 -> /run/docker/netns/b28eced1b395
lr-x------ 1 root root 64 Jun 22 10:48 135 -> /run/docker/netns/f206782a181b
lr-x------ 1 root root 64 Jun 22 10:48 19 -> /run/docker/netns/71abd2860e80
lr-x------ 1 root root 64 Jun 22 10:48 123 -> /run/docker/netns/978a2a0a1807
lr-x------ 1 root root 64 Jun 22 10:48 104 -> /run/docker/netns/b8aa0b193942
lr-x------ 1 root root 64 Jun 22 10:48 1663 -> /run/docker/netns/cfe1141baca3
lr-x------ 1 root root 64 Jun 22 10:48 1682 -> /run/docker/netns/43c33179c26b
lr-x------ 1 root root 64 Jun 22 10:48 1665 -> /run/docker/netns/775324d4cf1b
lr-x------ 1 root root 64 Jun 22 10:48 1652 -> /run/docker/netns/dd18614755d2
lr-x------ 1 root root 64 Jun 22 10:48 1586 -> /run/docker/netns/e9b67fa1ffb8
...
- The container netns to not clean up successfully after the container stops
(1) netns fd leaks into docker-proxy:
[root@host-21 test]# lsof -p 1105
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
docker-pr 1105 root cwd DIR 253,0 4096 64 /
docker-pr 1105 root rtd DIR 253,0 4096 64 /
docker-pr 1105 root txt REG 253,0 3591848 809072665 /usr/bin/docker-proxy
...
docker-pr 1105 root 482r REG 0,3 0 4026540057 /run/docker/netns/08f69b572ca0
(2) get the container id corresponding to /run/docker/netns/08f69b572ca0
[root@host-21 test]# docker inspect -f '{{.Id}} {{.NetworkSettings.SandboxKey}}' $(docker ps -qa) | grep 08f69b572ca0
4a060c4ace9924362d9ff75cc4054762f3834758a6a7da412ac566be1ab314ea /var/run/docker/netns/08f69b572ca0
(3) insert the veth into /run/docker/netns/08f69b572ca0
[root@host-21 test]# ip link add test0 type veth peer name test1
[root@host-21 test]# ip link set dev test0 netns /var/run/docker/netns/08f69b572ca0
[root@host-21 test]#
[root@host-21 test]# ip link | grep test
13852: test1@if13853: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
[root@host-21 test]# docker exec 4a060c4ace99 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
13853: test0@if13852: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ee:36:94:96:ba:3a brd ff:ff:ff:ff:ff:ff
13426: eth0@if13427: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:ac:11:00:4c brd ff:ff:ff:ff:ff:ff
inet 172.17.0.76/16 brd 172.17.255.255 scope global eth0
valid_lft forever preferred_lft forever
(4) stop container 4a060c4ace99
[root@host-21 test]# docker stop 4a060c4ace99
4a060c4ace99
(5) veth cannot be released automatically
[root@host-21 test]# ip link | grep test
13852: test1@if13853: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
Describe the results you received:
The veth(and the corresponding ip/mac) cannot be released automatically after the container stops.
Describe the results you expected:
The veth(and the corresponding ip/mac) can be released automatically after the container stops.
Additional information you deem important (e.g. issue happens only occasionally):
Output of docker version:
[root@host-21 test]# docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:27:04 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:25:42 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.2.8
GitCommit: a4bc1d432a2c33aa2eed37f338dceabb93641310
runc:
Version: 029124da7af7360afa781a0234d1b083550f797c
GitCommit: 029124da7af7360afa781a0234d1b083550f797c
docker-init:
Version: 0.18.0
GitCommit: fec3683
Output of docker info:
(paste your output here)
Additional environment details (AWS, VirtualBox, physical, etc.):