Skip to content

Leaked namespace and network interfaces if container startup (prestart hook) fails #31414

@krisztian-kovacs

Description

@krisztian-kovacs

Description
We've observed an issue where occasionally container startup fails with the following error:

docker[28335]: /usr/bin/docker: Error response from daemon: oci runtime error: process_linux.go:334: running prestart hook 0 caused "exit status 1: time=\"2017-01-27T11:04:33+01:00\" level=fatal msg=\"failed to add interface veth944fac0 to sandbox: failed to set link up: device or resource busy\" \n".

After the above failure, all subsequent docker run commands with the same IP address specified fail with the following error:

docker[28335]: /usr/bin/docker: Error response from daemon: oci runtime error: process_linux.go:334: running prestart hook 0 caused "exit status 1: time=\"2017-01-27T11:04:33+01:00\" level=fatal msg=\"failed to add interface veth944fac0 to sandbox: failed to set link up: device or resource busy\" \n".

The containers are run via docker run --net=ournetwork --ip=10.10.10.1 --ip6=fec0::a123:10:10:10:1 image-name. We're using a custom network with the macvlan driver:

docker network inspect ournetwork
[
    {
        "Name": "ournetwork",
        "Id": "55f89666fe515afb05927762eadad282f663ade59f34dd11fb0ffe2894428eb1",
        "Scope": "local",
        "Driver": "macvlan",
        "EnableIPv6": true,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "10.10.0.0/16",
                    "Gateway": "10.10.255.254"
                },
                {
                    "Subnet": "fec0:0:0:a123::/64"
                }
            ]
        },
        "Internal": false,
        "Containers": {},
        "Options": {
            "parent": "eth0"
        },
        "Labels": {}
    }
]

I've found some seemingly unused netns mounts in /var/run/docker/netns, and running ip address show in those unused network namespaces reveals that the namespace and a network interface in it with IP address 10.10.10.1 is leaked after each and every error where the prestart hook fails.

For example, in netns fc371b062ac8, running ip address show shows the following:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
10006: eth0@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default 
    link/ether 02:42:0a:6e:1e:1c brd ff:ff:ff:ff:ff:ff
    inet 10.10.10.1/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fec0::a123:10:10:10:1/64 scope site nodad 
       valid_lft forever preferred_lft forever

Steps to reproduce the issue:

After the first (random, seemingly some kind of race condition adding the IPv6 gateway) prestart hook failure, all subsequent 'docker run' commands fail and all of them leak the network namespace and the interfaces created.

Describe the results you received:

The IP address configured in docker run --ip can be pinged even though the container is not running.

Describe the results you expected:

The namespace and all associated interfaces should be properly deleted if container startup fails.

Output of docker version:

Client:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   34a2ead
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   34a2ead
 Built:        
 OS/Arch:      linux/amd64

Additional environment details (AWS, VirtualBox, physical, etc.):

We're running CoreOS 1235.1.0 with the docker version supplied with that release.

Since the error messages above are printed by runc, I've previously reported this as a runc issue (opencontainers/runc#1352), but it seems that even though runc is failing, it's the docker daemon that was supposed to clean up after such a failure.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions