Skip to content

[1.13.0-rc1][Intermittent] docker: Error response from daemon: subnet sandbox join failed for "10.0.0.0/16": error creating vxlan interface: file exists. #28559

@RRAlex

Description

@RRAlex

Description

Since we can't use most docker-compose.yml options right now in bundles, we're trying to slowly migrate do swarm. The goal is simply to setup a 2 nodes swarm so we can start attaching services that we need to scale on a second box while the rest keeps running like before, except that it shares the global network.
Doing:

swarm init
docker network create -d overlay -o encrypted --attachable --subnet 10.10.0.0/16 --gateway 10.10.0.1 global

Most of the time it works, but once every [10 - 100] times I get the error that can't be recovered from, without deleting and recreating the network again.

... and then launching a container in that network with an external: true (in docker-compose.yml) or, subsequently, directly with docker run --network global, sometimes result in the following error:

docker: Error response from daemon: subnet sandbox join failed for "10.0.0.0/16": error creating vxlan interface: file exists.

For example:

docker run -d  --network global consul
95e3b9968368fd69ce0d550d1df4a72b1555cc00c3db801d49e72105157cf174
docker: Error response from daemon: subnet sandbox join failed for "10.0.0.0/16": error creating vxlan interface: file exists.

Steps to reproduce the issue:

  1. Repeatedly swarm {leave, init, join}
  2. Repeatedly create and destroy an overlay network with --attachable
  3. Launch containers using that attachable network outside of the service paradigm.

Describe the results you received:
docker: Error response from daemon: subnet sandbox join failed for "10.0.0.0/16": error creating vxlan interface: file exists.

Describe the results you expected:
Have docker-compose or docker run simply launch the containers on the local machine, but using the attachable overlay network used by the swarm cluster.

Additional information you deem important (e.g. issue happens only occasionally):

Most of the time it works, but once every [10 - 100] times I get the error that can't be recovered from, without deleting and recreating the network again.

Output of docker version:

Client:
 Version:      1.13.0-rc1
 API version:  1.25
 Go version:   go1.7.3
 Git commit:   75fd88b
 Built:        Fri Nov 11 19:47:07 2016
 OS/Arch:      linux/amd64

Server:
 Version:             1.13.0-rc1
 API version:         1.25
 Minimum API version: 1.12
 Go version:          go1.7.3
 Git commit:          75fd88b
 Built:               Fri Nov 11 19:47:07 2016
 OS/Arch:             linux/amd64
 Experimental:        false

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 38
Server Version: 1.13.0-rc1
Storage Driver: devicemapper
 Pool Name: docker-252:1-24513995-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 11.37 GB
 Data Space Total: 107.4 GB
 Data Space Available: 96 GB
 Metadata Space Used: 21.6 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.126 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.110 (2015-10-30)
Logging Driver: syslog
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: w157uhjpskdezpqxa238z8a58
 Is Manager: true
 ClusterID: 76qelv6wztgbfr0wsgtaitnnu
 Managers: 1
 Nodes: 2
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 192.168.1.8
 Manager Addresses:
  192.168.1.8:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8517738ba4b82aff5662c97ca4627e7e4d03b531
runc version: ac031b5bf1cc92239461125f4c1ffb760522bbf2
init version: N/A (expected: v0.13.0)
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-49-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.42 GiB
Name: aramid
ID: YZNL:KJWN:VKZS:Z4YQ:5CIK:WEPN:BAPO:SUN3:Y5ZZ:KEE2:IDKH:CXKI
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):
bare metal

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions