Idle connections over overlay network ends up in a broken state after 15 minutes

**Description**

In a swarm setup using overlay networks, idle connections between 2 services will end up in a broken state after 15 minutes.

The issue is related to the way docker overlay routes packets, using first iptables to mark them and use ipvs to forward them to the right hosts but the default expiration for connections on ipvs is set to 900 seconds (`ipvsadm -l --timeout`) after which it will stop forwarding packets even though the connection still exists; If this happens then any new packet on this connection will now try to go to the virtual IP for that service that has no valid resolution, resulting in a broken state where it is stuck in limbo while the kernel forever tries to resolve that virtual IP.

**Steps to reproduce the issue:**

1. Start 2 services on the same network (on different hosts, though it should be reproducible even on a single host?)
2. `docker exec` in both of them, in one start a `nc` command in listen mode, in the other one connect to that `nc` server by using the service name DNS.
3. Send a packet from the client to the server, everything is fine
4. Find your `netns` and find your connection by doing `nsenter --net=2cc18e502f81 ipvsadm -lnc`
4. Wait for the connection to expire and be removed from the list
5. Send another packet, nothing ever gets there and the connection doesn't timeout, `tcpdump` shows lots of ARP packets going out

**Describe the results you received:**

Packet never reaches the target, kernel is stuck doing ARP requests over and over.

**Describe the results you expected:**

Either have the connection properly timeout, or find a way to restore the routing in ipvs.

**Additional information you deem important (e.g. issue happens only occasionally):**

Currently can be resolved by setting `net.ipv4.tcp_keepalive_time` to less than 900 seconds, to make sure the TCP connection doesn't expire but I'm not sure if it's a valid way to deal with this; At the very least this behavior should be documented.

**Output of `docker version`:**

```
Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 06:38:28 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 06:38:28 2017
 OS/Arch:      linux/amd64
 Experimental: false

```

**Output of `docker info`:**

```
Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 1.13.1
Storage Driver: overlay
 Backing Filesystem: xfs
 Supports d_type: true
Logging Driver: fluentd
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: l3e2evjei4cvcdgjqavtrztgo
 Is Manager: false
 Node Address: 172.24.0.100
 Manager Addresses:
  172.24.0.200:2377
  172.24.0.50:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1
runc version: 9df8b306d01f59d3a8029be411de015b7304dd8f
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.2.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.796 GiB
Name: worker-1
ID: DR4G:LZEQ:YSQ7:CYTR:FAXW:ZNVJ:E4AZ:BX5L:QYYG:ZDY5:SO7U:TFZW
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Labels:
 dawn.node.type=worker
 dawn.node.subtype=app
Experimental: false
Insecure Registries:
 172.24.0.50:5000
 127.0.0.0/8
Live Restore Enabled: false
```

**Additional environment details (AWS, VirtualBox, physical, etc.):**

My current test setup is 5 vagrant boxes (2 managers + 3 workers), but it should happen in any environment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idle connections over overlay network ends up in a broken state after 15 minutes #31208

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Idle connections over overlay network ends up in a broken state after 15 minutes #31208

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions