Hi,
I have a problem with the docker 1.12 swarm mode load balancing. The setup has 3 hosts, Docker 1.12 on CentOS 7 running in Azure. Nothing really special about the hosts. Plain CentOS 7 setup, Docker 1.12 from the Docker yum repo and btrfs as a data disk for /var/lib/docker.
If I create 2 services, scale them to 3 and then try to access them from a client the access occasionally does not work.
What it means is if you access the service via the docker host ip address(es) and exposed ports some containers do not respond.
Output of docker version:
Client:
Version: 1.12.0
API version: 1.24
Go version: go1.6.3
Git commit: 8eab29e
Built:
OS/Arch: linux/amd64
Server:
Version: 1.12.0
API version: 1.24
Go version: go1.6.3
Git commit: 8eab29e
Built:
OS/Arch: linux/amd64
Output of docker info:
Containers: 2
Running: 2
Paused: 0
Stopped: 0
Images: 1
Server Version: 1.12.0
Storage Driver: btrfs
Build Version: Btrfs v3.19.1
Library Version: 101
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge overlay null host
Swarm: active
NodeID: d7oq3rjt5llc47hr9wt19tood
Is Manager: true
ClusterID: 51zzdq5p2xe8otuwmbalyfy2t
Managers: 3
Nodes: 3
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot interval: 10000
Heartbeat tick: 1
Election tick: 3
Dispatcher:
Heartbeat period: 5 seconds
CA configuration:
Expiry duration: 3 months
Node Address: 10.218.3.5
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 3.10.0-327.22.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.806 GiB
Name: azeausdockerapps301t.azr.omg.wpp
ID: LWMY:RHUH:JJ5O:OP6G:5LV5:7P7B:WI3W:2JMI:B7HY:EP6J:A7SW:DUX2
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Insecure Registries:
127.0.0.0/8
Additional environment details (AWS, VirtualBox, physical, etc.):
Current test environment is running on Microsoft Azure
Steps to reproduce the issue:
Create overlay network
docker network create --driver overlay whoami-net
docker network ls | grep whoami-net
7bmymhp028ov whoami-net overlay swarm
docker network inspect whoami-net
[
{
"Name": "whoami-net",
"Id": "7bmymhp028ov19ia47xpdao7r",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": []
},
"Internal": false,
"Containers": null,
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "257"
},
"Labels": null
}
]
Create services and scale them
docker service create --name service1 --network whoami-net -p 8000 jwilder/whoami
docker service scale service1=3
docker service create --name service2 --network whoami-net -p 8000 jwilder/whoami
docker service scale service2=3
docker service ls
ID NAME REPLICAS IMAGE COMMAND
0u2d76899t30 service2 3/3 jwilder/whoami
3ecardus67vd service1 3/3 jwilder/whoami
docker service ps service1
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
48kab5vtpwbiimn1ilsbakh0j service1.1 jwilder/whoami azeausdockerapps303t.marco.lan Running Running 3 minutes ago
800eov5dgg4hf1rgjwn2vb17d service1.2 jwilder/whoami azeausdockerapps302t.marco.lan Running Running 2 minutes ago
2klc639jzqhgy1ejyvqard46t service1.3 jwilder/whoami azeausdockerapps301t.marco.lan Running Running 2 minutes ago
docker service ps service2
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
1iyvqd2eskzdr78k86i4bjxc7 service2.1 jwilder/whoami azeausdockerapps302t.marco.lan Running Running 52 seconds ago
b4ntijm8lc99oqq2af5dyh6u9 service2.2 jwilder/whoami azeausdockerapps303t.marco.lan Running Running 48 seconds ago
e3i956f4fxgq847jwsqsstcbq service2.3 jwilder/whoami azeausdockerapps301t.marco.lan Running Running 48 seconds ago
docker service inspect service1
[
{
"ID": "3ecardus67vdjb552xf01hn3f",
"Version": {
"Index": 275
},
"CreatedAt": "2016-08-02T09:55:30.35862447Z",
"UpdatedAt": "2016-08-02T09:56:53.477137303Z",
"Spec": {
"Name": "service1",
"TaskTemplate": {
"ContainerSpec": {
"Image": "jwilder/whoami"
},
"Resources": {
"Limits": {},
"Reservations": {}
},
"RestartPolicy": {
"Condition": "any",
"MaxAttempts": 0
},
"Placement": {}
},
"Mode": {
"Replicated": {
"Replicas": 3
}
},
"UpdateConfig": {
"Parallelism": 1,
"FailureAction": "pause"
},
"Networks": [
{
"Target": "7bmymhp028ov19ia47xpdao7r"
}
],
"EndpointSpec": {
"Mode": "vip",
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8000
}
]
}
},
"Endpoint": {
"Spec": {
"Mode": "vip",
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8000
}
]
},
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8000,
"PublishedPort": 30000
}
],
"VirtualIPs": [
{
"NetworkID": "dpac4u1zv98g9eayoql72jvhq",
"Addr": "10.255.0.6/16"
},
{
"NetworkID": "7bmymhp028ov19ia47xpdao7r",
"Addr": "10.0.0.2/24"
}
]
},
"UpdateStatus": {
"StartedAt": "0001-01-01T00:00:00Z",
"CompletedAt": "0001-01-01T00:00:00Z"
}
}
]
Access service1 from a client against docker host 1
➜ ~ time curl http://10.218.3.5:30000
I'm 272dd0310a95
curl http://10.218.3.5:30000 0.01s user 0.01s system 6% cpu 0.217 total
➜ ~ time curl http://10.218.3.5:30000
curl: (7) Failed to connect to 10.218.3.5 port 30000: Operation timed out
curl http://10.218.3.5:30000 0.01s user 0.01s system 0% cpu 1:15.71 total
➜ ~ time curl http://10.218.3.5:30000
curl: (7) Failed to connect to 10.218.3.5 port 30000: Operation timed out
curl http://10.218.3.5:30000 0.01s user 0.01s system 0% cpu 1:16.82 total
➜ ~
Access service2 from a client against docker host 1
➜ ~ time curl http://10.218.3.5:30001
curl: (7) Failed to connect to 10.218.3.5 port 30001: Operation timed out
curl http://10.218.3.5:30001 0.01s user 0.01s system 0% cpu 1:17.69 total
➜ ~ time curl http://10.218.3.5:30001
I'm 8519ed607de5
curl http://10.218.3.5:30001 0.01s user 0.01s system 6% cpu 0.227 total
➜ ~ time curl http://10.218.3.5:30001
curl: (7) Failed to connect to 10.218.3.5 port 30001: Operation timed out
curl http://10.218.3.5:30001 0.01s user 0.01s system 0% cpu 1:15.79 total
➜ ~
Access service1 from a client against docker host 2
➜ ~ time curl http://10.218.3.6:30000
I'm 272dd0310a95
curl http://10.218.3.6:30000 0.01s user 0.01s system 5% cpu 0.232 total
➜ ~ time curl http://10.218.3.6:30000
curl: (7) Failed to connect to 10.218.3.6 port 30000: Operation timed out
curl http://10.218.3.6:30000 0.01s user 0.01s system 0% cpu 1:12.34 total
➜ ~ time curl http://10.218.3.6:30000
I'm 71f6aa01fad4
curl http://10.218.3.6:30000 0.01s user 0.01s system 7% cpu 0.267 total
➜ ~
Access service2 from a client against docker host 2
➜ ~ time curl http://10.218.3.6:30001
I'm 8519ed607de5
curl http://10.218.3.6:30001 0.01s user 0.01s system 6% cpu 0.241 total
➜ ~ time curl http://10.218.3.6:30001
I'm 24dbf906923a
curl http://10.218.3.6:30001 0.01s user 0.01s system 7% cpu 0.246 total
➜ ~ time curl http://10.218.3.6:30001
curl: (7) Failed to connect to 10.218.3.6 port 30001: Operation timed out
curl http://10.218.3.6:30001 0.01s user 0.01s system 0% cpu 1:15.87 total
➜ ~
Access service1 from a client against docker host 3
➜ ~ time curl http://10.218.3.7:30000
I'm 272dd0310a95
curl http://10.218.3.7:30000 0.01s user 0.01s system 4% cpu 0.353 total
➜ ~ time curl http://10.218.3.7:30000
I'm e6289ebe82da
curl http://10.218.3.7:30000 0.01s user 0.01s system 2% cpu 0.513 total
➜ ~ time curl http://10.218.3.7:30000
curl: (7) Failed to connect to 10.218.3.7 port 30000: Operation timed out
curl http://10.218.3.7:30000 0.01s user 0.01s system 0% cpu 1:16.79 total
➜ ~
Access service2 from a client against docker host 3
➜ ~ time curl http://10.218.3.7:30001
I'm 24dbf906923a
curl http://10.218.3.7:30001 0.01s user 0.01s system 7% cpu 0.234 total
➜ ~ time curl http://10.218.3.7:30001
I'm 8519ed607de5
curl http://10.218.3.7:30001 0.01s user 0.01s system 6% cpu 0.216 total
➜ ~ time curl http://10.218.3.7:30001
I'm da18d8e4b307
curl http://10.218.3.7:30001 0.01s user 0.01s system 6% cpu 0.214 total
➜ ~
Describe the results you received:
Not all containers respond when accessing the service via the docker host ip addresses and exposed ports.
Describe the results you expected:
All containers from a service should respond no matter via which docker host the service is accessed.
Additional information you deem important (e.g. issue happens only occasionally):
The issue is occasionally. Occasionally that if you delete and re-create the service maybe all containers respond, or containers on a different host do not respond.
It is at least consistent once a service is created. Lets say, containers on host 2 and host 3 do not respond when accessed via docker host 1, then it is always like this for the lifetime of that service.
Hi,
I have a problem with the docker 1.12 swarm mode load balancing. The setup has 3 hosts, Docker 1.12 on CentOS 7 running in Azure. Nothing really special about the hosts. Plain CentOS 7 setup, Docker 1.12 from the Docker yum repo and btrfs as a data disk for
/var/lib/docker.If I create 2 services, scale them to 3 and then try to access them from a client the access occasionally does not work.
What it means is if you access the service via the docker host ip address(es) and exposed ports some containers do not respond.
Output of
docker version:Output of
docker info:Additional environment details (AWS, VirtualBox, physical, etc.):
Current test environment is running on Microsoft Azure
Steps to reproduce the issue:
Create overlay network
Create services and scale them
docker service ls
docker service ps service1
docker service ps service2
docker service inspect service1
Access service1 from a client against docker host 1
Access service2 from a client against docker host 1
Access service1 from a client against docker host 2
Access service2 from a client against docker host 2
Access service1 from a client against docker host 3
Access service2 from a client against docker host 3
Describe the results you received:
Not all containers respond when accessing the service via the docker host ip addresses and exposed ports.
Describe the results you expected:
All containers from a service should respond no matter via which docker host the service is accessed.
Additional information you deem important (e.g. issue happens only occasionally):
The issue is occasionally. Occasionally that if you delete and re-create the service maybe all containers respond, or containers on a different host do not respond.
It is at least consistent once a service is created. Lets say, containers on host 2 and host 3 do not respond when accessed via docker host 1, then it is always like this for the lifetime of that service.