Description
I'm currently facing a problem in DNS resolution on our 2 node DEV docker swarm mode cluster.
Our system is using netflix/eureka for service discovery. To get all running eureka nodes in our cluster the services use a DNS lookup for "tasks.eureka-server" and use the returned addresses as eureka-servers for service discovery. After a random amount of time the server responds with only one of our eureka-servers (the one residing on the same node).
;; QUESTION SECTION:
;tasks.eureka-server. IN A
;; ANSWER SECTION:
tasks.eureka-server. 600 IN A 10.9.4.20
tasks.eureka-server. 600 IN A 10.9.4.20
Each service is querying the embedded DNS server two times every 20 seconds (one A, one AAAA). With about 18 microservice tasks up on the node there is a decent amount of querys to the embedded DNS server.
Steps to reproduce the issue:
I could not figure out a way to reproduce this - it happens randomly
Describe the results you received:
Only one of the running tasks IPs is given in the DNS Response. (2 lines with the same IP)
Describe the results you expected:
Both tasks IPs are given in the DNS Response. (2 lines with different IPs)
Output of docker version:
Client:
Version: 1.13.0-rc2
API version: 1.25
Go version: go1.7.3
Git commit: 1f9b3ef
Built: Wed Nov 23 06:17:45 2016
OS/Arch: linux/amd64
Server:
Version: 1.13.0-rc2
API version: 1.25
Minimum API version: 1.12
Go version: go1.7.3
Git commit: 1f9b3ef
Built: Wed Nov 23 06:17:45 2016
OS/Arch: linux/amd64
Experimental: false
Output of docker info:
Containers: 17
Running: 16
Paused: 0
Stopped: 1
Images: 17
Server Version: 1.13.0-rc2
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 273
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: active
NodeID: 4c20wnqlz18k7df5gf239jl63
Is Manager: true
ClusterID: 49qkbt5bcp0jau6rffzcvzjc8
Managers: 1
Nodes: 2
Orchestration:
Task History Retention Limit: 1
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: 10.9.1.238
Manager Addresses:
10.9.1.238:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.71 GiB
Name: gtg-dev-doc03
ID: VUE7:2SEX:5MI2:XLWI:2K4E:GJ4E:4ISQ:A5FX:E47A:ONPA:DD7K:ECCC
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 181
Goroutines: 294
System Time: 2016-11-25T12:40:40.179934678+01:00
EventsListeners: 16
Registry: https://index.docker.io/v1/
WARNING: No kernel memory limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
Experimental: false
Insecure Registries:
10.9.1.238:5000
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
Docker Swarm Nodes running on VMs in a XEN-Cluster.
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
4c20wnqlz18k7df5gf239jl63 * gtg-dev-doc03 Ready Active Leader
c0ywywgo0to8ya492gna8aksa gtg-dev-doc04 Ready Active
Log from gtg-dev-doc04
Nov 25 10:25:54 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:54.209083760+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:54 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:54.209098299+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.423963948+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424030177+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424054102+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424077630+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424109696+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424110039+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424084264+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424060118+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.423963948+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424179676+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424193043+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424132933+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.440334762+01:00" level=debug msg="2016/11/25 10:25:55 [DEBUG] memberlist: TCP connection from=10.9.1.238:40486\n"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490083414+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490107906+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490149363+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490152907+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.20]"
Log from gtg-dev-doc03
Nov 25 10:25:47 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:47.928916383+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:47 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:47.928994311+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.624930023+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.624945515+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.625033685+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.36]"
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.625104964+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.318900068+01:00" level=debug msg="Calling GET /_ping"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.319552756+01:00" level=debug msg="Calling GET /v1.25/services"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.325920809+01:00" level=debug msg="Calling GET /v1.25/tasks?filters=%7B%22service%22%3A%7B%2221zwlnp0wh5lopjjvyk1lsg4c%22%3Atrue%2C%222gmfd9312m8u2bxttcxtw06rl%22%3Atrue%2C%2238b9geyigv2lampyxbgipav6d%22%3Atrue%2C%223putfvrcrd4gy2hk3095jibvy%22%3At
rue%2C%22bp1d2cs97bapdc6u6w5ru69nd%22%3Atrue%2C%22c4iq45fl5ah4scz4w2gg4yz4z%22%3Atrue%2C%22coqc5xhqfc4td2n4w791qteyw%22%3Atrue%2C%22d0hqogxbtqsd9csgiw9jcp13x%22%3Atrue%2C%22dg052c4f9728l636dwv47nqbq%22%3Atrue%2C%22dnv9huceskuf1s79ako3e6x7q%22%3Atrue%2C%22eslfovlxxfsatbfxwoae7rlca%22%3Atrue%2C%22k6ipnhi6wbs5xeuzqwde8
m78z%22%3Atrue%2C%22lqe10blr7y0p94kusy1tbjzzh%22%3Atrue%2C%22m6b5gbtisa3uysbudxnp3zqhs%22%3Atrue%2C%22vhgxj158zddivpn3el5emiji0%22%3Atrue%2C%22yvpzqtuq9fx5hqdiwe5duscdz%22%3Atrue%2C%22zj1o6esjvqmts4rtl4b517afj%22%3Atrue%7D%7D"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.338293083+01:00" level=debug msg="Calling GET /v1.25/nodes"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.364932781+01:00" level=debug msg="Calling GET /_ping"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.365432915+01:00" level=debug msg="Calling GET /v1.25/services"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.369456046+01:00" level=debug msg="Calling GET /v1.25/tasks?filters=%7B%22service%22%3A%7B%2221zwlnp0wh5lopjjvyk1lsg4c%22%3Atrue%2C%222gmfd9312m8u2bxttcxtw06rl%22%3Atrue%2C%2238b9geyigv2lampyxbgipav6d%22%3Atrue%2C%223putfvrcrd4gy2hk3095jibvy%22%3At
rue%2C%22bp1d2cs97bapdc6u6w5ru69nd%22%3Atrue%2C%22c4iq45fl5ah4scz4w2gg4yz4z%22%3Atrue%2C%22coqc5xhqfc4td2n4w791qteyw%22%3Atrue%2C%22d0hqogxbtqsd9csgiw9jcp13x%22%3Atrue%2C%22dg052c4f9728l636dwv47nqbq%22%3Atrue%2C%22dnv9huceskuf1s79ako3e6x7q%22%3Atrue%2C%22eslfovlxxfsatbfxwoae7rlca%22%3Atrue%2C%22k6ipnhi6wbs5xeuzqwde8
m78z%22%3Atrue%2C%22lqe10blr7y0p94kusy1tbjzzh%22%3Atrue%2C%22m6b5gbtisa3uysbudxnp3zqhs%22%3Atrue%2C%22vhgxj158zddivpn3el5emiji0%22%3Atrue%2C%22yvpzqtuq9fx5hqdiwe5duscdz%22%3Atrue%2C%22zj1o6esjvqmts4rtl4b517afj%22%3Atrue%7D%7D"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.381617852+01:00" level=debug msg="Calling GET /v1.25/nodes"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482110113+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482134682+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482204317+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482216295+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.36]"
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191523211+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191581012+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191631557+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191667511+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.36]"
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.444128802+01:00" level=debug msg="2016/11/25 10:25:55 [DEBUG] memberlist: Initiating push/pull sync with: 10.9.1.239:7946\n"
docker service inspect eureka-server
[
{
"ID": "vhgxj158zddivpn3el5emiji0",
"Version": {
"Index": 84415
},
"CreatedAt": "2016-11-18T13:54:14.010802643Z",
"UpdatedAt": "2016-11-25T05:31:43.521722108Z",
"Spec": {
"Name": "eureka-server",
"TaskTemplate": {
"ContainerSpec": {
"Image": "10.9.1.238:5000/goodstag/eureka-server:dev@sha256:aa07d9169d545ac14690e5b325c7cd134f64f61cfa688f248f299377423c732e",
"Env": [
"UPDATE_TIME=1479976196"
],
"StopGracePeriod": 120000000000
},
"Resources": {
"Limits": {
"MemoryBytes": 878706688
},
"Reservations": {
"MemoryBytes": 878706688
}
},
"RestartPolicy": {
"Condition": "any",
"Delay": 60000000000,
"MaxAttempts": 9,
"Window": 600000000000
},
"Placement": {},
"ForceUpdate": 0
},
"Mode": {
"Global": {}
},
"UpdateConfig": {
"Parallelism": 1,
"Delay": 200000000000,
"FailureAction": "pause",
"MaxFailureRatio": 0
},
"Networks": [
{
"Target": "csf4tlbspmynrow20s2606l8a"
}
],
"EndpointSpec": {
"Mode": "vip",
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8761,
"PublishedPort": 8761,
"PublishMode": "ingress"
}
]
}
},
"PreviousSpec": {
"Name": "eureka-server",
"TaskTemplate": {
"ContainerSpec": {
"Image": "10.9.1.238:5000/goodstag/eureka-server:dev@sha256:96c25de3a26d888ed3f19371e45cee7120148393b2457a28bf7b666d95938e88",
"Env": [
"UPDATE_TIME=1479479270"
],
"StopGracePeriod": 120000000000
},
"Resources": {
"Limits": {
"MemoryBytes": 878706688
},
"Reservations": {
"MemoryBytes": 878706688
}
},
"RestartPolicy": {
"Condition": "any",
"Delay": 60000000000,
"MaxAttempts": 9,
"Window": 600000000000
},
"Placement": {},
"ForceUpdate": 0
},
"Mode": {
"Global": {}
},
"UpdateConfig": {
"Parallelism": 1,
"Delay": 200000000000,
"FailureAction": "pause",
"MaxFailureRatio": 0
},
"Networks": [
{
"Target": "csf4tlbspmynrow20s2606l8a"
}
],
"EndpointSpec": {
"Mode": "vip",
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8761,
"PublishedPort": 8761,
"PublishMode": "ingress"
}
]
}
},
"Endpoint": {
"Spec": {
"Mode": "vip",
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8761,
"PublishedPort": 8761,
"PublishMode": "ingress"
}
]
},
"Ports": [
{
"Protocol": "tcp",
"TargetPort": 8761,
"PublishedPort": 8761,
"PublishMode": "ingress"
}
],
"VirtualIPs": [
{
"NetworkID": "b4a0oqwx0rsu1y0i20pcqkhyh",
"Addr": "10.255.0.6/16"
},
{
"NetworkID": "csf4tlbspmynrow20s2606l8a",
"Addr": "10.9.4.5/24"
}
]
},
"UpdateStatus": {
"State": "completed",
"StartedAt": "2016-11-24T08:29:57.014069251Z",
"CompletedAt": "2016-11-25T05:31:43.52170886Z",
"Message": "update completed"
}
}
Description
I'm currently facing a problem in DNS resolution on our 2 node DEV docker swarm mode cluster.
Our system is using netflix/eureka for service discovery. To get all running eureka nodes in our cluster the services use a DNS lookup for "tasks.eureka-server" and use the returned addresses as eureka-servers for service discovery. After a random amount of time the server responds with only one of our eureka-servers (the one residing on the same node).
Each service is querying the embedded DNS server two times every 20 seconds (one A, one AAAA). With about 18 microservice tasks up on the node there is a decent amount of querys to the embedded DNS server.
Steps to reproduce the issue:
I could not figure out a way to reproduce this - it happens randomly
Describe the results you received:
Only one of the running tasks IPs is given in the DNS Response. (2 lines with the same IP)
Describe the results you expected:
Both tasks IPs are given in the DNS Response. (2 lines with different IPs)
Output of
docker version:Output of
docker info:Additional environment details (AWS, VirtualBox, physical, etc.):
Docker Swarm Nodes running on VMs in a XEN-Cluster.
Log from gtg-dev-doc04
Log from gtg-dev-doc03
docker service inspect eureka-server