Skip to content

Swarm Mode: strange/incorrect DNS response "tasks.[service-name]" #28836

@sign2k

Description

@sign2k

Description

I'm currently facing a problem in DNS resolution on our 2 node DEV docker swarm mode cluster.
Our system is using netflix/eureka for service discovery. To get all running eureka nodes in our cluster the services use a DNS lookup for "tasks.eureka-server" and use the returned addresses as eureka-servers for service discovery. After a random amount of time the server responds with only one of our eureka-servers (the one residing on the same node).

;; QUESTION SECTION:
;tasks.eureka-server.           IN      A
;; ANSWER SECTION:
tasks.eureka-server.    600     IN      A       10.9.4.20
tasks.eureka-server.    600     IN      A       10.9.4.20

Each service is querying the embedded DNS server two times every 20 seconds (one A, one AAAA). With about 18 microservice tasks up on the node there is a decent amount of querys to the embedded DNS server.

Steps to reproduce the issue:

I could not figure out a way to reproduce this - it happens randomly

Describe the results you received:

Only one of the running tasks IPs is given in the DNS Response. (2 lines with the same IP)

Describe the results you expected:

Both tasks IPs are given in the DNS Response. (2 lines with different IPs)

Output of docker version:

Client:
 Version:      1.13.0-rc2
 API version:  1.25
 Go version:   go1.7.3
 Git commit:   1f9b3ef
 Built:        Wed Nov 23 06:17:45 2016
 OS/Arch:      linux/amd64

Server:
 Version:             1.13.0-rc2
 API version:         1.25
 Minimum API version: 1.12
 Go version:          go1.7.3
 Git commit:          1f9b3ef
 Built:               Wed Nov 23 06:17:45 2016
 OS/Arch:             linux/amd64
 Experimental:        false

Output of docker info:

Containers: 17
 Running: 16
 Paused: 0
 Stopped: 1
Images: 17
Server Version: 1.13.0-rc2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 273
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: 4c20wnqlz18k7df5gf239jl63
 Is Manager: true
 ClusterID: 49qkbt5bcp0jau6rffzcvzjc8
 Managers: 1
 Nodes: 2
 Orchestration:
  Task History Retention Limit: 1
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 10.9.1.238
 Manager Addresses:
  10.9.1.238:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.71 GiB
Name: gtg-dev-doc03
ID: VUE7:2SEX:5MI2:XLWI:2K4E:GJ4E:4ISQ:A5FX:E47A:ONPA:DD7K:ECCC
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 181
 Goroutines: 294
 System Time: 2016-11-25T12:40:40.179934678+01:00
 EventsListeners: 16
Registry: https://index.docker.io/v1/
WARNING: No kernel memory limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
Experimental: false
Insecure Registries:
 10.9.1.238:5000
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):
Docker Swarm Nodes running on VMs in a XEN-Cluster.

ID                           HOSTNAME       STATUS  AVAILABILITY  MANAGER STATUS
4c20wnqlz18k7df5gf239jl63 *  gtg-dev-doc03  Ready   Active        Leader
c0ywywgo0to8ya492gna8aksa    gtg-dev-doc04  Ready   Active

Log from gtg-dev-doc04

Nov 25 10:25:54 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:54.209083760+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:54 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:54.209098299+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.423963948+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424030177+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424054102+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424077630+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424109696+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424110039+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424084264+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424060118+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.423963948+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424179676+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424193043+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.424132933+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.20]"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.440334762+01:00" level=debug msg="2016/11/25 10:25:55 [DEBUG] memberlist: TCP connection from=10.9.1.238:40486\n"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490083414+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490107906+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490149363+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc04 dockerd[482]: time="2016-11-25T10:25:55.490152907+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.20]"

Log from gtg-dev-doc03

Nov 25 10:25:47 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:47.928916383+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:47 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:47.928994311+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.36 10.9.4.20]"
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.624930023+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.624945515+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.625033685+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.36]"
Nov 25 10:25:50 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:50.625104964+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.318900068+01:00" level=debug msg="Calling GET /_ping"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.319552756+01:00" level=debug msg="Calling GET /v1.25/services"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.325920809+01:00" level=debug msg="Calling GET /v1.25/tasks?filters=%7B%22service%22%3A%7B%2221zwlnp0wh5lopjjvyk1lsg4c%22%3Atrue%2C%222gmfd9312m8u2bxttcxtw06rl%22%3Atrue%2C%2238b9geyigv2lampyxbgipav6d%22%3Atrue%2C%223putfvrcrd4gy2hk3095jibvy%22%3At
rue%2C%22bp1d2cs97bapdc6u6w5ru69nd%22%3Atrue%2C%22c4iq45fl5ah4scz4w2gg4yz4z%22%3Atrue%2C%22coqc5xhqfc4td2n4w791qteyw%22%3Atrue%2C%22d0hqogxbtqsd9csgiw9jcp13x%22%3Atrue%2C%22dg052c4f9728l636dwv47nqbq%22%3Atrue%2C%22dnv9huceskuf1s79ako3e6x7q%22%3Atrue%2C%22eslfovlxxfsatbfxwoae7rlca%22%3Atrue%2C%22k6ipnhi6wbs5xeuzqwde8
m78z%22%3Atrue%2C%22lqe10blr7y0p94kusy1tbjzzh%22%3Atrue%2C%22m6b5gbtisa3uysbudxnp3zqhs%22%3Atrue%2C%22vhgxj158zddivpn3el5emiji0%22%3Atrue%2C%22yvpzqtuq9fx5hqdiwe5duscdz%22%3Atrue%2C%22zj1o6esjvqmts4rtl4b517afj%22%3Atrue%7D%7D"
Nov 25 10:25:52 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:52.338293083+01:00" level=debug msg="Calling GET /v1.25/nodes"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.364932781+01:00" level=debug msg="Calling GET /_ping"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.365432915+01:00" level=debug msg="Calling GET /v1.25/services"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.369456046+01:00" level=debug msg="Calling GET /v1.25/tasks?filters=%7B%22service%22%3A%7B%2221zwlnp0wh5lopjjvyk1lsg4c%22%3Atrue%2C%222gmfd9312m8u2bxttcxtw06rl%22%3Atrue%2C%2238b9geyigv2lampyxbgipav6d%22%3Atrue%2C%223putfvrcrd4gy2hk3095jibvy%22%3At
rue%2C%22bp1d2cs97bapdc6u6w5ru69nd%22%3Atrue%2C%22c4iq45fl5ah4scz4w2gg4yz4z%22%3Atrue%2C%22coqc5xhqfc4td2n4w791qteyw%22%3Atrue%2C%22d0hqogxbtqsd9csgiw9jcp13x%22%3Atrue%2C%22dg052c4f9728l636dwv47nqbq%22%3Atrue%2C%22dnv9huceskuf1s79ako3e6x7q%22%3Atrue%2C%22eslfovlxxfsatbfxwoae7rlca%22%3Atrue%2C%22k6ipnhi6wbs5xeuzqwde8
m78z%22%3Atrue%2C%22lqe10blr7y0p94kusy1tbjzzh%22%3Atrue%2C%22m6b5gbtisa3uysbudxnp3zqhs%22%3Atrue%2C%22vhgxj158zddivpn3el5emiji0%22%3Atrue%2C%22yvpzqtuq9fx5hqdiwe5duscdz%22%3Atrue%2C%22zj1o6esjvqmts4rtl4b517afj%22%3Atrue%7D%7D"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.381617852+01:00" level=debug msg="Calling GET /v1.25/nodes"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482110113+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482134682+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482204317+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:54 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:54.482216295+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.36]"
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191523211+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191581012+01:00" level=debug msg="Name To resolve: tasks.eureka-server."
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191631557+01:00" level=debug msg="Lookup name tasks.eureka-server. present without IPv6 address"
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.191667511+01:00" level=debug msg="Lookup for tasks.eureka-server.: IP [10.9.4.20 10.9.4.36]"
Nov 25 10:25:55 gtg-dev-doc03 dockerd[480]: time="2016-11-25T10:25:55.444128802+01:00" level=debug msg="2016/11/25 10:25:55 [DEBUG] memberlist: Initiating push/pull sync with: 10.9.1.239:7946\n"

docker service inspect eureka-server

[
    {
        "ID": "vhgxj158zddivpn3el5emiji0",
        "Version": {
            "Index": 84415
        },
        "CreatedAt": "2016-11-18T13:54:14.010802643Z",
        "UpdatedAt": "2016-11-25T05:31:43.521722108Z",
        "Spec": {
            "Name": "eureka-server",
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "10.9.1.238:5000/goodstag/eureka-server:dev@sha256:aa07d9169d545ac14690e5b325c7cd134f64f61cfa688f248f299377423c732e",
                    "Env": [
                        "UPDATE_TIME=1479976196"
                    ],
                    "StopGracePeriod": 120000000000
                },
                "Resources": {
                    "Limits": {
                        "MemoryBytes": 878706688
                    },
                    "Reservations": {
                        "MemoryBytes": 878706688
                    }
                },
                "RestartPolicy": {
                    "Condition": "any",
                    "Delay": 60000000000,
                    "MaxAttempts": 9,
                    "Window": 600000000000
                },
                "Placement": {},
                "ForceUpdate": 0
            },
            "Mode": {
                "Global": {}
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "Delay": 200000000000,
                "FailureAction": "pause",
                "MaxFailureRatio": 0
            },
            "Networks": [
                {
                    "Target": "csf4tlbspmynrow20s2606l8a"
                }
            ],
            "EndpointSpec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 8761,
                        "PublishedPort": 8761,
                        "PublishMode": "ingress"
                    }
                ]
            }
        },
        "PreviousSpec": {
            "Name": "eureka-server",
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "10.9.1.238:5000/goodstag/eureka-server:dev@sha256:96c25de3a26d888ed3f19371e45cee7120148393b2457a28bf7b666d95938e88",
                    "Env": [
                        "UPDATE_TIME=1479479270"
                    ],
                    "StopGracePeriod": 120000000000
                },
                "Resources": {
                    "Limits": {
                        "MemoryBytes": 878706688
                    },
                    "Reservations": {
                        "MemoryBytes": 878706688
                    }
                },
                "RestartPolicy": {
                    "Condition": "any",
                    "Delay": 60000000000,
                    "MaxAttempts": 9,
                    "Window": 600000000000
                },
                "Placement": {},
                "ForceUpdate": 0
            },
            "Mode": {
                "Global": {}
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "Delay": 200000000000,
                "FailureAction": "pause",
                "MaxFailureRatio": 0
            },
            "Networks": [
                {
                    "Target": "csf4tlbspmynrow20s2606l8a"
                }
            ],
            "EndpointSpec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 8761,
                        "PublishedPort": 8761,
                        "PublishMode": "ingress"
                    }
                ]
            }
        },
        "Endpoint": {
            "Spec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 8761,
                        "PublishedPort": 8761,
                        "PublishMode": "ingress"
                    }
                ]
            },
            "Ports": [
                {
                    "Protocol": "tcp",
                    "TargetPort": 8761,
                    "PublishedPort": 8761,
                    "PublishMode": "ingress"
                }
            ],
            "VirtualIPs": [
                {
                    "NetworkID": "b4a0oqwx0rsu1y0i20pcqkhyh",
                    "Addr": "10.255.0.6/16"
                },
                {
                    "NetworkID": "csf4tlbspmynrow20s2606l8a",
                    "Addr": "10.9.4.5/24"
                }
            ]
        },
        "UpdateStatus": {
            "State": "completed",
            "StartedAt": "2016-11-24T08:29:57.014069251Z",
            "CompletedAt": "2016-11-25T05:31:43.52170886Z",
            "Message": "update completed"
        }
    }

Metadata

Metadata

Assignees

Labels

area/networkingNetworkingarea/swarmkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.version/1.13

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions