Skip to content

Add alias for hostname if hostname != container name#39204

Merged
yongtang merged 1 commit intomoby:masterfrom
olljanat:fix-hostname-dns-resolution
Jun 2, 2019
Merged

Add alias for hostname if hostname != container name#39204
yongtang merged 1 commit intomoby:masterfrom
olljanat:fix-hostname-dns-resolution

Conversation

@olljanat
Copy link
Contributor

@olljanat olljanat commented May 11, 2019

- What I did
As part of investigation #33408 I noticed that if user defines custom hostname then that hostname is not resolvable from other containers on same network.

- How I did it
Added logic which adds alias if to all user defined networks if hostname != container name and user have not added that alias himself.

- How to verify it
Included unit test which failed to error:

Running /go/src/github.com/docker/docker/integration/container
INFO: Testing against a local daemon
=== RUN   TestHostnameDnsResolution
--- FAIL: TestHostnameDnsResolution (3.64s)
    run_linux_test.go:128: assertion failed: 
        --- ←
        +++ →
        @@ -1 +1,2 @@
        +ping: bad address 'foobar'
         
        
    run_linux_test.go:129: assertion failed: 0 (int) != 1 (res.ExitCode int)
FAIL

without this one.

You can also try deploy stack:

version: '3.7'
networks:
    test:
services:
  busybox:
    image: busybox
    command: tail -f /dev/null
    deploy:
      replicas: 2
    hostname: "busybox-{{.Task.Slot}}"
    networks:
      - test

and ping containers with each others hostnames.

It is also worth of mention that user is also able to set same hostname to multiple containers with stack like:

version: '3.7'
networks:
    test:
services:
  test:
    image: busybox
    command: tail -f /dev/null
    deploy:
      replicas: 2
    hostname: busybox
    networks:
      - test

but it does matter as it is totally valid DNS configuration to have multiple records with same name (DNS round robin) and container uses hosts file as default name resolution source (=> application inside of container will get containers own IP when if they try resolve container hostname).

- A picture of a cute animal (not mandatory but encouraged)
image

@mikeytag
Copy link

I really like this solution as well as the other branch you and I are working on. It makes sense to me that custom hostnames should be addressable on an attached network.

@olljanat
Copy link
Contributor Author

olljanat commented May 11, 2019

(reserved for my derek commands)

@codecov
Copy link

codecov bot commented May 11, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@0805242). Click here to learn what that means.
The diff coverage is 0%.

@@            Coverage Diff            @@
##             master   #39204   +/-   ##
=========================================
  Coverage          ?   37.04%           
=========================================
  Files             ?      612           
  Lines             ?    45495           
  Branches          ?        0           
=========================================
  Hits              ?    16853           
  Misses            ?    26352           
  Partials          ?     2290

@olljanat olljanat force-pushed the fix-hostname-dns-resolution branch from 5a486bf to c74ba11 Compare May 22, 2019 17:18
name which happens if user manually specify hostname

Signed-off-by: Olli Janatuinen <[email protected]>
@olljanat
Copy link
Contributor Author

@vdemeester looks that you was chosen as wonder to review this one so PTAL.

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Member

@yongtang yongtang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah
Copy link
Member

I recall this was originally by design; hostname being local to the container, and not to the network (might've been because container hostnames are not checked to be unique, therefore multiple containers with the same hostname would collide; sure, the same applies when using --network-alias, but in that case it'd be an explicit call by the user)

@mavenugo perhaps you know more about that decision from the past?

@olljanat
Copy link
Contributor Author

might've been because container hostnames are not checked to be unique, therefore multiple containers with the same hostname would collide

@thaJeztah this PR did not not changed that behavior because --hostname also adds row to /etc/hosts and that have higher priority on name resolution so if multiple containers have same hostname they will always get their own IP as response for name resolution.

Only place where behavior really changes are containers which hostnames does not collide.

@pvtpogopuschel
Copy link

How do I figure out when this will be released to docker? I have checked the changelog of the past months and it seems like it did not make it in so far?

@olljanat
Copy link
Contributor Author

olljanat commented Sep 5, 2019

@pvtpogopuschel Based on version number. 19.03 means that it code freeze was on March this year (03/2019) so any code changes which have merged to Moby after that will go to next major version. Only exception to that one are critical bug fixes which are listed on release notes.

So as this was merged on June it will be released as part of version > 19.06. I guess that it is 19.09 but I'm not fully sure.

@pvtpogopuschel
Copy link

@olljanat Duh! I could have guessed that myself! Thank you very much! :)

@thaJeztah
Copy link
Member

Fixes may be backported (but this is more a "feature/enhance", so likely won't qualify for that).

To know if something made it into a Docker Engine release, the source of truth would be the docker/engine repository; https://github.com/docker/engine/releases, you can pick a tag, and check if the commit/patch is in that branch; looking at the 19.03.2 tag, the change is not in there; https://github.com/docker/engine/blob/v19.03.2/daemon/container_operations.go#L677-L700

@thaJeztah
Copy link
Member

This is probably won't be backported as a bugfix, but in case it's useful for your use-case, you can discover the IP-addresses for all tasks of a service by doing a DNS lookup for tasks.<servicename>., e.g.

/ # dig tasks.mystack_minio.

; <<>> DiG 9.14.8 <<>> tasks.mystack_minio.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25763
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.mystack_minio.		IN	A

;; ANSWER SECTION:
tasks.mystack_minio.	600	IN	A	10.0.1.4
tasks.mystack_minio.	600	IN	A	10.0.1.3
tasks.mystack_minio.	600	IN	A	10.0.1.6
tasks.mystack_minio.	600	IN	A	10.0.1.5

;; Query time: 1 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Fri Dec 06 12:06:26 UTC 2019
;; MSG SIZE  rcvd: 167

And a reverse DNS lookup should give the associated DNS name;

/ # dig -x 10.0.1.6

; <<>> DiG 9.14.8 <<>> -x 10.0.1.6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3463
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;6.1.0.10.in-addr.arpa.		IN	PTR

;; ANSWER SECTION:
6.1.0.10.in-addr.arpa.	600	IN	PTR	mystack_minio.1.aptd8oeiaacnwvsymr400wys5.mynetwork.

@treksler
Copy link

treksler commented Dec 6, 2019

Thanks for the reply.

there are two problems with the reverse lookup approach for this use case
1)
minio expects the hostnames to be <somestring><slot> eg minio1,minio2,...,minio<n>
There is no way to cluster with hostnames like mystack_minio.1.aptd8oeiaacnwvsymr400wys5.mynetwork.
(That works for rabbitmq, and possibly elasticsearch, but not minio)

If the service is on multiple networks, the PTR record will be for the first network alphabetically.

If your service extends a base service, you can't be guaranteed what network will be in the PTR record.
For example,

  • if your stack name is mystack, (using the default network) and the service is also on a network called proxy, the PTR record will be mystack_minio.1.aptd8oeiaacnwvsymr400wys5.mystack_dedault. (because mystack_default comes before proxy alphabetically)
  • BUT if your stack is called stack2, then it will be stack2_minio.1.aptd8oeiaacnwvsymr400wys5.proxy. (because proxy comes before stack2)

So if auto clustering works based on PTR records, and if hostnames have to match PTR records, like they do for rabbitmq dns based clustering, then you can't be guaranteed that it will work, unless you can make the service live on only one network

So, basically, for your suggestion to work reliably, I would need to be able to tell a network that it should be the default for PTR records, or be able to raise or lower PTR priority of a network somehow, so that I could ensure the default network is actually default for PTR records as well. Not sure if there is a way to do that (which works with docker-compose).

I briefly looked how this PR is implemented. It seems to put the configured hostname into an array of aliases. This implies that a possible workaround would be to use aliases for now, but I can't get that work.

version: '3.7'
services:
  minio:
    image: minio/minio
    ports:
      - "9000:9000"
    environment:
      MINIO_ACCESS_KEY: minio
      MINIO_SECRET_KEY: minio123
    deploy:
      replicas: 4
    volumes:
      - minio-data:/data
    command: server http://minio{1...4}/data
    networks:
      default:
        aliases:
          - 'minio{{.Task.Slot}}'  
volumes:
  minio-data:
    name: '{{index .Service.Labels "com.docker.stack.namespace"}}_minio-data-{{.Task.Slot}}'

but that does not work. minio1, minio2, minio3, minio4 do not resolve

Anyway, I forked docker-ce and slapped this PR into the 19.03 branch.
My patched version will work until this PR is released in a stable release.

https://github.com/treksler/docker-ce/releases/tag/0.0.0-20191205212647-9ac962bc37
UPDATE: this patched version appears to break cadvisor. not sure why
UPDATE: rebuilt version appears to work with cadvisor
https://github.com/treksler/docker-ce/releases/tag/v19.03.5.3

@belfo
Copy link

belfo commented Feb 4, 2020

@thaJeztah
any news on the docker release containing this ?

@SvenDowideit
Copy link
Contributor

frustratingly, {{.Task.Slot}} doesn't resolve to a a number when deploy mode: global - so this only works for replicated services (you get useless things like mt4f8lpj54c2n849fak338byv).

@olljanat
Copy link
Contributor Author

@SvenDowideit afaiu nothing prevents to modify Swarmkit on way that it would use same task slot numbering. Someone need to just provide PR of it.

PS. Thanks about your effort for Rancher OS. It is my favorite Linux for running Docker containers.

@SvenDowideit
Copy link
Contributor

SvenDowideit commented Feb 17, 2020

RancherOS :)

I kinda thing that task.slotid was once a number, and that it was changed - see moby/swarmkit#1554 and #34658 ( implemented at https://github.com/docker/swarmkit/blob/master/manager/orchestrator/jobs/global/reconciler.go#L175 , but with interesting hidden talents like in https://github.com/docker/swarmkit/blob/master/api/naming/naming.go#L26 )

^^^^^ isn't correct - it's largly @dperny 's new (unreleased) swarm jobs code - what I should have been looking at is https://github.com/docker/swarmkit/blob/master/manager/orchestrator/update/updater.go#L341 where the SlotIDseems to still be set, but the NodeID is also set - that gives me more hope...

there's not really enough info written down @stevvooe @aaronlehmann, @aluzzardi and @dongluochen - and they're pretty dismissive about the user facing side effects (by which i mean, i can't see any attempt to consider how not having an equivalence to non-global tasks could be a good or bad thing)

IDK - more information desired....

@SvenDowideit
Copy link
Contributor

I keep having circular concerns - {{.Task.Slot}} is a bad proxy (but the only) for what I'm trying to have - which is a way to pin a dns name to a known sequence of names

https://github.com/docker/swarmkit/blob/master/manager/orchestrator/jobs/replicated/reconciler.go#L194 shows that it works for replicated tasks

but with global tasks, there's this mental image that the dns name, and magic zero / one indexed number is also pinned to the node - which falls down extremely hard when a new node meets the constraints, or an old node stops meeting the constraints.

Reading around the code, its really readable, well commented, but still, i don't feel i have enough information about why

( @dperny omg, i'm obviously actually reading your new jobs code, updating above for the non-jobs case)

@SvenDowideit
Copy link
Contributor

@olljanat I'd love input, i'm still contemplating how I feel about the compromises - its essentially creating an auto-generated {{.Node.Labels.NodeIdx}} (I build stuff for non-docker users to consume, so I want to minimum of "oh, and add this label here" for the default case)

The code doesn't really look like it'd be too hard to move from "Task.Slot==0" means global task to using something else

@olljanat
Copy link
Contributor Author

@SvenDowideit that is tricky one. If we add NodeIdx how those indexes should be handled when more nodes is added and especially if/when old ones will be removed? Maybe it would help if you can open up some use case where you plan to use it?

One solution which we have used is use engine labels which is are automatically set by server provisioning so we are able to even re-create those nodes and those labels can remain same and there is no need update service constraints (nice option if you prefer to re-provision servers instead of updating existing ones). But when I write this is just noticed that there is no support service templates {{.Engine.Labels}} so support for that and probably .{{.Node.Labels}} to should be added.

@JKJameson
Copy link

JKJameson commented Mar 7, 2020

@thaJeztah Hours upon hours of testing wasted for something that has already been fixed. I am absolutely livid at Docker project management right now. This should have been in 19.09 but we're all still frozen in 19.03 so LGTM in 19.03. As many have stated above, I also agree this is a critical bug fix (erroneous functionality). Squash it.

@treksler You're a legend. Thank you for the release.

@JKJameson
Copy link

JKJameson commented Mar 11, 2020

This doesn't seem to work with services deployed as "global mode". It only works with services deployed as "replicated mode".

@zopar
Copy link

zopar commented Oct 29, 2020

;; ANSWER SECTION:
6.1.0.10.in-addr.arpa. 600 IN PTR mystack_minio.1.aptd8oeiaacnwvsymr400wys5.mynetwork.

@thaJeztah
A question please:
If you have 2 network attached to the same container which one is used by dns? Based on which criteria?
Suppose you have mynetwork and mynetworkB in the stack, the reverse dns will reply
mystack_minio.1.aptd8oeiaacnwvsymr400wys5.mynetwork.
or?
mystack_minio.1.aptd8oeiaacnwvsymr400wys5.mynetworkB.

is it predictable?

@markz0r
Copy link

markz0r commented Jan 22, 2021

This is a bad change, it should an flagable option not the default behaviour (in my opinion).
Also thanks for being a maintainer.

@olljanat
Copy link
Contributor Author

olljanat commented Jan 22, 2021

@markz0r can describe some use case where this would be harmful because I'm not able imagine any? After all we are just adding CNAME records to DNS with this one.

Note this change is now included to 20.10.x versions so we cannot easily remove/change it anymore.

@markz0r
Copy link

markz0r commented Jan 22, 2021

@markz0r can describe some use case where this would be harmful because I'm not able imagine any? After all we are just adding CNAME records to DNS with this one.

Note this change is now included to 20.10.x versions so we cannot easily remove/change it anymore.

Hi @olljanat thanks for the quick reply; the specific use case that this was harmful for me:
I run gitlab and gitlab runners on an ec2 instance behind an ALB, the ALB offers a valid certificate.
Previously, the gitlab-runner container would connect via the ALB by using the FQDN gitlab.int.blah.com, which is also the hostname of the gitlab container.
With this change, the gitlab-runner no longer connects via the ALB, because - as you stated - you are adding a CNAME record. I had a self-signed cert on the internal gitlab endpoint (it was not being used prior to this change) so gitlab-runners failed were now unable to register with gitlab.

I don’t think the change needs to be reverted - but it can definitely have harmful impacts, it is a change in the default behaviour. If this was a flag-able option I would agree with that.

Docker is used in a lot of places (because its great) so, you not being able to imagine any harmful side effects for every conceivable use case is not a good yardstick for risks involved in changing default behaviour.

@zephyia
Copy link

zephyia commented Jan 22, 2021

Hi @markz0r , @olljanat ,

I would have to agree with Mark here, this change is problematic. I would also argue it was unnecessary, as the behaviour is now inconsistent with how hostnames work in most non-docker scenarios. When I set a host name on my linux machine, DNS records are not automatically created (in fact the /etc/hosts file isn't even automatically updated) it is my responsibility / choice as the administrator of the system to decide if I need my hostname to resolve or not. In fact, there may even be case's (eg NAT on a firewall) where I explicitly want the hostname to resolve to a different address than is present on the hosts.

That choice was afforded to me prior to this change in docker - if I needed the host name to resolve I could

  • add it to the aliases of the container
  • delegate it in my own DNS or
  • indeed I could choose not to have it resolve at all

All the knobs and buttons needed to solve the problem of having the host name resolve were present - perhaps all that was needed was a note in the documentation stating this. Instead you changed the behaviour to make a previous field that had no impact on the DNS of the containers to now impact DNS resolution - granted this issue was referenced from the release notes - but it is still an unnecessary change to default behaviour that has existed for a long time and is consistent with the functioning and use of hostname outside of the docker environment (windows AD being the exception).

@thaJeztah
Copy link
Member

@markz0r @zephyia could one of you open a ticket with details about the issues you mentioned for tracking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.