Add resolver for default bridge, remove default nameservers#47602
Add resolver for default bridge, remove default nameservers#47602akerouanton merged 2 commits intomoby:masterfrom
Conversation
59b678a to
6b8bdde
Compare
|
@robmry this is probably changelog-worthy? (and may need "impact/docs" as well?); if so, can you add those labels and a description for the change-log? |
Will do ... I haven't filled in any of the description yet, the PR's still draft - wanted the full test run before committing! |
eab8b25 to
38b34a7
Compare
| if actual != "nameserver 127.0.0.11 options ndots:3" { | ||
| c.Fatalf("expected 'nameserver 127.0.0.11 options ndots:3', but says: %q", actual) | ||
| } | ||
| if !strings.Contains(out, "ExtServers: [1.1.1.1]") { |
There was a problem hiding this comment.
Nothing urgent, but I saw the 1.1.1.1 and for a second thought we still set some default somewhere; but it looks like this is the IP we set as part of the test.
It made me thinking though; should we start practicing "best practice" and use designated example IP-addresses for these; similar to how we use designated example domains? See https://datatracker.ietf.org/doc/html/rfc5737
There was a problem hiding this comment.
Yes, a tidy-up of this and other old tests would be good. (I don't think it needs to be tied to this change though.)
There was a problem hiding this comment.
For sure; nothing urgent! My curiosity here was if we depended on those resolvers to make the test work (and in that case possibly having CI instability if those resolvers would be unreachable). That doesn't appear to be the case here (we only check the content of the file), but I guess having "designated" IP-addresses could possibly help taking away that confusion
LOL, that said; it's much easier to recognise "example.com" than one of those IP-addresses (we could have a const for them but that may be a bit too much.
| // ResolvConf is the path to the configuration of the host resolver | ||
| ResolvConf string `json:"resolv-conf,omitempty"` |
There was a problem hiding this comment.
For a minute, I thought that an undocumented feature was sneaked in through e353e7e, because all the code seems to indicate that this option is configurable through daemon.json;
- it's an exported field, and it has a
jsonlabel ("resolv-conf") setupResolvConfchecks if the value is set before applyingresolvconf.PathLines 139 to 149 in 2ebf191
But it looks like we're saved here by the fact that there's no flag defined for the daemon;
mkdir -p /etc/docker
cp /etc/resolv.conf /etc/docker/resolv.conf
echo '# THIS IS CUSTOM' >> /etc/docker/resolv.conf
echo '{"debug": true, "resolv-conf":"/etc/docker/resolv.conf"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives don't match any configuration option: resolv-confThat said; there's one place this field is currently used, and that's when reloading the daemon config; in that case the daemon logs print the reloaded configuration; is that a problem? Should we still have it visible what the daemon used as resolv.conf ?
kill -s SIGHUP $(pidof dockerd)
INFO[2024-05-29T12:40:23.968890416Z] Reloaded configuration: {"pidfile":"/var/run/docker.pid","data-root":"/var/lib/docker","exec-root":"/var/run/docker","group":"docker","max-concurrent-downloads":3,"max-concurrent-uploads":5,"max-download-attempts":5,"shutdown-timeout":15,"debug":true,"hosts":["unix:///var/run/docker.sock"],"log-level":"info","log-format":"text","swarm-default-advertise-addr":"","swarm-raft-heartbeat-tick":0,"swarm-raft-election-tick":0,"metrics-addr":"","host-gateway-ip":"172.18.0.1","log-driver":"json-file","mtu":1500,"ip":"0.0.0.0","icc":true,"iptables":true,"ip-forward":true,"ip-masq":true,"userland-proxy":true,"userland-proxy-path":"/usr/local/bin/docker-proxy","default-address-pools":{"Values":null},"network-control-plane-mtu":1500,"experimental":false,"containerd":"/var/run/docker/containerd/containerd.sock","builder":{"GC":{},"Entitlements":{}},"containerd-namespace":"moby","containerd-plugin-namespace":"plugins.moby","default-runtime":"runc","seccomp-profile":"builtin","default-shm-size":67108864,"default-ipc-mode":"private","default-cgroupns-mode":"host","resolv-conf":"/etc/resolv.conf","proxies":{}}There was a problem hiding this comment.
Sorry - I missed this, or forgot to reply.
It's quite a while since I looked at this PR, so had to remind myself ... before this change, if the host's resolver was systemd's 127.0.0.53 we couldn't point at it from a container's resolv.conf on the default network. So, in that case, we based our config on systemd's alternative resolv.conf.
Now, even on the default network the internal resolver can use the systemd resolver directly, we don't need to know about its upstreams - and we only ever read /etc/resolv.conf (ignoring the undocumented env-var that's intended for tests to use).
So, I don't think continuing to log it would be very useful.
daemon/container_operations_unix.go
Outdated
| // as on the host itself. The container gets a copy of these files. | ||
| // In host-mode networking, the container does not have its own networking | ||
| // namespace, so `/etc/hosts` should be the same as on the host itself. Setting | ||
| // OptionOriginHostsPath means the container will get a copy of the host's file. |
There was a problem hiding this comment.
| // OptionOriginHostsPath means the container will get a copy of the host's file. | |
| // OptionOriginHostsPath means the container will get a copy of `/etc/hosts` from the host filesystem. |
| // Expect the external DNS server to respond to a request from the container. | ||
| res, err := container.Exec(ctx, c, ctrId, []string{"nslookup", "test.example"}) | ||
| assert.NilError(t, err) | ||
| assert.Check(t, is.Equal(res.ExitCode, 0)) | ||
| assert.Check(t, is.Contains(res.Stdout(), network.DNSRespAddr)) | ||
|
|
||
| // Expect the external DNS server to respond to a request from the container | ||
| // for the container's own name - it won't be recognised as a container name | ||
| // because there's no service resolution on the default bridge. | ||
| res, err = container.Exec(ctx, c, ctrId, []string{"nslookup", ctrName}) | ||
| assert.NilError(t, err) | ||
| assert.Check(t, is.Equal(res.ExitCode, 0)) | ||
| assert.Check(t, is.Contains(res.Stdout(), network.DNSRespAddr)) | ||
|
|
||
| // Check that inspect output has no DNSNames for the container. | ||
| inspect := container.Inspect(ctx, t, c, ctrId) | ||
| net, ok := inspect.NetworkSettings.Networks[networktypes.NetworkBridge] | ||
| assert.Check(t, ok, "expected to find bridge network in inspect output") | ||
| assert.Check(t, is.Nil(net.DNSNames)) |
There was a problem hiding this comment.
These smell like three distinct subtests. It would be nice if the pass/fail status was reported independently for each, to make it easier to narrow down the root cause.
38b34a7 to
6f00ac7
Compare
I think it should wait for us to do something about buidkit too ... I've added a comment. (Also, don't want to imply that this change is required for compliance with GDPR or PCI-DSS. Internal advice is that it isn't.) |
6f00ac7 to
d44269e
Compare
| } | ||
|
|
||
| // Called when an endpoint has joined the sandbox. | ||
| func (sb *Sandbox) updateDNS(ipv6Enabled bool) error { |
There was a problem hiding this comment.
- The windows implementation of this method can also be deleted. See
moby/libnetwork/sandbox_dns_windows.go
Lines 25 to 27 in c6aaabc
- I think the following TODO comment can be updated / removed:
moby/libnetwork/sandbox_dns_unix.go
Lines 366 to 371 in c6aaabc
d44269e to
630930b
Compare
Until now, containers on the default bridge network have been configured to talk directly to external DNS servers - their resolv.conf files have either been populated with nameservers from the host's resolv.conf, or with servers from '--dns' (or with Google's nameservers as a fallback). This change makes the internal bridge more like other networks by using the internal resolver. But, the internal resolver is not populated with container names or aliases - it's only for external DNS lookups. Containers on the default network, on a host that has a loopback resolver (like systemd's on 127.0.0.53) will now use that resolver via the internal resolver. So, the logic used to find systemd's current set of resolvers is no longer needed by the daemon. Legacy links work just as they did before, using '/etc/hosts' and magic. (Buildkit does not use libnetwork, so it can't use the internal resolver. But it does use libnetwork/resolvconf's logic to configure resolv.conf. So, code to set up resolv.conf for a legacy networking without an internal resolver can't be removed yet.) Signed-off-by: Rob Murray <[email protected]>
Don't fall-back to Google's DNS servers in a network that has an internal resolver. Now the default bridge uses the internal resolver, the only reason a network started by the daemon should end up without any upstream servers is if the host's resolv.conf doesn't list any. In this case, the '--dns' option can be used to explicitly configure nameservers for a container if necessary. (Note that buildkit's containers do not have an internal resolver, so they will still set up Google's nameservers if the host has no resolvers that can be used in the container's namespace.) Signed-off-by: Rob Murray <[email protected]>
630930b to
d365702
Compare
- What I did
Internal resolver for default bridge network
Until now, containers on the default bridge network have been configured to talk directly to external DNS servers - their resolv.conf files have either been populated with nameservers from the host's resolv.conf, or with servers from '--dns' (or with Google's nameservers as a fallback).
This change makes the internal bridge more like other networks by using the internal resolver. But, the internal resolver is not populated with container names or aliases - it's only for external DNS lookups.
Containers on the default network, on a host that has a loopback resolver (like systemd's on 127.0.0.53) will now use that resolver via the internal resolver. So, the logic used to find systemd's current set of resolvers is no longer needed by the daemon.
Legacy links work just as they did before, using '/etc/hosts' and magic.
No default nameservers for internal resolver
Don't fall-back to Google's DNS servers in a network that has an internal resolver.
Now the default bridge uses the internal resolver, the only reason a network started by the daemon should end up without any upstream servers is if the host's resolv.conf doesn't list any. In this case, the '--dns' option can be used to explicitly configure nameservers for a container if necessary.
(Note that buildkit's containers do not have an internal resolver, so they will still set up Google's nameservers if the host has no resolvers that can be used in the container's namespace.)
- How I did it
Start the internal resolver, unless the network is 'host', 'container' or 'none'.
Deleted a bunch of code - but not the 'resolvconf' code that configures the resolver for a no-internal-resolver network, because that's still needed by buildkit.
- How to verify it
Now the default bridge has an internal resolver, there's a danger we could accidentally populate DNS names for its containers - added a regression test to catch that.
- Description for the changelog