Skip to content

Comments

Revert "libnet: setupDNS: don't overwrite user-modified resolv.conf"#51615

Merged
thaJeztah merged 3 commits intomoby:masterfrom
akerouanton:revert-51507
Nov 28, 2025
Merged

Revert "libnet: setupDNS: don't overwrite user-modified resolv.conf"#51615
thaJeztah merged 3 commits intomoby:masterfrom
akerouanton:revert-51507

Conversation

@akerouanton
Copy link
Member

@akerouanton akerouanton commented Nov 28, 2025

- What I did

On upgrade to v29.1.0, the Engine would consider that containers' /etc/resolv.conf were user-modified because hash files were out-of-sync with file contents written by a prior Engine version. This was caused by sb.rebuildDNS() never updating the hash file along with the file content.

In #51507, we fixed that (see commit eb18b39), and we added an extra check in sb.setupDNS() to ensure that user-modified /etc/resolv.conf aren't overwritten.

But, when a new Sandbox is created (as when a container is restarted), sb.setupDNS() is called first to copy the host's /etc/resolv.conf into the sandbox, and then sb.rebuildDNS() is used to set nameserver 127.0.0.11 and extract the list of upstream nameservers that should be configured on the embedded DNS server.

Prior to v29.1.0, sb.setupDNS() was called unconditionally — ensuring the /etc/resolv.conf and its hash file were in sync. Then, sb.rebuildDNS() would be called and always proceed — extracting the list of upstream nameservers.

With v29.1.0, sb.setupDNS() is skipped because of the out-of-sync hash file, which causes sb.rebuildDNS() to be skipped too, and the list of upstream nameservers to never be extracted.

Revert the whole PR.

- How to verify it

Both dig should work properly:

$ docker network create testnet
$ docker run -d --name test --network testnet nicolaka/netshoot top
$ docker exec -it test dig A google.com

# Restart the daemon

$ docker start test
$ docker exec -it test dig A google.com

- Human readable description for the release notes

Revert a PR breaking external DNS resolution on all custom bridge networks

This reverts commit eb18b39.

Signed-off-by: Albin Kerouanton <[email protected]>
…solver"

This reverts commit 937246a.

Signed-off-by: Albin Kerouanton <[email protected]>
@akerouanton akerouanton self-assigned this Nov 28, 2025
@github-actions github-actions bot added the area/daemon Core Engine label Nov 28, 2025
@akerouanton akerouanton changed the title Revert 51507 Revert "libnet: setupDNS: don't overwrite user-modified resolv.conf" Nov 28, 2025
@vvoland vvoland added this to the 29.1.1 milestone Nov 28, 2025
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah merged commit 9a84135 into moby:master Nov 28, 2025
218 checks passed
@b-reich
Copy link

b-reich commented Nov 28, 2025

@akerouanton is it planed to add an regression test for this?

@akerouanton akerouanton deleted the revert-51507 branch November 28, 2025 11:11
@akerouanton
Copy link
Member Author

@b-reich We don't have a proper way to write regression tests for upgrades unfortunately. But yeah, we talked about that and it's not the first time we missed that capability. We'd like to add it eventually.

At least, with this PR we're going back to a known-good state.

@zhangguanzhang
Copy link
Contributor

$ docker network create testnet
$ docker run -d --name test --network testnet nicolaka/netshoot top
$ docker exec -it test dig A google.com

# Restart the daemon

$ docker start test
$ docker exec -it test dig +short A google.com
64.233.170.100
64.233.170.138
64.233.170.102
64.233.170.101
64.233.170.139
64.233.170.113

I rolled back the code and tested the above steps, and found that it works normally. Are there any other steps that can be reproduced?

@robmry
Copy link
Contributor

robmry commented Dec 1, 2025

I rolled back the code and tested the above steps, and found that it works normally. Are there any other steps that can be reproduced?

Hi @zhangguanzhang - I'm not sure whether you're asking about verifying the revert, or reproducing the issue?

But, to repro the issue (in 29.1.0, without the revert) ... it's enough to create a container on a user-defined network, modify its /etc/resolv.conf, restart the container, and check external DNS. Note that on upgrade, dockerd will think all restarted containers have a modified resolv.conf, because older daemons didn't write the hash file when creating it.

@zhangguanzhang
Copy link
Contributor

I rolled back the code and tested the above steps, and found that it works normally. Are there any other steps that can be reproduced?

Hi @zhangguanzhang - I'm not sure whether you're asking about verifying the revert, or reproducing the issue?

But, to repro the issue (in 29.1.0, without the revert) ... it's enough to create a container on a user-defined network, modify its /etc/resolv.conf, restart the container, and check external DNS. Note that on upgrade, dockerd will think all restarted containers have a modified resolv.conf, because older daemons didn't write the hash file when creating it.

Thanks for the answer. So, if you're only using version 29.1.0 to reproduce the issue, you can use the following steps:

$ docker network create testnet
$ docker run -d --name test --network testnet nicolaka/netshoot top
$ docker exec -it test dig A google.com
$ docker exec -ti test sh -c 'echo >> /etc/resolv.conf'

# Restart the daemon

$ docker start test
$ docker exec -it test dig +short A google.com
# will be empty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

29.1.0 breaks the embedded DNS resolver DNS fails in docker 29.1.0-1

6 participants