Skip to content

cmd/docker-proxy: UDP: reply to clients with original daddr#48571

Merged
thaJeztah merged 1 commit intomoby:masterfrom
akerouanton:proxy-udp-with-correct-saddr
Oct 15, 2024
Merged

cmd/docker-proxy: UDP: reply to clients with original daddr#48571
thaJeztah merged 1 commit intomoby:masterfrom
akerouanton:proxy-udp-with-correct-saddr

Conversation

@akerouanton
Copy link
Copy Markdown
Member

@akerouanton akerouanton commented Oct 2, 2024

See:

- What I did

When a UDP server is running on a multihomed server, as is the case with pretty much all Docker hosts (eg. eth0 + docker0), the kernel has to choose which source address is used when replying to a UDP client. But that process is based on heuristics and is fallible.

If the address picked doesn't match the original destination address used by the client, it'll drop the datagram and return an ICMP Port Unreachable.

To prevent that, we need to:

  • setsockopt(IP_PKTINFO) on proxy's sockets.
  • Extract the original destination address from an ancillary message every time a new 'UDP connection' is 'established' (ie. every time we insert a new entry into the UDP conntrack table).
  • And finally, pass a control message containing the desired source address to the kernel, every time we send a response back to the client.

Also, update the inline comment on read errors in (*UDPProxy).Run(). This comment was misleadingly referencing ECONNREFUSED - Linux's UDP implementation never returns this error (see here). Instead, state why net.ErrClosed is perfectly fine and doesn't need to be logged (although, docker-proxy currently logs to nowhere).

- How to verify it

A new integration test TestPortMappedHairpinUDP has been added.

- Description for the changelog

- UDP ports published by a container are now reliably accessible by containers on other networks, via the host's public IP address.

@akerouanton akerouanton added this to the 28.0.0 milestone Oct 2, 2024
@akerouanton akerouanton self-assigned this Oct 2, 2024
@akerouanton akerouanton marked this pull request as draft October 2, 2024 19:48
@akerouanton akerouanton force-pushed the proxy-udp-with-correct-saddr branch 2 times, most recently from d8fe567 to 7d0ba4d Compare October 7, 2024 15:05
@akerouanton akerouanton marked this pull request as ready for review October 7, 2024 19:29
Copy link
Copy Markdown
Contributor

@robmry robmry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the changelog comment, maybe ...

- UDP ports published by a container are now reliably accessible by
  containers on other networks, via the host's public IP address.

func readDestFromCmsg(oob []byte, ipVer ipVersion) (_ net.IP, err error) {
defer func() {
// In case of partial upgrade / downgrade, docker-proxy could read
// control messages from a socket which doesn't have the sockopt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't leave docker-proxy running when the daemon stops, it'll always be a new socket I think? (Maybe I missed the point?!)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is about mixed versions (of dockerd and docker-proxy). If there's a mismatch, the proxy might read control messages from a socket that doesn't have the IP_PKTINFO opt set (in which case, it'll have a length of 0).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, yes - got it! Thank you.

Comment thread cmd/docker-proxy/udp_proxy_linux.go Outdated
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some first set of comments; might want to look more in-depth after this

Comment on lines +80 to +83
if proxy.ipVer == ip4 {
cm := &ipv4.ControlMessage{Src: serverAddr}
oob = cm.Marshal()
} else {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps silly / overly cautious comment; I notice that for these we default to IPv6 (i.e., if not explicit ipv4, then assume ipv6); wondering;

  • if we could have any code-path where ipVer is accidentally not set (and we now default to IPv6)
  • Given that Ipv4 was previously our "only" supported option; would it make sense to reverse the default?
  • Or (for code-paths that allow it) have an error (or WARNING) "invalid ipVer ?

Copy link
Copy Markdown
Member Author

@akerouanton akerouanton Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we could have any code-path where ipVer is accidentally not set (and we now default to IPv6)

No, we can't. The only place where the struct UDPProxy is initialized is in NewUDPProxy(), and this function is only called from a single place (from newProxy()). There, ipVer is determined based on the -host-ip flag.

Given that Ipv4 was previously our "only" supported option; would it make sense to reverse the default?

They're equally safe -- ipv4.ControlMessage.Marshal() and ipv6.ControlMessage.Marshal() return an empty buffer if Src has a mismatching address family (see here for ipv4, and here for ipv6).

Comment on lines +84 to +85
cm := &ipv6.ControlMessage{Src: serverAddr}
oob = cm.Marshal()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started looking at the differences between &ipv6.ControlMessage -> Marshal() and ipv6.NewControlMessage below, but not familiar enough with this. Was mostly wondering if this code could be handled by ipv6.NewControlMessage alone, but maybe not.

Copy link
Copy Markdown
Member Author

@akerouanton akerouanton Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ipv4.NewControlMessage() / ipv6.NewControlMessage() both take a single argument specifying which types of ancillary data will be read and returns a []byte buffer of the appropriate size.

Right-sizing the buffer passed to recvmsg is somewhat important because the sysctl net.core.optmem_max defines the maximum amount of ancillary data each socket can hold.

So, nope, we can't use NewControlMessage to return a buffer containing options.

Comment thread cmd/docker-proxy/udp_proxy_linux.go Outdated
@akerouanton akerouanton force-pushed the proxy-udp-with-correct-saddr branch 3 times, most recently from 635f20d to 372bde8 Compare October 9, 2024 09:50
Comment on lines +219 to +223
res := container.RunAttach(clientCtx, t, c,
container.WithNetworkMode(clientNetName),
container.WithCmd("nslookup", "foobar.internal", net.JoinHostPort(hostAddr, hostPort)),
)
defer c.ContainerRemove(ctx, res.ContainerID, containertypes.RemoveOptions{Force: true})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Doesn't look like we need this container later, so maybe just use container.WithAutoRemove?

When a UDP server is running on a multihomed server, as is the case with
pretty much _all_ Docker hosts (eg. eth0 + docker0), the kernel has to
choose which source address is used when replying to a UDP client. But
that process is based on heuristics and is fallible.

If the address picked doesn't match the original destination address
used by the client, it'll drop the datagram and return an ICMP Port
Unreachable.

To prevent that, we need to:

- `setsockopt(IP_PKTINFO)` on proxy's sockets.
- Extract the original destination address from an ancillary message
  every time a new 'UDP connection' is 'established' (ie. every time we
  insert a new entry into the UDP conntrack table).
- And finally, pass a control message containing the desired source
  address to the kernel, every time we send a response back to the
  client.

Also, update the inline comment on read errors in `(*UDPProxy).Run()`.
This comment was misleadingly referencing ECONNREFUSED - Linux's UDP
implementation never returns this error (see [1]). Instead, state why
`net.ErrClosed` is perfectly fine and doesn't need to be logged
(although, docker-proxy currently logs to nowhere).

[1]: https://github.com/search?q=repo%3Atorvalds%2Flinux+ECONNREFUSED+path%3A%2F%5Enet%5C%2F%28ipv4%7Cipv6%29%5C%2F%28udp%7Ctcp%29%2F&type=code

Signed-off-by: Albin Kerouanton <[email protected]>
@akerouanton akerouanton force-pushed the proxy-udp-with-correct-saddr branch from 372bde8 to 6c6174b Compare October 15, 2024 10:42
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants