cmd/docker-proxy: UDP: reply to clients with original daddr#48571
cmd/docker-proxy: UDP: reply to clients with original daddr#48571thaJeztah merged 1 commit intomoby:masterfrom
Conversation
d8fe567 to
7d0ba4d
Compare
robmry
left a comment
There was a problem hiding this comment.
For the changelog comment, maybe ...
- UDP ports published by a container are now reliably accessible by
containers on other networks, via the host's public IP address.
| func readDestFromCmsg(oob []byte, ipVer ipVersion) (_ net.IP, err error) { | ||
| defer func() { | ||
| // In case of partial upgrade / downgrade, docker-proxy could read | ||
| // control messages from a socket which doesn't have the sockopt |
There was a problem hiding this comment.
We don't leave docker-proxy running when the daemon stops, it'll always be a new socket I think? (Maybe I missed the point?!)
There was a problem hiding this comment.
This comment is about mixed versions (of dockerd and docker-proxy). If there's a mismatch, the proxy might read control messages from a socket that doesn't have the IP_PKTINFO opt set (in which case, it'll have a length of 0).
There was a problem hiding this comment.
Ahh, yes - got it! Thank you.
thaJeztah
left a comment
There was a problem hiding this comment.
just some first set of comments; might want to look more in-depth after this
| if proxy.ipVer == ip4 { | ||
| cm := &ipv4.ControlMessage{Src: serverAddr} | ||
| oob = cm.Marshal() | ||
| } else { |
There was a problem hiding this comment.
Perhaps silly / overly cautious comment; I notice that for these we default to IPv6 (i.e., if not explicit ipv4, then assume ipv6); wondering;
- if we could have any code-path where
ipVeris accidentally not set (and we now default to IPv6) - Given that Ipv4 was previously our "only" supported option; would it make sense to reverse the default?
- Or (for code-paths that allow it) have an error (or WARNING) "invalid
ipVer?
There was a problem hiding this comment.
if we could have any code-path where ipVer is accidentally not set (and we now default to IPv6)
No, we can't. The only place where the struct UDPProxy is initialized is in NewUDPProxy(), and this function is only called from a single place (from newProxy()). There, ipVer is determined based on the -host-ip flag.
Given that Ipv4 was previously our "only" supported option; would it make sense to reverse the default?
They're equally safe -- ipv4.ControlMessage.Marshal() and ipv6.ControlMessage.Marshal() return an empty buffer if Src has a mismatching address family (see here for ipv4, and here for ipv6).
| cm := &ipv6.ControlMessage{Src: serverAddr} | ||
| oob = cm.Marshal() |
There was a problem hiding this comment.
I started looking at the differences between &ipv6.ControlMessage -> Marshal() and ipv6.NewControlMessage below, but not familiar enough with this. Was mostly wondering if this code could be handled by ipv6.NewControlMessage alone, but maybe not.
There was a problem hiding this comment.
ipv4.NewControlMessage() / ipv6.NewControlMessage() both take a single argument specifying which types of ancillary data will be read and returns a []byte buffer of the appropriate size.
Right-sizing the buffer passed to recvmsg is somewhat important because the sysctl net.core.optmem_max defines the maximum amount of ancillary data each socket can hold.
So, nope, we can't use NewControlMessage to return a buffer containing options.
635f20d to
372bde8
Compare
| res := container.RunAttach(clientCtx, t, c, | ||
| container.WithNetworkMode(clientNetName), | ||
| container.WithCmd("nslookup", "foobar.internal", net.JoinHostPort(hostAddr, hostPort)), | ||
| ) | ||
| defer c.ContainerRemove(ctx, res.ContainerID, containertypes.RemoveOptions{Force: true}) |
There was a problem hiding this comment.
nit: Doesn't look like we need this container later, so maybe just use container.WithAutoRemove?
When a UDP server is running on a multihomed server, as is the case with pretty much _all_ Docker hosts (eg. eth0 + docker0), the kernel has to choose which source address is used when replying to a UDP client. But that process is based on heuristics and is fallible. If the address picked doesn't match the original destination address used by the client, it'll drop the datagram and return an ICMP Port Unreachable. To prevent that, we need to: - `setsockopt(IP_PKTINFO)` on proxy's sockets. - Extract the original destination address from an ancillary message every time a new 'UDP connection' is 'established' (ie. every time we insert a new entry into the UDP conntrack table). - And finally, pass a control message containing the desired source address to the kernel, every time we send a response back to the client. Also, update the inline comment on read errors in `(*UDPProxy).Run()`. This comment was misleadingly referencing ECONNREFUSED - Linux's UDP implementation never returns this error (see [1]). Instead, state why `net.ErrClosed` is perfectly fine and doesn't need to be logged (although, docker-proxy currently logs to nowhere). [1]: https://github.com/search?q=repo%3Atorvalds%2Flinux+ECONNREFUSED+path%3A%2F%5Enet%5C%2F%28ipv4%7Cipv6%29%5C%2F%28udp%7Ctcp%29%2F&type=code Signed-off-by: Albin Kerouanton <[email protected]>
372bde8 to
6c6174b
Compare
See:
- What I did
When a UDP server is running on a multihomed server, as is the case with pretty much all Docker hosts (eg. eth0 + docker0), the kernel has to choose which source address is used when replying to a UDP client. But that process is based on heuristics and is fallible.
If the address picked doesn't match the original destination address used by the client, it'll drop the datagram and return an ICMP Port Unreachable.
To prevent that, we need to:
setsockopt(IP_PKTINFO)on proxy's sockets.Also, update the inline comment on read errors in
(*UDPProxy).Run(). This comment was misleadingly referencing ECONNREFUSED - Linux's UDP implementation never returns this error (see here). Instead, state whynet.ErrClosedis perfectly fine and doesn't need to be logged (although, docker-proxy currently logs to nowhere).- How to verify it
A new integration test
TestPortMappedHairpinUDPhas been added.- Description for the changelog
- UDP ports published by a container are now reliably accessible by containers on other networks, via the host's public IP address.