Skip to content

libnet/d/bridge: port mapping: proxy LL connections #48570

Merged
akerouanton merged 1 commit intomoby:masterfrom
akerouanton:proxy-LL-connections
Oct 8, 2024
Merged

libnet/d/bridge: port mapping: proxy LL connections #48570
akerouanton merged 1 commit intomoby:masterfrom
akerouanton:proxy-LL-connections

Conversation

@akerouanton
Copy link
Copy Markdown
Member

@akerouanton akerouanton commented Oct 2, 2024

- What I did

Link-local connections were DNATed like other non-loopback connections, but the kernel would drop them even before their reach the container.

This PR changes the DNAT rule inserted in ip6tables to exclude link-local addresses. Instead, these connections will be proxied by docker-proxy, at least if --userland-proxy=true.

If dockerd is started with the userland-proxy disabled, link-local port-bindings won't be supported (ie. silently discarded).

- How I did it

Debugged the issue with the help of iptables-tracer and pwru:

  • Initial SYN packet is DNAT'd to the container
  • But no SYN-ACK reach iptables on the return path
  • The initial SYN packet doesn't reach iptables in the container's netns
  • And pwru shows that a link-local address is picked as source addr, and a ULA as destination addr.
$ iptables-tracer -skip-modprobe -family ipv6 -filter 'tcp port 80'
INFO[0000] Waiting for trace events...
IN= OUT=lo SRC=fe80::42:acff:fe11:2 DST=fe80::42:acff:fe11:2 LEN=40 HOP=64 PROTO=TCP SPT=43178 DPT=80 FLAGS=SYN SEQ=2461312321 CSUM=53db
	raw OUTPUT NFMARK=0x0
		DEFAULT POLICY
		=> ACCEPT
	nat OUTPUT NFMARK=0x0
		MATCH RULE (#1): ! -d ::1/128 -m addrtype --dst-type LOCAL -j DOCKER
		=> DOCKER
	nat DOCKER NFMARK=0x0
		MATCH RULE (#2): ! -i br-a364fa5118cd -p tcp -m tcp --dport 80 -j DNAT --to-destination [fdda:b577:4b05::2]:80
		=> DNAT: --to-destination [fdda:b577:4b05::2]:80

$ conntrack -f ipv6 -L
conntrack v1.4.7 (conntrack-tools): 0 flow entries have been shown.

$ iptables-tracer -skip-modprobe -family ipv6 \
    -netns=$(docker inspect --format='{{ .NetworkSettings.SandboxKey }}' great_ride) \
    -filter 'tcp port 80'

$ docker run --privileged --rm -t --pid=host -v /sys/kernel/debug/:/sys/kernel/debug/ albinkerouanton006/pwru     pwru --filter-func kfree_skb_reason --output-stack --output-tuple 'tcp port 80'
2024/10/02 16:14:15 Attaching kprobes (via kprobe)...
1 / 1 [------------------------------------------------------------------------------------------] 100.00% ? p/s
2024/10/02 16:14:15 Attached (ignored 0)
2024/10/02 16:14:15 Listening for events..
SKB                CPU PROCESS          TUPLE FUNC
0xffff0000c22bd8c0 9   <empty>:4865     [fe80::42:acff:fe11:2]:37582->[fdda:b577:4b05::2]:80(tcp) kfree_skb_reason(SKB_DROP_REASON_NETFILTER_DROP)
kfree_skb_reason
ip6_xmit
inet6_csk_xmit
__tcp_transmit_skb
tcp_connect
tcp_v6_connect
__inet_stream_connect
inet_stream_connect
__sys_connect_file
__sys_connect
__arm64_sys_connect

- How to verify it

The integration test TestAccessPublishedPortFromHost introduced in #48545 was skipping LL addrs. That's not the case anymore.

- Description for the changelog

- Fix connections to published ports using link-local addresses.

Link-local connections were DNATed like other non-loopback connections,
but the kernel would drop them even before their reach the container.

This commit changes the DNAT rule inserted in ip6tables to exclude
link-local addresses. Instead, these connections will be proxied by
docker-proxy, at least if --userland-proxy=true.

If dockerd is started with the userland-proxy disabled, link-local
port-bindings won't be supported (ie. silently discarded).

Signed-off-by: Albin Kerouanton <[email protected]>
@akerouanton akerouanton force-pushed the proxy-LL-connections branch from 90fe6d8 to 7ca9e9b Compare October 7, 2024 15:59
@akerouanton akerouanton marked this pull request as ready for review October 7, 2024 19:29
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah
Copy link
Copy Markdown
Member

We removed the cherry-pick label; this bug/issue has been around for a long time, and there were some merge-conflicts when back porting, that would not be worth significant effort to resolve those for a long standing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants