Skip to content

Commit 433b1f9

Browse files
committed
libnet/d/bridge: port mappings: filter by input iface
When a NAT-based port mapping is created with a HostIP specified, we insert a DNAT rule in nat-DOCKER to replace the dest addr with the container IP. Then, in filter chains, we allow access to the container port for any packet not coming from the container's network itself (if hairpinning is disabled), nor from another host bridge. However we don't set any rule that prevents a rogue neighbor that shares a L2 segment with the host, but not the one where the port binding is expected to be published, from sending packets destined to that HostIP. For instance, if a port binding is created with HostIP == '127.0.0.1', this port should not be accessible from anything but the lo interface. That's currently not the case and this provides a false sense of security. Since nat-DOCKER mangles the dest addr, and the nat table rejects DROP rules, this change adds rules into raw-PREROUTING to filter ingress packets destined to mapped ports based on the input interface, the dest addr and the dest port. Interfaces are dynamically resolved when packets hit the host, thanks to iptables' addrtype extension. This extension does a fib lookup of the dest addr and checks that it's associated with the interface reached. Also, when a proxy-based port mapping is created, as is the case when an IPv6 HostIP is specified but the container is only IPv4-capable, we don't set any sort of filtering. So the same issue might happen. The reason is a bit different - in that case, that's just how the kernel works. But, in order to stay consistent with NAT-based mappings, these rules are also applied. The env var `DOCKER_DISABLE_INPUT_IFACE_FILTERING` can be set to any true-ish value to globally disable this behavior. Signed-off-by: Albin Kerouanton <[email protected]>
1 parent d7ad7a9 commit 433b1f9

12 files changed

Lines changed: 513 additions & 25 deletions

File tree

Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -542,6 +542,7 @@ RUN --mount=type=cache,sharing=locked,id=moby-dev-aptlib,target=/var/lib/apt \
542542
libprotobuf-c1 \
543543
libyajl2 \
544544
net-tools \
545+
netcat-openbsd \
545546
patch \
546547
pigz \
547548
sudo \

daemon/info_unix.go

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import (
99
"os"
1010
"os/exec"
1111
"path/filepath"
12+
"strconv"
1213
"strings"
1314

1415
runcoptions "github.com/containerd/containerd/api/types/runc/options"
@@ -159,6 +160,12 @@ func (daemon *Daemon) fillPlatformInfo(ctx context.Context, v *system.Info, sysI
159160
if !v.IPv4Forwarding {
160161
v.Warnings = append(v.Warnings, "WARNING: IPv4 forwarding is disabled")
161162
}
163+
if filtering, _ := strconv.ParseBool(os.Getenv("DOCKER_DISABLE_INPUT_IFACE_FILTERING")); filtering {
164+
v.Warnings = append(v.Warnings,
165+
"WARNING: input interface filtering is disabled on port mappings, this might be insecure",
166+
"DEPRECATED: DOCKER_DISABLE_INPUT_IFACE_FILTERING is deprecated and will be removed in a future release",
167+
)
168+
}
162169
return nil
163170
}
164171

integration/internal/network/ops.go

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,14 @@ func WithIPv6() func(*network.CreateOptions) {
2727
}
2828
}
2929

30+
// WithIPv6Disabled makes sure IPv6 is disabled on the network.
31+
func WithIPv6Disabled() func(*network.CreateOptions) {
32+
return func(n *network.CreateOptions) {
33+
enable := false
34+
n.EnableIPv6 = &enable
35+
}
36+
}
37+
3038
// WithInternal enables Internal flag on the create network request
3139
func WithInternal() func(*network.CreateOptions) {
3240
return func(n *network.CreateOptions) {
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
## Container on a user-defined network, with a port published on a specific HostIP
2+
3+
Adding a network running a container with a mapped port, equivalent to:
4+
5+
docker network create \
6+
-o com.docker.network.bridge.name=bridge1 \
7+
--subnet 192.0.2.0/24 --gateway 192.0.2.1 bridge1
8+
docker run --network bridge1 -p 127.0.0.1:8080:80 --name c1 busybox
9+
10+
The filter and nat tables are the same as with no HostIP specified.
11+
12+
<details>
13+
<summary>Filter table</summary>
14+
15+
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
16+
num pkts bytes target prot opt in out source destination
17+
18+
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
19+
num pkts bytes target prot opt in out source destination
20+
1 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0
21+
2 0 0 ACCEPT 0 -- * * 0.0.0.0/0 0.0.0.0/0 match-set docker-ext-bridges-v4 dst ctstate RELATED,ESTABLISHED
22+
3 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0
23+
4 0 0 DOCKER 0 -- * * 0.0.0.0/0 0.0.0.0/0 match-set docker-ext-bridges-v4 dst
24+
5 0 0 ACCEPT 0 -- docker0 * 0.0.0.0/0 0.0.0.0/0
25+
6 0 0 ACCEPT 0 -- bridge1 * 0.0.0.0/0 0.0.0.0/0
26+
27+
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
28+
num pkts bytes target prot opt in out source destination
29+
30+
Chain DOCKER (1 references)
31+
num pkts bytes target prot opt in out source destination
32+
1 0 0 ACCEPT 6 -- !bridge1 bridge1 0.0.0.0/0 192.0.2.2 tcp dpt:80
33+
2 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0
34+
3 0 0 DROP 0 -- !bridge1 bridge1 0.0.0.0/0 0.0.0.0/0
35+
36+
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
37+
num pkts bytes target prot opt in out source destination
38+
1 0 0 DOCKER-ISOLATION-STAGE-2 0 -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
39+
2 0 0 DOCKER-ISOLATION-STAGE-2 0 -- bridge1 !bridge1 0.0.0.0/0 0.0.0.0/0
40+
41+
Chain DOCKER-ISOLATION-STAGE-2 (2 references)
42+
num pkts bytes target prot opt in out source destination
43+
1 0 0 DROP 0 -- * bridge1 0.0.0.0/0 0.0.0.0/0
44+
2 0 0 DROP 0 -- * docker0 0.0.0.0/0 0.0.0.0/0
45+
46+
Chain DOCKER-USER (1 references)
47+
num pkts bytes target prot opt in out source destination
48+
1 0 0 RETURN 0 -- * * 0.0.0.0/0 0.0.0.0/0
49+
50+
51+
-P INPUT ACCEPT
52+
-P FORWARD ACCEPT
53+
-P OUTPUT ACCEPT
54+
-N DOCKER
55+
-N DOCKER-ISOLATION-STAGE-1
56+
-N DOCKER-ISOLATION-STAGE-2
57+
-N DOCKER-USER
58+
-A FORWARD -j DOCKER-USER
59+
-A FORWARD -m set --match-set docker-ext-bridges-v4 dst -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
60+
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
61+
-A FORWARD -m set --match-set docker-ext-bridges-v4 dst -j DOCKER
62+
-A FORWARD -i docker0 -j ACCEPT
63+
-A FORWARD -i bridge1 -j ACCEPT
64+
-A DOCKER -d 192.0.2.2/32 ! -i bridge1 -o bridge1 -p tcp -m tcp --dport 80 -j ACCEPT
65+
-A DOCKER ! -i docker0 -o docker0 -j DROP
66+
-A DOCKER ! -i bridge1 -o bridge1 -j DROP
67+
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
68+
-A DOCKER-ISOLATION-STAGE-1 -i bridge1 ! -o bridge1 -j DOCKER-ISOLATION-STAGE-2
69+
-A DOCKER-ISOLATION-STAGE-2 -o bridge1 -j DROP
70+
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
71+
-A DOCKER-USER -j RETURN
72+
73+
74+
</details>
75+
76+
<details>
77+
<summary>NAT table</summary>
78+
79+
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
80+
num pkts bytes target prot opt in out source destination
81+
1 0 0 DOCKER 0 -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
82+
83+
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
84+
num pkts bytes target prot opt in out source destination
85+
86+
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
87+
num pkts bytes target prot opt in out source destination
88+
1 0 0 DOCKER 0 -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
89+
90+
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
91+
num pkts bytes target prot opt in out source destination
92+
1 0 0 MASQUERADE 0 -- * !bridge1 192.0.2.0/24 0.0.0.0/0
93+
2 0 0 MASQUERADE 0 -- * !docker0 172.17.0.0/16 0.0.0.0/0
94+
95+
Chain DOCKER (2 references)
96+
num pkts bytes target prot opt in out source destination
97+
1 0 0 RETURN 0 -- bridge1 * 0.0.0.0/0 0.0.0.0/0
98+
2 0 0 RETURN 0 -- docker0 * 0.0.0.0/0 0.0.0.0/0
99+
3 0 0 DNAT 6 -- !bridge1 * 0.0.0.0/0 127.0.0.1 tcp dpt:8080 to:192.0.2.2:80
100+
101+
102+
-P PREROUTING ACCEPT
103+
-P INPUT ACCEPT
104+
-P OUTPUT ACCEPT
105+
-P POSTROUTING ACCEPT
106+
-N DOCKER
107+
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
108+
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
109+
-A POSTROUTING -s 192.0.2.0/24 ! -o bridge1 -j MASQUERADE
110+
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
111+
-A DOCKER -i bridge1 -j RETURN
112+
-A DOCKER -i docker0 -j RETURN
113+
-A DOCKER -d 127.0.0.1/32 ! -i bridge1 -p tcp -m tcp --dport 8080 -j DNAT --to-destination 192.0.2.2:80
114+
115+
116+
</details>
117+
118+
The raw table is:
119+
120+
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
121+
num pkts bytes target prot opt in out source destination
122+
1 0 0 ACCEPT 6 -- * * 0.0.0.0/0 127.0.0.1 tcp dpt:8080 ADDRTYPE match dst-type LOCAL limit-in
123+
2 0 0 DROP 6 -- * * 0.0.0.0/0 127.0.0.1 tcp dpt:8080
124+
125+
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
126+
num pkts bytes target prot opt in out source destination
127+
128+
129+
<details>
130+
<summary>iptables commands</summary>
131+
132+
-P PREROUTING ACCEPT
133+
-P OUTPUT ACCEPT
134+
-A PREROUTING -d 127.0.0.1/32 -p tcp -m tcp --dport 8080 -m addrtype --dst-type LOCAL --limit-iface-in -j ACCEPT
135+
-A PREROUTING -d 127.0.0.1/32 -p tcp -m tcp --dport 8080 -j DROP
136+
137+
138+
</details>
139+
140+
The difference from [port mapping with no HostIP][0] is:
141+
142+
- An ACCEPT rule is added to the PREROUTING chain to drop packets targeting the
143+
mapped port and coming from the interface that has the HostIP assigned.
144+
- And a DROP rule is added too, to drop packets targeting the mapped port but
145+
didn't pass the previous check.
146+
147+
[0]: usernet-portmap.md

integration/network/bridge/iptablesdoc/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,4 @@ Scenarios:
4646
- [Container on a routed-mode network, with a published port](generated/usernet-portmap-routed.md)
4747
- [Container on a nat-unprotected network, with a published port](generated/usernet-portmap-natunprot.md)
4848
- [Swarm service, with a published port](generated/swarm-portmap.md)
49+
- [Container on a user-defined network, with a port published on a specific HostIP](generated/usernet-portmap-hostip.md)

integration/network/bridge/iptablesdoc/iptablesdoc_linux_test.go

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,18 @@ var index = []section{
174174
},
175175
}},
176176
},
177+
{
178+
name: "usernet-portmap-hostip.md",
179+
networks: []networkDesc{{
180+
name: "bridge1",
181+
containers: []ctrDesc{
182+
{
183+
name: "c1",
184+
portMappings: nat.PortMap{"80/tcp": {{HostIP: "127.0.0.1", HostPort: "8080"}}},
185+
},
186+
},
187+
}},
188+
},
177189
}
178190

179191
// iptCmdType is used to look up iptCmds in the markdown (can't use an int
@@ -188,6 +200,8 @@ const (
188200
iptCmdSFilterDocker4 iptCmdType = "SFilterDocker4"
189201
iptCmdLNat4 iptCmdType = "LNat4"
190202
iptCmdSNat4 iptCmdType = "SNat4"
203+
iptCmdLRaw4 iptCmdType = "LRaw4"
204+
iptCmdSRaw4 iptCmdType = "SRaw4"
191205
)
192206

193207
var iptCmds = map[iptCmdType][]string{
@@ -198,6 +212,8 @@ var iptCmds = map[iptCmdType][]string{
198212
iptCmdSFilterDocker4: {"iptables", "-S", "DOCKER"},
199213
iptCmdLNat4: {"iptables", "-nvL", "--line-numbers", "-t", "nat"},
200214
iptCmdSNat4: {"iptables", "-S", "-t", "nat"},
215+
iptCmdLRaw4: {"iptables", "-nvL", "--line-numbers", "-t", "raw"},
216+
iptCmdSRaw4: {"iptables", "-S", "-t", "raw"},
201217
}
202218

203219
func TestBridgeIptablesDoc(t *testing.T) {
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
## Container on a user-defined network, with a port published on a specific HostIP
2+
3+
Adding a network running a container with a mapped port, equivalent to:
4+
5+
docker network create \
6+
-o com.docker.network.bridge.name=bridge1 \
7+
--subnet 192.0.2.0/24 --gateway 192.0.2.1 bridge1
8+
docker run --network bridge1 -p 127.0.0.1:8080:80 --name c1 busybox
9+
10+
The filter and nat tables are the same as with no HostIP specified.
11+
12+
<details>
13+
<summary>Filter table</summary>
14+
15+
{{index . "LFilter4"}}
16+
17+
{{index . "SFilter4"}}
18+
19+
</details>
20+
21+
<details>
22+
<summary>NAT table</summary>
23+
24+
{{index . "LNat4"}}
25+
26+
{{index . "SNat4"}}
27+
28+
</details>
29+
30+
The raw table is:
31+
32+
{{index . "LRaw4"}}
33+
34+
<details>
35+
<summary>iptables commands</summary>
36+
37+
{{index . "SRaw4"}}
38+
39+
</details>
40+
41+
The difference from [port mapping with no HostIP][0] is:
42+
43+
- An ACCEPT rule is added to the PREROUTING chain to drop packets targeting the
44+
mapped port and coming from the interface that has the HostIP assigned.
45+
- And a DROP rule is added too, to drop packets targeting the mapped port but
46+
didn't pass the previous check.
47+
48+
[0]: usernet-portmap.md

0 commit comments

Comments
 (0)