Skip to content

Commit 801e8c8

Browse files
committed
pkg/cri/server: Fix net.ipv4.ping_group_range with userns
userns.RunningInUserNS() checks if the code calling that function is running inside a user namespace. But we need to check if the container we will create will use a user namespace, in that case we need to disable the sysctl too (or we would need to take the userns mapping into account to set the IDs). This was added in PR: #6170 And the param documentation says it is not enabled when user namespaces are in use: https://github.com/containerd/containerd/pull/6170/files#diff-91d0a4c61f6d3523b5a19717d1b40b5fffd7e392d8fe22aed7c905fe195b8902R118 I'm not sure if the intention was to disable this if containerd is running inside a userns (rootless, if that is even supported) or just when the pod has user namespaces. Out of an abundance of caution, I'm keeping the userns.RunningInUserNS() so it is still not used if containerd runs inside a user namespace. With this patch and "enable_unprivileged_icmp = true" in the config, running containerd as root on the host, pods with user namespaces start just fine. Without this patch they fail with: ... failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: w /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown Thanks a lot to Andy on the k8s slack for reporting the issue. He also mentions he hits this with k3s on a default installation (the param is off by default on containerd, but k3s turns that on by default it seems). He also debugged which part of the stack was setting that sysctl, found the PR that added this code in containerd and a workaround (to turn the bool off). Signed-off-by: Rodrigo Campos <[email protected]> (cherry picked from commit 9bf5aec)
1 parent 6d148fb commit 801e8c8

2 files changed

Lines changed: 6 additions & 1 deletion

File tree

pkg/cri/sbserver/podsandbox/sandbox_run_linux.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,9 @@ func (c *Controller) sandboxContainerSpec(id string, config *runtime.PodSandboxC
148148
if c.config.EnableUnprivilegedPorts && !ipUnprivilegedPortStart {
149149
sysctls["net.ipv4.ip_unprivileged_port_start"] = "0"
150150
}
151+
// TODO (rata): We need to set this only if the pod will
152+
// **not** use user namespaces either.
153+
// This will be done when user namespaces is ported to sbserver.
151154
if c.config.EnableUnprivilegedICMP && !pingGroupRange && !userns.RunningInUserNS() {
152155
sysctls["net.ipv4.ping_group_range"] = "0 2147483647"
153156
}

pkg/cri/server/sandbox_run_linux.go

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ func (c *criService) sandboxContainerSpec(id string, config *runtime.PodSandboxC
9797

9898
usernsOpts := nsOptions.GetUsernsOptions()
9999
uids, gids, err := parseUsernsIDs(usernsOpts)
100+
var usernsEnabled bool
100101
if err != nil {
101102
return nil, fmt.Errorf("user namespace configuration: %w", err)
102103
}
@@ -107,6 +108,7 @@ func (c *criService) sandboxContainerSpec(id string, config *runtime.PodSandboxC
107108
specOpts = append(specOpts, customopts.WithoutNamespace(runtimespec.UserNamespace))
108109
case runtime.NamespaceMode_POD:
109110
specOpts = append(specOpts, oci.WithUserNamespace(uids, gids))
111+
usernsEnabled = true
110112
default:
111113
return nil, fmt.Errorf("unsupported user namespace mode: %q", mode)
112114
}
@@ -166,7 +168,7 @@ func (c *criService) sandboxContainerSpec(id string, config *runtime.PodSandboxC
166168
if c.config.EnableUnprivilegedPorts && !ipUnprivilegedPortStart {
167169
sysctls["net.ipv4.ip_unprivileged_port_start"] = "0"
168170
}
169-
if c.config.EnableUnprivilegedICMP && !pingGroupRange && !userns.RunningInUserNS() {
171+
if c.config.EnableUnprivilegedICMP && !pingGroupRange && !userns.RunningInUserNS() && !usernsEnabled {
170172
sysctls["net.ipv4.ping_group_range"] = "0 2147483647"
171173
}
172174
}

0 commit comments

Comments
 (0)