Skip to content

Conversation

@rata
Copy link
Contributor

@rata rata commented Jul 6, 2023

userns.RunningInUserNS() checks if the code calling that function is running inside a user namespace. But we need to check if the container we will create will use a user namespace, in that case we need to disable the sysctl too (or we would need to take the userns mapping into account to set the IDs).

This was added in PR:
#6170

And the param documentation says it is not enabled when user namespaces are in use:
https://github.com/containerd/containerd/pull/6170/files#diff-91d0a4c61f6d3523b5a19717d1b40b5fffd7e392d8fe22aed7c905fe195b8902R118

I'm not sure if the intention was to disable this if containerd is running inside a userns (rootless, if that is even supported) or just when the pod has user namespaces.

Out of an abundance of caution, I'm keeping the userns.RunningInUserNS() so it is still not used if containerd runs inside a user namespace.

With this patch and "enable_unprivileged_icmp = true" in the config, running containerd as root on the host, pods with user namespaces start just fine. Without this patch they fail with:

	... failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: write /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown

Thanks a lot to Andy on the k8s slack for reporting the issue. He also mentions he hits this with k3s on a default installation (the param is off by default on containerd, but k3s turns that on by default it seems). He also debugged which part of the stack was setting that sysctl, found the PR that added this code in containerd and a workaround (to turn the bool off). It was very valuable and a great report :)

@k8s-ci-robot
Copy link

Hi @rata. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rata
Copy link
Contributor Author

rata commented Jul 6, 2023

I'll create a backport for 1.7 when this is merged, if you all agree :)

@rata rata force-pushed the rata/userns-fixes branch 2 times, most recently from d7ae6c5 to bc9b45f Compare July 6, 2023 12:10
rata added 2 commits July 6, 2023 14:20
userns.RunningInUserNS() checks if the code calling that function is
running inside a user namespace. But we need to check if the container
we will create will use a user namespace, in that case we need to
disable the sysctl too (or we would need to take the userns mapping into
account to set the IDs).

This was added in PR:
        containerd#6170

And the param documentation says it is not enabled when user namespaces
are in use:
        https://github.com/containerd/containerd/pull/6170/files#diff-91d0a4c61f6d3523b5a19717d1b40b5fffd7e392d8fe22aed7c905fe195b8902R118

I'm not sure if the intention was to disable this if containerd is
running inside a userns (rootless, if that is even supported) or just
when the pod has user namespaces.

Out of an abundance of caution, I'm keeping the userns.RunningInUserNS()
so it is still not used if containerd runs inside a user namespace.

With this patch and "enable_unprivileged_icmp = true" in the config,
running containerd as root on the host, pods with user namespaces start
just fine. Without this patch they fail with:
        ... failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: w
 /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown

Thanks a lot to Andy on the k8s slack for reporting the issue. He also
mentions he hits this with k3s on a default installation (the param
is off by default on containerd, but k3s turns that on by default it
seems). He also debugged which part of the stack was setting that
sysctl, found the PR that added this code in containerd and a workaround
(to turn the bool off).

Signed-off-by: Rodrigo Campos <[email protected]>
@rata rata force-pushed the rata/userns-fixes branch 3 times, most recently from abfdbd9 to c17d3bd Compare July 6, 2023 13:53
@rata
Copy link
Contributor Author

rata commented Jul 6, 2023

CI seems like a flake/something broken in main and unrelated to this PR.

CI / Linux Integration (io.containerd.runc.v2, crun, sandboxed is failing but, besides sandboxed being untouched by this PR (only a comment added), reverting all the changes in this PR (so no diff) also makes the CI fails in that test (the error seems unrelated to this PR too, see it here: https://github.com/containerd/containerd/actions/runs/5476033300/jobs/9973114117?pr=8779)

@estesp estesp merged commit 3c250cb into containerd:main Jul 6, 2023
@rata rata deleted the rata/userns-fixes branch July 7, 2023 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants