This is a duplication of the issue on the old CRI project repo #6643 since we're still experiencing the same problem
Description
When configuring containerd with privileged_without_host_devices=true while using runc1.0.0-rc91 onwards, we cannot run any containers on a privileged DIND container.
We saw the following error running a normal container in DIND on a node that is running containerd-1.4.3 with runc1.0.0-rc95
/ # docker container run alpine:3.7
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: failed to write "c 5:1 rwm": write /sys/fs/cgroup/devices/docker/d933b364769b1857434209b2576d82c40871670730ab190d983c4b0b50493045/devices.allow: operation not permitted: unknown.
This is due to the removal of the whitelisting of the /dev/console rule c 5:1 rwm in runc1.0.0-rc91 and oci.WithAllDevicesAllowed is not enabled for priviledged container when privileged_without_host_devices=true is configured.
/ # cat /sys/fs/cgroup/devices/devices.list
b *:* m
c *:* m
c 1:3 rwm
c 1:5 rwm
c 1:7 rwm
c 1:8 rwm
c 1:9 rwm
c 5:0 rwm
c 5:2 rwm
c 10:200 rwm
c 136:* rwm
When privileged=true and privileged_without_host_devices=false. This is not a problem because
/ # cat /sys/fs/cgroup/devices/devices.list
a *:* rwm
Steps to reproduce the issue:
Using the pod spec:
apiVersion: v1
kind: Pod
metadata:
name: dind-test
namespace: default
spec:
containers:
- name: client
image: docker:stable-dind
command:
- sleep
- "86400"
env:
- name: DOCKER_HOST
value: tcp://127.0.0.1:2375
- name: dind
image: docker:stable-dind
env:
- name: DOCKER_TLS_CERTDIR
securityContext:
privileged: true
and then exec-ing into the client container:
$ kubectl exec -it test -c client -- sh
/ # docker container run alpine:3.7
Unable to find image 'alpine:3.7' locally
3.7: Pulling from library/alpine
5d20c808ce19: Pull complete
Digest: sha256:8421d9a84432575381bfabd248f1eb56f3aa21d9d7cd2511583c68c9b7511d10
Status: Downloaded newer image for alpine:3.7
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: failed to write "c 5:1 rwm": write /sys/fs/cgroup/devices/docker/4dcb197f000dbe16e9aad23ae2c5105006b4e3e9ed41d5455436ee68cd606e13/devices.allow: operation not permitted: unknown.
/ #
Workaround
The workaround of adding a lifecycle to the privileged Kubernetes container to update the cgroup still works:
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- echo "a *:* rwm" > /sys/fs/cgroup/devices/devices.allow
But however at current state, this would be enabled for every privileged DIND container spec for the daemon to run any container at all.
Proposed Solution
Considering the comment regarding keeping the behaviour of privileged_without_host_devices unchanged containerd/cri#1567 (comment). The proposed solution would be to add a new flag option and enable oci.WithAllDevicesAllowed on top of privileged_without_host_devices when the flag is enabled.
diff --git a/pkg/cri/config/config.go b/pkg/cri/config/config.go
index b1c04add3..3814d099c 100644
--- a/pkg/cri/config/config.go
+++ b/pkg/cri/config/config.go
@@ -54,6 +54,10 @@ type Runtime struct {
// PrivilegedWithoutHostDevices overloads the default behaviour for adding host devices to the
// runtime spec when the container is privileged. Defaults to false.
PrivilegedWithoutHostDevices bool `toml:"privileged_without_host_devices" json:"privileged_without_host_devices"`
+ // PrivilegedWithoutHostDevicesAllDevicesAllowed overloads the default behaviour for mounting host devices and
+ // whitelisting devices to the runtime spec when the container is privileged. Requires
+ // PrivilegedWithoutHostDevices to be enabled. Defaults to false.
+ PrivilegedWithoutHostDevicesAllDevicesAllowed bool `toml:"privileged_without_host_devices_all_devices_allowed" json:"privileged_without_host_devices_all_devices_allowed"`
// BaseRuntimeSpec is a json file with OCI spec to use as base spec that all container's will be created from.
BaseRuntimeSpec string `toml:"base_runtime_spec" json:"baseRuntimeSpec"`
}
diff --git a/pkg/cri/server/container_create_linux.go b/pkg/cri/server/container_create_linux.go
index 26386e991..12ba708aa 100644
--- a/pkg/cri/server/container_create_linux.go
+++ b/pkg/cri/server/container_create_linux.go
@@ -221,6 +221,10 @@ func (c *criService) containerSpec(
if !ociRuntime.PrivilegedWithoutHostDevices {
specOpts = append(specOpts, oci.WithHostDevices, oci.WithAllDevicesAllowed)
} else {
+ // allow rwm on all devices for the container
+ if ociRuntime.PrivilegedWithoutHostDevicesAllDevicesAllowed {
+ specOpts = append(specOpts, oci.WithAllDevicesAllowed)
+ }
// add requested devices by the config as host devices are not automatically added
specOpts = append(specOpts, customopts.WithDevices(c.os, config),
customopts.WithCapabilities(securityContext, c.allCaps))
This is a duplication of the issue on the old CRI project repo #6643 since we're still experiencing the same problem
Description
When configuring containerd with
privileged_without_host_devices=truewhile usingrunc1.0.0-rc91onwards, we cannot run any containers on a privileged DIND container.We saw the following error running a normal container in DIND on a node that is running
containerd-1.4.3withrunc1.0.0-rc95This is due to the removal of the whitelisting of the
/dev/consolerulec 5:1 rwminrunc1.0.0-rc91andoci.WithAllDevicesAllowedis not enabled for priviledged container whenprivileged_without_host_devices=trueis configured.When privileged=true and privileged_without_host_devices=false. This is not a problem because
Steps to reproduce the issue:
Using the pod spec:
and then exec-ing into the client container:
Workaround
The workaround of adding a lifecycle to the privileged Kubernetes container to update the cgroup still works:
But however at current state, this would be enabled for every privileged DIND container spec for the daemon to run any container at all.
Proposed Solution
Considering the comment regarding keeping the behaviour of
privileged_without_host_devicesunchanged containerd/cri#1567 (comment). The proposed solution would be to add a new flag option and enableoci.WithAllDevicesAllowedon top ofprivileged_without_host_deviceswhen the flag is enabled.