Skip to content

cgroup2: how can we support nested containers with domain controllers? #2356

@AkihiroSuda

Description

@AkihiroSuda

https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#no-internal-process-constraint

No Internal Process Constraint

Non-root cgroups can distribute domain resources to their children only when they don’t have any processes of their own. In other words, only domain cgroups which don’t contain any processes can have domain controllers enabled in their “cgroup.subtree_control” files.

http://man7.org/linux/man-pages/man7/cgroups.7.html

As at Linux 4.19, the following controllers are threaded: cpu, perf_event, and pids.

The constraint seems blocker toward supporting nested containers like dind and kind.

$ sudo podman run -it --rm --runtime=runc --privileged --cgroupns=private alpine
/ # cd /sys/fs/cgroup/
/sys/fs/cgroup # cat cgroup.controllers 
cpu io memory pids
/sys/fs/cgroup # echo +cpu > cgroup.subtree_control 
/sys/fs/cgroup # echo +io > cgroup.subtree_control 
sh: write error: Not supported
/sys/fs/cgroup # echo +memory > cgroup.subtree_control 
sh: write error: Not supported
/sys/fs/cgroup # echo +pids > cgroup.subtree_control 

The situation is same on crun as well.

@giuseppe @kolyshkin @vbatts Thoughts?
A workaround is to specify an entrypoint script that moves the processes in the namespaced-root cgroup to another cgroup.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions