Skip to content

core: cgroup2 support#2903

Merged
keszybz merged 3 commits intosystemd:masterfrom
keszybz:cgroup2-v3
Mar 30, 2016
Merged

core: cgroup2 support#2903
keszybz merged 3 commits intosystemd:masterfrom
keszybz:cgroup2-v3

Conversation

@keszybz
Copy link
Member

@keszybz keszybz commented Mar 26, 2016

This replaces #2271 and #2902.

First commit is taken from #2271, but "cgroup" is used as the dummy device name instead of "cgroup2". Commits two and three are from #2902.

alban and others added 3 commits March 26, 2016 12:05
Since Linux v4.4-rc1, __DEVEL__sane_behavior does not exist anymore and
is replaced by a new fstype "cgroup2".

With this patch, systemd no longer supports the old (unstable) way of
doing unified hierarchy with __DEVEL__sane_behavior and systemd now
requires Linux v4.4 for unified hierarchy.

Non-unified hierarchy is still the default and is unchanged by this
patch.

torvalds/linux@67e9c74
Earlier during the development of unified hierarchy, the populated event was
reported through by the dedicated "cgroup.populated" file; however, the
interface was updated so that it's reported through the "populated" field of
"cgroup.events" file.  Update populated event handling logic accordingly.
After receiving SIGCHLD, one of the ways manager_dispatch_sigchld() maps the
now zombie $PID to its unit is through manager_get_unit_by_pid_cgroup() which
reads /proc/$PID/cgroup and looks up the unit associated with the cgroup path.

On non-unified cgroup hierarchies, a process is immediately migrated to the
root cgroup on death and the cgroup lookup would always have returned the unit
associated with it, making it rather pointless but safe.  On unified hierarchy,
a zombie remains associated with the cgroup that it was associated with at the
time of death and thus manager_get_unit_by_pid_cgroup() will look up the unit
properly.

However, by the time manager_dispatch_sigchld() is running, the original cgroup
may have become empty and it and its associated unit might already have been
removed.  If the cgroup path doesn't yield a match, manager_dispatch_sigchld()
keeps pruning the leaf component.  This means that the function may return a
slice unit for a pid and as a slice doesn't have ->sigchld_event() handler,
calling invoke_sigchld_event() on it causes a segfault.

This patch updates invoke_sigchld_event() so that it skips calling if the
handler is not set.
@keszybz keszybz changed the title Cgroup2 v3 core: cgroup2 support Mar 26, 2016
@keszybz
Copy link
Member Author

keszybz commented Mar 26, 2016

Hm, doesn't boot with systemd.unified_cgroup_hierarchy=1. /me needs to investigate.

@davide125
Copy link
Contributor

@keszybz how it is failing for you? fwiw we have #2902 running fine on a bunch of machines here. One thing that bit me while testing is that you need to make sure the initramfs is also updated after updating systemd, as the cgroup stuff happens very early during startup.

@keszybz
Copy link
Member Author

keszybz commented Mar 30, 2016

Yeah, initramfs needed updating, and also selinux policy needs updating (https://bugzilla.redhat.com/show_bug.cgi?id=1322184). This is good to merge.

@keszybz keszybz merged commit 1b81db7 into systemd:master Mar 30, 2016
@keszybz keszybz deleted the cgroup2-v3 branch March 30, 2016 00:25
pebenito pushed a commit to OwlCyberDefense/refpolicy that referenced this pull request Mar 31, 2016
With the new "cgroup2" system added in kernel 4.5, systemd is getting
selinux denials when manipulating the cgroup hierarchy.

Pull request in systemd with cgroup2 support:
systemd/systemd#2903

AVC when writing process numbers to move them to the right cgroup:
Mar 29 19:58:30 rawhide kernel: audit: type=1400
audit(1459295910.257:68): avc:  denied  { write } for  pid=1
comm="systemd" name="cgroup.procs" dev="cgroup2" ino=6
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1

In this case new filesystem "cgroup2" need to be labeled as cgroup_t.

Signed-off-by: Lukas Vrabec <[email protected]>
perfinion pushed a commit to perfinion/hardened-refpolicy that referenced this pull request May 13, 2016
With the new "cgroup2" system added in kernel 4.5, systemd is getting
selinux denials when manipulating the cgroup hierarchy.

Pull request in systemd with cgroup2 support:
systemd/systemd#2903

AVC when writing process numbers to move them to the right cgroup:
Mar 29 19:58:30 rawhide kernel: audit: type=1400
audit(1459295910.257:68): avc:  denied  { write } for  pid=1
comm="systemd" name="cgroup.procs" dev="cgroup2" ino=6
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1

In this case new filesystem "cgroup2" need to be labeled as cgroup_t.

Signed-off-by: Lukas Vrabec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants