Conversation
|
Just saw: related to #2112. |
src/basic/cgroup-util.c
Outdated
|
|
||
| bool cg_ns_supported(void) | ||
| { | ||
| return access("/proc/self/ns/cgroup", F_OK) == 0; |
There was a problem hiding this comment.
nitpick: please follow the usual coding style, and place the opening bracket on the same line as the function name. i.e.:
bool cg_ns_supported(void) {
…There was a problem hiding this comment.
Sorry, was irritated by the function definition directly above.
|
looks pretty good. mostly minor issues. (oh, one more thing: we don't use Signed-off-by in systemd, that's a kernel thing) |
a3d0aae to
3c1f06e
Compare
|
This also seems to break nspawn, see the autopkgtest log for the failed "build-and-services" and the first "upstream" nspawn test: |
Yeah, this is on On root# uname -r
4.6.2-1-ARCH
root# env UNIFIED_CGROUP_HIERARCHY=no ../../systemd-nspawn --register=no --kill-signal=SIGKILL --directory=/var/tmp/systemd-test.6Rcdc5/nspawn-root /usr/lib/systemd/systemd systemd.unit=multi-user.target
Spawning container nspawn-root on /var/tmp/systemd-test.6Rcdc5/nspawn-root.
Press ^] three times within 1s to kill container.
Child died too early.
Failed to read link /sys/fs/cgroup/cpu: No such file or directory |
|
I'm on this. Sorry. |
|
So, yeah, |
|
Note that the Ubuntu 4.4 kernels has the cgroup namespace feature backported, as we use it for LXD. If that's somehow incomplete, I can move the testing to a newer kernel (with some additional overhead). However, AFAIR current systemd policy is that things should generally work with kernels ≤ 2 years old, so things should at least have a reasonable fallback. |
Oh, indeed: Good to know, thanks! I've checked this patch on |
|
So I misconstrued how |
|
Oh, and thanks for the feedback! |
b4f3457 to
dd8e1b4
Compare
|
So here is how I implemented it so far: When cgroup namespaces are enabled we unshare the cgroup namespace after all limits and so on have been applied but we do not mount cgroups since that is unnecessary with cgroup namespaces and only causes information leak. We should then be correctly placed in the right cgroups when we do |
src/nspawn/nspawn.c
Outdated
| return r; | ||
| if (cg_ns_supported()) { | ||
| r = unshare(CLONE_NEWCGROUP); | ||
| if (r < 0) |
There was a problem hiding this comment.
Well, systemd-nspawn doesn't fail on startup. But this breaks UNIFIED_CGROUP_HIERARCHY:
nspawn understands the $UNIFIED_CGROUP_HIERARCHY
environment variable to individually select the hierarchy to
use for executed containers. By default, nspawn will use the
unified hierarchy for the containers if the host uses the
unified hierarchy, and the legacy hierarchy otherwise.
-bash-4.3# grep cgroup /proc/self/mounts
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
-bash-4.3# UNIFIED_CGROUP_HIERARCHY=yes systemd-nspawn -D /nspawn-root/ -b 3
...
container# grep cgroup /proc/self/mounts
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0There was a problem hiding this comment.
Also, this works strange with the unified hierarchy:
-bash-4.3# grep cgroup /proc/self/mounts
cgroup /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
-bash-4.3# unshare -C cat /proc/self/cgroup
0::/
-bash-4.3# systemd-nspawn -D /nspawn-root -b 3
...
container# grep cgroup /proc/self/mounts
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
container# cat /proc/1/cgroup
9:cpuset:/
8:devices:/init.scope
7:cpu,cpuacct:/init.scope
6:net_cls:/
5:freezer:/
1:name=systemd:/init.scope
0::/
There was a problem hiding this comment.
Thanks for testing on unified @evverx. Starting unified with systemd-nspawn didn't work for me with v228 independent of the patch. So maybe I need to test that from master again.
Your second point I'm not entirely clear what you're getting at. In the case of cgroup namespaces the container will be able to mount a cgroup filesystem by itself just as on normal system bootup. So we don't need to bind-mount, I think. If you're getting at the point about some subsystems missing in the container. This is explained by how cgroup v1 and v2 interact I think: As you have mounted cgroup2 on the host you likely have mounted the available subsystems memory, pid etc. into the v2 hierarchy which means that they are not mounted into the v1 hierarchy. This is why they do not appear in the container which checks the available controllers in the v1 hierarchy.
@poettering would you prefer a different approach?
There was a problem hiding this comment.
This is why they do not appear in the container which checks the available controllers in the v1 hierarchy.
But why do we need to check the v1-controllers on the v2-hierarchy?
In the case of cgroup namespaces the container will be able to mount a cgroup filesystem by itself just as on normal system bootup.
Yeah. But we shouldn't mount v1 on v2 (and vice versa)
master:
-bash-4.3# systemd-nspawn -D /nspawn-root -b 3
...
container# grep cgroup /proc/self/mounts
cgroup /sys/fs/cgroup cgroup2 ro,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/machine.slice/machine-nspawn\134x2droot.scope cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
container# cat /proc/1/cgroup
0::/machine.slice/machine-nspawn\x2droot.scope/init.scopeThere was a problem hiding this comment.
So, something went wrong here:
int mount_cgroup_controllers(char ***join_controllers) {
_cleanup_set_free_free_ Set *controllers = NULL;
int r;
if (!cg_is_legacy_wanted())
return 0;
/* Mount all available cgroup controllers that are built into the kernel. */
controllers = set_new(&string_hash_ops);
if (!controllers)
return log_oom();cg_is_legacy_wanted should return 0
There was a problem hiding this comment.
On Sat, Jun 25, 2016 at 12:24:00AM -0700, Evgeny Vereshchagin wrote:
@@ -2594,9 +2594,15 @@ static int inner_child(
return -ESRCH;
}
r = mount_systemd_cgroup_writable("", arg_unified_cgroup_hierarchy);if (r < 0)return r;- if (cg_ns_supported()) {
r = unshare(CLONE_NEWCGROUP);if (r < 0)Well,
systemd-nspawndoesn't fail on startup. But this breaksUNIFIED_CGROUP_HIERARCHY:
This is an indirect consequence of cgroup namespaces. With cgroup namespaces the
container will mount the cgroupfs itself. Hence, mounting the cgroupfs is the
task of systemd inside the container as opposed to bind-mount magic when
cgroup namespaces are not available. If we want systemd inside the container to
mount the unified cgroup hierarchy the simplest solution is to pass
systemd.unified_cgroup_hierarchy=1as argument tosystemd-nspawn:
systemd-nspawn -D /some/rootfs -b 'systemd.unified_cgroup_hierarchy=1'
To be backwards compatible with prior systemd-nspawn versions that allow
setting the UNIFIED_CGROUP_HIERARCHY env variable we can simply append
systemd.unified_cgroup_hierarchy=1. However, when the user simply wants a
shell inside the container things get more complicated since there is no
systemd/init process that sets up the cgroupfs.
Minor point: Note also, that the systemd v230 release notes state that booting
unified cgroups with kernels >= 4.5 requires systemd v230. This is why I
had trouble using unified cgroups:
"WARNING: it is not possible to use previous systemd versions with
systemd.unified_cgroup_hierarchy=1 and the new kernel. Therefore it is
necessary to also update systemd in the initramfs if using the unified
hierarchy. An updated SELinux policy is also required."
(https://lists.freedesktop.org/archives/systemd-devel/2016-May/036583.html)
Since the cgroup namespaces patch here requires that systemd inside the
container mounts the cgroup it means that systemd v230 is required inside the
container with a kernel >=4.5.
nspawn understands the $UNIFIED_CGROUP_HIERARCHY
environment variable to individually select the hierarchy to
use for executed containers. By default, nspawn will use the
unified hierarchy for the containers if the host uses the
unified hierarchy, and the legacy hierarchy otherwise.-bash-4.3# grep cgroup /proc/self/mounts tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0 cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0 cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0 cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0 cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0 -bash-4.3# UNIFIED_CGROUP_HIERARCHY=yes systemd-nspawn -D /nspawn-root/ -b 3 ... container# grep cgroup /proc/self/mounts tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0 cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0 cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0 cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0 cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0 --- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/systemd/systemd/pull/3589/files/dd8e1b4bf0b4e6180812428053d6dfb97d66b4db#r68485591
There was a problem hiding this comment.
On Sat, Jun 25, 2016 at 03:08:20AM -0700, Evgeny Vereshchagin wrote:
@@ -2594,9 +2594,15 @@ static int inner_child(
return -ESRCH;
}
r = mount_systemd_cgroup_writable("", arg_unified_cgroup_hierarchy);if (r < 0)return r;if (cg_ns_supported()) {
r = unshare(CLONE_NEWCGROUP);if (r < 0)This is why they do not appear in the container which checks the available controllers in the v1 hierarchy.
But why do we need to check the
v1-controllers on thev2-hierarchy?In the case of cgroup namespaces the container will be able to mount a cgroup filesystem by itself just as on normal system bootup.
Yeah. But we shouldn't mount
v1onv2(and vice versa)
master:-bash-4.3# systemd-nspawn -D /nspawn-root -b 3 ... container# grep cgroup /proc/self/mounts cgroup /sys/fs/cgroup cgroup2 ro,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/machine.slice/machine-nspawn\134x2droot.scope cgroup2 rw,nosuid,nodev,noexec,relatime 0 0 container# cat /proc/1/cgroup 0::/machine.slice/machine-nspawn\x2droot.scope/init.scopeI can reproduce this behavior with systemd master independent of this patch.
Sorry, I'm a little confused as to what you're getting at here.
There was a problem hiding this comment.
Sorry, I'm a little confused as to what you're getting at here.
@brauner , sorry.
I mean:
By default, nspawn will use the unified hierarchy for the containers if the host uses the
unified hierarchy, and the legacy hierarchy otherwise.
Your patch doesn't work as expected: #3589 (comment)
container# grep cgroup /proc/self/mounts
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
master works fine:
container# grep cgroup /proc/self/mounts
cgroup /sys/fs/cgroup cgroup2 ro,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/machine.slice/machine-nspawn\134x2droot.scope cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
Yeah
systemd-nspawn -D /some/rootfs -b 'systemd.unified_cgroup_hierarchy=1'mounts the v2-hierarchy. But we should do this by default (i.e. without systemd.unified_cgroup_hierarchy=1)
There was a problem hiding this comment.
On Sun, Jun 26, 2016 at 12:52:33AM -0700, Evgeny Vereshchagin wrote:
@@ -2594,9 +2594,15 @@ static int inner_child(
return -ESRCH;
}
r = mount_systemd_cgroup_writable("", arg_unified_cgroup_hierarchy);if (r < 0)return r;- if (cg_ns_supported()) {
r = unshare(CLONE_NEWCGROUP);if (r < 0)Sorry, I'm a little confused as to what you're getting at here.
@brauner , sorry.
I mean:By default, nspawn will use the unified hierarchy for the containers if the host uses the
unified hierarchy, and the legacy hierarchy otherwise.Your patch doesn't work as expected: #3589 (comment)
container# grep cgroup /proc/self/mounts tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0 cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
masterworks fine:container# grep cgroup /proc/self/mounts cgroup /sys/fs/cgroup cgroup2 ro,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/machine.slice/machine-nspawn\134x2droot.scope cgroup2 rw,nosuid,nodev,noexec,relatime 0 0Yeah
systemd-nspawn -D /some/rootfs -b 'systemd.unified_cgroup_hierarchy=1'mounts the
v2-hierarchy. But we should do this by default (i.e. withoutsystemd.unified_cgroup_hierarchy=1)Thanks for the clarification, @evverx. Yes, I can think of a way to do this.
When we detect that unified is requested or used on the host we append
"systemd.unified_cgroup_hierarchy=1" to the arguments passed to the containers
init.
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/systemd/systemd/pull/3589/files/dd8e1b4bf0b4e6180812428053d6dfb97d66b4db#r68498967
65d1fce to
06c87a2
Compare
src/nspawn/nspawn.c
Outdated
| // legacy cgroup. | ||
| if (arg_unified_cgroup_hierarchy && cg_ns_supported() && arg_start_mode == START_BOOT) { | ||
| if (strv_extend(&arg_parameters, "systemd.unified_cgroup_hierarchy=1") < 0) | ||
| return log_oom(); |
There was a problem hiding this comment.
@brauner , thanks!
systemd-nspawn -D /nspawn-root/ -b 3
works fine.
But
-bash-4.3# systemd-nspawn -D /nspawn-root/ /usr/lib/systemd/systemd 3
...
container# grep cgroup /proc/self/mounts
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
This is a regression.
Another issue: we overwrite the user's setting
-bash-4.3# systemd-nspawn -D /nspawn-root -b 3 systemd.unified_cgroup_hierarchy=0
...
container# grep cgroup /proc/self/mounts
cgroup /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
(actually, systemd.unified_cgroup_hierarchy=... never really works. So this is not a regression. Maybe, we should document this)
922ef9e to
d4bb3a8
Compare
|
The clean way to handle cgroup namespaces would be to delegate mounting of |
d4bb3a8 to
2812b69
Compare
src/nspawn/nspawn.c
Outdated
| arg_uid_range, | ||
| arg_selinux_apifs_context); | ||
| if (r < 0) | ||
| return r; |
There was a problem hiding this comment.
Jun 28 13:12:02 adt systemd[1]: Starting Container c1...
Jun 28 13:12:02 adt systemd-nspawn[1485]: Selected user namespace base 84410368 and range 65536.
Jun 28 13:12:02 adt systemd-nspawn[1485]: mount(/var/lib/machines/c1/sys/fs/selinux) failed, ignoring: No such file or directory
Jun 28 13:12:02 adt systemd-nspawn[1485]: mount(/var/lib/machines/c1/sys/fs/selinux) failed, ignoring: Invalid argument
Jun 28 13:12:02 adt systemd-nspawn[1485]: Timezone Etc/UTC does not exist in container, not updating container timezone.
Jun 28 13:12:02 adt systemd-nspawn[1485]: Failed to determine if /sys/fs/cgroup is already mounted: No such file or directory
Jun 28 13:12:02 adt systemd-nspawn[1485]: Child died too early.
Jun 28 13:12:02 adt systemd[1]: [email protected]: Main process exited, code=exited, status=1/FAILURE
Jun 28 13:12:02 adt systemd[1]: Failed to start Container c1.
Jun 28 13:12:02 adt systemd[1]: [email protected]: Unit entered failed state.
Jun 28 13:12:02 adt systemd[1]: [email protected]: Failed with result 'exit-code'.
I think
r = path_is_mount_point(cgroup_root, AT_SYMLINK_FOLLOW);
if (r < 0)
return log_error_errno(r, "Failed to determine if /sys/fs/cgroup is already mounted: %m");doesn't work in the inner child (after mount_move_root)
Seems like we should check /sys/fs/cgroup in the outer_child and pass the result of the check to the inner_child.
There was a problem hiding this comment.
No, this is not the real cause. The real cause is that sys is mounted read-only when --private-veth is used. So we are not allowed to create /sys/fs/cgroup which fails prior to the call you're pointing to.
There was a problem hiding this comment.
oh, right
[pid 8274] 1467144470.419389 mount(NULL, "/sys", NULL, MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_BIND, NULL) = 0
[...]
[pid 8274] 1467144470.419942 stat("/sys/fs", {st_dev=makedev(0, 42), st_ino=3, st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=80, st_atime=2016/06/28-20:07:50.346550852, st_mtime=2016/06/28-20:07:50.418552403, st_ctime=2016/06/28-20:07:50.418552403}) = 0
[pid 8274] 1467144470.420194 mkdir("/sys/fs/cgroup", 0755) = -1 EROFS (Read-only file system)
[pid 8274] 1467144470.420240 lstat("/sys", {st_dev=makedev(0, 42), st_ino=2, st_mode=S_IFDIR|0755, st_nlink=9, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=180, st_atime=2016/06/28-20:07:50.342550765, st_mtime=2016/06/28-20:07:50.418552403, st_ctime=2016/06/28-20:07:50.418552403}) = 0
[pid 8274] 1467144470.420292 lstat("/sys/fs", {st_dev=makedev(0, 42), st_ino=3, st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=80, st_atime=2016/06/28-20:07:50.346550852, st_mtime=2016/06/28-20:07:50.418552403, st_ctime=2016/06/28-20:07:50.418552403}) = 0
[pid 8274] 1467144470.420338 lstat("/sys/fs/cgroup", 0x7ffcdafb3760) = -1 ENOENT (No such file or directory)
[pid 8274] 1467144470.420385 writev(2, [{"Failed to determine if /sys/fs/cgroup is already mounted: No such file or directory", 83}, {"\n", 1}], 2) = 84
sorry
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
Signed-off-by: Tiktodz <[email protected]>
Signed-off-by: Kneba <[email protected]>
Signed-off-by: dotkit <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller enables if the cgroup has processes in it. The enforcement of this logic assumes that the cgroup wouldn't have any css_sets associated with it if there are no tasks in the cgroup, which is no longer true since a79a908 ("cgroup: introduce cgroup namespaces"). When a cgroup namespace is created, it pins the css_set of the creating task to use it as the root css_set of the namespace. This extra reference stays as long as the namespace is around and makes "cgroup.subtree_control" think that the namespace root cgroup is not empty even when it is and thus reject controller enables. Fix it by making cgroup_subtree_control() walk and test emptiness of each css_set instead of testing whether the list_head is empty. While at it, update the comment of cgroup_task_count() to indicate that the returned value may be higher than the number of tasks, which has always been true due to temporary references and doesn't break anything. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Evgeny Vereshchagin <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Aditya Kali <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: [email protected] # v4.6+ Fixes: a79a908 ("cgroup: introduce cgroup namespaces") Link: systemd/systemd#3589 (comment) Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller enables if the cgroup has processes in it. The enforcement of this logic assumes that the cgroup wouldn't have any css_sets associated with it if there are no tasks in the cgroup, which is no longer true since a79a908 ("cgroup: introduce cgroup namespaces"). When a cgroup namespace is created, it pins the css_set of the creating task to use it as the root css_set of the namespace. This extra reference stays as long as the namespace is around and makes "cgroup.subtree_control" think that the namespace root cgroup is not empty even when it is and thus reject controller enables. Fix it by making cgroup_subtree_control() walk and test emptiness of each css_set instead of testing whether the list_head is empty. While at it, update the comment of cgroup_task_count() to indicate that the returned value may be higher than the number of tasks, which has always been true due to temporary references and doesn't break anything. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Evgeny Vereshchagin <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Aditya Kali <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: [email protected] # v4.6+ Fixes: a79a908 ("cgroup: introduce cgroup namespaces") Link: systemd/systemd#3589 (comment) Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller enables if the cgroup has processes in it. The enforcement of this logic assumes that the cgroup wouldn't have any css_sets associated with it if there are no tasks in the cgroup, which is no longer true since a79a908 ("cgroup: introduce cgroup namespaces"). When a cgroup namespace is created, it pins the css_set of the creating task to use it as the root css_set of the namespace. This extra reference stays as long as the namespace is around and makes "cgroup.subtree_control" think that the namespace root cgroup is not empty even when it is and thus reject controller enables. Fix it by making cgroup_subtree_control() walk and test emptiness of each css_set instead of testing whether the list_head is empty. While at it, update the comment of cgroup_task_count() to indicate that the returned value may be higher than the number of tasks, which has always been true due to temporary references and doesn't break anything. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Evgeny Vereshchagin <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Aditya Kali <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: [email protected] # v4.6+ Fixes: a79a908 ("cgroup: introduce cgroup namespaces") Link: systemd/systemd#3589 (comment) Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Kunmun <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
…p namespace On the v2 hierarchy, "cgroup.subtree_control" rejects controller enables if the cgroup has processes in it. The enforcement of this logic assumes that the cgroup wouldn't have any css_sets associated with it if there are no tasks in the cgroup, which is no longer true since a79a908 ("cgroup: introduce cgroup namespaces"). When a cgroup namespace is created, it pins the css_set of the creating task to use it as the root css_set of the namespace. This extra reference stays as long as the namespace is around and makes "cgroup.subtree_control" think that the namespace root cgroup is not empty even when it is and thus reject controller enables. Fix it by making cgroup_subtree_control() walk and test emptiness of each css_set instead of testing whether the list_head is empty. While at it, update the comment of cgroup_task_count() to indicate that the returned value may be higher than the number of tasks, which has always been true due to temporary references and doesn't break anything. Change-Id: I7e2210ad1a3e605fa10ad1f723214b3adb2dfb5e Signed-off-by: Tejun Heo <[email protected]> Reported-by: Evgeny Vereshchagin <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Aditya Kali <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: [email protected] # v4.6+ Fixes: a79a908 ("cgroup: introduce cgroup namespaces") Link: systemd/systemd#3589 (comment) (cherry picked from commit 9157056) Signed-off-by: nostalgiceagle <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Oktapra Amtono <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller enables if the cgroup has processes in it. The enforcement of this logic assumes that the cgroup wouldn't have any css_sets associated with it if there are no tasks in the cgroup, which is no longer true since a79a908 ("cgroup: introduce cgroup namespaces"). When a cgroup namespace is created, it pins the css_set of the creating task to use it as the root css_set of the namespace. This extra reference stays as long as the namespace is around and makes "cgroup.subtree_control" think that the namespace root cgroup is not empty even when it is and thus reject controller enables. Fix it by making cgroup_subtree_control() walk and test emptiness of each css_set instead of testing whether the list_head is empty. While at it, update the comment of cgroup_task_count() to indicate that the returned value may be higher than the number of tasks, which has always been true due to temporary references and doesn't break anything. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Evgeny Vereshchagin <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Aditya Kali <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: [email protected] # v4.6+ Fixes: a79a908 ("cgroup: introduce cgroup namespaces") Link: systemd/systemd#3589 (comment) Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Kunmun <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Kunmun <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").
When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.
Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.
While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Evgeny Vereshchagin <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Aditya Kali <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: [email protected] # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: systemd/systemd#3589 (comment)
Signed-off-by: Chatur27 <[email protected]>
This adds support for cgroup namespaces which are available since 4.6. Cgroup namespaces work with both, the legacy and unified cgroup hierarchy. For legacy:
Inside new cgroup namespace:
Parent cgroup namespace: