runc exec: use manager.AddPid#4822
Conversation
|
@kolyshkin I think so, let me review this tomorrow (I mean to review the cgroups PR but didn't have time, sorry about that). |
This fixes the following warning (seen on Fedora 42 and Ubuntu 24.04): + sudo chown -R rootless.rootless /home/rootless chown: warning: '.' should be ':': ‘rootless.rootless’ Signed-off-by: Kir Kolyshkin <[email protected]>
Signed-off-by: Kir Kolyshkin <[email protected]>
The main idea is to maintain the code separately (and eventually kill V1 implementation). Signed-off-by: Kir Kolyshkin <[email protected]>
Remove cgroupPaths field from struct setnsProcess, because: - we can get base cgroup paths from p.manager.GetPaths(); - we can get sub-cgroup paths from p.process.SubCgroupPaths. But mostly because we are going to need separate cgroup paths when adopting cgroups.AddPid. Signed-off-by: Kir Kolyshkin <[email protected]>
The main benefit here is when we are using a systemd cgroup driver, we actually ask systemd to add a PID, rather than doing it ourselves. This way, we can add rootless exec PID to a cgroup. This requires newer opencontainers/cgroups and coreos/go-systemd. Signed-off-by: Kir Kolyshkin <[email protected]>
@cyphar please |
| // On cgroup v2 + nesting + domain controllers, adding to initial cgroup may fail with EBUSY. | ||
| // https://github.com/opencontainers/runc/issues/2356#issuecomment-621277643 | ||
| // Try to join the cgroup of InitProcessPid, unless sub-cgroup is explicitly set. | ||
| if p.initProcessPid != 0 && sub == "" { | ||
| initProcCgroupFile := fmt.Sprintf("/proc/%d/cgroup", p.initProcessPid) | ||
| initCg, initCgErr := cgroups.ParseCgroupFile(initProcCgroupFile) | ||
| if initCgErr == nil { | ||
| if initCgPath, ok := initCg[""]; ok { | ||
| initCgDirpath := filepath.Join(fs2.UnifiedMountpoint, initCgPath) | ||
| logrus.Debugf("adding pid %d to cgroup failed (%v), attempting to join %s", | ||
| p.pid(), err, initCgDirpath) | ||
| // NOTE: initCgDirPath is not guaranteed to exist because we didn't pause the container. | ||
| err = cgroups.WriteCgroupProc(initCgDirpath, p.pid()) | ||
| } | ||
| } |
There was a problem hiding this comment.
I am guessing you don't want to remove this despite your comment in #2416 (comment) ?
There was a problem hiding this comment.
Ideally, I wish this to be removed, but not in this PR, as it will break the test case added in PR #2416 (and may also break some funny users' workloads). I would like to hear from @AkihiroSuda first.
There was a problem hiding this comment.
Let's keep it compatible with the existing releases of runc.
I wish we could simplify the implementation though.
|
I had one question, but feel free to merge anyway. |
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on work done in - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Signed-off-by: Kir Kolyshkin <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on work done in - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Signed-off-by: Kir Kolyshkin <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on work done in - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Signed-off-by: Kir Kolyshkin <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on work done in - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Regarding E2BIG check in shouldRetryWithoutCgroupFD. The clone3 syscall first appeared in kernel v5.3 via commit [1], which added a check that if the size of clone_args structure passed from the userspace is larger than known to kernel, and the "unknown" part contains non-zero values, E2BIG is returned. A similar check was already used in other similar scenarios at the time, and later in kernel v5.4, this was generalized by patch series [2]. [1]: torvalds/linux@7f192e3 [2]: https://lore.kernel.org/all/[email protected]/#r Signed-off-by: Kir Kolyshkin <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on work done in - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Regarding E2BIG check in shouldRetryWithoutCgroupFD. The clone3 syscall first appeared in kernel v5.3 via commit [1], which added a check that if the size of clone_args structure passed from the userspace is larger than known to kernel, and the "unknown" part contains non-zero values, E2BIG is returned. A similar check was already used in other similar scenarios at the time, and later in kernel v5.4, this was generalized by patch series [2]. [1]: torvalds/linux@7f192e3 [2]: https://lore.kernel.org/all/[email protected]/#r Signed-off-by: Kir Kolyshkin <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on: - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Signed-off-by: Kir Kolyshkin <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on: - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 5af4dd4) Signed-off-by: Aleksa Sarai <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on: - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 5af4dd4) Signed-off-by: Aleksa Sarai <[email protected]>
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP), if it is available. Since it requires a recent kernel and might not work, implement a fallback to older way of joining the cgroup. Based on: - https://go-review.googlesource.com/c/go/+/417695 - coreos/go-systemd#458 - opencontainers/cgroups#26 - opencontainers#4822 Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 5af4dd4) Signed-off-by: Aleksa Sarai <[email protected]>
|
1.4 backport: #4920 |
The main benefit here is when we are using a systemd cgroup driver,
we actually ask systemd to add a PID, rather than doing it ourselves.
This way, we can add exec PID to a cgroup even when cgroup itself is
is not writable to us (rootless).
The implementation requires opencontainers/cgroups#26 (in oc/[email protected])
(which requires coreos/go-systemd#458 (in coreos/[email protected])).
This PR is a prerequisite for #4812 ("runc exec: use CLONE_INTO_CGROUP")