kubelet/cm: bump runc/libcontainer to 1.1.0#107149
kubelet/cm: bump runc/libcontainer to 1.1.0#107149kolyshkin wants to merge 1 commit intokubernetes:masterfrom
Conversation
|
/retest |
|
OK, it works with these 4 commits. Let's add more. |
769a312 to
c359c7f
Compare
|
/test pull-kubernetes-integration |
|
The /test pull-kubernetes-integration |
c359c7f to
4f974eb
Compare
|
/retest |
|
/test pull-kubernetes-node-crio-cgrpv2-e2e |
|
I'm not sure what to make of test failures -- are all of them currently broken? |
I think it is related to kubernetes/test-infra#24798 |
odinuge
left a comment
There was a problem hiding this comment.
Took an initial look, and it looks ok at first glance. The runc changes are so big this time, so haven't had a deep dive into them, but I think @kolyshkin have more control on that front than me. I am leaning towards splitting the other non-bump-related changes to separate PR(s), but I think a lot of them are valuable. Feedback from other reviewers would be appreciated. I do have a few questions tho.
The CI situation is kinda sad, and I would prefer if it was sorted before merging this. The failing suites are the one testing cgroup v2 and systemd support, and having them working is kinda important. Do you know if someone are working actively on fixing it @pacoxu?
There was a problem hiding this comment.
+1 on this change. Can we keep it as a separate PR tho? In case we have to revert the bump, or we start seeing problems, it is easier if they are separated.
There was a problem hiding this comment.
Again, this is a separate commit.
I'm not very familiar with kubernetes processes. Isn't it possible to revert separate commits, rather than the whole PR?
There was a problem hiding this comment.
Same applies here. I am fairly sure this is totally fine, but I think I would personally prefer if we kept it separate from the bump if possible.
There was a problem hiding this comment.
This is a separate commit (I guess you are not reviewing this commit-by-commit, since this is the first commit).
I can certainly move it out to a separate PR and made this one a draft waiting for that other PR to be merged. Same for other 5 preparatory commits.
There was a problem hiding this comment.
This is a separate commit (I guess you are not reviewing this commit-by-commit, since this is the first commit).
Yeah, thanks for doing that! I definitely prefer your style of doing it with separate commits with well defined commit messages! I am a huge fan of that, thanks!
I can certainly move it out to a separate PR and made this one a draft waiting for that other PR to be merged. Same for other 5 preparatory commits.
Thanks. The only worry is doing a lot of these cleanups in the same PR as a big dependency change; mostly for being (better) able to bisect the CI running on the master branch, as well as the "work" of reverting if we have to do that. Other than for dependency changes or verrry big PRs, I do prefer your style waay over the alternative. Again, thanks for doing this!
There was a problem hiding this comment.
This change looks ok as well. Would tho prefer to keep it separated from the bump, since it looks like it is a noop change of how things work. If others disagree with me, its fine, and we can keep it
There was a problem hiding this comment.
+1 on this change, especially since it reduce runtime of the destroy code path. As with the other ones, I don't think this is related to the runc bump is it?
|
This PR is 6 commits, of which 5 are preparatory/cleanup, and the last one is the actual runc bump. I guess it makes more sense if you can review it commit by commit. I can move these preparatory commits out of this PR (and make this one a draft until those commits are merged, and then do a rebase and mark it as ready for review). @odinuge Would you prefer a separate PR for each of the 5 prep commits, or a single PR for all of them? |
|
Rebased. |
|
Rebased. |
Thanks @kolyshkin! Reviewed that one separately. Feel free to hide (or what it is called on github) my earlier PR reviews, since they no longer apply to this PR. I am still a bit worried about kubernetes/test-infra#24798 still being broken, meaning we have no CI signal for cgroup v2 (I still don't think we have cgroup v2 containerd tests, but I might be wrong..), the systemd cgroup driver or cri-o... |
This updates vendored runc/libcontainer to 1.1.0, and google/cadvisor to a version updated to runc 1.1.0-rc1 (google/cadvisor#3031). Changes in vendor are generated by (roughly): ./hack/pin-dependency.sh github.com/google/cadvisor v0.44.0 ./hack/pin-dependency.sh github.com/opencontainers/runc v1.1.0 ./hack/update-vendor.sh ./hack/lint-dependencies.sh # And follow all its recommendations. ./hack/update-vendor.sh ./hack/update-internal-modules.sh ./hack/lint-dependencies.sh # Re-check everything again. The changes (mostly in pkg/kubelet/cm) are there to adopt changed runc 1.1 API, and simplify things a bit. In particular: 1. simplify cgroup manager instantiation, using a new, easier way of libcontainers/cgroups/manager.New; 2. replace libcontainerAdapter with a boolean variable (all it did was passing on whether systemd manager should be used); 3. trivial change due to removed cgroupfs.HugePageSizes and added cgroups.HugePageSizes(); 4. do not calculate cgroup paths in update / destroy, since libcontainer cgroup managers now calculate the paths upon creation (previously, they were doing that only in Apply, so using e.g. Set or Destroy right after creation was impossible without specifying paths). We currently still calculate cgroup paths in Exists -- this is to be addressed separately. Signed-off-by: Kir Kolyshkin <[email protected]>
|
Rebased on top of just-merged #108597; no longer a draft |
odinuge
left a comment
There was a problem hiding this comment.
Changes looks good to me now! I have looked through the code with the previous regressions we have seen, and I haven't found any red flags this time. Thanks for pushing @kolyshkin
/lgtm
But lets hold some time to let the CI run for the other changes (eg. this run https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-cos-cgroupv2-containerd-node-e2e-serial/1506700583120146432 of ci-cos-cgroupv2-containerd-node-e2e-serial).
Feel free to unhold when we have CI confidence about the other change.
/hold
|
/ok-to-test |
|
And lets test this one that failed; /test pull-kubernetes-node-swap-ubuntu-serial As well as this one that should pass; /test pull-kubernetes-node-kubelet-serial-containerd |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: giuseppe, kolyshkin, odinuge The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@kolyshkin: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/milestone v1.24 @kolyshkin can you please remove draft status on the PR and s/Switch/Update/ in the change note as well as noting the cadvisor version bump? |
|
/retest |
|
I think this needs rebase... |
|
@kolyshkin: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
This has been replaced with #109029 |
|
/close |
|
@bobbypage: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
TL;DR: simplify/improve cgroup management code in kubelet/cm. Bump to runc 1.1.0.
This updates vendored runc/libcontainer to 1.1.0
(release notes at https://github.com/opencontainers/runc/releases)
and google/cadvisor to a version with PR google/cadvisor#3031 merged.
Changes in vendor are generated by (roughly):
Changes in pkg/kubelet/cm are to adopt new runc 1.1 libcontainer/cgroups
APIs and build on its improvements. In particular:
simplify cgroup manager instantiation, using a new, easier way of
libcontainers/cgroups/manager.New;replace
libcontainerAdapterwith a boolean variable (all it didwas passing on whether systemd manager should be used);
trivial change due to removed
cgroupfs.HugePageSizesand addedcgroups.HugePageSizes();do not calculate cgroup paths in update / destroy, since libcontainer
cgroup managers now calculate the paths upon creation (previously,
they were doing that only in
Apply, so using e.g.SetorDestroyrightafter creation was impossible without specifying paths).
🔔 We currently still calculate cgroup paths in
Exists-- this is to beaddressed separately.
Which issue(s) this PR fixes:
none
Special notes for your reviewer:
Please review commit by commit, and see individual commit messages for more details.
This is currently a draft, as the code is on top of kubelet/cm: refactor, prepare for runc 1.1 bump #108597. Once that PR is merged, this will be rebased.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
none