Add blog article about cgroup v2#47342
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
✅ Pull request preview available for checkingBuilt without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site configuration. |
|
/hold As drafted this mentions changes from v1.31; we shouldn't publish this before v1.31 is released. |
|
/cc @harche @haircommander @mrunalp |
Thanks @pacoxu |
|
/retitle Add blog article about cgroup v2 |
|
/hold cancel |
|
No need to hold for v1.31 but does need a publication date. How about the 1st of October - does that work? /hold |
…tatus on sub controllers
66efcbd to
b8bddc4
Compare
OK. I will update the publication date. |
|
The current blog can be previewed in https://deploy-preview-47342--kubernetes-io-main-staging.netlify.app/blog/2024/10/01/kubernetes-cgroup-v2-shift/. For migration, we may add more details in https://kubernetes.io/docs/concepts/architecture/cgroups/#migrating-cgroupv2 if needed. And I am looking forward adding a session for |
sftim
left a comment
There was a problem hiding this comment.
Thanks! I recommend some suggestions.
| In v1.31, [KEP-4033](https://github.com/kubernetes/enhancements/issues/4033) is beta to extend CRI API for the kubelet | ||
| to discover the cgroup driver from the container runtime. This will help installer and kubelet to |
There was a problem hiding this comment.
Yes. I still want to add more information here and some more paragraphs.
| layout: blog | ||
| title: 'The Shift to cgroup v2 in Kubernetes: What You Need to Know' | ||
| date: 2024-10-01 | ||
| slug: kubernetes-cgroup-v2-shift |
There was a problem hiding this comment.
it should be cgroups plural, not a singular cgroup
https://en.wikipedia.org/wiki/Cgroups
applies to a number of places in the doc.
There was a problem hiding this comment.
Does this mean that all cgroup should be cgroups?
There was a problem hiding this comment.
It should be cgroup according to kernel docs https://docs.kernel.org/admin-guide/cgroup-v2.html
| `cgroups` (control groups) are a Linux kernel feature used for managing system resources. | ||
| Kubernetes uses cgroups to allocate resources like CPU and memory to containers, | ||
| ensuring that applications run smoothly without interfering with each other. | ||
| With the release of Kubernetes v1.31, cgroup v1 has been moved into maintenance mode. |
There was a problem hiding this comment.
it's not very clear in the blog post what "maintenance mode" means.
maybe add a sentence or two after this one that explains it. try in a new paragraph.
There was a problem hiding this comment.
The details are in another blog.
https://github.com/pacoxu/website/blob/b8bddc4aff4e9f67a35957dd07acce7b3d0ebdea/content/en/blog/_posts/2024-08-14-moving-cgroup-v1-support-maintenance-mode-kubernetes-1-31.md#L69-L76
Could we just add a link here?
|
/cc @sohankunkerkar |
|
|
||
| - `stat -fc %T /sys/fs/cgroup/`: Check if cgroups v2 is enabled which will return `cgroup2fs` | ||
| - `systemctl list-units kube* --type=slice` or `--type=scope`: List kube related units that systemd currently has in memory. | ||
| - `bpftool cgroup list /sys/fs/cgroup/*`: List all programs attached to the cgroup CGROUP. |
There was a problem hiding this comment.
bpftool cgroup list /sys/fs/cgroup/* actually shows all programs?
In my environment, I could not list it.
First of all, I think bpftool cgroup tree is suitable to list all attached programs.
BTW, I have recently fixed the bug of bpftool cgroup as below:
https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/
So if you want to try bpftool cgroup tree` in your environment, please keep that in your mind.
|
From kubernetes/system-validators#41 (comment) |
| With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/). | ||
| For cgroups v2, it graduated in v1.25 2 years ago. | ||
|
|
||
| Top FAQs are why we should migrate, what's the benifits and lost, |
There was a problem hiding this comment.
| Top FAQs are why we should migrate, what's the benifits and lost, | |
| The top FAQs cover three main areas: why to migrate, the benefits and drawbacks, and key points to keep in mind when using cgroup v2. |
| Top FAQs are why we should migrate, what's the benifits and lost, | ||
| and what needs to be noticed when using cgroups v2. | ||
|
|
||
| ## cgroups v1 problem, and solutions in cgroups v2 |
There was a problem hiding this comment.
| ## cgroups v1 problem, and solutions in cgroups v2 | |
| ## Limitations of cgroup v1 and Improvements with cgroup v2 |
There was a problem hiding this comment.
I think I would restructure this blog in the following manner:
1. Introduction
2. Why Migrate to cgroup v2?
a. Benefits of cgroup v2
b. Limitations of cgroup v1 addressed by v2
3. Adopting cgroup v2
a. Requirements
b. Troubleshooting and Monitoring Tools
4. Conclusion
5. Further reading | @@ -0,0 +1,162 @@ | |||
| --- | |||
| layout: blog | |||
| title: 'The Shift to cgroups v2 in Kubernetes: What You Need to Know' | |||
There was a problem hiding this comment.
I think you can switch to cgroup wherever you are using cgroups
| Kubernetes uses cgroups to allocate resources like CPU and memory to containers, | ||
| ensuring that applications run smoothly without interfering with each other. | ||
| With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/). | ||
| For cgroups v2, it graduated in v1.25 2 years ago. |
| User Namespace minimal kernel version is 6.5, according to | ||
| [KEP-127](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/127-user-namespaces/README.md). | ||
|
|
||
| ### What's more? |
There was a problem hiding this comment.
| ### What's more? | |
| ### What else? |
| - In cgroups v1, the device access control are defined in the static configuration/. | ||
| - cgroups v2 device controller has no interface files and is implemented on top of cgroup BPF. |
There was a problem hiding this comment.
| - In cgroups v1, the device access control are defined in the static configuration/. | |
| - cgroups v2 device controller has no interface files and is implemented on top of cgroup BPF. | |
| - In cgroup v1, device access controls are defined within static configuration. | |
| - The cgroup v2 device controller has no interface files, and is implemented on top of cgroup BPF. |
| by default at the path /run/cilium/cgroupv2 . | ||
| 2. PSI is planned in a future release [KEP-4205](https://github.com/kubernetes/enhancements/issues/4205), | ||
| but pending due to runc 1.2.0 release delay. |
There was a problem hiding this comment.
| by default at the path /run/cilium/cgroupv2 . | |
| 2. PSI is planned in a future release [KEP-4205](https://github.com/kubernetes/enhancements/issues/4205), | |
| but pending due to runc 1.2.0 release delay. | |
| by default at the path `/run/cilium/cgroupv2`. | |
| 2. Support for pressure stall reporting (PSI) is planned and may arrive in a future release - see [KEP-4205](https://github.com/kubernetes/enhancements/issues/4205). |
Do we need to explain what a Cilium is?
|
|
||
| ### Use systemd as cgroup driver | ||
|
|
||
| [Configure the kubelet's cgroup driver to match the container runtime cgroup driver](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/). |
There was a problem hiding this comment.
| [Configure the kubelet's cgroup driver to match the container runtime cgroup driver](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/). | |
| [Configure the kubelet's cgroup driver to match the container runtime cgroup driver](/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/). |
| In v1.31, [KEP-4033](https://github.com/kubernetes/enhancements/issues/4033) is beta to extend CRI API for the kubelet | ||
| to discover the cgroup driver from the container runtime. This will help installer and kubelet to autodetect | ||
|
|
||
| - TODO |
There was a problem hiding this comment.
This doesn't look ready for review.
| `cgroups` (control groups) are a Linux kernel feature used for managing system resources. | ||
| Kubernetes uses cgroups to allocate resources like CPU and memory to containers, | ||
| ensuring that applications run smoothly without interfering with each other. | ||
| With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/). |
There was a problem hiding this comment.
| With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/). | |
| With the release of Kubernetes v1.31, support for v1 cgroup management has been moved into [maintenance mode](/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/). |
|
|
||
| ### active_file memory is not considered as available memory | ||
|
|
||
| There is [a known issue](/docs/concepts/scheduling-eviction/node-pressure-eviction/#active-file-memory-is-not-considered-as-available-memory) of page cache: [#43916](https://github.com/kubernetes/kubernetes/issues/43916). |
There was a problem hiding this comment.
(nit)
| There is [a known issue](/docs/concepts/scheduling-eviction/node-pressure-eviction/#active-file-memory-is-not-considered-as-available-memory) of page cache: [#43916](https://github.com/kubernetes/kubernetes/issues/43916). | |
| There is a [known issue](/docs/concepts/scheduling-eviction/node-pressure-eviction/#active-file-memory-is-not-considered-as-available-memory) around page cache: [#43916](https://github.com/kubernetes/kubernetes/issues/43916). |
|
|
||
| Support for Memory QoS was initially added in Kubernetes v1.22, | ||
| and later some limitations around the formula for calculating `memory.high` were identified. | ||
| These limitations are addressed in Kubernetes v1.27. |
There was a problem hiding this comment.
| These limitations are addressed in Kubernetes v1.27. | |
| These limitations were addressed in Kubernetes v1.27. |
| However, until v1.31, the feature gate is still alpha due to another known issue | ||
| that application pod may be hanging forever due to heavy memory reclaiming. |
There was a problem hiding this comment.
| However, until v1.31, the feature gate is still alpha due to another known issue | |
| that application pod may be hanging forever due to heavy memory reclaiming. | |
| However, until v1.31, the feature gate for memory QoS was still alpha due to another known issue | |
| (that an application pod may be hanging forever due to heavy memory reclaiming). |
|
|
||
| #### kernel updates around cgroups v2 | ||
|
|
||
| cgroups v2 first appeared in Linux Kernel 4.5 in 2016. |
There was a problem hiding this comment.
How about:
| cgroups v2 first appeared in Linux Kernel 4.5 in 2016. | |
| When Kubernetes was first announced, in 2014, only v1 cgroups existed. | |
| Version 2 cgroup management first appeared in Linux kernel 4.5, released in 2016. |
|
@pacoxu, what's your intention with this PR? There's feedback pending. |
|
My current plan is to finish this in 2025 Q2. |
|
Thanks for this @pacoxu, and thanks to all the reviewers. I'm closing this PR as stale since some change requests were made but not responded to in over 2 weeks. @pacoxu, I see you plan on picking it back up in a couple months, so please feel free to reopen it when you are ready to continue the work. /close |
|
@nate-double-u: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Description
In v1.31, we announce that cgroup v1 is now in maintenance mode.
Top FAQs are
The current items include
Issue
Closes: None