Skip to content

Add blog article about cgroup v2#47342

Closed
pacoxu wants to merge 3 commits intokubernetes:mainfrom
pacoxu:cgroupv2-blog
Closed

Add blog article about cgroup v2#47342
pacoxu wants to merge 3 commits intokubernetes:mainfrom
pacoxu:cgroupv2-blog

Conversation

@pacoxu
Copy link
Copy Markdown
Member

@pacoxu pacoxu commented Aug 2, 2024

Description

In v1.31, we announce that cgroup v1 is now in maintenance mode.
Top FAQs are

  1. how to migrate?
  2. what's the benifits?
  3. can it cover all users' usage in cgroup v1, including limitation\monitoring.

The current items include

Issue

Closes: None

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 2, 2024
@k8s-ci-robot k8s-ci-robot added the area/blog Issues or PRs related to the Kubernetes Blog subproject label Aug 2, 2024
@k8s-ci-robot k8s-ci-robot requested a review from sftim August 2, 2024 09:06
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sftim for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 2, 2024
@pacoxu pacoxu changed the title [WIP][Not Urget or for v1.31] add cgroup v2 blog which focuses on v1/v1 comparision and community update [WIP][Not urgent or applicable to v1.31] Add cgroup v2 blog, focusing on v1/v2 comparison and community updates #47342 Aug 2, 2024
@netlify
Copy link
Copy Markdown

netlify Bot commented Aug 2, 2024

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit b128440
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/6705f5a989c3440008836474
😎 Deploy Preview https://deploy-preview-47342--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Comment thread content/en/blog/_posts/2024-11-10-migrate-cgroup-v2.md Outdated
@sftim
Copy link
Copy Markdown
Contributor

sftim commented Aug 7, 2024

/hold

As drafted this mentions changes from v1.31; we shouldn't publish this before v1.31 is released.
It's possible we'll agree an exception to the usual policy.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 7, 2024
@pacoxu pacoxu marked this pull request as ready for review August 13, 2024 09:34
@pacoxu pacoxu changed the title [WIP][Not urgent or applicable to v1.31] Add cgroup v2 blog, focusing on v1/v2 comparison and community updates #47342 [Not urgent or applicable to v1.31] Add cgroup v2 blog, focusing on v1/v2 comparison and community updates #47342 Aug 13, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 13, 2024
@pacoxu
Copy link
Copy Markdown
Member Author

pacoxu commented Aug 13, 2024

/cc @harche @haircommander @mrunalp
I tried to include some comparisons and community updates here. But this is still a draft somehow.
I would like to make it late 2024.

@harche
Copy link
Copy Markdown
Contributor

harche commented Aug 13, 2024

/cc @harche @haircommander @mrunalp I tried to include some comparisons and community updates here. But this is still a draft somehow. I would like to make it late 2024.

Thanks @pacoxu

@sftim
Copy link
Copy Markdown
Contributor

sftim commented Sep 8, 2024

/retitle Add blog article about cgroup v2

@k8s-ci-robot k8s-ci-robot changed the title [Not urgent or applicable to v1.31] Add cgroup v2 blog, focusing on v1/v2 comparison and community updates #47342 Add blog article about cgroup v2 Sep 8, 2024
@sftim
Copy link
Copy Markdown
Contributor

sftim commented Sep 8, 2024

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 8, 2024
@sftim
Copy link
Copy Markdown
Contributor

sftim commented Sep 8, 2024

No need to hold for v1.31 but does need a publication date. How about the 1st of October - does that work?

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 8, 2024
@pacoxu
Copy link
Copy Markdown
Member Author

pacoxu commented Sep 9, 2024

No need to hold for v1.31 but does need a publication date. How about the 1st of October - does that work?

/hold

OK. I will update the publication date.

@pacoxu
Copy link
Copy Markdown
Member Author

pacoxu commented Sep 23, 2024

The current blog can be previewed in https://deploy-preview-47342--kubernetes-io-main-staging.netlify.app/blog/2024/10/01/kubernetes-cgroup-v2-shift/.

For migration, we may add more details in https://kubernetes.io/docs/concepts/architecture/cgroups/#migrating-cgroupv2 if needed.

And I am looking forward adding a session for problems that cgroup v2 does not solve as cgroup v1 like cpu throttling IIUC.

Copy link
Copy Markdown
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I recommend some suggestions.

Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment on lines +129 to +130
In v1.31, [KEP-4033](https://github.com/kubernetes/enhancements/issues/4033) is beta to extend CRI API for the kubelet
to discover the cgroup driver from the container runtime. This will help installer and kubelet to
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks unfinished.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I still want to add more information here and some more paragraphs.

Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Copy link
Copy Markdown
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, thanks. requires some proofreading.

also needs a SIG Node reviewer.
maybe you can try pinging the people who submitted the maintenance mode KEP.

layout: blog
title: 'The Shift to cgroup v2 in Kubernetes: What You Need to Know'
date: 2024-10-01
slug: kubernetes-cgroup-v2-shift
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be cgroups plural, not a singular cgroup
https://en.wikipedia.org/wiki/Cgroups

applies to a number of places in the doc.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that all cgroup should be cgroups?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
Comment thread content/en/blog/_posts/2024-10-01-migrate-cgroup-v2.md Outdated
`cgroups` (control groups) are a Linux kernel feature used for managing system resources.
Kubernetes uses cgroups to allocate resources like CPU and memory to containers,
ensuring that applications run smoothly without interfering with each other.
With the release of Kubernetes v1.31, cgroup v1 has been moved into maintenance mode.
Copy link
Copy Markdown
Member

@neolit123 neolit123 Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not very clear in the blog post what "maintenance mode" means.
maybe add a sentence or two after this one that explains it. try in a new paragraph.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pacoxu
Copy link
Copy Markdown
Member Author

pacoxu commented Oct 14, 2024

/cc @sohankunkerkar


- `stat -fc %T /sys/fs/cgroup/`: Check if cgroups v2 is enabled which will return `cgroup2fs`
- `systemctl list-units kube* --type=slice` or `--type=scope`: List kube related units that systemd currently has in memory.
- `bpftool cgroup list /sys/fs/cgroup/*`: List all programs attached to the cgroup CGROUP.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bpftool cgroup list /sys/fs/cgroup/* actually shows all programs?
In my environment, I could not list it.
First of all, I think bpftool cgroup tree is suitable to list all attached programs.

BTW, I have recently fixed the bug of bpftool cgroup as below:
https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/
So if you want to try bpftool cgroup tree` in your environment, please keep that in your mind.

@KentaTada
Copy link
Copy Markdown

From kubernetes/system-validators#41 (comment)
I reviewed this blog and left a comment #47342 (comment)
Thank you for letting me participate in the review.

With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/).
For cgroups v2, it graduated in v1.25 2 years ago.

Top FAQs are why we should migrate, what's the benifits and lost,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Top FAQs are why we should migrate, what's the benifits and lost,
The top FAQs cover three main areas: why to migrate, the benefits and drawbacks, and key points to keep in mind when using cgroup v2.

Top FAQs are why we should migrate, what's the benifits and lost,
and what needs to be noticed when using cgroups v2.

## cgroups v1 problem, and solutions in cgroups v2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## cgroups v1 problem, and solutions in cgroups v2
## Limitations of cgroup v1 and Improvements with cgroup v2

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would restructure this blog in the following manner:

1. Introduction
2. Why Migrate to cgroup v2?
  a.  Benefits of cgroup v2
  b.  Limitations of cgroup v1 addressed by v2
3. Adopting cgroup v2
  a. Requirements
  b. Troubleshooting and Monitoring Tools
4. Conclusion
5. Further reading 

@@ -0,0 +1,162 @@
---
layout: blog
title: 'The Shift to cgroups v2 in Kubernetes: What You Need to Know'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/cgroups/cgroup/g

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can switch to cgroup wherever you are using cgroups

Kubernetes uses cgroups to allocate resources like CPU and memory to containers,
ensuring that applications run smoothly without interfering with each other.
With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/).
For cgroups v2, it graduated in v1.25 2 years ago.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/cgroups/cgroup/g

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(original is also fine)

Copy link
Copy Markdown
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more feedback

User Namespace minimal kernel version is 6.5, according to
[KEP-127](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/127-user-namespaces/README.md).

### What's more?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### What's more?
### What else?

Comment on lines +74 to +75
- In cgroups v1, the device access control are defined in the static configuration/.
- cgroups v2 device controller has no interface files and is implemented on top of cgroup BPF.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- In cgroups v1, the device access control are defined in the static configuration/.
- cgroups v2 device controller has no interface files and is implemented on top of cgroup BPF.
- In cgroup v1, device access controls are defined within static configuration.
- The cgroup v2 device controller has no interface files, and is implemented on top of cgroup BPF.

Comment on lines +77 to +79
by default at the path /run/cilium/cgroupv2 .
2. PSI is planned in a future release [KEP-4205](https://github.com/kubernetes/enhancements/issues/4205),
but pending due to runc 1.2.0 release delay.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
by default at the path /run/cilium/cgroupv2 .
2. PSI is planned in a future release [KEP-4205](https://github.com/kubernetes/enhancements/issues/4205),
but pending due to runc 1.2.0 release delay.
by default at the path `/run/cilium/cgroupv2`.
2. Support for pressure stall reporting (PSI) is planned and may arrive in a future release - see [KEP-4205](https://github.com/kubernetes/enhancements/issues/4205).

Do we need to explain what a Cilium is?


### Use systemd as cgroup driver

[Configure the kubelet's cgroup driver to match the container runtime cgroup driver](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Configure the kubelet's cgroup driver to match the container runtime cgroup driver](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/).
[Configure the kubelet's cgroup driver to match the container runtime cgroup driver](/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/).

Comment on lines +127 to +130
In v1.31, [KEP-4033](https://github.com/kubernetes/enhancements/issues/4033) is beta to extend CRI API for the kubelet
to discover the cgroup driver from the container runtime. This will help installer and kubelet to autodetect

- TODO
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look ready for review.

`cgroups` (control groups) are a Linux kernel feature used for managing system resources.
Kubernetes uses cgroups to allocate resources like CPU and memory to containers,
ensuring that applications run smoothly without interfering with each other.
With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With the release of Kubernetes v1.31, cgroups v1 has been moved into [maintenance mode]/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/).
With the release of Kubernetes v1.31, support for v1 cgroup management has been moved into [maintenance mode](/blog/2024/08/14/kubernetes-1-31-moving-cgroup-v1-support-maintenance-mode/).


### active_file memory is not considered as available memory

There is [a known issue](/docs/concepts/scheduling-eviction/node-pressure-eviction/#active-file-memory-is-not-considered-as-available-memory) of page cache: [#43916](https://github.com/kubernetes/kubernetes/issues/43916).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
There is [a known issue](/docs/concepts/scheduling-eviction/node-pressure-eviction/#active-file-memory-is-not-considered-as-available-memory) of page cache: [#43916](https://github.com/kubernetes/kubernetes/issues/43916).
There is a [known issue](/docs/concepts/scheduling-eviction/node-pressure-eviction/#active-file-memory-is-not-considered-as-available-memory) around page cache: [#43916](https://github.com/kubernetes/kubernetes/issues/43916).


Support for Memory QoS was initially added in Kubernetes v1.22,
and later some limitations around the formula for calculating `memory.high` were identified.
These limitations are addressed in Kubernetes v1.27.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These limitations are addressed in Kubernetes v1.27.
These limitations were addressed in Kubernetes v1.27.

Comment on lines +41 to +42
However, until v1.31, the feature gate is still alpha due to another known issue
that application pod may be hanging forever due to heavy memory reclaiming.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
However, until v1.31, the feature gate is still alpha due to another known issue
that application pod may be hanging forever due to heavy memory reclaiming.
However, until v1.31, the feature gate for memory QoS was still alpha due to another known issue
(that an application pod may be hanging forever due to heavy memory reclaiming).


#### kernel updates around cgroups v2

cgroups v2 first appeared in Linux Kernel 4.5 in 2016.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
cgroups v2 first appeared in Linux Kernel 4.5 in 2016.
When Kubernetes was first announced, in 2014, only v1 cgroups existed.
Version 2 cgroup management first appeared in Linux kernel 4.5, released in 2016.

@sftim
Copy link
Copy Markdown
Contributor

sftim commented Jan 31, 2025

@pacoxu, what's your intention with this PR? There's feedback pending.

@pacoxu
Copy link
Copy Markdown
Member Author

pacoxu commented Feb 5, 2025

My current plan is to finish this in 2025 Q2.

@nate-double-u
Copy link
Copy Markdown
Contributor

Thanks for this @pacoxu, and thanks to all the reviewers.

I'm closing this PR as stale since some change requests were made but not responded to in over 2 weeks. @pacoxu, I see you plan on picking it back up in a couple months, so please feel free to reopen it when you are ready to continue the work.

/close

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@nate-double-u: Closed this PR.

Details

In response to this:

Thanks for this @pacoxu, and thanks to all the reviewers.

I'm closing this PR as stale since some change requests were made but not responded to in over 2 weeks. @pacoxu, I see you plan on picking it back up in a couple months, so please feel free to reopen it when you are ready to continue the work.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/blog Issues or PRs related to the Kubernetes Blog subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. language/en Issues or PRs related to English language size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

Status: Planned

Development

Successfully merging this pull request may close these issues.

8 participants