Skip to content

Kubelet regularly freeze control groups causing issues further down #104280

@odinuge

Description

@odinuge

What happened:

On cgroup v1 when using systemd cgroup driver, kubelet regularly freeze and thaw its control groups, and that is causing issues further down.

What you expected to happen:

No freeze

How to reproduce it (as minimally and precisely as possible):

Running the command below should never print anything, since freezer.state should always be THAWED.

# zsh globs
while true ; do cat /sys/fs/cgroup/freezer/kubepods.slice/{*/{*/,},}freezer.state; done | grep FROZEN

However, when using master/v1.22.0, this returns FROZEN regularly.

When using runc v1.0.0 or earlier, this can cause containers to end in a permanently frozen state. This state is tricky to debug, since no log output is given. To see if an environment is having problems, one can use:

# zsh syntax
$ grep -R "FROZEN"  /sys/fs/cgroup/freezer/kubepods.slice/**/freezer.state
# If there is no output, there is no issue. If there is output one or more container are frozen.

Anything else we need to know?:

More info in #102676 and #102676.

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.priority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions