Skip to content

Containerd restart failed with message "failed to recover state: failed to reserve container name xxx: name xxx is reserved for xxx" #7247

@payall4u

Description

@payall4u

Description

We use kubelet and containerd. Restart containerd failed because cri found the same container name.

8月 02 22:56:05 VM-0-29-centos containerd[36948]: time="2022-08-02T22:56:05.989055410+08:00" level=fatal msg="Failed to run CRI service" error="failed to recover state: failed to reserve container name "kube-proxy_kube-proxy-m28fw_kube-system_0df69e5f-4355-4f99-bfd0-d1c6b2f935aa_0": name "kube-proxy_kube-proxy-m28fw_kube-system_0df69e5f-4355-4f99-bfd0-d1c6b2f935aa_0" is reserved for "73cc6d80cd6602e5ff53fd62db85cf09ecc8fe12b9effe753c404bf45750842a""
8月 02 22:56:05 VM-0-29-centos systemd[1]: containerd.service: Main process exited, code=exited, status=1/FAILURE

It's easy to reproduce, as long as you fill up the disk.

CRI will load all containers., when restart. And the container will be skipped, if load failed. And kubelet will create a container/sandbox with same name and restrartCount, because kubelet get restrartCount from annotation of the existed container. If loading the missing container succeeds, on the next restart, cri will find the container with the same name, so it will panic.

Steps to reproduce the issue

  1. Find a common node with containerd as runtime
  2. Fill the disk with commond such as dd if=/dev/zero of=file bs=1M count=1024
  3. Restart containerd and we'll go a message "Failed to load container xxx error=failed to checkpoint status to xxx/.tmp-status106398678: no space left on device""
  4. Kubelet will create new containers will same name
  5. rm file
  6. systemctl restart containerd

Describe the results you received and expected

Noop

What version of containerd are you using?

1.4.3

Any other relevant information

no matter

Show configuration if it is related to CRI plugin.

no matter

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions