Skip to content

New containerd process tries to connect to a non-existing socket after old containerd and shim processes got killed. #7394

@syurevich

Description

@syurevich

Description

If containerd and containerd-shim-runc-v2 processes get killed, new containerd process keeps trying to connect to a socket which does not exist.
containerd cannot recover from this state. It will not reach active running state (at least not within reasonable time).

Steps to reproduce the issue

  1. pkill containerd
  2. New containerd process will keep trying to connect to a non-existing socket

Describe the results you received and expected

We have observed that after containerd and containerd-shim-runc-v2 processes get killed, new containerd process keeps trying to connect to a socket which does not exist. This can be seen in the output of strace:

connect(7<socket:[116055]>, {sa_family=AF_UNIX, sun_path="/run/containerd/s/9c8032c19e0e87011d31b278b24a2529a0ff01efde9f6fbbf5b9c1a9cdd709ec"}, 85) = -1 ENOENT (No such file or directory)

We would expect that the new containerd process should be able to clean up shims which got disconnected when containerd was down.

What version of containerd are you using?

1.5.11

Any other relevant information

No response

Show configuration if it is related to CRI plugin.

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions