Skip to content

containerd 1.6.4 breaks weave on Kubernetes #6921

@jpetazzo

Description

@jpetazzo

Description

Hi!

I noticed the following problem on my new Kubernetes clusters: all pods (except the ones using hostNetwork) remain in ContainerCreating status, and events (as shown by e.g. kubectl describe pod) indicate:

  Warning  FailedCreatePodSandBox  4m24s (x83 over 22m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "9db599b21b2e13e75342e5707d1617f29c7a52ad09770bc9a2c38b1ecc35896a": failed to find network info for sandbox "9db599b21b2e13e75342e5707d1617f29c7a52ad09770bc9a2c38b1ecc35896a"

The kubelet logs (as shown by journalctl -u kubelet -f) indicate the same message (nothing more), and the containerd logs (as shown by journalctl -u containerd -f) indicate:

May 10 16:18:47 node4 containerd[10270]: time="2022-05-10T16:18:47.113889212Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:metrics-server-765bc4bc75-lqrls,Uid:71af0496-cd6b-430c-8e97-d7dc5b5159d9,Namespace:kube-system,Attempt:0,}"
May 10 16:18:47 node4 containerd[10270]: time="2022-05-10T16:18:47.516615215Z" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:metrics-server-765bc4bc75-lqrls,Uid:71af0496-cd6b-430c-8e97-d7dc5b5159d9,Namespace:kube-system,Attempt:0,} failed, error" error="failed to setup network for sandbox \"38c5bae03e8a1d51bff5e47aa57f2102c0e0db1dc015440dad832f7f2fee9b25\": failed to find network info for sandbox \"38c5bae03e8a1d51bff5e47aa57f2102c0e0db1dc015440dad832f7f2fee9b25\""

I looked at what had changed between my "new" Kubernetes clusters and the "old" ones (the "old" ones that were deployed just last week, using the same deployment mechanism, and that worked fine 🤔), and it seems to be containerd. To give a bit more context: unless instructed otherwise, my deployment scripts spawn Ubuntu 18.04 VMs, install the latest Kubernetes packages from the Kubernetes official deb repositories (http://apt.kubernetes.io/), the latest docker-ce package from the Docker repository (https://download.docker.com/linux/ubuntu), which itself seems to bring the latest containerd.io. According to https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/, it looks like containerd.io 1.6.4-1 was released May 5th, just 5 days ago.

To cross-check the issue, I've:

  • reproduced the issue with various versions of Kubernetes (1.24 down to 1.19)
  • fixed the issue by downgrading containerd.io to 1.5.11-1
  • fixed the issue by replacing Weave with Calico

So it seems that there is definitely something off with containerd 1.6; but it could also be something odd with Weave. However, according to https://github.com/weaveworks/weave/releases, Weave hasn't changed since early 2021.

Is there a way to get containerd to tell me more about what it's doing; or what it's expecting?

Steps to reproduce the issue

I think the following should work (but my deployment scripts are a little bit hairy, to account for various cloud provider discrepancies😬) →

  1. Obtain Ubuntu 18.04 machine
  2. Install docker-ce
  3. Wipe out /etc/containerd/config.toml to re-enable CRI API (which is disabled by default)
  4. Install Kubernetes packages
  5. kubeadm init with the config file below
  6. kubectl apply -f https://cloud.weave.works/k8s/net
  7. Observe with kubectl get pods --all-namespaces that pods (e.g. coredns pods) fail to start

kubeadm config file:

kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
nodeRegistration:
  criSocket: /run/containerd/containerd.sock

Describe the results you received and expected

Expected results: pods start

Observed results: pods fail to start, stay stuck in ContainerCreating, with errors mentioning failure to create pod sandbox

What version of containerd are you using?

containerd containerd.io 1.6.4 212e8b6

Any other relevant information

$ runc --version
runc version 1.1.1
commit: v1.1.1-0-g52de29d
spec: 1.0.2-dev
go: go1.17.9
libseccomp: 2.5.1
$ sudo crictl info
{
  "status": {
    "conditions": [
      {
        "type": "RuntimeReady",
        "status": true,
        "reason": "",
        "message": ""
      },
      {
        "type": "NetworkReady",
        "status": true,
        "reason": "",
        "message": ""
      }
    ]
  },
  "cniconfig": {
    "PluginDirs": [
      "/opt/cni/bin"
    ],
    "PluginConfDir": "/etc/cni/net.d",
    "PluginMaxConfNum": 1,
    "Prefix": "eth",
    "Networks": [
      {
        "Config": {
          "Name": "cni-loopback",
          "CNIVersion": "0.3.1",
          "Plugins": [
            {
              "Network": {
                "type": "loopback",
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"type\":\"loopback\"}"
            }
          ],
          "Source": "{\n\"cniVersion\": \"0.3.1\",\n\"name\": \"cni-loopback\",\n\"plugins\": [{\n  \"type\": \"loopback\"\n}]\n}"
        },
        "IFName": "lo"
      },
      {
        "Config": {
          "Name": "weave",
          "CNIVersion": "0.3.0",
          "Plugins": [
            {
              "Network": {
                "name": "weave",
                "type": "weave-net",
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"hairpinMode\":true,\"name\":\"weave\",\"type\":\"weave-net\"}"
            },
            {
              "Network": {
                "type": "portmap",
                "capabilities": {
                  "portMappings": true
                },
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"capabilities\":{\"portMappings\":true},\"snat\":true,\"type\":\"portmap\"}"
            }
          ],
          "Source": "{\n    \"cniVersion\": \"0.3.0\",\n    \"name\": \"weave\",\n    \"plugins\": [\n        {\n            \"name\": \"weave\",\n            \"type\": \"weave-net\",\n            \"hairpinMode\": true\n        },\n        {\n            \"type\": \"portmap\",\n            \"capabilities\": {\"portMappings\": true},\n            \"snat\": true\n        }\n    ]\n}\n"
        },
        "IFName": "eth0"
      }
    ]
  },
  "config": {
    "containerd": {
      "snapshotter": "overlayfs",
      "defaultRuntimeName": "runc",
      "defaultRuntime": {
        "runtimeType": "",
        "runtimePath": "",
        "runtimeEngine": "",
        "PodAnnotations": null,
        "ContainerAnnotations": null,
        "runtimeRoot": "",
        "options": null,
        "privileged_without_host_devices": false,
        "baseRuntimeSpec": "",
        "cniConfDir": "",
        "cniMaxConfNum": 0
      },
      "untrustedWorkloadRuntime": {
        "runtimeType": "",
        "runtimePath": "",
        "runtimeEngine": "",
        "PodAnnotations": null,
        "ContainerAnnotations": null,
        "runtimeRoot": "",
        "options": null,
        "privileged_without_host_devices": false,
        "baseRuntimeSpec": "",
        "cniConfDir": "",
        "cniMaxConfNum": 0
      },
      "runtimes": {
        "runc": {
          "runtimeType": "io.containerd.runc.v2",
          "runtimePath": "",
          "runtimeEngine": "",
          "PodAnnotations": null,
          "ContainerAnnotations": null,
          "runtimeRoot": "",
          "options": {
            "BinaryName": "",
            "CriuImagePath": "",
            "CriuPath": "",
            "CriuWorkPath": "",
            "IoGid": 0,
            "IoUid": 0,
            "NoNewKeyring": false,
            "NoPivotRoot": false,
            "Root": "",
            "ShimCgroup": "",
            "SystemdCgroup": false
          },
          "privileged_without_host_devices": false,
          "baseRuntimeSpec": "",
          "cniConfDir": "",
          "cniMaxConfNum": 0
        }
      },
      "noPivot": false,
      "disableSnapshotAnnotations": true,
      "discardUnpackedLayers": false,
      "ignoreRdtNotEnabledErrors": false
    },
    "cni": {
      "binDir": "/opt/cni/bin",
      "confDir": "/etc/cni/net.d",
      "maxConfNum": 1,
      "confTemplate": "",
      "ipPref": ""
    },
    "registry": {
      "configPath": "",
      "mirrors": null,
      "configs": null,
      "auths": null,
      "headers": null
    },
    "imageDecryption": {
      "keyModel": "node"
    },
    "disableTCPService": true,
    "streamServerAddress": "127.0.0.1",
    "streamServerPort": "0",
    "streamIdleTimeout": "4h0m0s",
    "enableSelinux": false,
    "selinuxCategoryRange": 1024,
    "sandboxImage": "k8s.gcr.io/pause:3.6",
    "statsCollectPeriod": 10,
    "systemdCgroup": false,
    "enableTLSStreaming": false,
    "x509KeyPairStreaming": {
      "tlsCertFile": "",
      "tlsKeyFile": ""
    },
    "maxContainerLogSize": 16384,
    "disableCgroup": false,
    "disableApparmor": false,
    "restrictOOMScoreAdj": false,
    "maxConcurrentDownloads": 3,
    "disableProcMount": false,
    "unsetSeccompProfile": "",
    "tolerateMissingHugetlbController": true,
    "disableHugetlbController": true,
    "device_ownership_from_security_context": false,
    "ignoreImageDefinedVolumes": false,
    "netnsMountsUnderStateDir": false,
    "enableUnprivilegedPorts": false,
    "enableUnprivilegedICMP": false,
    "containerdRootDir": "/var/lib/containerd",
    "containerdEndpoint": "/run/containerd/containerd.sock",
    "rootDir": "/var/lib/containerd/io.containerd.grpc.v1.cri",
    "stateDir": "/run/containerd/io.containerd.grpc.v1.cri"
  },
  "golang": "go1.17.9",
  "lastCNILoadStatus": "OK",
  "lastCNILoadStatus.default": "OK"
}
$ uname -a
Linux node1 4.15.0-176-generic #185-Ubuntu SMP Tue Mar 29 17:40:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Also reproduced on:

Linux node1 5.4.0-96-generic #109~18.04.1-Ubuntu SMP Thu Jan 13 15:06:26 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Show configuration if it is related to CRI plugin.

Configuration is empty.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions