Description
Hey All,
gVisor project member here.
We're seeing issues on containerd 2.0 with our existing shim. The main method for our shim is here.
I observed the issue a k8s cluster on GKE. kubectl delete operations hang indefinitely because containerd fails to delete the pause container. kubectl delete --force will delete the pod from k8s, but on the node, the pause container (via a runsc sandbox) and the shim stay around. I'll post the debug logs here in a subsequent post.
Our shim does not implement the Manager interface in case that is a problem.
Note that our shim is currently linked against containerd v1.6.36
Steps to reproduce the issue
- Start a cluster w/ containerd 2.0
- Add gVisor as a runtime class (follow instructions here).
- Add a runtimeClass object to the cluster
- Update the containers/config.toml file to the above.
- Run a hello-world pod on the cluster (kubectl apply -f hello.yaml).
- Delete the pod (kubectl delete pod/hello)
Describe the results you received and expected
Expected results: kubectl delete deletes the pod.
The delete command hangs indefinitely. kubectl delete pod/hello --force will terminate, and the pod will not be registered in the cluster
On the node, the runsc instance and the runsc shim remain for both instances (ps -ef | grep runsc)
What version of containerd are you using?
containerd github.com/containerd/containerd/v2 2.0.0 207ad71
Any other relevant information
Simple hello-world pod that I used to debug:
apiVersion: v1
kind: Pod
metadata:
name: hello
spec:
runtimeClassName: gvisor
restartPolicy: Never
containers:
- name: hello
image: alpine
args: ["echo", "hello" ]
- name: pause
image: registry.k8s.io/pause:latest
Show configuration if it is related to CRI plugin.
version = 2
required_plugins = ["io.containerd.grpc.v1.cri"]
# Kubernetes doesn't use containerd restart manager.
disabled_plugins = ["io.containerd.internal.v1.restart"]
oom_score = -999
[debug]
level = "info"
[grpc]
gid = 412
[plugins."io.containerd.grpc.v1.cri"]
stream_server_address = "127.0.0.1"
max_container_log_line_size = 262144
sandbox_image = "us-central1-artifactregistry.gcr.io/gke-release/gke-release/pause:3.8@sha256:880e63f94b145e46f1b1082bb71b85e21f16b99b180b9996407d61240ceb9830"
image_pull_progress_timeout = "5m"
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/home/kubernetes/bin"
conf_dir = "/etc/cni/net.d"
conf_template = "/home/containerd/cni.template"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://mirror.gcr.io","https://registry-1.docker.io"]
[metrics]
address = "127.0.0.1:1338"
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
discard_unpacked_layers = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.gvisor]
runtime_type = "io.containerd.runsc.v1"
pod_annotations = [ "dev.gvisor.*" ]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.gvisor.options]
TypeUrl = "io.containerd.runsc.v1.options"
ConfigPath = "/run/containerd/runsc/config.toml"
[plugins."io.containerd.internal.v1.opt"]
path = "/home/containerd/opt/containerd"
Description
Hey All,
gVisor project member here.
We're seeing issues on containerd 2.0 with our existing shim. The main method for our shim is here.
I observed the issue a k8s cluster on GKE.
kubectl deleteoperations hang indefinitely because containerd fails to delete the pause container.kubectl delete --forcewill delete the pod from k8s, but on the node, the pause container (via a runsc sandbox) and the shim stay around. I'll post the debug logs here in a subsequent post.Our shim does not implement the
Managerinterface in case that is a problem.Note that our shim is currently linked against containerd v1.6.36
Steps to reproduce the issue
Describe the results you received and expected
Expected results:
kubectl deletedeletes the pod.The delete command hangs indefinitely.
kubectl delete pod/hello --forcewill terminate, and the pod will not be registered in the clusterOn the node, the runsc instance and the runsc shim remain for both instances (
ps -ef | grep runsc)What version of containerd are you using?
containerd github.com/containerd/containerd/v2 2.0.0 207ad71
Any other relevant information
Simple hello-world pod that I used to debug:
Show configuration if it is related to CRI plugin.