Fix shim deadlock#3529
Fix shim deadlock#3529keloyang wants to merge 1 commit intocontainerd:release/1.2from keloyang:sigchld-lost-1.2
Conversation
shim.Reap and shim.Default.Wait may deadlock, use Monitor.Notify to fix this issue. Signed-off-by: Shukui Yang <[email protected]>
|
I think this is a very serious problem, maybe a CVE, It lead to a k8s node not ready. how to make k8s node not ready? It may take a few attempts for the following two steps until you get the problem if you find kubectl exec fork ... hung, please check the docker container too. after docker container hung, please wait, and check kubelet log, kubelet will log like this: then we can see the state of k8s cluster use the following script, we can find the node of centos2 is NotReady every 3 min, and last 1min. |
|
For Kubernetes + containerd, this shouldn't cause kubelet to become not ready, but still good to fix it. |
|
@Random-Liu Pods Like you mentioned this happens with kubelet + docker (dockershim) and not with kubelet + containerd(CRI). To be more precise, pods on both docker and containerd runtime hangs but docker nodes flap between Ready and NotReady due to |
shim.Reap and shim.Default.Wait may deadlock, and #2748 do't fix this, putting those two pieces of code together makes it easier to see the problem.
if first code run to line 44, and the second code happened to run to line 85, those two pieces of code will deadlock.
the following will show how to reproduce
at last , execute the shell script, and you will find the hung.
the other branches may have this problem too.
Signed-off-by: Shukui Yang [email protected]