cri: close leak container's io when restart containerd#11389
cri: close leak container's io when restart containerd#11389ningmingxiao wants to merge 1 commit intocontainerd:mainfrom
Conversation
|
Hi @ningmingxiao. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
5d73e8e to
ccab1c4
Compare
76b89f0 to
31244a1
Compare
31244a1 to
46464c2
Compare
46464c2 to
fba1cb6
Compare
fba1cb6 to
d7dc513
Compare
|
can this pr be merged? @fuweid |
The rule of merge is to have two approvals at least. :) please wait for reviewers. Thanks |
|
cc @mikebrow |
we need to handle rollback reservation first
d7dc513 to
d6fda53
Compare
d6fda53 to
89549c4
Compare
dbe79a8 to
27a6956
Compare
|
two question:
|
No, we don't. Old, exited containers may still exist. And they are valid.
If I understand correctly, you're saying that ListContainers can return all containers, including those with the same name. Yes, that's correct. With my suggestion, ListContainers will not return leaky containers that have duplicate names. And we can't close IO. I don't know how to reproduce containers with same attempt in k8s case. /hold |
Signed-off-by: ningmingxiao <[email protected]>
27a6956 to
5145f88
Compare
|
I find many log "failed to reserve container name" happened after containerd booted successfully. panic happened(20:34:09) after containerd is restart(20:03:44). panic reason is that containerd doesn't receive delete request. many containers wait to be started. lsof -p 10124 stdout and stderror is hold by containerd. |
Is it caused by office kubelet release? I mean that it's not the your internal version. For |
|
we use internal k8s matained by other team kubernetes/kubernetes#130331 |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
This PR is stale because it has been open 90 days with no activity. This PR will be closed in 7 days unless new comments are made or the stale label is removed. |
|
This PR was closed because it has been stalled for 7 days with no activity. |
|
Can this be reopened? |
If system is busy, shim is busy container status is unknown k8s will create many containers in to one pod,which cause containerd panic(we set default 10000 to debug.SetMaxThreads(20000) still panic )
when restart containerd
[[email protected]@LIN-FB738BFD367 op-containers-containerd]$ cat 1.log |grep openFifo|wc -l
819
many openfifo panic.
may be fix #9113 and
#10515
but I can't find a good way to understand how " ... is reserved for ..." happend.
panic happened(20:34:09) after containerd is restart(20:03:44).