-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add check for CNI plugins before tearing down pod network #10744
Conversation
Signed-off-by: Sameer <[email protected]>
Hi @sameersaeed. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test pull-containerd-k8s-e2e-ec2 |
Additionally, the following needs to be fixed in a follow up PR as the defer'd tear down will also fail and "block" the netns removal https://github.com/containerd/containerd/blob/release/1.6/pkg/cri/server/sandbox_run.go#L312. @fuweid FYI.. probably should not be deferring tear down before the network bring up, or if we do should ignore teardown failure when setup fails. Same for 1.6/7 |
Currently, locking behavior occurs if a user creates a sandbox pod and then tries to remove / delete it without having any CNI plugins initialized on their system.
Sample flow
pod-config.json:, per crictl docs
Running sandbox pod - receive CNI plugin error:
Pod shows up as NotReady:
Locking behavior - cannot stop / remove pod due to CNI plugin error:
Changes made
Current code:
containerd/internal/cri/server/sandbox_stop.go
Lines 114 to 116 in db97449
Added a minor change to the above
sandbox_stop.go
code so that the pod network teardown will only be performed if CNIResult returns a valid value:This avoids the locking behavior, so the pods are able to be stopped and / or removed as expected: