*: properly shutdown non-groupable shims to prevent resource leaks#11916
*: properly shutdown non-groupable shims to prevent resource leaks#11916fuweid merged 1 commit intocontainerd:mainfrom
Conversation
|
ping @smira if possible, would you please verify this patch? thanks updated: tested with gvisor in my local. it can fix that issue. |
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
|
/retest |
I verified by adding https://patch-diff.githubusercontent.com/raw/containerd/containerd/pull/11916.patch and removing my previous hacky patch https://github.com/siderolabs/pkgs/blob/main/containerd/patches/11741-not-set-sandbox-id-when-use-podsandbox-type.patch - all seems to be good in the tests, thank you! |
|
@smira thanks for the update! |
| removeTask(ctx, s.ID()) | ||
| } | ||
|
|
||
| const supportSandboxAPIVersion = 3 |
There was a problem hiding this comment.
nit: should we move this to a package-level const and with comments about if/ when to bump up it? (also do we already have an existing const for the current API version?)
There was a problem hiding this comment.
it is fixed value. I think we only need to change sandbox value for old shim. I Will add comment in follow up. Thanks for the suggestion.
|
/cherry-pick release/2.1 |
|
@fuweid: new pull request created: #11971 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Drop the hacky patch which was a workaround for gvisor issues. I tested the fix in 2.1.2: containerd/containerd#11916 (comment) Signed-off-by: Andrey Smirnov <[email protected]>
Drop the hacky patch which was a workaround for gvisor issues. I tested the fix in 2.1.2: containerd/containerd#11916 (comment) Signed-off-by: Andrey Smirnov <[email protected]>
Previously, to address issue #11708, PR #11793 changed containerd to always
invoke the shim binary to establish shim connections, rather than reusing the
sandbox shim. However, this change did not ensure that the Shutdown API was
called to stop the shim process.
Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox
containers (when container.SandboxID is empty). This approach works for
groupable shims, where multiple containers share a single socket address and
only require a single Shutdown call. However, for non-groupable shims, each
container requires its own Shutdown call during cleanup to avoid leaking shim
processes.
Additionally, PR #11793 introduced a corner case during upgrades:
To address this, ensure that loadShimTask downgrades to version:2 if necessary,
and always invoke the Shutdown API for each non-groupable shim during cleanup to
prevent resource leaks and handle upgrade scenarios correctly.
(Introduced by #11793)
Signed-off-by: Wei Fu [email protected]
Fixes: #11871