[carry-11761] core/runtime: should invoke shim binary if it doesn't support Sandbox API#11793
[carry-11761] core/runtime: should invoke shim binary if it doesn't support Sandbox API#11793fuweid merged 3 commits intocontainerd:mainfrom
Conversation
Signed-off-by: Wei Fu <[email protected]> Co-authored-by: yang yang <[email protected]>
This reverts commit 4b4e6f7. Signed-off-by: Wei Fu <[email protected]>
| params = shimbinary.BootstrapParams{ | ||
| Version: int(opts.Version), | ||
| Protocol: protocol, | ||
| Address: address, |
There was a problem hiding this comment.
I wonder if we should introduce a concept of "supported features" on the shim side. The bootstrap parameters could return a list of supported features as strings (e.g., ["sandboxes", "streaming_io", ...]). This would allow containerd to explicitly check for feature support, rather than relying on assumptions.
There was a problem hiding this comment.
it sounds good to me. we can re-design this part. I think it's more easy to decide which client version we should use. but right now, version number is good enough to fix this compatible issue.
Signed-off-by: yang yang <[email protected]> Signed-off-by: Wei Fu <[email protected]>
| // It's rolled out together with the sandbox API and can be used | ||
| // to determine whether we should invoke the shim binary. | ||
| const supportSandboxAPIVersion = 3 | ||
| if params.Version < supportSandboxAPIVersion { |
There was a problem hiding this comment.
It seems that this should not be here, only when params is valid should it be judged
There was a problem hiding this comment.
If that param is empty, there are several cases we should invoke shim:
- sandbox(pause) container (there is no sandbox ID yet)
- created container without SandboxID field because it could be created by v1.7.
- with SandboxID, but the sandbox is not created by podsandbox plugin
- ...
So if param is empty, we should invoke shim.
| return newPodTCtxWithRuntimeHandler(t, rSvc, name, ns, "", opts...) | ||
| } | ||
|
|
||
| func newPodTCtxWithRuntimeHandler(t *testing.T, rSvc cri.RuntimeService, |
There was a problem hiding this comment.
If there is a structure similar to the one below, I think this part of the code is easier to read and maintain.
type ctrdConfig struct {
t *testing.T
workdir string
version int
}
type runtimeConfig struct {
name string
runtimeType string
runtimePath string
}
There was a problem hiding this comment.
This new method is to allow us to run pod with different runtime. I think it's not related to runtime config you mentioned. Yeah, I think we should use function options here. I just don't get that why runtime handler is not that part of pod config. We can improve that in follow up
|
hm... there should have to be manually cherry picked to 2.0? Have learned a lot, and before seems that there were too many hard coded parts, and some of them were difficult to understand |
|
@yylt what part is hard code, would you please share that code? Or reference? Thanks |
i mean the #11761 which there are too many hard coded |
When updating for containerd 2.1, I assumed the patch is no longer needed, as it was included into containerd, but it turns out it got reverted upstream: containerd/containerd#11793 So bring back the patch in updated form, as otherwise `gvisor` leaves running processes behind on container stop. Signed-off-by: Andrey Smirnov <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]> (cherry picked from commit 1ac97c2) Signed-off-by: Wei Fu <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
Previously, to address issue containerd#11708, PR containerd#11793 changed containerd to always invoke the shim binary to establish shim connections, rather than reusing the sandbox shim. However, this change did not ensure that the Shutdown API was called to stop the shim process. Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox containers (when container.SandboxID is empty). This approach works for groupable shims, where multiple containers share a single socket address and only require a single Shutdown call. However, for non-groupable shims, each container requires its own Shutdown call during cleanup to avoid leaking shim processes. Additionally, PR containerd#11793 introduced a corner case during upgrades: - T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A. - T2: containerd is upgraded to v2.X.Y. - T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary. - T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory. - T5: containerd is restarted. - T6: containerd fails to connect to A-C1. - T7: The A-C1 container is left in EXITED status in the CRI plugin. To address this, ensure that loadShimTask downgrades to version:2 if necessary, and always invoke the Shutdown API for each non-groupable shim during cleanup to prevent resource leaks and handle upgrade scenarios correctly. (Introduced by containerd#11793) Signed-off-by: Wei Fu <[email protected]>
When updating for containerd 2.1, I assumed the patch is no longer needed, as it was included into containerd, but it turns out it got reverted upstream: containerd/containerd#11793 So bring back the patch in updated form, as otherwise `gvisor` leaves running processes behind on container stop. Signed-off-by: Andrey Smirnov <[email protected]>
Since the v2.0.0 release, the shim manager has attempted to use a single ttrpc connection to manage multiple containers through one shim server. However, this approach may introduce compatibility issues with older shims. According to the current documentation, container resources are cleaned up via the
shim deletebinary call, which is triggered as a callback when the connection is closed. To ensure compatibility, containerd should always invoke the shim start binary to establish a new connection—regardless of the shim type—if the shim version is lower than3.This patch enables containerd to invoke the shim start binary call, but doing so reveals another issue:
{ "version": 3, "protocol": "ttrpc", "address": "EXISTING-SOCKET" }.To address this, this patch also introduces a
clientVersionDowngraderon shim, allowing the shim manager to retry with a lower client version for compatibility with older shims.Closes: #11761
Closes: #9137
Closes: #11535
We should revert this change: #11741 from release/2.0
@dmcgowan @mikebrow @mxpv @samuelkarp
cc @yylt @chrishenzie