-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[release/2.1] Disable event subscriber during task cleanup #12410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We have individual goroutine for each sandbox container. If there is any
error in handler, that goroutine will put event in that backoff queue.
So we don't need event subscriber for podsandbox. Otherwise, there will
be two goroutines to cleanup sandbox container.
```
>>>> From EventMonitor
time="2025-10-23T19:30:59.626254404Z" level=debug msg="Received containerd event timestamp - 2025-10-23 19:30:59.624494674 +0000 UTC, namespace - \"k8s.io\", topic - \"/tasks/exit\""
time="2025-10-23T19:30:59.626301912Z" level=debug msg="TaskExit event in podsandbox handler container_id:\"22e15114133e4d461ab380654fb76f3e73d3e0323989c422fa17882762979ccf\" id:\"22e15114133e4d461ab380654fb76f3e73d3e0323989c422fa17882762979ccf\" pid:203121 exit_status:137 exited_at:{seconds:1761247859 nanos:624467824}"
>>> If EventMonitor handles task exit well, it will close ttrpc
connection and then waitSandboxExit could encounter ttrpc-closed error
time="2025-10-23T19:30:59.688031150Z" level=error msg="failed to delete task" error="ttrpc: closed" id=22e15114133e4d461ab380654fb76f3e73d3e0323989c422fa17882762979ccf
```
If both task.Delete calls fail but the shim has already been shut down, it
could trigger a new task.Exit event sent by cleanupAfterDeadShim. This would
result in three events in the EventMonitor's backoff queue, which is unnecessary
and could cause confusion due to duplicate events.
The worst-case scenario caused by two concurrent task.Delete calls is a shim
leak. The timeline for this scenario is as follows:
| Timestamp | Component | Action | Result |
| ------ | ----------- | -------- | -------- |
| T1 | EventMonitor | Sends `task.Delete` | Marked as Req-1 |
| T2 | waitSandboxExit | Sends `task.Delete` | Marked as Req-2 |
| T3 | containerd-shim | Handles Req-2 | Container transitions from stopped to deleted |
| T4 | containerd-shim | Handles Req-1 | Fails - container already deleted<br>Returns error: `cannot delete a deleted process: not found` |
| T5 | EventMonitor | Receives `not found` error | - |
| T6 | EventMonitor | Sends `shim.Shutdown` request | No-op (active container record still exists) |
| T7 | EventMonitor | Closes ttrpc connection | Clean container state dir |
| T8 | containerd-shim | Handles Req-2 | Removes container record from memory |
| T9 | waitSandboxExit | Receives error | Error: `ttrpc: closed` |
| T10 | waitSandboxExit | Sends `shim.Shutdown` request | Fails (connection already closed) |
| T11 | waitSandboxExit | Closes ttrpc connection | No-op (already closed) |
The containerd-shim is still running because shim.Shutdown was sent at T6
before T8. Because container's state dir is deleted at T7, it's unable to clean
it up after containerd restarted.
We should avoid concurrent task.Delete calls here.
I also add subcommand - shutdown - in `ctr shim` for debug.
Fixed: containerd#12344
Signed-off-by: Wei Fu <[email protected]>
(cherry picked from commit 2042e80)
Signed-off-by: Wei Fu <[email protected]>
henry118
approved these changes
Oct 27, 2025
estesp
approved these changes
Oct 28, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We have individual goroutine for each sandbox container. If there is any error in handler, that goroutine will put event in that backoff queue. So we don't need event subscriber for podsandbox. Otherwise, there will be two goroutines to cleanup sandbox container.
If both task.Delete calls fail but the shim has already been shut down, it could trigger a new task.Exit event sent by cleanupAfterDeadShim. This would result in three events in the EventMonitor's backoff queue, which is unnecessary and could cause confusion due to duplicate events.
The worst-case scenario caused by two concurrent task.Delete calls is a shim leak. The timeline for this scenario is as follows:
task.Deletetask.DeleteReturns error:
cannot delete a deleted process: not foundnot founderrorshim.Shutdownrequestttrpc: closedshim.ShutdownrequestThe containerd-shim is still running because shim.Shutdown was sent at T6 before T8. Because container's state dir is deleted at T7, it's unable to clean it up after containerd restarted.
We should avoid concurrent task.Delete calls here.
I also add subcommand - shutdown - in
ctr shimfor debug.Fixed: #12344
Cherry-picked: #12400
(cherry picked from commit 2042e80)