Skip to content

Conversation

@fuweid
Copy link
Member

@fuweid fuweid commented Sep 7, 2020

ttrpc provides WithOnClose option for user and ttrpc will call the
callback function when connection is closed unexpectedly or the ttrpc
client's Close() method is called. containerd runtime plugin uses it
to handle cleanup the resources created by containerd shim.

But the ttrpc client's Close() is only trigger and the shim's cleanup
resource callback is called asynchronously, which might make part of
resources leaky. There is an example from containerd-runtime-v2 for
runc:

[Task.Delete goroutine]         [cleanupCallback goroutine]

call ttrpc client.Close() -->

                                  read bundle and call runc delete

delete bundle

If the cleanupCallback is called after deleting bundle, the callback
will fail to call runc delete. If there is any running processes, the
resource becomes leaky.

[Task.Delete goroutine]         [cleanupCallback goroutine]

call ttrpc client.Close() -->

delete bundle

                                  failed to read bundle and call runc delete

In order to avoid this, introduces the UserOnCloseWait to make sure that
the cleanupCallback has been called synchronously, like:

[Task.Delete goroutine]         [cleanupCallback goroutine]

call ttrpc client.Close() -->

wait for callback

                               read bundle and call runc delete

                          <--  finish sync

delete bundle

Signed-off-by: Wei Fu [email protected]

ttrpc provides WithOnClose option for user and ttrpc will call the
callback function when connection is closed unexpectedly or the ttrpc
client's Close() method is called. containerd runtime plugin uses it
to handle cleanup the resources created by containerd shim.

But the ttrpc client's Close() is only trigger and the shim's cleanup
resource callback is called asynchronously, which might make part of
resources leaky. There is an example from containerd-runtime-v2 for
runc:

```happy
[Task.Delete goroutine]         [cleanupCallback goroutine]

call ttrpc client.Close() -->

                                  read bundle and call runc delete

delete bundle
```

If the cleanupCallback is called after deleting bundle, the callback
will fail to call runc delete. If there is any running processes, the
resource becomes leaky.

```unhappy
[Task.Delete goroutine]         [cleanupCallback goroutine]

call ttrpc client.Close() -->

delete bundle

                                  failed to read bundle and call runc delete
```

In order to avoid this, introduces the UserOnCloseWait to make sure that
the cleanupCallback has been called synchronously, like:

```
[Task.Delete goroutine]         [cleanupCallback goroutine]

call ttrpc client.Close() -->

wait for callback

                               read bundle and call runc delete

                          <--  finish sync

delete bundle
```

Signed-off-by: Wei Fu <[email protected]>
Copy link
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@estesp estesp merged commit 079556c into containerd:master Sep 8, 2020
@fuweid fuweid deleted the add-user-on-close-wait branch August 19, 2021 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants