Skip to content

runtime/v2/runc: fix leaking socket path#5195

Merged
estesp merged 1 commit intocontainerd:masterfrom
fuweid:fix-5173
Mar 17, 2021
Merged

runtime/v2/runc: fix leaking socket path#5195
estesp merged 1 commit intocontainerd:masterfrom
fuweid:fix-5173

Conversation

@fuweid
Copy link
Copy Markdown
Member

@fuweid fuweid commented Mar 15, 2021

When runC shimv2 starts, the StartShim interface will re-exec itself as
long-running process, which will read the address during initializing.

Process

containerd-shim-runc-v1/v2 start             containerd-shim-runc-v1/v2

	initializing socket

	reexec containerd-shim-runc-v1/v2

	write address into file

						initializing

							read address

	write back to containerd daemon

						serving

						...

						remove address in Shutdown call

However, there is no synchronization after reexec. Then the data race is
like:

Process

containerd-shim-runc-v1/v2 start             containerd-shim-runc-v1/v2

	initializing socket

	reexec containerd-shim-runc-v1/v2

						initializing

							read address

	write address into file

	write back to containerd daemon

						serving

						...

						fail to remove address
						because of empty address

The address should be writen into file first before reexec.

And if shutdown the whole service before cleanup temporary
resource (like socket file), the Shutdown caller will receive ttrpc: closed
sometime, which depends on go runtime scheduler. Then it also causes leaking
socket files.

Since the shimV2-Delete binary API must be called to cleanup shim temporary
resource and shimV2-runC-v1 doesn't support grouping multi containers in one,
it is safe to remove the socket file in the binary call for shimV2-runC-v1.
But for the shimV2-runC-v2 shim, we still cleanup socket in Shutdown.
Hopefully we can find a way to cleanup socket in shimV2-Delete binary
call.

Fix: #5173

Signed-off-by: Wei Fu [email protected]

@fuweid fuweid added this to the 1.5 milestone Mar 15, 2021
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Mar 15, 2021

/ok-to-test

@theopenlab-ci

This comment has been minimized.

@fuweid fuweid requested a review from samuelkarp March 15, 2021 04:56
@fuweid fuweid force-pushed the fix-5173 branch 4 times, most recently from 9bf1f16 to e37483b Compare March 15, 2021 10:01
@theopenlab-ci

This comment has been minimized.

@fuweid fuweid changed the title runtime/v2/runc: fix leaking socket path [wip] runtime/v2/runc: fix leaking socket path Mar 15, 2021
When runC shimv2 starts, the StartShim interface will re-exec itself as
long-running process, which will read the `address` during initializing.

```happycase
Process

containerd-shim-runc-v1/v2 start             containerd-shim-runc-v1/v2

	initializing socket

	reexec containerd-shim-runc-v1/v2

	write address into file

						initializing

							read address

	write back to containerd daemon

						serving

						...

						remove address in Shutdown call
```

However, there is no synchronization after reexec. Then the data race is
like:

```leaking-case
Process

containerd-shim-runc-v1/v2 start             containerd-shim-runc-v1/v2

	initializing socket

	reexec containerd-shim-runc-v1/v2

						initializing

							read address

	write address into file

	write back to containerd daemon

						serving

						...

						fail to remove address
						because of empty address
```

The `address` should be writen into file first before reexec.

And if shutdown the whole service before cleanup temporary
resource (like socket file), the Shutdown caller will receive `ttrpc: closed`
sometime, which depends on go runtime scheduler. Then it also causes leaking
socket files.

Since the shimV2-Delete binary API must be called to cleanup shim temporary
resource and shimV2-runC-v1 doesn't support grouping multi containers in one,
it is safe to remove the socket file in the binary call for shimV2-runC-v1.
But for the shimV2-runC-v2 shim, we still cleanup socket in Shutdown.
Hopefully we can find a way to cleanup socket in shimV2-Delete binary
call.

Fix: containerd#5173

Signed-off-by: Wei Fu <[email protected]>
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Mar 15, 2021

Build succeeded.

@fuweid fuweid changed the title [wip] runtime/v2/runc: fix leaking socket path runtime/v2/runc: fix leaking socket path Mar 15, 2021
Copy link
Copy Markdown
Member

@crosbymichael crosbymichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@estesp estesp merged commit a0cc9b4 into containerd:master Mar 17, 2021
@fuweid fuweid deleted the fix-5173 branch March 18, 2021 02:46
@fuweid fuweid added cherry-picked/1.4.x PR commits are cherry picked into the release/1.4 branch and removed cherry-pick/1.4.x labels May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/1.4.x PR commits are cherry picked into the release/1.4 branch ok-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

runtime: leaking shim socket files

4 participants