*: fix leaked shim caused by high IO pressure #8954

fuweid · 2023-08-11T09:10:45Z

integration: add case to reproduce #7496

When the shim unmounts overlayfs rootfs, kernel will force syncfs if there is no volatile option. In order to reproduce the high IO pressure, this patch uses strace to delay the umount2 syscall.

NOTE: I don't merge three commits into one because it's easy to backport to v1.6.

Fixes: #7496 #8931

integration: add ShouldRetryShutdown case based on #7496

Within current design, if the shim is killed before task-service.Delete API call, the callback on connect close will send 137 exit code because the callback doesn't have any context about container's exit code.

containerd/runtime/v2/shim.go

Lines 170 to 184 in 70a2c95

    
           if response != nil { 
        
           	pid = response.Pid 
        
           	exitStatus = response.Status 
        
           	exitedAt = response.Timestamp 
        
           } else { 
        
           	exitStatus = 255 
        
           	exitedAt = time.Now() 
        
           } 
        
           events.Publish(ctx, runtime.TaskExitEventTopic, &eventstypes.TaskExit{ 
        
           	ContainerID: id, 
        
           	ID:          id, 
        
           	Pid:         pid, 
        
           	ExitStatus:  exitStatus, 
        
           	ExitedAt:    protobuf.ToTimestamp(exitedAt), 
        
           })

And the moby/moby can't handle duplicate exit event well. Let's say that the moby receives exit code 0 at first and then the duplicate exit event with different code 137 can override 0, as the #4769 described.

I think the best solution to avoid leaky issue is to redesign the task-service.Delete API, remove the async callback and then let the caller retry. Since the task in shim can't be restart, we can cache the exit code in container bundle so that the shim.Delete binary call and read it and return exit code correctly. However, it doesn't work with running shim server.

In order to prevent from regression like #4769, I add skipped
integration case as TODO item and we should rethink about how to handle
the task/shim lifecycle.

@mikebrow @dmcgowan @mxpv @AkihiroSuda @thaJeztah @laurazard

Signed-off-by: Wei Fu <[email protected]>

Fixes: containerd#7496 containerd#8931 Signed-off-by: Wei Fu <[email protected]>

Since the moby/moby can't handle duplicate exit event well, it's hard for containerd to retry shutdown if there is error, like context canceled. In order to prevent from regression like containerd#4769, I add skipped integration case as TODO item and we should rethink about how to handle the task/shim lifecycle. Signed-off-by: Wei Fu <[email protected]>

Signed-off-by: Wei Fu <[email protected]>

cpuguy83

LGTM

I have a feeling moby needs a similar patch.

fuweid · 2023-08-16T02:54:02Z

I have a feeling moby needs a similar patch.

Basically, yes. The leaky shim can be cleanup after containerd restart by the way.

Thanks for the review.

fuweid added the ok-to-test label Aug 11, 2023

fuweid force-pushed the fix-shim-leak branch from dc0588b to c362a3b Compare August 11, 2023 09:36

fuweid added 4 commits August 11, 2023 17:41

integration: add case to reproduce containerd#7496

5bdd9ca

Signed-off-by: Wei Fu <[email protected]>

pkg/cri/server: fix leaked shim issue

72bc63d

Fixes: containerd#7496 containerd#8931 Signed-off-by: Wei Fu <[email protected]>

pkg/cri/sbserver: fix leaked shim issue for podsandbox mode

8dcb2a6

Fixes: containerd#7496 containerd#8931 Signed-off-by: Wei Fu <[email protected]>

fuweid force-pushed the fix-shim-leak branch from c362a3b to 601699a Compare August 11, 2023 09:44

Vagrantfile: add strace tool

00ef8ba

Signed-off-by: Wei Fu <[email protected]>

fuweid mentioned this pull request Aug 13, 2023

libcontainerd: consider to use task.Wait to update the container's exit code instead of task.Event moby/moby#46212

Open

cpuguy83 approved these changes Aug 15, 2023

View reviewed changes

dmcgowan approved these changes Aug 16, 2023

View reviewed changes

fuweid merged commit ba852fa into containerd:main Aug 17, 2023

fuweid added cherry-pick/1.6.x cherry-pick/1.7.x Change to be cherry picked to release/1.7 branch area/cri Container Runtime Interface (CRI) labels Aug 17, 2023

fuweid deleted the fix-shim-leak branch August 17, 2023 00:17

fuweid added cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch cherry-picked/1.7.x PR commits are cherry-picked into release/1.7 branch and removed cherry-pick/1.6.x cherry-pick/1.7.x Change to be cherry picked to release/1.7 branch labels Oct 12, 2023

johannesfrey mentioned this pull request Oct 30, 2023

shim process leaked #9309

Closed

mikebrow mentioned this pull request Nov 2, 2023

[release/1.7] Update hcsshim tag to v0.11.4 #9326

Merged

marcodalcin mentioned this pull request Feb 6, 2024

OCI runtime create failed #9222

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

*: fix leaked shim caused by high IO pressure #8954

*: fix leaked shim caused by high IO pressure #8954

Uh oh!

fuweid commented Aug 11, 2023 •

edited

Loading

Uh oh!

cpuguy83 left a comment

Uh oh!

fuweid commented Aug 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if response != nil {
	pid = response.Pid
	exitStatus = response.Status
	exitedAt = response.Timestamp
	} else {
	exitStatus = 255
	exitedAt = time.Now()
	}
	events.Publish(ctx, runtime.TaskExitEventTopic, &eventstypes.TaskExit{
	ContainerID: id,
	ID: id,
	Pid: pid,
	ExitStatus: exitStatus,
	ExitedAt: protobuf.ToTimestamp(exitedAt),
	})

*: fix leaked shim caused by high IO pressure #8954

*: fix leaked shim caused by high IO pressure #8954

Uh oh!

Conversation

fuweid commented Aug 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

integration: add case to reproduce #7496

integration: add ShouldRetryShutdown case based on #7496

Uh oh!

cpuguy83 left a comment

Choose a reason for hiding this comment

Uh oh!

fuweid commented Aug 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fuweid commented Aug 11, 2023 •

edited

Loading