Skip to content

fix the restart desired to running when task not found#6833

Merged
estesp merged 1 commit intocontainerd:mainfrom
junnplus:monitor-restart
Apr 20, 2022
Merged

fix the restart desired to running when task not found#6833
estesp merged 1 commit intocontainerd:mainfrom
junnplus:monitor-restart

Conversation

@junnplus
Copy link
Copy Markdown
Member

@junnplus junnplus commented Apr 20, 2022

Signed-off-by: Ye Sijun [email protected]

Report of @AkihiroSuda #6830 (comment)

I'm sorry for misunderstanding 9503d72#diff-a9d90e82d0fee932f0437ab217a8ef14e4a56b4be02321de5193de0ad98e20f9L239-L249

For the case when Task returns not found and the restart desired to running, we need to continue to restart the container.

@junnplus
Copy link
Copy Markdown
Member Author

junnplus commented Apr 20, 2022

I tested TestRunRestart of nerdctl locally, and it works.

containerd --version
containerd github.com/containerd/containerd v1.6.0-281-g4fd32eeb5 4fd32ee

$ go test -exec sudo -v ./cmd/nerdctl/... -run TestRunRestart -args -test.kill-daemon
test target: "nerdctl"
=== RUN   TestRunRestart
    run_restart_linux_test.go:44: NOTE: this test may take a while
    run_restart_linux_test.go:71: killing "containerd.service"
    run_restart_linux_test.go:72: checking activity of "containerd.service"
    run_restart_linux_test.go:72: (retry=0) active
    run_restart_linux_test.go:72: daemon "containerd.service" is now running, checking whether the daemon can handle requests
    run_restart_linux_test.go:72: (retry=0) time="2022-04-20T10:00:42Z" level=fatal msg="failed to dial \"/run/containerd/containerd.sock\": connection error: desc = \"transport: error while dialing: dial unix /run/containerd/containerd.sock: connect: connection refused\""
    run_restart_linux_test.go:72: (retry=1) activating
    run_restart_linux_test.go:72: (retry=2) active
    run_restart_linux_test.go:72: daemon "containerd.service" is now running, checking whether the daemon can handle requests
    run_restart_linux_test.go:72: daemon "containerd.service" can now handle requests
    run_restart_linux_test.go:79: (retry 0) ps -a: "CONTAINER ID    IMAGE                                              COMMAND                   CREATED           STATUS     PORTS                     NAMES\n367bda156295    ghcr.io/stargz-containers/nginx:1.19-alpine-org    \"/docker-entrypoint.…\"    15 minutes ago    Created                              c1-in-n0                  \n3843c3975c6f    ghcr.io/stargz-containers/nginx:1.19-alpine-org    \"/docker-entrypoint.…\"    15 minutes ago    Created                              c2-in-n1                  \n6465b0b125f6    ghcr.io/stargz-containers/nginx:1.19-alpine-org    \"/docker-entrypoint.…\"    6 seconds ago     Up         127.0.0.1:8080->80/tcp    nerdctl-testrunrestart    \nd941b0eec3a5    ghcr.io/stargz-containers/nginx:1.19-alpine-org    \"/docker-entrypoint.…\"    15 minutes ago    Created                              c0-in-n0                  \nf0460d99d436    ghcr.io/stargz-containers/nginx:1.19-alpine-org    \"/docker-entrypoint.…\"    15 minutes ago    Created                              c3-in-bridge              \n"
    run_restart_linux_test.go:82: test is passing, after 0 retries
--- PASS: TestRunRestart (6.89s)
PASS
ok      github.com/containerd/nerdctl/cmd/nerdctl       6.906s

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Apr 20, 2022

Build succeeded.

if err != nil {
logrus.WithError(err).Error("monitor")
if desiredStatus != containerd.Stopped {
if desiredStatus == containerd.Stopped {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if desiredStatus is running, the "on-failure" case will get empty status in the Line 245.
I am not sure that there is a way to store the status after delete. Anyway, it is safe to use with nerdctl.
Maybe we can add the comment about known issue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty status has no effect on on-failure, which expects a status whose ExitStatus is not 0.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it, I added the comment about known issue in 243-245 .

@junnplus junnplus force-pushed the monitor-restart branch 2 times, most recently from 52ff683 to 8232c44 Compare April 20, 2022 13:42
Signed-off-by: Ye Sijun <[email protected]>
@junnplus junnplus requested a review from fuweid April 20, 2022 13:48
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Apr 20, 2022

Build succeeded.

@fuweid
Copy link
Copy Markdown
Member

fuweid commented Apr 20, 2022

I think we can watch event to handle the reconcile if necessary ~

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@estesp estesp merged commit 1a23678 into containerd:main Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants