Skip to content

bugfix: handle duplicate exit events via task status#52156

Merged
cpuguy83 merged 1 commit intomoby:masterfrom
fuweid:fix-52153
Mar 10, 2026
Merged

bugfix: handle duplicate exit events via task status#52156
cpuguy83 merged 1 commit intomoby:masterfrom
fuweid:fix-52153

Conversation

@fuweid
Copy link
Copy Markdown
Contributor

@fuweid fuweid commented Mar 8, 2026

- What I did

Fixes: #52153

- How I did it

Replace timestamp-based duplicate exit detection for running containers with a live containerd task status check, and ignore exits only when the task is still Running.

Also simplify restarting-state handling by always treating exit events as duplicates while restart processing is in progress, and add warning logs for task status lookups.

This avoids relying on wall-clock timestamps that can move backward (e.g., NTP), which could misclassify duplicate exit events.

- How to verify it

Manually run container and force containerd-shim to publish duplicated events

- Human readable description for the release notes

Improved duplicate container-exit handling by using live containerd task state (not timestamps)

- A picture of a cute animal (not mandatory but encouraged)

@github-actions github-actions Bot added the area/daemon Core Engine label Mar 8, 2026
@fuweid
Copy link
Copy Markdown
Contributor Author

fuweid commented Mar 8, 2026

@anatolebeuzon
Copy link
Copy Markdown

Thanks for the fix! It looks correct to me.

Comment thread daemon/monitor.go Outdated
return false
}

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context.TODO() ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Copy Markdown
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just a minor comment.

Replace timestamp-based duplicate exit detection for running containers with a
live containerd task status check, and ignore exits only when the task is still
`Running`.

Also simplify restarting-state handling by always treating exit events as
duplicates while restart processing is in progress, and add warning logs for
task status lookups.

This avoids relying on wall-clock timestamps that can move backward (e.g., NTP),
which could misclassify duplicate exit events.

Signed-off-by: Wei Fu <[email protected]>
@fuweid
Copy link
Copy Markdown
Contributor Author

fuweid commented Mar 9, 2026

the test case failed but it seems it's not related to my change.

@fuweid
Copy link
Copy Markdown
Contributor Author

fuweid commented Mar 9, 2026

Is it ready to get merged? :)

@cpuguy83 cpuguy83 merged commit 3ad6ab4 into moby:master Mar 10, 2026
252 of 253 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Container stuck in Running state when system clock steps backwards

6 participants