Skip to content

Update OOMKilled event handling#12714

Merged
fuweid merged 6 commits intocontainerd:mainfrom
fuweid:fix-oom-issue
Jan 7, 2026
Merged

Update OOMKilled event handling#12714
fuweid merged 6 commits intocontainerd:mainfrom
fuweid:fix-oom-issue

Conversation

@fuweid
Copy link
Copy Markdown
Member

@fuweid fuweid commented Dec 19, 2025

cmd/containerd-shim-runc-v2: add experimental OOM package

The OOM handling code is intended to live under pkg/oom/v2. However, the
cgroupv2 package still needs further refinement, such as exporting the
cgroup path and allowing callers to query specific stats instead of
returning all of them.

Until that work is complete, introduce the OOM package as experimental
and place it under containerd-shim-runc-v2.

cmd/containerd-shim-runc-v2: use experimental OOM package

We should always send oom event before exit event.

internal/cri/server: check if OOM event occurred before update status

cri-integration: add stress test for TestOOMEventMonitor

The test was validated locally by running 100 pods for 100 rounds without
observing any failures. Due to limited resources in the CI environment,
the test parameters were reduced to 8 pods and 10 rounds.

FOCUS=TestOOMEventMonitor CGROUP_DRIVER=cgroupfs taskset -c 0,1 make cri-integration | tee /tmp/log

*: skip critest OOMKilled testcase for systemd cgroup

With the systemd cgroup driver, the container runtime uses a scope unit to
manage the cgroup path. According to the scope unit documentation:

Unlike service units, scope units have no “main” process: all processes in
the scope are equivalent. The lifecycle of a scope unit is therefore not
bound to a specific process, but to the existence of at least one process in
the scope. As a result, individual process exit statuses are not relevant to
the scope unit’s failure state.

We cannot rely on CollectMode=inactive-or-failed to preserve the cgroup path.
So there is a race condition between containerd and systemd garbage collection.
If systemd GC removes the scope unit’s cgroup before containerd reads it,
containerd loses the opportunity to inspect the cgroup and determine the OOM status.

So we disable the OOMKilled testcase.

In theory, this could be mitigated by inspecting the unit logs (e.g.
journalctl -u XXX.scope) and searching for the "OOMKilled" keyword.
However, this approach depends on journalctl and systemd logging behavior,
so it should be avoided.

Example journal output:

Dec 22 01:24:58 devbox systemd[1]: Started /usr/bin/bash -c dd if=/dev/zero of=/dev/null bs=20M.
Dec 22 01:24:58 devbox systemd[1]: XXX.service: A process of this unit has been killed by the OOM killer.
Dec 22 01:24:58 devbox systemd[1]: XXX.service: Main process exited, code=killed, status=9/KILL
Dec 22 01:24:58 devbox systemd[1]: XXX.service: Failed with result 'oom-kill'.

Ref: https://www.freedesktop.org/software/systemd/man/latest/systemd.scope.html

@github-project-automation github-project-automation Bot moved this to Needs Triage in Pull Request Review Dec 19, 2025
@dosubot dosubot Bot added area/cri Container Runtime Interface (CRI) area/runtime Runtime labels Dec 19, 2025
@fuweid fuweid requested review from dmcgowan and mikebrow December 19, 2025 19:36
Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple comments on the events.go changes

Comment thread internal/cri/server/events.go
Comment thread internal/cri/server/events.go
@fuweid fuweid force-pushed the fix-oom-issue branch 2 times, most recently from b00a96e to fafcfc8 Compare December 20, 2025 01:05
@fuweid fuweid marked this pull request as draft December 20, 2025 01:41
Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your draft looks good to me
nit: the test output is a bit spammy.. but I suppose not the fault of the new test .. just what is being tested

Comment thread internal/cri/server/events.go
@fuweid fuweid marked this pull request as ready for review December 22, 2025 22:16
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Dec 22, 2025

Hi @mikebrow The output is from integration/remote package. Since we create a lot of pods and containers, it outputs a lot. Let me see if I can reduce it in the followup. And I also update the critest oomkilled testcase for systemd cgroup driver. please take a look when you have chance. Thanks

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ningmingxiao
Copy link
Copy Markdown
Contributor

how about use old package containerd/cgroups ? @fuweid @mikebrow

@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Dec 24, 2025

how about use old package containerd/cgroups ?

I will update package later and then update in containerd.

@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Dec 30, 2025

May I have review on this? Thanks!

fuweid added 6 commits January 6, 2026 20:44
The OOM handling code is intended to live under pkg/oom/v2. However, the
cgroupv2 package still needs further refinement, such as exporting the
cgroup path and allowing callers to query specific stats instead of
returning all of them.

Until that work is complete, introduce the OOM package as experimental
and place it under containerd-shim-runc-v2.

Signed-off-by: Wei Fu <[email protected]>
We should always send oom event before exit event.

Signed-off-by: Wei Fu <[email protected]>
The test was validated locally by running 100 pods for 100 rounds without
observing any failures. Due to limited resources in the CI environment,
the test parameters were reduced to 8 pods and 10 rounds.

```bash
FOCUS=TestOOMEventMonitor CGROUP_DRIVER=cgroupfs taskset -c 0,1 make cri-integration | tee /tmp/log
```

Signed-off-by: Wei Fu <[email protected]>
@github-project-automation github-project-automation Bot moved this from Needs Triage to Review In Progress in Pull Request Review Jan 7, 2026
@fuweid fuweid added this pull request to the merge queue Jan 7, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jan 7, 2026
@fuweid fuweid added this pull request to the merge queue Jan 7, 2026
Merged via the queue into containerd:main with commit a800acb Jan 7, 2026
52 checks passed
@github-project-automation github-project-automation Bot moved this from Review In Progress to Done in Pull Request Review Jan 7, 2026
@dmcgowan dmcgowan added impact/changelog and removed area/cri Container Runtime Interface (CRI) labels Mar 17, 2026
@dmcgowan dmcgowan changed the title *: update OOMKilled event handling Update OOMKilled event handling Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants