Skip to content

Dealing with optimizer-server zombie processes#622

Merged
imeoer merged 2 commits intocontainerd:mainfrom
LunaWhispers:deal_zombie_processes
Dec 4, 2024
Merged

Dealing with optimizer-server zombie processes#622
imeoer merged 2 commits intocontainerd:mainfrom
LunaWhispers:deal_zombie_processes

Conversation

@LunaWhispers
Copy link
Copy Markdown

Problem Description

Based on the process outlined in optimize_nydus_image, we attempted to build an optimizer to generate the accessed files list. The optimizer-server and optimizer-nri-plugin were deployed on the server. After some time, we noticed that some of the optimizer-servers became defunct. The specific screenshots are as follows:
image
The parent processes of these zombie processes all belong to the optimizer-nri-plugin. We have also verified that this issue occurs on every server we deployed.
We believe that if the servers continue to retain these zombie processes, they will accumulate to a certain number and eventually affect the normal operation of the business.

Modified Sections

We checked and found that the issue primarily lies in two areas:

  1. When the optimizer-nri-plugin performs the fanotifyServer operation, it needs to invoke the optimizer-server executable, which creates a child process. It is necessary to wait for the completion of this process and properly clean up the resources.
  2. When the optimizer-server is running, it also creates a child process for the fanotify operation. Similarly, it is necessary to wait for this process to complete and clean up the resources.

Validation Results

We deployed the modified functionality on four servers for observation. After a full day of monitoring, no zombie processes were observed.

Testing Method

The modified code was recompiled, and the newly generated optimizer-server and optimizer-nri-plugin binaries replaced the existing ones.
The process status was checked using the command ps -ef | grep optimizer to verify that no zombie processes had appeared.

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 14, 2024

Codecov Report

Attention: Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.

Project coverage is 21.29%. Comparing base (29243e3) to head (917375e).
Report is 20 commits behind head on main.

Files with missing lines Patch % Lines
pkg/fanotify/fanotify.go 0.00% 4 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #622      +/-   ##
==========================================
- Coverage   21.93%   21.29%   -0.64%     
==========================================
  Files         122      122              
  Lines       10839    13682    +2843     
==========================================
+ Hits         2377     2913     +536     
- Misses       8140    10447    +2307     
  Partials      322      322              
Files with missing lines Coverage Δ
pkg/fanotify/fanotify.go 0.00% <0.00%> (ø)

... and 112 files with indirect coverage changes

@LunaWhispers
Copy link
Copy Markdown
Author

@imeoer PTLK

Copy link
Copy Markdown
Collaborator

@imeoer imeoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, LGTM, thanks!

@imeoer imeoer merged commit 021c505 into containerd:main Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants