Skip to content

analyzer: add pre-container monitor to capture early file access#2129

Merged
ktock merged 1 commit intocontainerd:mainfrom
wswsmao:monitor
Oct 16, 2025
Merged

analyzer: add pre-container monitor to capture early file access#2129
ktock merged 1 commit intocontainerd:mainfrom
wswsmao:monitor

Conversation

@wswsmao
Copy link
Copy Markdown
Contributor

@wswsmao wswsmao commented Sep 12, 2025

Fixes: #2128

@wswsmao wswsmao closed this Sep 15, 2025
@wswsmao wswsmao reopened this Sep 15, 2025
@wswsmao wswsmao closed this Sep 29, 2025
@wswsmao wswsmao reopened this Sep 29, 2025
@wswsmao wswsmao closed this Oct 9, 2025
@wswsmao wswsmao reopened this Oct 9, 2025
@wswsmao wswsmao force-pushed the monitor branch 2 times, most recently from e363684 to 71e7abd Compare October 14, 2025 06:40
prePaths := preMonitor.GetPaths()
for _, path := range prePaths {
cleanPath := path
if strings.HasPrefix(path, target) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that this becomes false? If so, that path should not be recorded and just be skipped.

Copy link
Copy Markdown
Contributor Author

@wswsmao wswsmao Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I investigated this on nvcr.io/nvidia/tritonserver:25.07-py3 and found that during the pre-monitor phase fanotify can report the same file in two forms: with and without the target mount prefix. For example:

  • Without prefix:
    • /usr/lib/x86_64-linux-gnu/libubsan.so.1.0.0
    • /usr/lib/x86_64-linux-gnu/liblber.so.2.0.200
    • /opt/hpcx/ompi/lib/libpmix.so.2.2.35
  • With prefix:
    • /tmp/target296403442/usr/bin/nvidia-persistenced
    • /tmp/target296403442/usr/lib/firmware/nvidia/535.261.03/gsp_ga10x.bin

After the container starts, these files are present in the container filesystem as expected:

# inside the running container
ls /usr/lib/x86_64-linux-gnu/libubsan.so.1.0.0 \
   /usr/lib/x86_64-linux-gnu/liblber.so.2.0.200 \
   /opt/hpcx/ompi/lib/libpmix.so.2.2.35
# outputs: all three paths exist

ls /usr/bin/nvidia-persistenced \
   /usr/lib/firmware/nvidia/535.261.03/gsp_ga10x.bin
# outputs: both paths exist

So strings.HasPrefix(path, target) can legitimately be false for valid accesses under the rootfs. If we skip those, we will miss real file accesses.
The safer approach is to normalize: when the prefix is present, trim it; otherwise record the path as-is. Even if there is a non-existent file, recorder.Record will skip it here.

@wswsmao wswsmao requested a review from ktock October 16, 2025 01:55
Copy link
Copy Markdown
Member

@ktock ktock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ktock ktock merged commit 8f3983e into containerd:main Oct 16, 2025
130 of 132 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prefetch miss with GPU containers due to missing pre-container file access tracking

2 participants