Skip to content

core/mount.test: should not call removeLoop when set autoclear#12561

Merged
fuweid merged 1 commit intocontainerd:mainfrom
fuweid:fix-loopback-testcase
Nov 25, 2025
Merged

core/mount.test: should not call removeLoop when set autoclear#12561
fuweid merged 1 commit intocontainerd:mainfrom
fuweid:fix-loopback-testcase

Conversation

@fuweid
Copy link
Copy Markdown
Member

@fuweid fuweid commented Nov 24, 2025

In CI we run make root-test via gotestsum, which executes multiple package tests concurrently. TestAutoclearTrueLoop attempts to invoke LOOP_CLR_FD using a device name, which introduces a race condition.

Example race:

Process P1 represents mount.test which runs TestAutoclearTrueLoop Process P2 represents manager.test which runs TestLoopbackMount

T1: P1 closes fd of loop-device (loop3) (kernel unsets backing-file on close)
T2: P2 gets loop3 from /dev/loop-control
T3: P2 configures loop3 with backing file successfully
T4: P1 invokes removeLoop to clear backing file for loop3

You might see that failure like this

=== FAIL: core/mount/manager TestLoopbackMount (0.05s)
    log_hook.go:47: time="2025-10-23T21:49:22.532811960Z" level=debug msg="activating mount" func="manager.(*mountManager).Activate" file="/home/runner/work/containerd/containerd/core/mount/manager/manager.go:134" mounts="[{loop /tmp/TestLoopbackMount989607109/001/fs-1621892597  []} {format/ext4 {{ mount 0 }}  []}]" name=id1 testcase=TestLoopbackMount
    helpers.go:100: unmount /tmp/TestLoopbackMount989607109/001/test-mount-3030342351
    manager_linux_test.go:80:
        	Error Trace:	/home/runner/work/containerd/containerd/core/mount/manager/manager_linux_test.go:80
        	            				/home/runner/work/containerd/containerd/core/mount/manager/manager_linux_test.go:105
        	Error:      	Received unexpected error:
        	            	failed to get loop device info: no such device or address
        	Test:       	TestLoopbackMount

To fix this, the test now compares backing-file's inode directly and does not call removeLoop when autoclear is set.

In CI we run make root-test via gotestsum, which executes multiple
package tests concurrently. TestAutoclearTrueLoop attempts to invoke
LOOP_CLR_FD using a device name, which introduces a race condition.

Example race:

Process P1 represents mount.test which runs TestAutoclearTrueLoop
Process P2 represents manager.test which runs TestLoopbackMount

T1: P1 closes fd of loop-device (loop3) (kernel unsets backing-file on close)
T2: P2 gets loop3 from /dev/loop-control
T3: P2 configures loop3 with backing file successfully
T4: P1 invokes removeLoop to clear backing file for loop3

You might see that failure like this

```
=== FAIL: core/mount/manager TestLoopbackMount (0.05s)
    log_hook.go:47: time="2025-10-23T21:49:22.532811960Z" level=debug msg="activating mount" func="manager.(*mountManager).Activate" file="/home/runner/work/containerd/containerd/core/mount/manager/manager.go:134" mounts="[{loop /tmp/TestLoopbackMount989607109/001/fs-1621892597  []} {format/ext4 {{ mount 0 }}  []}]" name=id1 testcase=TestLoopbackMount
    helpers.go:100: unmount /tmp/TestLoopbackMount989607109/001/test-mount-3030342351
    manager_linux_test.go:80:
        	Error Trace:	/home/runner/work/containerd/containerd/core/mount/manager/manager_linux_test.go:80
        	            				/home/runner/work/containerd/containerd/core/mount/manager/manager_linux_test.go:105
        	Error:      	Received unexpected error:
        	            	failed to get loop device info: no such device or address
        	Test:       	TestLoopbackMount
```

To fix this, the test now compares backing-file's inode directly and does
not call removeLoop when autoclear is set.

Signed-off-by: Wei Fu <[email protected]>
@github-project-automation github-project-automation Bot moved this to Needs Triage in Pull Request Review Nov 24, 2025
@fuweid fuweid requested review from dmcgowan and hsiangkao and removed request for dmcgowan November 24, 2025 02:35
@fuweid fuweid requested a review from dmcgowan November 24, 2025 02:35
@fuweid fuweid changed the title core/mount: should not call removeLoop when set autoclear core/mount.test: should not call removeLoop when set autoclear Nov 24, 2025
return dev
}()
for range 10 {
if err := removeLoop(dev); err != nil {
Copy link
Copy Markdown
Member

@hsiangkao hsiangkao Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that can introduce a race since another loop user could reuse it after loop fd is closed with autoclear on.

@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Nov 25, 2025

ping @containerd/reviewers could you please approve this one? It's trying to deflake that test case. thanks

@github-project-automation github-project-automation Bot moved this from Needs Triage to Review In Progress in Pull Request Review Nov 25, 2025
@fuweid fuweid added this pull request to the merge queue Nov 25, 2025
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Nov 25, 2025
@fuweid fuweid added this pull request to the merge queue Nov 25, 2025
Merged via the queue into containerd:main with commit 576b52a Nov 25, 2025
90 of 92 checks passed
@github-project-automation github-project-automation Bot moved this from Review In Progress to Done in Pull Request Review Nov 25, 2025
@fuweid fuweid deleted the fix-loopback-testcase branch November 25, 2025 21:52
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Nov 25, 2025

/cherry-pick release/2.2

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@fuweid: new pull request created: #12577

Details

In response to this:

/cherry-pick release/2.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@fuweid fuweid added the cherry-pick/2.2.x Change to be cherry picked to release/2.2 branch label Nov 25, 2025
@fuweid fuweid added cherry-picked/2.2.x PR commits are cherry-picked into release/2.2 branch and removed cherry-pick/2.2.x Change to be cherry picked to release/2.2 branch labels Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/2.2.x PR commits are cherry-picked into release/2.2 branch kind/test size/M

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants