Skip to content

cri: Fix image volumes with user namespaces#12816

Merged
estesp merged 1 commit intocontainerd:mainfrom
AutuSnow:fix-image-volume-userns
Feb 10, 2026
Merged

cri: Fix image volumes with user namespaces#12816
estesp merged 1 commit intocontainerd:mainfrom
AutuSnow:fix-image-volume-userns

Conversation

@AutuSnow
Copy link
Copy Markdown
Contributor

/kind bug
Fixes image volumes failing with "invalid argument" error when used with user namespaces.

Fixes #12812

This PR fixes a compatibility issue between image volumes and user namespaces by:

  1. Retrieving the sandbox's user namespace configuration
  2. Creating idmap labels using the existing snapshotterRemapOpts() function
  3. Passing these labels to the snapshotter's Prepare() call
  4. Allowing the overlay snapshotter to apply uidmap/gidmap options to lower layers
    
    It only affects pods with hostUsers: false and image volumes.

@AutuSnow
Copy link
Copy Markdown
Contributor Author

/release-note-none

@dosubot dosubot Bot added the area/cri Container Runtime Interface (CRI) label Jan 24, 2026
@AutuSnow
Copy link
Copy Markdown
Contributor Author

AutuSnow commented Jan 26, 2026

@rata May have a look?

Copy link
Copy Markdown
Contributor

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AutuSnow awesome, thanks a lot for the PR!

It seems good, left a small comment. Can we have an end to end test? Is this tested in critools too? In the future we should add tests there too

Comment on lines +106 to +118
// Check if the sandbox has Linux security context configured
if sandbox.Config.GetLinux() == nil {
return nil, nil
}

// Get namespace options and create idmap labels if user namespace is enabled
nsOpts := sandbox.Config.GetLinux().GetSecurityContext().GetNamespaceOptions()
snapshotOpts, err := snapshotterRemapOpts(nsOpts)
if err != nil {
return nil, fmt.Errorf("failed to get snapshotter remap options: %w", err)
}

return snapshotOpts, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should call snapshotterOpts() that already calls snapshotterRemapOpts() and gets the usrens config things. Once we have the config from the sandbox, just call snapshotterOpts() and return that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, Sorry I ignored this method

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should call snapshotterOpts() that already calls snapshotterRemapOpts() and gets the usrens config things. Once we have the config from the sandbox, just call snapshotterOpts() and return that?

Please ignore my reply above. I carefully examined the snapshotterOpts method and found that there are differences in parameter types. In addition, at the call point of mutateImageMount, only the sandbox ID is available, no container configuration, and the semantics look different. Image volume should use the sandbox configuration, and snapshotterOpts seems to be designed for container level rootfs snapshots

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, okay, I see the sandbox config vs container config thing. Next time please make explicit the difference if you don't mind.

But the mutateMounts() thing is called from createContainerRequest() where we do have all that info, so we can easily share the container config with those functions, if we want, right? I think we should pass what the function needs, instead of going from the sandbox via accessors to get the config.

I'm unsure if we should use the podconfig for this or not. Long ago I've added a commit to check podconfig and containerconfig are consistent: #12816 (comment).

I'm unsure what we should do. What do you think of passing the container config and just use that for the mappings?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, okay, I see the sandbox config vs container config thing. Next time please make explicit the difference if you don't mind.

But the mutateMounts() thing is called from createContainerRequest() where we do have all that info, so we can easily share the container config with those functions, if we want, right? I think we should pass what the function needs, instead of going from the sandbox via accessors to get the config.

I'm unsure if we should use the podconfig for this or not. Long ago I've added a commit to check podconfig and containerconfig are consistent: #12816 (comment).

I'm unsure what we should do. What do you think of passing the container config and just use that for the mappings?

It seems that the commit address you posted is not correct
In my opinion, there are several considerations

  1. Image volume is a pod level resource, and sandbox config should be used
  2. If the two are consistent, container config can also work
    These two are the trade-offs between the readability of the reader's code and the reusability of the code. In my opinion, resources at the pod level should be configured at the pod level, which will benefit the later scalability

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, the mappings are part of the mount message from cri! : https://github.com/kubernetes/cri-api/blob/ccff438b5dae5751e028196c6d4e167b0f745122/pkg/apis/runtime/v1/api.proto#L251-L253.

We should just use the mappings from there.

We should already have the mappings in mutateImageMount():

. That is the cri runtime mount struct, that has the field.

So we should just get it from there. Sorry I didn't realize yesterday, it was a crazy day.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, the mappings are part of the mount message from cri! : https://github.com/kubernetes/cri-api/blob/ccff438b5dae5751e028196c6d4e167b0f745122/pkg/apis/runtime/v1/api.proto#L251-L253.

We should just use the mappings from there.

We should already have the mappings in mutateImageMount():

. That is the cri runtime mount struct, that has the field.
So we should just get it from there. Sorry I didn't realize yesterday, it was a crazy day.

ye, I understand now. Maybe I was too sleepy yesterday, so I didn't find it, or I was confused

@AutuSnow AutuSnow requested a review from rata January 27, 2026 10:27
Copy link
Copy Markdown
Contributor

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AutuSnow I reviewed a few minutes before you requested already :)

@AutuSnow
Copy link
Copy Markdown
Contributor Author

@rata I have modified it

@rata
Copy link
Copy Markdown
Contributor

rata commented Jan 28, 2026

@AutuSnow did you forget to push? I don't see any changes regarding this

@AutuSnow AutuSnow force-pushed the fix-image-volume-userns branch 2 times, most recently from ffce9b5 to 8b70ec4 Compare January 28, 2026 11:57
@AutuSnow
Copy link
Copy Markdown
Contributor Author

@AutuSnow did you forget to push? I don't see any changes regarding this

It seems that I forgot to push it just now. Now I can look at it again. It's really a forgetful brain

@AutuSnow
Copy link
Copy Markdown
Contributor Author

@rata I think the CI reports an error because of insufficient memory

Copy link
Copy Markdown
Contributor

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, delete the .idea folder, I guess it was added by mistake.

Otherwise it looks fine, but there is a bug. CI is failing the test you added, not OOM. Search the logs (using the github search, not the one on the browser) for "fail:". You will see the test is failing: https://github.com/containerd/containerd/actions/runs/21437274115/job/61731279761?pr=12816#step:20:2230

Can you please fix it (not removing the assert, of course :D)

Comment thread .idea/workspace.xml Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, delete this file

Comment thread integration/image_volume_linux_test.go Outdated
stdout, stderr, err = runtimeService.ExecSync(cnID, []string{"stat", "-c", "%u:%g", "/image-mount/pause"}, 0)
require.NoError(t, err, "failed to stat file in image volume")
require.Len(t, stderr, 0)
require.Contains(t, string(stdout), "0:0", "files in image volume should appear as owned by root in container's user namespace")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will solve it immediately

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Just ping me when this is solved, so I take another look :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Just ping me when this is solved, so I take another look :)

I will give priority to solving this problem when I get up tomorrow, and you will see the perfect CI results tomorrow

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is nice. But please do have a good rest :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is nice. But please do have a good rest :)

Thank you for your concern. Unit testing seems to be ready now

@github-project-automation github-project-automation Bot moved this from Needs Triage to Needs Update in Pull Request Review Jan 28, 2026
@AutuSnow AutuSnow force-pushed the fix-image-volume-userns branch from 073b6a1 to a84b55c Compare January 28, 2026 15:32
@AutuSnow AutuSnow force-pushed the fix-image-volume-userns branch from 557d717 to 65028d7 Compare January 29, 2026 02:20
@AutuSnow AutuSnow force-pushed the fix-image-volume-userns branch 2 times, most recently from 0509732 to 308e7a9 Compare January 29, 2026 03:02
Copy link
Copy Markdown
Contributor

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AutuSnow thanks a lot! It looks quite well, left some small comments on the tests.

Do you happen to have a k8s pod yaml handy, so I can test this locally too?

Comment thread integration/image_volume_linux_test.go Outdated
Comment on lines +390 to +393
stdout, stderr, err = runtimeService.ExecSync(cnID, []string{"stat", "-c", "%u:%g", "/image-mount/pause"}, 0)
require.NoError(t, err, "failed to stat file in image volume")
require.Len(t, stderr, 0)
require.Contains(t, string(stdout), "0:0", "files in image volume should appear as owned by root in container's user namespace")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be true if it returns 1000:05. In other tests (I think in the k/k repo) we are using stat -c '=%u=%g='. This will print something like =0=0=. This way you can match for that string exactly and we can be sure the UID is 0 and not some bigger number that ends with 0 (like 1000).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be true if it returns 1000:05. In other tests (I think in the k/k repo) we are using stat -c '=%u=%g='. This will print something like =0=0=. This way you can match for that string exactly and we can be sure the UID is 0 and not some bigger number that ends with 0 (like 1000).

Oh, my god, why didn't I take this into consideration, but I have finished the modification now

Comment on lines +41 to +47
UidMappings: []*runtime.IDMapping{
{
ContainerId: 0,
HostId: 65536,
Length: 65536,
},
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move this to a var just before the for, like mappings := []*runtime.IDMapiing{ ... . And the assign it here for uidMappings and GidMappings, and all the other test cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}

func TestGetImageVolumeSnapshotOpts_InvalidMappings(t *testing.T) {
Copy link
Copy Markdown
Contributor

@rata rata Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move these cases to the previous function and set the expectError bool? Am I missing something?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@AutuSnow
Copy link
Copy Markdown
Contributor Author

@AutuSnow thanks a lot! It looks quite well, left some small comments on the tests.

Do you happen to have a k8s pod yaml handy, so I can test this locally too?

  apiVersion: v1
  kind: Pod
  metadata:
    name: test-image-volume-userns
  spec:
    hostUsers: false
    containers:
    - name: test
      image: registry.k8s.io/pause:3.9
      volumeMounts:
      - name: image-vol
        mountPath: /image-mountvolumes:
    - name: image-vol
      image:
        reference: registry.k8s.io/pause:3.9
        pullPolicy: IfNotPresent

Does this suit your idea?

@AutuSnow AutuSnow force-pushed the fix-image-volume-userns branch from 99046ff to 9c0c878 Compare January 29, 2026 12:56
@AutuSnow AutuSnow force-pushed the fix-image-volume-userns branch 2 times, most recently from 673e987 to bcd3720 Compare February 4, 2026 13:22
@AutuSnow
Copy link
Copy Markdown
Contributor Author

AutuSnow commented Feb 6, 2026

@rata Can you help me take another look

Copy link
Copy Markdown
Contributor

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration test is broken, it always skips and never runs. See for example the run on CI: https://github.com/containerd/containerd/actions/runs/21673112539/job/62486266065?pr=12816#step:20:2197. See my comment for more details.

Left some other comments. This is getting close. But please, can you verify things work in a k8s pod and all tests run fine before asking for review next time? :)

If you need help with something, don't hesitate to ask. Of course.

You can run the integration test locally with this:

PATH=/sbin:$PATH FOCUS=TestImageVolumeWithUserNamespace BUILDTAGS=no_btrfs  make cri-integration

Comment thread internal/cri/server/container_image_mount_linux.go Outdated
Comment thread internal/cri/server/container_image_mount_linux.go
Comment thread internal/cri/server/container_image_mount_linux_test.go Outdated
Comment thread internal/cri/server/container_image_mount_linux_test.go Outdated
Comment thread integration/image_volume_linux_test.go Outdated
@AutuSnow AutuSnow force-pushed the fix-image-volume-userns branch from bcd3720 to db9546b Compare February 6, 2026 15:41
Copy link
Copy Markdown
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid fuweid added the cherry-pick/2.2.x Change to be cherry picked to release/2.2 branch label Feb 7, 2026
Copy link
Copy Markdown
Contributor

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @AutuSnow Thanks a lot for the patience, all the iterations and the quick responses!

@AutuSnow
Copy link
Copy Markdown
Contributor Author

AutuSnow commented Feb 9, 2026

LGTM! @AutuSnow Thanks a lot for the patience, all the iterations and the quick responses!

I feel that your approval is what I deserve, and I am happy that containerd has become better

@AutuSnow AutuSnow requested a review from rata February 9, 2026 09:35
Copy link
Copy Markdown
Contributor

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-project-automation github-project-automation Bot moved this from Needs Update to Review In Progress in Pull Request Review Feb 10, 2026
@estesp estesp added this pull request to the merge queue Feb 10, 2026
Merged via the queue into containerd:main with commit 81ac9a8 Feb 10, 2026
54 checks passed
@github-project-automation github-project-automation Bot moved this from Review In Progress to Done in Pull Request Review Feb 10, 2026
@estesp
Copy link
Copy Markdown
Member

estesp commented Feb 10, 2026

/cherry-pick release/2.2

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@estesp: new pull request created: #12885

Details

In response to this:

/cherry-pick release/2.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@rata
Copy link
Copy Markdown
Contributor

rata commented Feb 11, 2026

@estesp do we backport this to 2.1 too? Support for OCI images as source volumes was added in this PR: #10579. That commit is part of 2.1.0 too (git tag --contains 1ec10d9ae7535ddd7b18e3c21b6cd8ff12a2f90d shows it).

But the patch doesn't apply cleanly there.

@estesp
Copy link
Copy Markdown
Member

estesp commented Feb 11, 2026

@estesp do we backport this to 2.1 too? Support for OCI images as source volumes was added in this PR: #10579. That commit is part of 2.1.0 too (git tag --contains 1ec10d9ae7535ddd7b18e3c21b6cd8ff12a2f90d shows it).

But the patch doesn't apply cleanly there.

If we had support in 2.1 then the fix should go there as well. @fuweid had set the label on the original PR to just 2.2 so I hadn't personally looked if it was applicable there.

If it won't apply cleanly, then someone will need to do a manual backport and create the PR for release/2.1 branch.

@rata
Copy link
Copy Markdown
Contributor

rata commented Feb 13, 2026

@estesp verified it is affected with a real pod, and opened the backport that fixes it: #12894. Thanks!

@chrishenzie chrishenzie added cherry-picked/2.1.x PR commits are cherry picked into the release/2.1 branch cherry-picked/2.2.x PR commits are cherry-picked into release/2.2 branch and removed cherry-pick/2.2.x Change to be cherry picked to release/2.2 branch labels Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cri Container Runtime Interface (CRI) cherry-picked/2.1.x PR commits are cherry picked into the release/2.1 branch cherry-picked/2.2.x PR commits are cherry-picked into release/2.2 branch kind/bug size/L

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

OCI/Image Volume Source doesn't work with user namespaces

7 participants