cri: Fix image volumes with user namespaces#12816
Conversation
|
/release-note-none |
|
@rata May have a look? |
| // Check if the sandbox has Linux security context configured | ||
| if sandbox.Config.GetLinux() == nil { | ||
| return nil, nil | ||
| } | ||
|
|
||
| // Get namespace options and create idmap labels if user namespace is enabled | ||
| nsOpts := sandbox.Config.GetLinux().GetSecurityContext().GetNamespaceOptions() | ||
| snapshotOpts, err := snapshotterRemapOpts(nsOpts) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get snapshotter remap options: %w", err) | ||
| } | ||
|
|
||
| return snapshotOpts, nil |
There was a problem hiding this comment.
I think this should call snapshotterOpts() that already calls snapshotterRemapOpts() and gets the usrens config things. Once we have the config from the sandbox, just call snapshotterOpts() and return that?
There was a problem hiding this comment.
ye, Sorry I ignored this method
There was a problem hiding this comment.
I think this should call
snapshotterOpts()that already callssnapshotterRemapOpts()and gets the usrens config things. Once we have the config from the sandbox, just callsnapshotterOpts()and return that?
Please ignore my reply above. I carefully examined the snapshotterOpts method and found that there are differences in parameter types. In addition, at the call point of mutateImageMount, only the sandbox ID is available, no container configuration, and the semantics look different. Image volume should use the sandbox configuration, and snapshotterOpts seems to be designed for container level rootfs snapshots
There was a problem hiding this comment.
Ohh, okay, I see the sandbox config vs container config thing. Next time please make explicit the difference if you don't mind.
But the mutateMounts() thing is called from createContainerRequest() where we do have all that info, so we can easily share the container config with those functions, if we want, right? I think we should pass what the function needs, instead of going from the sandbox via accessors to get the config.
I'm unsure if we should use the podconfig for this or not. Long ago I've added a commit to check podconfig and containerconfig are consistent: #12816 (comment).
I'm unsure what we should do. What do you think of passing the container config and just use that for the mappings?
There was a problem hiding this comment.
Ohh, okay, I see the sandbox config vs container config thing. Next time please make explicit the difference if you don't mind.
But the
mutateMounts()thing is called fromcreateContainerRequest()where we do have all that info, so we can easily share the container config with those functions, if we want, right? I think we should pass what the function needs, instead of going from the sandbox via accessors to get the config.I'm unsure if we should use the podconfig for this or not. Long ago I've added a commit to check podconfig and containerconfig are consistent: #12816 (comment).
I'm unsure what we should do. What do you think of passing the container config and just use that for the mappings?
It seems that the commit address you posted is not correct
In my opinion, there are several considerations
- Image volume is a pod level resource, and sandbox config should be used
- If the two are consistent, container config can also work
These two are the trade-offs between the readability of the reader's code and the reusability of the code. In my opinion, resources at the pod level should be configured at the pod level, which will benefit the later scalability
There was a problem hiding this comment.
Wait, the mappings are part of the mount message from cri! : https://github.com/kubernetes/cri-api/blob/ccff438b5dae5751e028196c6d4e167b0f745122/pkg/apis/runtime/v1/api.proto#L251-L253.
We should just use the mappings from there.
We should already have the mappings in mutateImageMount():
So we should just get it from there. Sorry I didn't realize yesterday, it was a crazy day.
There was a problem hiding this comment.
Wait, the mappings are part of the mount message from cri! : https://github.com/kubernetes/cri-api/blob/ccff438b5dae5751e028196c6d4e167b0f745122/pkg/apis/runtime/v1/api.proto#L251-L253.
We should just use the mappings from there.
We should already have the mappings in
mutateImageMount():. That is the cri runtime mount struct, that has the field.
So we should just get it from there. Sorry I didn't realize yesterday, it was a crazy day.
ye, I understand now. Maybe I was too sleepy yesterday, so I didn't find it, or I was confused
d998ed2 to
9032f0d
Compare
|
@rata I have modified it |
ffce9b5 to
8b70ec4
Compare
|
@rata I think the CI reports an error because of insufficient memory |
rata
left a comment
There was a problem hiding this comment.
Please, delete the .idea folder, I guess it was added by mistake.
Otherwise it looks fine, but there is a bug. CI is failing the test you added, not OOM. Search the logs (using the github search, not the one on the browser) for "fail:". You will see the test is failing: https://github.com/containerd/containerd/actions/runs/21437274115/job/61731279761?pr=12816#step:20:2230
Can you please fix it (not removing the assert, of course :D)
| stdout, stderr, err = runtimeService.ExecSync(cnID, []string{"stat", "-c", "%u:%g", "/image-mount/pause"}, 0) | ||
| require.NoError(t, err, "failed to stat file in image volume") | ||
| require.Len(t, stderr, 0) | ||
| require.Contains(t, string(stdout), "0:0", "files in image volume should appear as owned by root in container's user namespace") |
There was a problem hiding this comment.
There was a problem hiding this comment.
I will solve it immediately
There was a problem hiding this comment.
No problem. Just ping me when this is solved, so I take another look :)
There was a problem hiding this comment.
No problem. Just ping me when this is solved, so I take another look :)
I will give priority to solving this problem when I get up tomorrow, and you will see the perfect CI results tomorrow
There was a problem hiding this comment.
That is nice. But please do have a good rest :)
There was a problem hiding this comment.
That is nice. But please do have a good rest :)
Thank you for your concern. Unit testing seems to be ready now
073b6a1 to
a84b55c
Compare
557d717 to
65028d7
Compare
0509732 to
308e7a9
Compare
| stdout, stderr, err = runtimeService.ExecSync(cnID, []string{"stat", "-c", "%u:%g", "/image-mount/pause"}, 0) | ||
| require.NoError(t, err, "failed to stat file in image volume") | ||
| require.Len(t, stderr, 0) | ||
| require.Contains(t, string(stdout), "0:0", "files in image volume should appear as owned by root in container's user namespace") |
There was a problem hiding this comment.
This will be true if it returns 1000:05. In other tests (I think in the k/k repo) we are using stat -c '=%u=%g='. This will print something like =0=0=. This way you can match for that string exactly and we can be sure the UID is 0 and not some bigger number that ends with 0 (like 1000).
There was a problem hiding this comment.
This will be true if it returns
1000:05. In other tests (I think in the k/k repo) we are usingstat -c '=%u=%g='. This will print something like=0=0=. This way you can match for that string exactly and we can be sure the UID is 0 and not some bigger number that ends with 0 (like 1000).
Oh, my god, why didn't I take this into consideration, but I have finished the modification now
| UidMappings: []*runtime.IDMapping{ | ||
| { | ||
| ContainerId: 0, | ||
| HostId: 65536, | ||
| Length: 65536, | ||
| }, | ||
| }, |
There was a problem hiding this comment.
You can move this to a var just before the for, like mappings := []*runtime.IDMapiing{ ... . And the assign it here for uidMappings and GidMappings, and all the other test cases.
| } | ||
| } | ||
|
|
||
| func TestGetImageVolumeSnapshotOpts_InvalidMappings(t *testing.T) { |
There was a problem hiding this comment.
Can we move these cases to the previous function and set the expectError bool? Am I missing something?
apiVersion: v1
kind: Pod
metadata:
name: test-image-volume-userns
spec:
hostUsers: false
containers:
- name: test
image: registry.k8s.io/pause:3.9
volumeMounts:
- name: image-vol
mountPath: /image-mountvolumes:
- name: image-vol
image:
reference: registry.k8s.io/pause:3.9
pullPolicy: IfNotPresentDoes this suit your idea? |
99046ff to
9c0c878
Compare
673e987 to
bcd3720
Compare
|
@rata Can you help me take another look |
rata
left a comment
There was a problem hiding this comment.
The integration test is broken, it always skips and never runs. See for example the run on CI: https://github.com/containerd/containerd/actions/runs/21673112539/job/62486266065?pr=12816#step:20:2197. See my comment for more details.
Left some other comments. This is getting close. But please, can you verify things work in a k8s pod and all tests run fine before asking for review next time? :)
If you need help with something, don't hesitate to ask. Of course.
You can run the integration test locally with this:
PATH=/sbin:$PATH FOCUS=TestImageVolumeWithUserNamespace BUILDTAGS=no_btrfs make cri-integration
Signed-off-by: qiuxue <[email protected]>
bcd3720 to
db9546b
Compare
I feel that your approval is what I deserve, and I am happy that containerd has become better |
|
/cherry-pick release/2.2 |
|
@estesp: new pull request created: #12885 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
If we had support in 2.1 then the fix should go there as well. @fuweid had set the label on the original PR to just 2.2 so I hadn't personally looked if it was applicable there. If it won't apply cleanly, then someone will need to do a manual backport and create the PR for |
/kind bug
Fixes image volumes failing with "invalid argument" error when used with user namespaces.
Fixes #12812
This PR fixes a compatibility issue between image volumes and user namespaces by:
snapshotterRemapOpts()functionPrepare()call
It only affects pods with
hostUsers: falseand image volumes.