Skip to content

docker container is privileged tries to assume more capabilities than available #42906

@smira

Description

@smira

tl;dr is that docker run --privileged tries to assign all known caps, while dockerd itself might not have all caps already.

Background: Talos in new version 0.13 started dropping two capabilities (kexec + module loading) from all processes but PID 1. Talos itself doesn't use dockerd, but if I launch privileged pod on Kubernetes with docker:20.10-dind image, I can't run any privileged container inside:

/ # docker run -it --rm --privileged alpine
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: apply caps: operation not permitted: unknown.

The problem starts in

p.Capabilities.Bounding = caps.GetAllCapabilities()

Which essentially uses the list of capabilities built here:

moby/oci/caps/utils.go

Lines 23 to 37 in 306fa44

func init() {
last := capability.CAP_LAST_CAP
rawCaps := capability.List()
allCaps = make([]string, min(int(last+1), len(rawCaps)))
capabilityList = make(map[string]*capability.Cap, len(rawCaps))
for i, c := range rawCaps {
capName := "CAP_" + strings.ToUpper(c.String())
if c > last {
capabilityList[capName] = nil
continue
}
allCaps[i] = capName
capabilityList[capName] = &c
}
}

This list contains every capability present on the host, which might be not be true (as some capabilities might have already been dropped).

Containerd OCI does proper thing:

https://github.com/containerd/containerd/blob/d193dc2b8afb1467255cea5326e9807514f94c0f/pkg/cap/cap_linux.go#L123-L136

I'm happy to send a PR, but what is the best way to solve this in the docker codebase?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions