Skip to content

Proposal: Entitlements in Moby #32801

@n4ss

Description

@n4ss

Entitlements in Moby

This issue captures a draft design proposal for an entitlement mechanism that can be leveraged by Moby and other container management platforms to describe what additional permissions a specific service should be allowed to have when executing.

An entitlement is a single right granted to a particular service/container that gives it additional permissions above and beyond what it would ordinarily have. An entitlement is a piece of configuration information included in the service spec, telling the container engine that executes a service allow access to certain resources or perform certain operations. In effect, an entitlement extends the sandbox and capabilities of your service to allow a particular operation to occur.

The Docker CLI currently supports over 100 command-line flags. By implementing an entitlement mechanism we plan to allow downstream consumes of moby such as docker to unify all of the security related flags into a single mechanism that is granular enough to be useful, platform independent, and understandable by a non-expert user. This mechanism can be seen as Moby's equivalent of Apple’s App Store permission model, where apps are granted capabilities to operate beyond the normal privileged of an application, such as access to the keychain or the ability of enabling push-notifications.

Goal

The goal is to simplify the way downstream users ask for permissions for their containers/services.

In order for a service to use a specific entitlement, access to that entitlement has to be granted. The objective is to have a grant mechanism that would look like this:

Specifying 1+ entitlement on the command-line:

docker run --entitlements=[entitlement1] --entitlements=[entitlement2] alpine
docker service create --entitlements=[entitlement] alpine

Current proposal for entitlements

Entitlement Privileges Capabilities Blocked syscalls On Windows
api.access API access management, defaults provided for the Engine and Swarm APIs
host.processes.none Do not share host's PID namespace N/A
host.processes.admin Shares host's PID namespace N/A
host.devices.none RO for sysfs, No additional non-default mounts; No RW on /proc/kcore N/A
host.devices.view RO on non-default mounts N/A
host.devices.mount Add SYS_ADMIN and allow a device to be mounted in. N/A
network.none No access to /proc/pid/net, /proc/sys/net; No access to /sys/class/net No NET_ADMIN, NET_BIND_SERVICE, NET_RAW, NET_BROADCAST socket, socketpair, setsockopt, getsockopt, getsockname, getpeername, bind, listen, accept, accept4, connect, shutdown,recvfrom, recvmsg, sendto, sendmsg, sendmmsg, sethostname, setdomainname, bpf
network.user CAP_NET_RAW, CAP_NET_BIND_SERVICE?, CAP_NET_BROADCAST? sethostname, setdomainname bpf, setsockopt(SO_DEBUG)
network.proxy Add: CAP_NET_RAW, CAP_NET_BROADCAST, CAP_NET_BIND_SERVICE, CAP_NET_BIND_SERVICE, CAP_NET_RAW
network.admin CAP_NET_ADMIN, CAP_NET_BROADCAST, CAP_NET_BIND_SERVICE, CAP_NET_RAW
security.confined Block access to sentitive paths: /sys/kernel/security , /sys/kernel/debug (ftrace), /sys/kernel/livepatch, /sys/fs/selinux, /sys/fs/cgroup, debugfs, securityfs, selinuxfs, /proc/sys/kernel/, /proc/config.gz, /boot, /proc/{mem,cpu,kcore,kmem,sysrq-trigger,bus}
No MAC/DAC policy read/write or configuration/state change
NoNewPrivileges activated
Drop:No CAP_MAC_*, CAP_DAC_*, CAP_SETPCAP, SYS_PTRACE, CAP_SET_*, CAP_FSETID, CAP_SYS_ADMIN bpf, ptrace, seccomp, arch_prctl, personality, setuid/setgid?, madvise, prctl(PR_CAPBSET_DROP, PR_SET_*, ..)
security.view Read Only rights on sensitive filesystems / fs directories and MAC/DAC policies Add:CAP_MAC_*, CAP_DAC_*, CAP_SETPCAP
Drop: CAP_LINUX_IMMUTABLE?, CAP_SET_*, CAP_FSETID, , SYS_PTRACE, CAP_SYS_ADMIN
security.admin Add: CAP_MAC_*, CAP_DAC_*, CAP_LINUX_IMMUTABLE, CAP_SYS_MODULE, CAP_SYS_PTRACE, CAP_SYSLOG, CAP_FSETID, CAP_SYS_BOOT
security.read-only Mounts the container's filesystem as read-only
resources.limit=value value is a percentage of available resources in the container at launch context for: Pids, perf_event, blkio, hugetlb, freezer, net_cls, net_prio, cpuset, memory, systemd. Set ulimits properly
debug security.unconfined Add:CAP_SYS_ADMIN, CAP_SYS_PTRACE, CAP_SYSLOG

Examples of usecases

Following examples are meant to show how to use entitlements for the most downloaded images on Docker Hub. They will probably be edited as new entitlements will probably be added and/or adjusted.

People used to run the following command in various use-cases:

docker run --privileged imagename:label

We impact privileges by reducing the amount of rights granted, let’s see some examples:

People who need to use raw sockets to control link discovery and aggregation
Before:

docker run --privileged image:label

After:

docker run --entitlements=network.proxy image:label

Docker in Docker would probably look like:
Before:

docker run --privileged docker:dind

After:

docker run --entitlements=network.admin --entitlements=host.devices.admin --entitlements=security.admin docker:dind

Long term, you should be able to tie a set of entitlements to an image if you want to as a publisher so the more example, the better.

Open Questions

Can the same privilege name mean different things for docker version to docker version? We would be treating entitlements as we treat default profiles.

Do all these entitlements also make sense on windows?

Should we provide a way to create custom entitlements?

What we need from the community

  • Validation that these entitlements are: high-level enough, non-overlapping and correctly implemented using the lower level primitives available from Linux
  • Examples of use-cases where the entitlements fit/don't fit, and if they add too much privilege over what users are usually doing Don't unmount entire plugin manager tree on remove #33422

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/securityarea/swarmkind/featureFunctionality or other elements that the project doesn't currently have. Features are new and shiny

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions