Skip to content

Commit 6993e89

Browse files
author
Liron Levin
committed
Run privileged containers when userns are specified
Following moby#19995 and moby#17409 this PR enables skipping userns re-mapping when creating a container (or when executing a command). Thus, enabling privileged containers running side by side with userns remapped containers. The feature is enabled by specifying ```--userns:host```, which will not remapped the user if userns are applied. If this flag is not specified, the existing behavior (which blocks specific privileged operation) remains. Signed-off-by: Liron Levin <[email protected]>
1 parent b9361f0 commit 6993e89

12 files changed

Lines changed: 87 additions & 6 deletions

File tree

daemon/container_operations_unix.go

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -218,11 +218,14 @@ func (daemon *Daemon) populateCommand(c *container.Container, env []string) erro
218218
processConfig.Env = env
219219

220220
remappedRoot := &execdriver.User{}
221-
rootUID, rootGID := daemon.GetRemappedUIDGID()
222-
if rootUID != 0 {
223-
remappedRoot.UID = rootUID
224-
remappedRoot.GID = rootGID
221+
if c.HostConfig.UsernsMode.IsPrivate() {
222+
rootUID, rootGID := daemon.GetRemappedUIDGID()
223+
if rootUID != 0 {
224+
remappedRoot.UID = rootUID
225+
remappedRoot.GID = rootGID
226+
}
225227
}
228+
226229
uidMap, gidMap := daemon.GetUIDGIDMaps()
227230

228231
if !daemon.seccompEnabled {

daemon/daemon_unix.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -429,7 +429,7 @@ func verifyPlatformContainerSettings(daemon *Daemon, hostConfig *containertypes.
429429
logrus.Warnf("IPv4 forwarding is disabled. Networking will not work")
430430
}
431431
// check for various conflicting options with user namespaces
432-
if daemon.configStore.RemappedRoot != "" {
432+
if daemon.configStore.RemappedRoot != "" && hostConfig.UsernsMode.IsPrivate() {
433433
if hostConfig.Privileged {
434434
return warnings, fmt.Errorf("Privileged mode is incompatible with user namespaces")
435435
}

docs/reference/api/docker_remote_api.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,7 @@ This section lists each version from latest to oldest. Each listing includes a
125125
* `GET /info` now returns `KernelMemory` field, showing if "kernel memory limit" is supported.
126126
* `POST /containers/create` now takes `PidsLimit` field, if the kernel is >= 4.3 and the pids cgroup is supported.
127127
* `GET /containers/(id or name)/stats` now returns `pids_stats`, if the kernel is >= 4.3 and the pids cgroup is supported.
128+
* `POST /containers/create` now allows you to override usernamespaces remapping and use privileged options for the container.
128129
* `POST /auth` now returns an `IdentityToken` when supported by a registry.
129130

130131
### v1.22 API changes

docs/reference/api/docker_remote_api_v1.23.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -431,6 +431,8 @@ Json Parameters:
431431
The default is not to restart. (optional)
432432
An ever increasing delay (double the previous delay, starting at 100mS)
433433
is added before each restart to prevent flooding the server.
434+
- **UsernsMode** - Sets the usernamespace mode for the container when usernamespace remapping option is enabled.
435+
supported values are: `host`.
434436
- **NetworkMode** - Sets the networking mode for the container. Supported
435437
standard values are: `bridge`, `host`, `none`, and `container:<name|id>`. Any other value is taken
436438
as a custom network's name to which this container should connect to.

docs/reference/commandline/create.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,9 @@ Creates a new container.
8383
--shm-size=[] Size of `/dev/shm`. The format is `<number><unit>`. `number` must be greater than `0`. Unit is optional and can be `b` (bytes), `k` (kilobytes), `m` (megabytes), or `g` (gigabytes). If you omit the unit, the system uses bytes. If you omit the size entirely, the system uses `64m`.
8484
-t, --tty Allocate a pseudo-TTY
8585
-u, --user="" Username or UID
86+
--userns="" Container user namespace
87+
'host': Use the Docker host user namespace
88+
'': Use the Docker daemon user namespace specified by `--userns-remap` option.
8689
--ulimit=[] Ulimit options
8790
--uts="" UTS namespace to use
8891
-v, --volume=[host-src:]container-dest[:<options>]

docs/reference/commandline/daemon.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -750,6 +750,16 @@ following algorithm to create the mapping ranges:
750750
2. Map segments will be created from each range in increasing value with a length matching the length of each segment. Therefore the range segment with the lowest numeric starting value will be equal to the remapped root, and continue up through host uid/gid equal to the range segment length. As an example, if the lowest segment starts at ID 1000 and has a length of 100, then a map of 1000 -> 0 (the remapped root) up through 1100 -> 100 will be created from this segment. If the next segment starts at ID 10000, then the next map will start with mapping 10000 -> 101 up to the length of this second segment. This will continue until no more segments are found in the subordinate files for this user.
751751
3. If more than five range segments exist for a single user, only the first five will be utilized, matching the kernel's limitation of only five entries in `/proc/self/uid_map` and `proc/self/gid_map`.
752752

753+
### Disable user namespace for a container
754+
755+
If you enable user namespaces on the daemon, all containers are started
756+
with user namespaces enabled. In some situations you might want to disable
757+
this feature for a container, for example, to start a privileged container (see
758+
[user namespace known restrictions](#user-namespace-known-restrictions)).
759+
To enable those advanced features for a specific container use `--userns=host`
760+
in the `run/exec/create` command.
761+
This option will completely disable user namespace mapping for the container's user.
762+
753763
### User namespace known restrictions
754764

755765
The following standard Docker features are currently incompatible when

docs/reference/commandline/run.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,9 @@ parent = "smn_cli"
8585
--stop-signal="SIGTERM" Signal to stop a container
8686
-t, --tty Allocate a pseudo-TTY
8787
-u, --user="" Username or UID (format: <name|uid>[:<group|gid>])
88+
--userns="" Container user namespace
89+
'host': Use the Docker host user namespace
90+
'': Use the Docker daemon user namespace specified by `--userns-remap` option.
8891
--ulimit=[] Ulimit options
8992
--uts="" UTS namespace to use
9093
-v, --volume=[host-src:]container-dest[:<options>]

integration-cli/docker_cli_userns_test.go

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,13 @@ func (s *DockerDaemonSuite) TestDaemonUserNamespaceRootSetting(c *check.C) {
3737
gid, err := strconv.Atoi(uidgid[1])
3838
c.Assert(err, checker.IsNil, check.Commentf("Can't parse gid"))
3939

40-
//writeable by the remapped root UID/GID pair
40+
// writable by the remapped root UID/GID pair
4141
c.Assert(os.Chown(tmpDir, uid, gid), checker.IsNil)
4242

4343
out, err := s.d.Cmd("run", "-d", "--name", "userns", "-v", tmpDir+":/goofy", "busybox", "sh", "-c", "touch /goofy/testfile; top")
4444
c.Assert(err, checker.IsNil, check.Commentf("Output: %s", out))
45+
user := s.findUser(c, "userns")
46+
c.Assert(uidgid[0], checker.Equals, user)
4547

4648
pid, err := s.d.Cmd("inspect", "--format='{{.State.Pid}}'", "userns")
4749
c.Assert(err, checker.IsNil, check.Commentf("Could not inspect running container: out: %q", pid))
@@ -62,4 +64,23 @@ func (s *DockerDaemonSuite) TestDaemonUserNamespaceRootSetting(c *check.C) {
6264
c.Assert(err, checker.IsNil)
6365
c.Assert(stat.UID(), checker.Equals, uint32(uid), check.Commentf("Touched file not owned by remapped root UID"))
6466
c.Assert(stat.GID(), checker.Equals, uint32(gid), check.Commentf("Touched file not owned by remapped root GID"))
67+
68+
// use host usernamespace
69+
out, err = s.d.Cmd("run", "-d", "--name", "userns_skip", "--userns", "host", "busybox", "sh", "-c", "touch /goofy/testfile; top")
70+
c.Assert(err, checker.IsNil, check.Commentf("Output: %s", out))
71+
user = s.findUser(c, "userns_skip")
72+
// userns are skipped, user is root
73+
c.Assert(user, checker.Equals, "root")
74+
}
75+
76+
// findUser finds the uid or name of the user of the first process that runs in a container
77+
func (s *DockerDaemonSuite) findUser(c *check.C, container string) string {
78+
out, err := s.d.Cmd("top", container)
79+
c.Assert(err, checker.IsNil, check.Commentf("Output: %s", out))
80+
rows := strings.Split(out, "\n")
81+
if len(rows) < 2 {
82+
// No process rows founds
83+
c.FailNow()
84+
}
85+
return strings.Fields(rows[1])[0]
6586
}

man/docker-create.1.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ docker-create - Create a new container
5858
[**-P**|**--publish-all**]
5959
[**-p**|**--publish**[=*[]*]]
6060
[**--pid**[=*[]*]]
61+
[**--userns**[=*[]*]]
6162
[**--pids-limit**[=*PIDS_LIMIT*]]
6263
[**--privileged**]
6364
[**--read-only**]
@@ -291,6 +292,10 @@ unit, `b` is used. Set LIMIT to `-1` to enable unlimited swap.
291292
**host**: use the host's PID namespace inside the container.
292293
Note: the host mode gives the container full access to local PID and is therefore considered insecure.
293294

295+
**--userns**=""
296+
Set the usernamespace mode for the container when `userns-remap` option is enabled.
297+
**host**: use the host usernamespace and enable all privileged options (e.g., `pid=host` or `--privileged`).
298+
294299
**--pids-limit**=""
295300
Tune the container's pids limit. Set `-1` to have unlimited pids for the container.
296301

man/docker-run.1.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ docker-run - Run a command in a new container
6060
[**-P**|**--publish-all**]
6161
[**-p**|**--publish**[=*[]*]]
6262
[**--pid**[=*[]*]]
63+
[**--userns**[=*[]*]]
6364
[**--pids-limit**[=*PIDS_LIMIT*]]
6465
[**--privileged**]
6566
[**--read-only**]
@@ -421,6 +422,10 @@ Use `docker port` to see the actual mapping: `docker port CONTAINER $CONTAINERPO
421422
**host**: use the host's PID namespace inside the container.
422423
Note: the host mode gives the container full access to local PID and is therefore considered insecure.
423424

425+
**--userns**=""
426+
Set the usernamespace mode for the container when `userns-remap` option is enabled.
427+
**host**: use the host usernamespace and enable all privileged options (e.g., `pid=host` or `--privileged`).
428+
424429
**--pids-limit**=""
425430
Tune the container's pids limit. Set `-1` to have unlimited pids for the container.
426431

0 commit comments

Comments
 (0)