Skip to content

--net=host fails for Docker-in-Docker with sysbox #45681

@spikecurtis

Description

@spikecurtis

Description

If we run Docker in Docker with the outer container using the sysbox container runtime, then in the inner container if we use --net=host it fails like:

$ docker run -it --rm --net=host alpine:latest whoami
Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
8a49fdb3b6a5: Pull complete 
Digest: sha256:02bb6f428431fbc2809c5d1b41eab5a68350194fb508869a33cb1af4444c9b11
Status: Downloaded newer image for alpine:latest
docker: Error response from daemon: failed to create default sandbox: permission denied.

git-bisecting, I've determined that 0246332 introduces the bug.

I think the relevant code is

func createNetworkNamespace(path string, osCreate bool) error {
	if err := createNamespaceFile(path); err != nil {
		return err
	}

	do := func() error {
		return mountNetworkNamespace(fmt.Sprintf("/proc/self/task/%d/ns/net", unix.Gettid()), path)
	}
	if osCreate {
		return unshare.Go(unix.CLONE_NEWNET, do, nil)
	}
	return do()
}

When you use --net=host, osCreate=false and so do() is called without unshare. I think what's happening is that there is a subtle bug where unix.Gettid() is called to fill out the /proc/ path, but without unshare the actual system call happens on a different thread. On a "regular" Docker install, with the daemon running in the root namespace, Linux lets this mount system call succeed, but inside sysbox, mounting the network namespace of one thread from a different thread is forbidden.

When I change the mount call to use /proc/thread-self/ns/net instead, things work fine. PR coming shortly.

Reproduce

  1. Install Docker and sysbox
  2. Run a generic Linux inner container, e.g. Ubuntu, with the sysbox container runtime
  3. In the inner container, install Docker v24
  4. In the inner container, docker run -it --rm --net=host alpine:latest whoami

Expected behavior

$ docker run -it --rm --net=host alpine:latest whoami
root

docker version

Client: Docker Engine - Community
 Version:           24.0.2
 API version:       1.43
 Go version:        go1.20.4
 Git commit:        cb74dfc
 Built:             Thu May 25 21:52:22 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.2
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.4
  Git commit:       659604f
  Built:            Thu May 25 21:52:22 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.21
  GitCommit:        3dce8eb055cbb6872793272b4f20ed16117344f8
 runc:
  Version:          1.1.7
  GitCommit:        v1.1.7-0-g860f061
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    24.0.2
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.5
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 31
 Server Version: 24.0.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: v1.1.7-0-g860f061
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.4.0-1095-gke
 Operating System: Ubuntu 20.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 62.81GiB
 Name: dogfood
 ID: b35b4709-efd9-43f2-804f-5c95ddc10fb9
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://mirror.gcr.io/
 Live Restore Enabled: false

WARNING: No swap limit support

Additional Info

docker version and docker info are for the inner Docker daemon

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/networkingNetworkingkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.status/0-triageversion/24.0

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions