Description
When you have the docker daemon configured with "--bridge none", and you launch containers via "docker run" without passing a --network option, then stale network sandbox accumulates, and on the next restart of the docker daemon, docker will remove these stale network sandboxes and this can substantially slow docker startup time when you have a lot of these or when docker is running with limited resources.
I started investigating this issue since at some point I had a machine where docker took quite a minutes to start and I was wondering what was going on.
Granted, it might be considered a bit bogus to have the combination of "--bridge none" on the docker daemon and run (some) containers without specifying the --network option, which is something I've adjusted (to explicitly pass --network none). But the docker behavior seems a bit misleading and I wonder if this doesn't hide another issue, where network sandboxes should not become stale.
Reproduce
Given I have the following in /etc/docker/daemon.json
And I restart docker
And I launch a few containers using something like:
for i in $(seq 50); do docker run --rm busybox echo $i; done
When I restart docker
Then I see in the docker logs 50 occurrences of Removing stale sandbox <sandbox ID> (<container ID>) (e.g. ""Removing stale sandbox d7548f4571ba16a4b925a11b75088f36148164993e43085c680075257dbb1c9f (9a64aa6aeadc4
817088db55c6ac520b69e275e248045d895ee6fac283e762823)"
And ultimately docker takes more time to start (this is particularly visible on low-power machine and when there's many (>1000) stable sandboxes)
Expected behavior
No stale sandbox removed when docker is restarted.
docker version
Client: Docker Engine - Community
Version: 24.0.6
API version: 1.43
Go version: go1.20.7
Git commit: ed223bc
Built: Mon Sep 4 12:32:10 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 24.0.6
API version: 1.43 (minimum version 1.12)
Go version: go1.20.7
Git commit: 1a79695
Built: Mon Sep 4 12:32:10 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.24
GitCommit: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
runc:
Version: 1.1.9
GitCommit: v1.1.9-0-gccaecfc
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client: Docker Engine - Community
Version: 24.0.6
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.2
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.21.0
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 1
Running: 0
Paused: 0
Stopped: 1
Images: 5
Server Version: 24.0.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
runc version: v1.1.9-0-gccaecfc
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.1.0-12-amd64
Operating System: Debian GNU/Linux 12 (bookworm)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 724.1MiB
Name: debian
ID: 175bab96-368d-4b39-8d68-57ed259ab284
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional Info
No response
Description
When you have the docker daemon configured with "--bridge none", and you launch containers via "docker run" without passing a --network option, then stale network sandbox accumulates, and on the next restart of the docker daemon, docker will remove these stale network sandboxes and this can substantially slow docker startup time when you have a lot of these or when docker is running with limited resources.
I started investigating this issue since at some point I had a machine where docker took quite a minutes to start and I was wondering what was going on.
Granted, it might be considered a bit bogus to have the combination of "--bridge none" on the docker daemon and run (some) containers without specifying the --network option, which is something I've adjusted (to explicitly pass
--network none). But the docker behavior seems a bit misleading and I wonder if this doesn't hide another issue, where network sandboxes should not become stale.Reproduce
Given I have the following in /etc/docker/daemon.json
And I restart docker
And I launch a few containers using something like:
When I restart docker
Then I see in the docker logs 50 occurrences of
Removing stale sandbox <sandbox ID> (<container ID>)(e.g. ""Removing stale sandbox d7548f4571ba16a4b925a11b75088f36148164993e43085c680075257dbb1c9f (9a64aa6aeadc4817088db55c6ac520b69e275e248045d895ee6fac283e762823)"
And ultimately docker takes more time to start (this is particularly visible on low-power machine and when there's many (>1000) stable sandboxes)
Expected behavior
No stale sandbox removed when docker is restarted.
docker version
Client: Docker Engine - Community Version: 24.0.6 API version: 1.43 Go version: go1.20.7 Git commit: ed223bc Built: Mon Sep 4 12:32:10 2023 OS/Arch: linux/amd64 Context: default Server: Docker Engine - Community Engine: Version: 24.0.6 API version: 1.43 (minimum version 1.12) Go version: go1.20.7 Git commit: 1a79695 Built: Mon Sep 4 12:32:10 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.24 GitCommit: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523 runc: Version: 1.1.9 GitCommit: v1.1.9-0-gccaecfc docker-init: Version: 0.19.0 GitCommit: de40ad0docker info
Additional Info
No response