-
-
Notifications
You must be signed in to change notification settings - Fork 602
[Bug]: Ryuk reaper container not properly re-initialized after it got terminated #764
Description
Testcontainers version
0.15.0
Using the latest Testcontainers version?
Yes
Host OS
Linux
Host arch
x86
Go version
1.19
Docker version
$ docker version
Client: Docker Engine - Community
Version: 23.0.0-rc.1
API version: 1.42
Go version: go1.19.4
Git commit: 139e924
Built: Thu Dec 22 23:37:13 2022
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 23.0.0-rc.1
API version: 1.42 (minimum version 1.12)
Go version: go1.19.4
Git commit: cba986b
Built: Thu Dec 22 23:37:13 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.14
GitCommit: 9ba4b250366a5ddde94bb7c9d1def331423aa323
runc:
Version: 1.1.4
GitCommit: v1.1.4-0-g5fd4c4d
docker-init:
Version: 0.19.0
GitCommit: de40ad0Docker info
$ docker info
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.9.1
Path: /home/miel/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.14.2
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 2
Running: 0
Paused: 0
Stopped: 2
Images: 25
Server Version: 23.0.0-rc.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9ba4b250366a5ddde94bb7c9d1def331423aa323
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.0.0-6-amd64
Operating System: Debian GNU/Linux bookworm/sid
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 31.09GiB
Name: housepaper
ID: TKCM:VE5M:466R:AMPV:XPP3:CZ45:BZRN:N6WX:YR4E:ZBR5:VYDD:5SWD
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: miel
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: falseWhat happened?
Test failing with failed to create container: connecting to reaper failed: Connecting to Ryuk on localhost:32787 failed: dial tcp [::1]:32787: connect: connection refused
I noticed the failing test(s) always ran after one other specific test (which succeeded). The only difference was this test wasn't using a TestContainer.
Diving deeper, I noticed the reaper container is normally reused once initialized;
// If reaper already exists re-use it
if reaper != nil {
return reaper, nil
}
Line 52 in 126aeb9
| // If reaper already exists re-use it |
This happened for my failing test as well. But trying to connect to the reaper, this failed;
Line 127 in 126aeb9
| conn, err := net.DialTimeout("tcp", r.Endpoint, 10*time.Second) |
When the moby-ryuk runs, it only stays 'active' for as long as it has some active connection. After all connections are lost, it shuts down after a pre-determined timeout of 10 seconds, which cannot be configured:
https://github.com/testcontainers/moby-ryuk/blob/8f512d37699cb6e5a3b28f2433c2b3b894a78e5d/main.go#L152
So what happens in my situation;
- Several integration tests run, creating their own container but reusing the Ryuk reaper container (as they are running in close succession, within the 10s timeout)
- The integration test without test-container runs. This one takes about 20-30 sec, so the reaper container shuts down
- The next integration test tries to create a container again, but aborts as it fails to connect to the 'existing' reaper container.
The code should detect whether the Ryuk container is actually still running, and if not running create a new one instead of trying to reuse the non-existing one.
Relevant log output
No response
Additional information
No response