Skip to content

[Bug]: StartLogProducer does not drain channel when EOF is received #1407

@jrfeenst

Description

@jrfeenst

Testcontainers version

0.21.0

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host arch

x86

Go version

1.20.6

Docker version

Client:
 Version:           20.10.5+dfsg1
 API version:       1.41
 Go version:        go1.15.15
 Git commit:        55c4c88
 Built:             Mon May 30 18:34:49 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.5+dfsg1
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.15.15
  Git commit:       363e9a8
  Built:            Mon May 30 18:34:49 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.13~ds1
  GitCommit:        1.4.13~ds1-1~deb11u4
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d1
 docker-init:
  Version:          0.19.0
  GitCommit:

Docker info

Client:                                                                                                                                                                                                              
 Context:    default                                                                                                                                                                                                 
 Debug Mode: false                                                                                                                                                                                                   
                                                                                                                                                                                                                     
Server:
 Containers: 38
  Running: 1
  Paused: 0
  Stopped: 37
 Images: 682
 Server Version: 20.10.5+dfsg1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1.4.13~ds1-1~deb11u4
 runc version: v1.1.4-0-g5fd4c4d1
 init version: 
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.0-23-amd64
 Operating System: Debian GNU/Linux 11 (bullseye)
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 62.4GiB
 Name: ah-jfeenstr2-l
 ID: UZUR:45Z4:3MFL:6P5B:HGFF:7IQC:FJD2:XJ2J:K72Q:MCP2:4DI2:IS7S
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: jrfeenst
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 192.168.0.0/16, Size: 24

WARNING: Support for cgroup v2 is experimental

What happened?

I have a container which exits on it's own. I use a log consumer/producer to copy the output to stdout. When the container is done it looks like the producer code is failing to properly handle the EOF. This then hangs the test -- I think because the docker daemon has an active log reader? If I don't start the log producer the test runs and passes.

_, _ = fmt.Fprintf(os.Stderr, "container log error: %+v. %s", err, logStoppedForOutOfSyncMessage)

_, _ = fmt.Fprintf(os.Stderr, "container log error: %+v. %s", err, logStoppedForOutOfSyncMessage)
// if we would continue here, the next header-read will result into random data...
return

I think the issue is that a blocking go channel is being used to stop the producer, but in this error case the call to StopLogProducer will hang indefinitely because the goroutine exited. Making it a buffering channel fixes this but makes StopLogProducer non-blocking, which may break some log consumers. You could also create a secondary goroutine to drain the stopProducer channel when this error is encountered. I've tested this out and it resolves the issue for me.

_, _ = fmt.Fprintf(os.Stderr, "container log error: %+v. %s", err, logStoppedForOutOfSyncMessage)
// if we would continue here, the next header-read will result into random data...
go func() {
<-stop // drain the channel to unblock StopLogProducer
}()

Relevant log output

container log error: EOF. Stopping log consumer: Headers out of sync2023/08/01 11:58:44 ✋ Container stopped: 2da69cdc0555
2023/08/01 11:58:44 🐳 Terminating container: 4e9083e61ffd

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugAn issue with the library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions