Evented Process Monitor

TLDR:

**Pros:**
- Evented lock free code
- 1 goroutine for all containers
- Generic zombie fighting abilities for free

**Cons:**
- none

Currently docker uses `exec.Cmd` and `cmd.Wait()` inside a goroutine for blocking on a container's processes until it finishes.  After a container dies two things can happen.  One, the container's process is restarted depending if the user requested that the container always restart.  Two, we tear down the container and update it's status with the exit code and release any resources required.  

Writing a process monitor like this is not very efficient and docker is unable to handle `SIGCHLD` events to reap zombies that were not direct children of the daemon's process.

It also means that if we have one goroutine per container, if we have 1000 containers we have 1000 blocking goroutines in the daemon.  Booooo.

We can do better.  The proper way is to move to an evented system for monitoring when child processes change state.  This can be handled via `SIGCHLD`.  A process can setup a signal handler for `SIGCHLD` events and when the status of a child process changes this signal is sent to the handler.  We can use this to extract the pid, exit status, and make decision on how to handle the event.  

Using an evented system like this we can reduce the number of goroutines to 1 for N number of containers also reduce the amount of locks that are required to handle the previous level of concurrency.  Running one container will require 1 goroutine, running 10k container's will require 1 goroutine, win.  This model also allows us to reap zombies, because zombies are bad m'kay, in the daemon process that are not direct children.  i.e. non container PID1's.  

A sample code of what the process monitor would look like is follows:

``` go
package main

import (
    "os"
    "os/exec"
    "os/signal"
    "sync"
    "syscall"
    "time"

    "github.com/Sirupsen/logrus"
)

var pidsICareAbout map[int]string

func handleEvents(signals chan os.Signal, group *sync.WaitGroup) {
    defer group.Done()
    for sig := range signals {
        switch sig {
        case syscall.SIGTERM, syscall.SIGINT:
            // return cuz user said so
            return
        case syscall.SIGCHLD:
            var (
                status syscall.WaitStatus
                usage  syscall.Rusage
            )
            pid, err := syscall.Wait4(-1, &status, syscall.WNOHANG, &usage)
            if err != nil {
                logrus.Error(err)
            }
            logrus.WithField("pid", pid).Info("process status changed")
            if _, ok := pidsICareAbout[pid]; ok {
                logrus.Infof("i care about %d", pid)
                // we can modify the map without a lock because we have a single
                // goroutine and handler for signals.
                delete(pidsICareAbout, pid)
                if len(pidsICareAbout) == 0 {
                    // return after everything is dead
                    return
                }
            } else {
                logrus.Infof("---> i don't care about %d", pid)
            }
        }
    }
}

func runSomething() error {
    // just add some delay for demo
    time.Sleep(1 * time.Second)
    cmd := exec.Command("sh", "-c", "sleep 5")
    // cmd.Start() is non blocking
    if err := cmd.Start(); err != nil {
        return err
    }
    // get the pid because I care about this one.
    logrus.WithField("pid", cmd.Process.Pid).Info("spawned new process")
    pidsICareAbout[cmd.Process.Pid] = "sleep 5"
    return nil
}

func randomFork() error {
    syscall.ForkLock.Lock()
    pid, _, err := syscall.RawSyscall(syscall.SYS_FORK, 0, 0, 0)
    syscall.ForkLock.Unlock()
    if err != 0 {
        return err
    }
    if pid == 0 {
        logrus.Info("i'm on a boat")
        os.Exit(0)
    } else {
        logrus.Infof("forked off %d", pid)
    }
    return nil
}

func main() {
    signals := make(chan os.Signal, 1024)
    signal.Notify(signals, syscall.SIGCHLD, syscall.SIGTERM, syscall.SIGINT)
    pidsICareAbout = make(map[int]string)
    group := &sync.WaitGroup{}
    group.Add(1)
    go handleEvents(signals, group)
    for i := 0; i < 5; i++ {
        if err := runSomething(); err != nil {
            logrus.Fatal(err)
        }
    }
    // fork off a random process that we don't care about but make sure the
    // signal handler reaps it when it dies.
    if err := randomFork(); err != nil {
        logrus.Error(err)
    }
    logrus.Info("waiting on processes to finish")
    group.Wait()
    logrus.Info("all processes are done, exiting...")
}

```

We should build this in a generic way because we want this monitor with restart capabilities to be available to any type of process the daemon can spawn.   


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evented Process Monitor #11529

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evented Process Monitor #11529

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions