Skip to content

Lock-free container proposal #30225

@cpuguy83

Description

@cpuguy83

Problem

The container object is used all over the code base as a place to store container configuration (both user config and config resolved by the daemon), runtime data.
Likewise the container's mutex (which is really the State object's mutex) is also exposed all over the place.

The container mutex is locked when both reading and writing to the container (as expected). The mutex is also locked during certain container operations. Because of this, reading and writing to the container object is very much tied to typically very slow operations, and as we've found operations that tend to be buggy (see netlink failures). This has led to a number of cases causing deadlocks.

Goal

Issues with a container should never prevent a user from querying the engine in any way, including the problem container itself.

Proposal

Make the container a lock-free object.
Swamkit already has a really nice model for this for storing each of it's object types using github.com/hashicorp/go-memdb. This same model should work for containers... see https://github.com/docker/swarmkit/tree/master/manager/state/store
This provides for indexing, filtering, partial matching (can drop usage of truncindex), and change notifications.

Essentially this means instead of passing around a single pointer object that each function reads from/writes to, we make (deep) copies. The object database contains the single source of truth, anything that needs to modify the container will need to modify it by committing changes back to the object store. The container object itself should not require any mutex since each object would only have 1 reader or writer at a time.

Because we'll be passing around copies, we need to make sure there is no "live" state being copied... for instance https://github.com/docker/docker/blob/f19a293dd741583c66001799435f784f2af455e0/container/state.go#L32... or for that matter https://github.com/docker/docker/blob/f19a293dd741583c66001799435f784f2af455e0/container/state.go#L18

Work to be done:

  1. (Split container runtime state from config #28483) Move "live" state off the container object (or at least contain it)
  2. Implement deep copy support for the container object and each of the objects it uses.
  3. Replace MemoryStore with go-memdb object store
    • Also requires replacing exec store
    • Needs to implement versioning on the container data (does not need to persist)
    • Implement notifications for changes to container state
  4. Add locking (per ID) at the container manager level (currently Daemon) to ensure that stateful actions against the container (start/stop/pause etc) are synchronized.

You can see a POC form of this here: master...cpuguy83:container_memdb

For step 1, this is implemented in the mentioned PR.

Step 2 is a little tricky, I do have a package to handle this (https://github.com/cpuguy83/go-generate/tree/master/deepcopy), but would prefer to use swarmkit's deep copy (generated from protobuf)... this is tricky because container uses so many different types and it may be difficult to force them to use protobufs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/daemonCore Enginekind/enhancementEnhancements are not bugs or new features but can improve usability or performance.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions