Skip to content

Proposal: Make Pods (collections of containers) a first order container object. #8781

@brendandburns

Description

@brendandburns

Pods

This is a proposal to change the first order container object within the Docker API from a single container to a pod of containers.

A pod (as in a pod of whales or pea pod) models an application-specific "logical host" in a containerized environment. It may contain one or more containers which are relatively tightly coupled -- in a pre-container world, they would have executed on the same physical or virtual host.

This is somewhat related to #8637 but that proposal has much more to do with the namespacing of containers into a single namespace, than grouping containers into logical hosts for the purposes of scheduling, resource tracking, isolation and sharing.

In this proposal there are two sub-proposals:

  • The first is that a new API object, representing a Pod, be added to the Docker API.
  • The second is that this new API object replace the existing singleton container object in future versions of the Docker API.

Since these topics are somewhat orthogonal, I will address them each in separate sections.

Pods as an API object

Inherently, a Pod represents an atomic unit of an application. It is the smallest piece of the application that it makes sense to consider as a unit. Primarily this atomicity is in terms of running the Pod. A pod may be made up of many containers, but the state of those containers must be treated as atomic. Either they are all running under the same Docker daemon, or they are not. It does not make sense to have a partially running pod. Nor does it make sense to run different containers from a single pod in different Docker daemons.

There are numerous examples of such mult-container atomic units, for example:

  • A user-facing web server, and a side-car container that periodically sync's the server's content's from version control.
  • A primary database container, and a periodic backup container that copies the database out to network storage.
  • Multiple containers synchronizing work through IPC or shared memory.
  • Side-car containers that provide thick libraries and simplified APIs that other containers can consume (e.g. Master Election)

In all of these cases, the important characteristic is that the containers involved are symbiotic, it doesn't make sense to place the containers in these example pods onto different hosts.

What does it mean to be a Pod?

Pods share a network namespace (and consequently an IP address). Members of a pod can address eachother via localhost. Pods also share a set of data volumes, which the pods can use to share data between different containers. Importantly, pods do not share a chroot, so data volumes are the only way to share storage. Pods also share a resource hierarchy, though the individual containers within a pod may also have their own specific resource limits, these resource limits are subdivisions of the resources allocated to the entire pod.

Why pods instead of scheduled co-location?

Co-location via a scheduling system achieves some of the goals of a pod, but it has signficant downsides, including the fact that the containers don't share a network namespace (and thus have to rely on additional discovery mechanisms). Additionally, they don't share a cgroup, so you can not express a parasitic container that steals resources when feasible from a co-container in it's pod, instead that parasitic container steals from all containers on the host. Additionally, the fact that the linkages between the containers is expressed as scheduling constraints, rather than an explicit grouping of the containers makes the it harder to reason about the container group and also makes the scheduler more complicated.

Pods as the only way to run containers

It would be a somewhat significant revision to the Docker API to transform the current singleton containers into Pods of containers, but it is a worthwhile endeavor, because it will retain the simplicity of the API.

Put concretely, there is no reason to introduce two different API objects (singleton containers and pods), when a Pod of a single container can effectively represent the singleton case. Sticking to a single API object will limit complexity both in the code and in the documentation, and will also give user's a seamless evolution from single container Pods to more sophisticated multi-container pods.

Implementation and further details

Pods are a foundational part of the Kubernetes API. The Kubernetes API spec for a Pod can be found in the v1 API

A fully functional implementation of pods in terms of Docker containers can be found inside of kubelet.go.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureFunctionality or other elements that the project doesn't currently have. Features are new and shiny

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions