Skip to content

Add garbage collection #1398

@dmcgowan

Description

@dmcgowan

Resource garbage collection is used remove unreferenced data from containerd.

This should be considered a proposal with active work on the POC based on this description and incorporated feedback.

GC Roles

The garbage collection process involves operating on a directed graph containing nodes and edges. A set of root pointers are used to seed the set of referenced nodes and each referenced node is walked to determine the full set of nodes. At the end of the process, any node which is not marked as referenced is removed.

In the containerd case, these nodes, directed edges, and root pointers are heterogeneous and require further enumeration.

Nodes (Resources eligible for cleanup)

  • Content in the content store (including on disk blob data)
  • Snapshots (including on disk data)
  • Tasks (process cleaned up) ?

Roots

  • Images
  • Containers
  • Content with label containerd.io/gc.root
  • Snapshot with label containerd.io/gc.root

Edges

  • Content -> Content (manifests references, manifest lists references, content label containerd.io/gc.ref.content)
  • Image -> Content (source descriptor)
  • Container -> Snapshot (rootfs/snapshotter, container label containerd.io/gc.ref.snapshot)
  • Content -> Snapshot (image label containerd.io/gc.ref.snapshot)
  • Container -> Task ?

Timeline

The garbage collection process is going to take time to become efficient. The requirement will be around accuracy and consistency, we will optimize from there.

For alpha (work in progress)

Stop the world Mark and Sweep. We will acquire a global lock, read all the databases to perform a lock, then delete all the unreferenced resources. During this collection no read or write actions can be performed on the databases or underlying structures.

For beta

Apart from acquiring the global lock, also create a read/write lock which can be used for operations which do not mutate data. Only the garbage collector may acquire the write lock to perform resource cleanup. The implementation of this dual layer lock is undecided.

Future optimizations

Since containerd runs as a single daemon, we can make optimize the garbage collector to allow the collector to run concurrently to other operations and only lock mutations for a brief period to perform resource cleanup.

We can further reduce the lock time during resource cleanup by deferring file system removals until after the cleanup phase has completed. The cleanup phase could perform a much faster rename operation while the lock is held, and perform removal after.

With the way containerd manages namespaces, garbage collection can be performed at different levels, allowing for individual and less intrusive garbage collection operations to be run on the underlying snapshotters and content stores based on their referenced namespace metadata. This could be possible with limited access to the underlying snapshotters and content store outside of the metadata interface. The directed graphs for these smaller garbage collections would be much smaller and more homogeneous.

Review process

Feedback appreciated, we can move this to temporarily to a google doc if there is a desire for inline commenting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions