Skip to content

build history: orphan records when switching between graphdriver and containerd store #48562

@crazy-max

Description

@crazy-max

Description

relates to

When switching to another store (graphdriver or containerd), builds from past store cannot be accessed / removed anymore and vice-versa.

Reproduce

Start a dind container with graphdriver store:

docker run -d --restart always --privileged --name docker27 -v docker27dt:/var/lib/docker -p 12375:2375 docker:27-dind --debug --host=tcp://0.0.0.0:2375 --tlsverify=false

Create a docker context and switch to it so buildx can build on it:

docker context create docker27 --docker "host=tcp://localhost:12375,skip-tls-verify=true"
docker context use docker27

Do a build with this Dockerfile:

FROM busybox
RUN echo "Hello, World!"
docker buildx build .

You should see the build on Docker Desktop and open it:

450e58d1-d678-4a95-b8db-ab8d1010894a

53f6071f-a614-4b92-8204-ed4a1bf864e6

Now restart the container with containerd store enabled:

docker context use default
docker rm -f docker27
docker run -d --restart always --privileged --name docker27 -v docker27dt:/var/lib/docker -p 12375:2375 --env TEST_INTEGRATION_USE_SNAPSHOTTER=1 docker:27-dind --debug --host=tcp://0.0.0.0:2375 --tlsverify=false

You should see containerd store enabled in the logs:

time="2024-09-30T19:43:01.560448969Z" level=info msg="Starting daemon with containerd snapshotter integration enabled"

We still see the build in Docker Dekstop but opening it yield to this error:

831e71d2-fc32-461f-9bd4-270469f67df7

Expected behavior

As the blobs for this record are in another store I would expect it not being listed by ControlClient().ListenBuildHistory(ctx, &controlapi.BuildHistoryRequest{}).

docker version

Server: Docker Engine - Community
 Engine:
  Version:          27.3.1
  API version:      1.47 (minimum version 1.24)
  Go version:       go1.22.7
  Git commit:       41ca978
  Built:            Fri Sep 20 11:41:02 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.22
  GitCommit:        7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc:
  Version:          1.1.14
  GitCommit:        v1.1.14-0-g2c9f560
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 27.3.1
 Storage Driver: overlayfs
  driver-type: io.containerd.snapshotter.v1
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc version: v1.1.14-0-g2c9f560
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 6.10.9-linuxkit
 Operating System: Alpine Linux v3.20 (containerized)
 OSType: linux
 Architecture: x86_64
 CPUs: 32
 Total Memory: 31.3GiB
 Name: 0d137b8046d0
 ID: af60703c-6a88-4bf5-8dd7-53880e95d58b
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 34
  Goroutines: 69
  System Time: 2024-09-30T19:54:49.198771943Z
  EventsListeners: 0
 Username: crazymax
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

Additional Info

I think we have several ways to mitigate this issue:

  • Even if the record is listed in the history db, check if it is part of the current content store before sending the event.
  • Create a new "store" field in the history db bucket and do a migration of all records that would set the db being used (either "graph" or "containerd") for each of them.
    • Or create a bbolt history db for each store and do the migration?

cc @tonistiigi

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/builder/buildkitBuildkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.status/0-triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions