Skip to content

Disable writing freelist to make the file robust against data corruptions#6761

Merged
dmcgowan merged 1 commit intocontainerd:mainfrom
kzys:bbolt-freelist
Apr 6, 2022
Merged

Disable writing freelist to make the file robust against data corruptions#6761
dmcgowan merged 1 commit intocontainerd:mainfrom
kzys:bbolt-freelist

Conversation

@kzys
Copy link
Copy Markdown
Member

@kzys kzys commented Apr 1, 2022

A bbolt database has a freelist to track all pages that are available
for allocation. However writing the list takes some time and reading
the list sometimes panics.

This commit sets NoFreelistSync true for skipping the freelist entirely,
following what etcd does.

https://github.com/etcd-io/etcd/blob/v3.5.2/server/mvcc/backend/config_linux.go#L31

Fixes #4838.

Signed-off-by: Kazuyoshi Kato [email protected]

@kzys kzys force-pushed the bbolt-freelist branch from ddf71b8 to 741ac64 Compare April 1, 2022 21:01
@kzys kzys changed the title Disable writing freelist to make the file more robust against a data corruption Disable writing freelist to make the file robust against data corruptions Apr 1, 2022
…ions

A bbolt database has a freelist to track all pages that are available
for allocation. However writing the list takes some time and reading
the list sometimes panics.

This commit sets NoFreelistSync true to skipping the freelist entirely,
following what etcd does.

https://github.com/etcd-io/etcd/blob/v3.5.2/server/mvcc/backend/config_linux.go#L31

Fixes containerd#4838.

Signed-off-by: Kazuyoshi Kato <[email protected]>
@kzys kzys force-pushed the bbolt-freelist branch from 741ac64 to 6da3183 Compare April 1, 2022 21:16
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Apr 1, 2022

Build succeeded.

@deitch
Copy link
Copy Markdown
Contributor

deitch commented Dec 2, 2022

Is this commit in the 1.6.x series? Or did it only make it into 1.7.0 series (currently at beta.0)

@kzys
Copy link
Copy Markdown
Member Author

kzys commented Dec 2, 2022

@deitch Currently in 1.7.x but I can backport that to 1.6.x.

@deitch
Copy link
Copy Markdown
Contributor

deitch commented Dec 3, 2022

That would be great. Thank you.

@pret-lekhraj
Copy link
Copy Markdown

@deitch Currently in 1.7.x but I can backport that to 1.6.x.
@kzys same issue observed in containerd containerd.io 1.7.22, Could you kindly check and confirm, In which specific version of containerd it is fixed,

logs snippet attached for your reference:

Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.993497139+03:00" level=info msg="Connect containerd service"
Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.993598671+03:00" level=info msg="using legacy CRI server"
Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.993623299+03:00" level=info msg="using experimental NRI integration - disable nri plugin to prevent this"
Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.993900098+03:00" level=info msg="Get image filesystem path "/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs""
Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.995520897+03:00" level=info msg="Start subscribing containerd event"
Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.995736052+03:00" level=info msg="Start recovering state"
Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.996225062+03:00" level=info msg=serving... address=/run/containerd/containerd.sock.ttrpc
Oct 28 06:24:30 raspberrypi containerd[2423528]: time="2024-10-28T06:24:30.997444439+03:00" level=info msg=serving... address=/run/containerd/containerd.sock
Oct 28 06:24:31 raspberrypi containerd[2423528]: panic: invalid freelist page: 0, page type is unknown<00>
Oct 28 06:24:31 raspberrypi containerd[2423528]: goroutine 112 [running]:
Oct 28 06:24:31 raspberrypi containerd[2423528]: go.etcd.io/bbolt.(*freelist).read(0x558972e1d7?, 0x7f400e9000)
Oct 28 06:24:31 raspberrypi containerd[2423528]: /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/freelist.go:267 +0x1e8
Oct 28 06:24:31 raspberrypi containerd[2423528]: go.etcd.io/bbolt.(*DB).loadFreelist.func1()
Oct 28 06:24:31 raspberrypi containerd[2423528]: /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:420 +0xa8
Oct 28 06:24:31 raspberrypi containerd[2423528]: sync.(*Once).doSlow(0x5588e231f0?, 0x400066f850?)
Oct 28 06:24:31 raspberrypi containerd[2423528]: /usr/local/go/src/sync/once.go:74 +0x100
Oct 28 06:24:31 raspberrypi containerd[2423528]: sync.(*Once).Do(...)
Oct 28 06:24:31 raspberrypi containerd[2423528]: /usr/local/go/src/sync/once.go:65
Oct 28 06:24:31 raspberrypi containerd[2423528]: go.etcd.io/bbolt.(*DB).loadFreelist(0x400066f688?)
Oct 28 06:24:31 raspberrypi containerd[2423528]: /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:413 +0x48
Oct 28 06:24:31 raspberrypi containerd[2423528]: go.etcd.io/bbolt.Open({0x4000052550, 0x46}, 0x180, 0x0)

Oct 28 06:24:47 raspberrypi containerd[2423693]: /usr/local/go/src/sync/once.go:65
Oct 28 06:24:47 raspberrypi containerd[2423693]: go.etcd.io/bbolt.(*DB).loadFreelist(0x400043b448?)
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:413 +0x48
Oct 28 06:24:47 raspberrypi containerd[2423693]: go.etcd.io/bbolt.Open({0x40001d85f0, 0x46}, 0x180, 0x0)
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:295 +0x340
Oct 28 06:24:47 raspberrypi containerd[2423693]: github.com/containerd/containerd/snapshots/storage.(*MetaStore).TransactionContext(0x40004c0420, {0x55775d3510, 0x400044bc20}, 0x0)
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/snapshots/storage/metastore.go:90 +0xd0
Oct 28 06:24:47 raspberrypi containerd[2423693]: github.com/containerd/containerd/snapshots/storage.(*MetaStore).WithTransaction(0x40008c5578?, {0x55775d3510?, 0x400044bc20?}, 0x0, 0x4000c95cb0)
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/snapshots/storage/metastore.go:115 +0x30
Oct 28 06:24:47 raspberrypi containerd[2423693]: github.com/containerd/containerd/snapshots/overlay.(*snapshotter).Stat(0x40004cf6c0, {0x55775d3510, 0x400044bc20}, {0x40004d2240, 0x52})
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/snapshots/overlay/overlay.go:180 +0x100
Oct 28 06:24:47 raspberrypi containerd[2423693]: github.com/containerd/containerd/metadata.(*snapshotter).Stat(0x40002d9940, {0x55775d3510, 0x400044bc20}, {0x4000cc4870, 0x47})
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/metadata/snapshot.go:137 +0x1f0
Oct 28 06:24:47 raspberrypi containerd[2423693]: github.com/containerd/containerd.(*image).IsUnpacked(0x4000c6ac40, {0x55775d3510, 0x400044bc20}, {0x40000e3240?, 0x487?})
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/image.go:275 +0xdc
Oct 28 06:24:47 raspberrypi containerd[2423693]: github.com/containerd/containerd/pkg/cri/server.(*criService).loadImages.func1()
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/pkg/cri/server/restart.go:460 +0x198
Oct 28 06:24:47 raspberrypi containerd[2423693]: created by github.com/containerd/containerd/pkg/cri/server.(*criService).loadImages in goroutine 76
Oct 28 06:24:47 raspberrypi containerd[2423693]: /go/src/github.com/containerd/containerd/pkg/cri/server/restart.go:448 +0x88
Oct 28 06:24:47 raspberrypi systemd[1]: containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 28 06:24:47 raspberrypi systemd[1]: containerd.service: Failed with result 'exit-code'.
Oct 28 06:24:47 raspberrypi systemd[1]: Failed to start containerd.service - containerd container runtime.

containerd --version
containerd containerd.io 1.7.22 7f7fdf5
runc --version
runc version 1.1.14
commit: v1.1.14-0-g2c9f560
spec: 1.0.2-dev
go: go1.22.7
libseccomp: 2.5.4
uname -a
Linux raspberrypi 6.6.35-v8+ #1 SMP PREEMPT Thu Jun 27 14:02:50 IST 2024 aarch64 GNU/Linux
lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 12 (bookworm)
Release: 12
Codename: bookworm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No option to disable freelist synchronization in order to recover corrupted bolt meta.db

5 participants