-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add no_sync option to boost boltDB performance on ephemeral environments #10745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
can we set boltdb file in custom directory? Because boltdb on different file system has different stability and performance. |
Not at the moment, at least not at per plugin granularity. Though you can use these to tell // Root is the path to a directory where containerd will store persistent data
Root string `toml:"root"`
// State is the path to a directory where containerd will store transient data
State string `toml:"state"` |
|
@mxpv and I had a bit of a chat offline in slack. I think I have a couple concerns with the approach in this PR:
Thinking about number 5 above, what if we instead expose a slightly higher abstraction like a "data consistency profile" that can be set to "ephemeral", and internally set options to BoltDB like the ones in the PR description? This would retain our flexibility for dependency upgrades, option additions/removals, and migration to a different backend (if we ever decided we wanted to do so). Our challenge is: (a) are "default" and "ephemeral" really the only two profiles, and (b) if there is a suggestion for some new option, do we tweak the existing "ephemeral" profile or decide to add a new one? |
|
To sum up, possible ways to go: 1. Mirror current bolt's options in TOML (currently implemented in this PR). These change rarely, but we can add a warning banner that these are subject to change with newer bolt versions and out of support scope, so users take responsibility for breaking changes. 1.1 Variation of 1. is something like: [plugins.'io.containerd.metadata.v1.bolt']
extra_boltdb_raw_options = '{uninterpreted json blob here}'But we have to mirror 2. Introduce "default" and "ephemeral" profiles to cover current use cases we're aware. This remains to be a concern:
|
|
As discussed in the containerd community meeting: |
|
/test pull-containerd-node-e2e |
A boolean no_sync flag is fine, but I do think a profile (effectively an enum) is a better option here. Both a no_sync flag and an enum capture the limited, specific use-case concern, but a boolean flag paints us into a bit of a corner if we ever want to have some other specific use-case beyond "go as fast as possible with no concern for data safety" and the default "keep the data safe" modes. |
|
/test pull-containerd-node-e2e |
dmcgowan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but maybe rebase to run the tests again
Signed-off-by: Maksym Pavlenko <[email protected]>
Signed-off-by: Maksym Pavlenko <[email protected]>
Signed-off-by: Maksym Pavlenko <[email protected]>
Signed-off-by: Maksym Pavlenko <[email protected]>
Signed-off-by: Maksym Pavlenko <[email protected]>
dims
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
containerd 2.1.0 Welcome to the v2.1.0 release of containerd! The first minor release of containerd 2.x focuses on continued stability alongside new features and improvements. This is the first time-based released for containerd. Most the feature set and core functionality has long been stable and harderened in production environments, so now we transition to a balance of timely delivery of new functionality with the same high confidence in stability and performance. * Add no_sync option to boost boltDB performance on ephemeral environments ([containerd#10745](containerd#10745)) * Add content create event ([containerd#11006](containerd#11006)) * Erofs snapshotter and differ ([containerd#10705](containerd#10705)) * Update CRI to use transfer service for image pull by default ([containerd#8515](containerd#8515)) * Support multiple cni plugin bin dirs ([containerd#11311](containerd#11311)) * Support container restore through CRI/Kubernetes ([containerd#10365](containerd#10365)) * Add OCI/Image Volume Source support ([containerd#10579](containerd#10579)) * Enable Writable cgroups for unprivileged containers ([containerd#11131](containerd#11131)) * Fix recursive RLock() mutex acquisition ([containerd/go-cni#126](containerd/go-cni#126)) * Support CNI STATUS Verb ([containerd/go-cni#123](containerd/go-cni#123)) * Retry last registry host on 50x responses ([containerd#11484](containerd#11484)) * Multipart layer fetch ([containerd#10177](containerd#10177)) * Enable HTTP debug and trace for transfer based puller ([containerd#10762](containerd#10762)) * Add support for unpacking custom media types ([containerd#11744](containerd#11744)) * Add dial timeout field to hosts toml configuration ([containerd#11106](containerd#11106)) * Expose Pod assigned IPs to NRI plugins ([containerd#10921](containerd#10921)) * Support multiple uid/gid mappings ([containerd#10722](containerd#10722)) * Fix race between serve and immediate shutdown on the server ([containerd/ttrpc#175](containerd/ttrpc#175)) * Update FreeBSD defaults and re-organize platform defaults ([containerd#11017](containerd#11017)) * Postpone cri config deprecations to v2.2 ([containerd#11684](containerd#11684)) * Remove deprecated dynamic library plugins ([containerd#11683](containerd#11683)) * Remove the support for Schema 1 images ([containerd#11681](containerd#11681)) Please try out the release binaries and report any issues at https://github.com/containerd/containerd/issues. * Derek McGowan * Phil Estes * Akihiro Suda * Maksym Pavlenko * Jin Dong * Wei Fu * Sebastiaan van Stijn * Samuel Karp * Mike Brown * Adrien Delorme * Austin Vazquez * Akhil Mohan * Kazuyoshi Kato * Henry Wang * Gao Xiang * ningmingxiao * Krisztian Litkey * Yang Yang * Archit Kulkarni * Chris Henzie * Iceber Gu * Alexey Lunev * Antonio Ojea * Davanum Srinivas * Marat Radchenko * Michael Zappa * Paweł Gronowski * Rodrigo Campos * Alberto Garcia Hierro * Amit Barve * Andrey Smirnov * Divya * Etienne Champetier * Kirtana Ashok * Philip Laine * QiPing Wan * fengwei0328 * zounengren * Adrian Reber * Alfred Wingate * Amal Thundiyil * Athos Ribeiro * Brian Goff * Cesar Talledo * ChengyuZhu6 * Chongyi Zheng * Craig Ingram * Danny Canter * David Son * Fupan Li * HirazawaUi * Jing Xu * Jonathan A. Sternberg * Jose Fernandez * Kaita Nakamura * Kohei Tokunaga * Lei Liu * Marco Visin * Mike Baynton * Qiyuan Liang * Sameer * Shiming Zhang * Swagat Bora * Teresaliu * Tony Fang * Tõnis Tiigi * Vered Rosen * Vinayak Goyal * bo.jiang * chriskery * luchenhan * mahmut * zhaixiaojuan * **github.com/Microsoft/hcsshim** v0.12.9 -> v0.13.0-rc.3 * **github.com/cilium/ebpf** v0.11.0 -> v0.16.0 * **github.com/containerd/cgroups/v3** v3.0.3 -> v3.0.5 * **github.com/containerd/containerd/api** v1.8.0 -> v1.9.0 * **github.com/containerd/continuity** v0.4.4 -> v0.4.5 * **github.com/containerd/go-cni** v1.1.10 -> v1.1.12 * **github.com/containerd/imgcrypt/v2** v2.0.0-rc.1 -> v2.0.1 * **github.com/containerd/otelttrpc** ea5083fda723 -> v0.1.0 * **github.com/containerd/platforms** v1.0.0-rc.0 -> v1.0.0-rc.1 * **github.com/containerd/ttrpc** v1.2.6 -> v1.2.7 * **github.com/containerd/typeurl/v2** v2.2.2 -> v2.2.3 * **github.com/containernetworking/cni** v1.2.3 -> v1.3.0 * **github.com/containernetworking/plugins** v1.5.1 -> v1.7.1 * **github.com/containers/ocicrypt** v1.2.0 -> v1.2.1 * **github.com/davecgh/go-spew** d8f796af33cc -> v1.1.1 * **github.com/fsnotify/fsnotify** v1.7.0 -> v1.9.0 * **github.com/go-jose/go-jose/v4** v4.0.4 -> v4.0.5 * **github.com/google/go-cmp** v0.6.0 -> v0.7.0 * **github.com/grpc-ecosystem/grpc-gateway/v2** v2.22.0 -> v2.26.1 * **github.com/klauspost/compress** v1.17.11 -> v1.18.0 * **github.com/mdlayher/socket** v0.4.1 -> v0.5.1 * **github.com/moby/spdystream** v0.4.0 -> v0.5.0 * **github.com/moby/sys/user** v0.3.0 -> v0.4.0 * **github.com/opencontainers/image-spec** v1.1.0 -> v1.1.1 * **github.com/opencontainers/runtime-spec** v1.2.0 -> v1.2.1 * **github.com/opencontainers/selinux** v1.11.1 -> v1.12.0 * **github.com/pelletier/go-toml/v2** v2.2.3 -> v2.2.4 * **github.com/petermattis/goid** 4fcff4a6cae7 **_new_** * **github.com/pmezard/go-difflib** 5d4384ee4fb2 -> v1.0.0 * **github.com/prometheus/client_golang** v1.20.5 -> v1.22.0 * **github.com/prometheus/common** v0.55.0 -> v0.62.0 * **github.com/sasha-s/go-deadlock** v0.3.5 **_new_** * **github.com/smallstep/pkcs7** v0.1.1 **_new_** * **github.com/stretchr/testify** v1.9.0 -> v1.10.0 * **github.com/tchap/go-patricia/v2** v2.3.1 -> v2.3.2 * **github.com/urfave/cli/v2** v2.27.5 -> v2.27.6 * **github.com/vishvananda/netlink** v1.3.0 -> 0e7078ed04c8 * **github.com/vishvananda/netns** v0.0.4 -> v0.0.5 * **go.etcd.io/bbolt** v1.3.11 -> v1.4.0 * **go.opentelemetry.io/auto/sdk** v1.1.0 **_new_** * **go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc** v0.56.0 -> v0.60.0 * **go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp** v0.56.0 -> v0.60.0 * **go.opentelemetry.io/otel** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/exporters/otlp/otlptrace** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/metric** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/sdk** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/trace** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/proto/otlp** v1.3.1 -> v1.5.0 * **golang.org/x/crypto** v0.28.0 -> v0.36.0 * **golang.org/x/exp** aacd6d4b4611 -> 2d47ceb2692f * **golang.org/x/mod** v0.21.0 -> v0.24.0 * **golang.org/x/net** v0.30.0 -> v0.38.0 * **golang.org/x/oauth2** v0.22.0 -> v0.27.0 * **golang.org/x/sync** v0.8.0 -> v0.14.0 * **golang.org/x/sys** v0.26.0 -> v0.33.0 * **golang.org/x/term** v0.25.0 -> v0.30.0 * **golang.org/x/text** v0.19.0 -> v0.23.0 * **golang.org/x/time** v0.3.0 -> v0.7.0 * **google.golang.org/genproto/googleapis/api** 5fefd90f89a9 -> 56aae31c358a * **google.golang.org/genproto/googleapis/rpc** 324edc3d5d38 -> 56aae31c358a * **google.golang.org/grpc** v1.67.1 -> v1.72.0 * **google.golang.org/protobuf** v1.35.1 -> v1.36.6 * **k8s.io/api** v0.31.2 -> v0.32.3 * **k8s.io/apimachinery** v0.31.2 -> v0.32.3 * **k8s.io/apiserver** v0.31.2 -> v0.32.3 * **k8s.io/client-go** v0.31.2 -> v0.32.3 * **k8s.io/cri-api** v0.31.2 -> v0.32.3 * **k8s.io/kubelet** v0.31.2 -> v0.32.3 * **k8s.io/utils** 18e509b52bc8 -> 3ea5e8cea738 * **sigs.k8s.io/json** bc3834ca7abd -> 9aa6b5e7a4b3 * **sigs.k8s.io/structured-merge-diff/v4** v4.4.1 -> v4.4.2 * **tags.cncf.io/container-device-interface** v0.8.0 -> v1.0.1 * **tags.cncf.io/container-device-interface/specs-go** v0.8.0 -> v1.0.0 Previous release can be found at [v2.0.0](https://github.com/containerd/containerd/releases/tag/v2.0.0) * `containerd-<VERSION>-<OS>-<ARCH>.tar.gz`: ✅Recommended. Dynamically linked with glibc 2.35 (Ubuntu 22.04). * `containerd-static-<VERSION>-<OS>-<ARCH>.tar.gz`: Statically linked. Expected to be used on Linux distributions that do not use glibc >= 2.35. Not position-independent. In addition to containerd, typically you will have to install [runc](https://github.com/opencontainers/runc/releases) and [CNI plugins](https://github.com/containernetworking/plugins/releases) from their official sites too. See also the [Getting Started](https://github.com/containerd/containerd/blob/main/docs/getting-started.md) documentation.
This PR exposes BoltDB options via TOML configuration.
On certain ephemeral environments its possible to squeeze a significant performance gains via more precise boltDB configuration by going async.
In our case we were able to reduce pull time under certain (heavy) conditions from 30s to just a few seconds.
Additionally this PR introduces optional args in
MetaStoreto allow (external) snapshotters configure boltDB options for same reasons.Troubleshooting and research credits belong to @xinyangge-db