Add overlay specific applier#13053
Conversation
1569370 to
3bcc5ad
Compare
If we want to deny the walking differ, we should deprecate it and add a deprecation warning in 2.3 for any manual configuration of the walking differ. Some of my colleagues have been testing a proxy snapshotter with parallel unpack using the walking differ. If we don't add this, we should delay the denial of walking differ until 2.7. |
34e7e4e to
ef1537c
Compare
|
@henry118 Can you rebase? |
There was a problem hiding this comment.
Pull request overview
Adds an overlay-specific applier/differ path to support correct whiteout handling during parallel unpack, and plumbs a new parallel signal through the diff service API so appliers can opt out when parallel mode is requested.
Changes:
- Introduces a Linux-only
overlaydiff plugin that uses a custom apply function with explicit parallel support. - Adds
parallelto the DiffServiceApplyRequestproto and wires it through local/proxy diff clients plusdiff.WithParallel. - Updates unpack logic to request parallel apply when supported and to avoid using the
walkingapplier for parallel unpack.
Reviewed changes
Copilot reviewed 23 out of 28 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| vendor/modules.txt | Updates vendoring metadata for the local ./api replacement. |
| vendor/github.com/containerd/containerd/api/services/diff/v1/diff.proto | Vendored proto update for parallel field. |
| vendor/github.com/containerd/containerd/api/services/diff/v1/diff.pb.go | Vendored generated Go update for parallel field. |
| vendor/github.com/containerd/containerd/api/LICENSE | Removes vendored license file for the api module. |
| plugins/transfer/plugin.go | Tracks the selected applier ID and adds WarningPlugin dependency. |
| plugins/services/diff/service_unix.go | Appends overlay to the default differ order on Unix. |
| plugins/services/diff/local.go | Passes parallel through diff apply options. |
| plugins/diff/walking/plugin/plugin.go | Switches walking plugin to the new filesystem applier constructor. |
| plugins/diff/overlay/plugin/plugin_other.go | Non-Linux stub package for overlay plugin. |
| plugins/diff/overlay/plugin/plugin_linux.go | Registers overlay diff plugin (custom apply func + parallel support). |
| plugins/diff/overlay/applier_other.go | Non-Linux Apply returns ErrNotImplemented. |
| plugins/diff/overlay/applier_linux_test.go | Adds unit tests for overlay option parsing and invalid parameters. |
| plugins/diff/overlay/applier_linux.go | Implements overlay-specific apply behavior (whiteout conversion on bind/overlay mounts). |
| integration/client/migration_test.go | Updates migration expectations for old defaults to include default=['walking']. |
| integration/build_local_containerd_helper_test.go | Ensures overlay diff plugin is linked for integration tests. |
| go.sum | Removes sums for the replaced containerd/api module. |
| go.mod | Adds replace github.com/containerd/containerd/api => ./api. |
| core/unpack/unpacker_test.go | Removes tests for the prior bind→overlay workaround. |
| core/unpack/unpacker.go | Propagates parallel into apply opts; deny-lists walking applier for parallel; removes workaround. |
| core/diff/proxy/differ.go | Plumbs parallel into remote DiffService ApplyRequest. |
| core/diff/diff.go | Adds ApplyConfig.Parallel and diff.WithParallel. |
| core/diff/apply/apply.go | Refactors filesystem applier to support custom apply funcs and parallel capability gating. |
| contrib/diffservice/service.go | Plumbs parallel into the contrib diffservice Apply path. |
| cmd/containerd/builtins/builtins_linux.go | Registers overlay diff plugin in Linux builtins. |
| api/services/diff/v1/diff.proto | Adds parallel field to ApplyRequest. |
| api/services/diff/v1/diff.pb.go | Regenerates Go bindings for the updated proto. |
| api/next.txtpb | Updates API descriptor textproto for the new field. |
| RELEASES.md | Documents walking differ behavior vs. parallel unpack and the new overlay differ. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for _, o := range opts { | ||
| if err := o(&config); err != nil { | ||
| return nil |
There was a problem hiding this comment.
NewFileSystemApplier silently returns nil when an option returns an error. Returning a nil diff.Applier will typically lead to a panic later and also drops the original error context, making failures hard to debug. Consider changing the API to return (diff.Applier, error), or (if keeping the current signature) logging/panicking immediately so the root cause is visible.
| for _, o := range opts { | |
| if err := o(&config); err != nil { | |
| return nil | |
| for i, o := range opts { | |
| if err := o(&config); err != nil { | |
| panic(fmt.Errorf("NewFileSystemApplier option %d failed: %w", i, err)) |
There was a problem hiding this comment.
An error log message was added to aim the troubleshooting. which should be enough to address this concern.
There was a problem hiding this comment.
Might be better to return an error here rather than nil still.
There was a problem hiding this comment.
Adding an error to the return values would be a breaking change, which would require a new function. Do we want to do that?
@samuelkarp Sorry for the delay as I was out for the past few days. I've rebased this PR and resolved the conflicts. PTAL, thanks! |
| for _, o := range opts { | ||
| if err := o(&config); err != nil { | ||
| return nil |
There was a problem hiding this comment.
Might be better to return an error here rather than nil still.
| // The ApplierID can be empty if the unpack is performed on the client side, | ||
| // where the applier is just a proxy. In that case the diff service will try each | ||
| // applier in a configured order. | ||
| var denylist = []string{"walking"} |
There was a problem hiding this comment.
Thinking about this more: now that we have config version 4, it would be nice to add a migration for overlay + explicit walking + parallel to the new overlay applier. That way existing configs will continue to work on upgrade, and the migrate command should show the new correct value.
There was a problem hiding this comment.
Yep good point. I've added the migration logic
Signed-off-by: Henry Wang <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 30 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 30 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Henry Wang <[email protected]>
Signed-off-by: Henry Wang <[email protected]>
Signed-off-by: Henry Wang <[email protected]>
This reverts commit 3382fb7. Signed-off-by: Henry Wang <[email protected]>
Signed-off-by: Henry Wang <[email protected]>
| SyncFs: false, | ||
| } | ||
|
|
||
| func configMigration(ctx context.Context, configVersion int, pluginConfigs map[string]any) error { |
| } | ||
|
|
||
| func configMigration(ctx context.Context, configVersion int, pluginConfigs map[string]any) error { | ||
| if configVersion >= version.ConfigVersion { |
There was a problem hiding this comment.
Probably not just version.ConfigVersion, but the specific versions we're looking at for the migration. This migration is for moving up to version 4.
| var new []any | ||
| for _, v := range old { | ||
| new = append(new, v) | ||
| if v == "walking" { |
There was a problem hiding this comment.
It's currently (i.e., before this PR) valid to configure "walking" with "rebase". So it's not just the order that should be modified, but also the capabilities.
There was a problem hiding this comment.
rebase is a build-in feature of a snapshotter, which will automatically advertise the cap when it's initialized. currently there is no tight coupling between a differ and a snapshotter (but maybe it makes sense if there was?).
In terms of configuration, there are two cases:
- Image pull with transfer service: where users have the option to configure a combination of "snapshotter", "differ" and "unpack parallelism". In this case, parallel unpack is only enabled if a/ snapshotter supports "rebase"; b/ differ is
overlay/erofs; c/ configured concurrent unpack > 1; - Local image pull from client side: where client can only specify "snapshotter" and "unpack parallelism". The differ is invoked in the order configured in the
diffservice. With the change in this PR, image pull would fail if there is an explicitwalkingin the list. So i think a migration would help to mitigate this risk.
So i have the following questions:
- Could you help me to understand what's the actual migration use case you were referring to?
- Also given this PR is not part of 2.3, can we make the migration as a separate patch based on the possibility of a new config version number?
Let me know. thanks!
|
I'm removing this from the 2.3 milestone since we're currently working on getting the release candidate out: #13302. I think at this point we should aim for inclusion in 2.4. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This is a series of patches that aims for a long term solution for #13030. It will eventually replace the short term workaround implemented by #13044.
The patch implements an overlay specific differ, which is also appended to the default differ list in the diff service, and the walking differ is blacklisted for parallel unpacking in v2.3.
In order to support local pull, I had to add a new
parallelfield toApplyRequestproto, so that differs under diff service can reject the request if it does not support parallel mode.This PR can be rebased on top of #12842 if that lands first.