Skip to content

Fix lease management during pull and export#48910

Merged
thaJeztah merged 3 commits intomoby:masterfrom
dmcgowan:containerd-lease-cleanup
Nov 26, 2024
Merged

Fix lease management during pull and export#48910
thaJeztah merged 3 commits intomoby:masterfrom
dmcgowan:containerd-lease-cleanup

Conversation

@dmcgowan
Copy link
Copy Markdown
Member

@dmcgowan dmcgowan commented Nov 20, 2024

Ensure that leases have a reasonable expiration and are cleaned up during prune

Currently it seems some leases could be leaked as they were created with no expiration and their removal could fail due to using a canceled context.

- Description for the changelog

containerd image-store: fix partially pulled images not being garbage-collected [moby#48910](https://github.com/moby/moby/pull/48910)

Copy link
Copy Markdown
Contributor

@vvoland vvoland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Comment on lines +64 to +78
// Prune leases
leaseManager := i.client.LeasesService()
pullLeases, err := leaseManager.List(ctx, `labels."moby/action.pull"`)
if err != nil {
return nil, err
}
for i, lease := range pullLeases {
opts := []leases.DeleteOpt{}
if i == len(pullLeases)-1 {
opts = append(opts, leases.SynchronousDelete)
}
if err := leaseManager.Delete(ctx, lease, opts...); err != nil {
return nil, err
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a follow up - not sure how feasible that is, but it would be nice if we'd be able to determine how much disk space was freed.

@vvoland vvoland added kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. kind/bugfix PR's that fix bugs containerd-integration Issues and PRs related to containerd integration impact/changelog labels Nov 21, 2024
@vvoland vvoland added this to the 28.0.0 milestone Nov 21, 2024
@vvoland vvoland added the area/images Image Distribution label Nov 21, 2024
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM

left some comments; nothing really blocking (I think), but possibly some for follow-ups. The timeout is the one I'm wondering if we should shorten as part of this PR (but open to input).

}
defer func() {
if err := leasesManager.Delete(ctx, lease); err != nil {
if err := done(context.WithoutCancel(ctx)); err != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly question (would have to look where all it's used); are there done() functions returned by such functions that should always be called with WithoutCancel()?

Considering if there's cases where the done function returned should handle the WithoutCancel parts so that consumers don't have to be dealing with those.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the export case there really isn't a need to hold onto anything since even if the operation is canceled, it will need to restart from the beginning on retry. For the pull case, a cancel may want to keep progress so the context error could be respected. We could make this more explicit and check for a context error rather than let the function error from it.

Comment on lines +43 to +44
// How long downloaded content will be held after cancellation
expireAt := time.Now().Add(8 * time.Hour)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a const for this?

Also considering if this should be shorter even; I somewhat expect the "resume previous pull" to be happening in a much shorter timeframe (1, 2 hours?);

  • starting a pull, realising you're either pulling the wrong image, or (e.g. in CI); test failed, and cancelled the pull
  • next attempt to pull will be either immediate after, or after fixing what caused the pull to be cancelled

Are these partial pulls only completed layers (pull for a layer completed, but not for the full image), or can these also be "partial downloads"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should for partial downloads of the blobs. Any partially unpacked snapshotters should be discarded as I believe we always use a unique identifier for unpack.

I'm adding a const. We can make this configurable as well in the future if we want and make a smaller value. With it static, it really has to cover the upper bound of possible pull times.

ctx, done, err := i.client.WithLease(ctx,
leases.WithRandomID(),
leases.WithLabels(map[string]string{
"moby/action.pull": baseRef.String(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a const for this one? (currently only used in two places, but not sure if we (should) / will be using the same label in other locations? Do we know if BuildKit also sets such labels, and if these are things we should align?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe later we can, the other spot we use the label is in a filter string which requires extra formatting. I prefer just to see the value and we can either add a utility later or ensure they remain consistent with a test.

leases.WithRandomID(),
leases.WithLabels(map[string]string{
"moby/action.pull": baseRef.String(),
"containerd.io/gc.expire": expireAt.Format(time.RFC3339),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I was surprised that these were not exported as either a const or as a utility in containerd; should they be?

labelGCRoot = []byte("containerd.io/gc.root")
labelGCRef = []byte("containerd.io/gc.ref.")
labelGCSnapRef = []byte("containerd.io/gc.ref.snapshot.")
labelGCContentRef = []byte("containerd.io/gc.ref.content")
labelGCExpire = []byte("containerd.io/gc.expire")
labelGCFlat = []byte("containerd.io/gc.flat")

I see containerd v2 added some extra documentation to some of these to outline the required format for values set; https://github.com/containerd/containerd/blob/f0ebbd3c8a662656b458e7e1cca059a690d557a7/core/metadata/gc.go#L69-L76

	// labelGCExpire indicates that an object is collectible after the
	// provided time. For image objects, this makes them available to
	// garbage collect when expired, when not provided, image objects
	// are root objects that never expire. For non-root objects such
	// as content or snapshots, these objects will be treated like
	// root objects before their expiration.
	// Expected format is RFC 3339
	labelGCExpire = []byte("containerd.io/gc.expire")

So maybe a utility could make most sense;

  • leases.WithGCExpireAt(time.Time) (or something) to make sure it's in the correct format
  • Some of the other ones may require a specific format as well, so perhaps could require a utility as well to validate (instead of a plain const for the right label)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a utility, here it is just to be specific. There is no logic around it though and the format doesn't change, seemed cleaner to just clearly see the labels that are getting set.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see there's indeed a WithExpiration;

// WithExpiration sets an expiration on the lease
func WithExpiration(d time.Duration) Opt {
return func(l *Lease) error {
if l.Labels == nil {
l.Labels = map[string]string{}
}
l.Labels["containerd.io/gc.expire"] = time.Now().Add(d).Format(time.RFC3339)
return nil
}
}

Yeah, mostly looking that using a consistent approach may help finding related things. In this case, I just checked where we're setting these, and I found these;

git grep '\.WithExpiration' -- ':!vendor'
builder/builder-next/adapters/containerimage/pull.go:   ctx, done, err := leaseutil.WithLease(ctx, p.is.LeaseManager, leases.WithExpiration(5*time.Minute), leaseutil.MakeTemporary)
daemon/containerd/image_builder.go:     ctx, _, err := i.client.WithLease(ctx, leases.WithRandomID(), leases.WithExpiration(1*time.Hour))
daemon/containerd/image_builder.go:             leases.WithExpiration(time.Hour*24),
daemon/containerd/image_builder.go:     ctx, release, err := i.client.WithLease(ctx, leases.WithRandomID(), leases.WithExpiration(1*time.Hour))
daemon/containerd/image_commit.go:      ctx, release, err := i.client.WithLease(ctx, leases.WithRandomID(), leases.WithExpiration(1*time.Hour))
daemon/images/image_pull.go:            leases.WithExpiration(24 * time.Hour),

Interestingly it looks like for non-containerd image-pulls, we still use 24 hours, but a different label;

// Use an expiration that ensures the lease is cleaned up at some point if there is a crash, SIGKILL, etc.
opts := []leases.Opt{
leases.WithRandomID(),
leases.WithExpiration(24 * time.Hour),
leases.WithLabels(map[string]string{
"moby.lease/temporary": time.Now().UTC().Format(time.RFC3339Nano),
}),
}

And for classic builder with containerd as well, and again with a different label;

func createLease(ctx context.Context, lm leases.Manager) (context.Context, leases.Lease, error) {
lease, err := lm.Create(ctx,
leases.WithExpiration(time.Hour*24),
leases.WithLabels(map[string]string{
"org.mobyproject.lease.classicbuilder": "true",
}),
)

Perhaps those should be corrected as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to touch the ones outside of daemon/containerd for now, those might be worth looking but not related to the issues we are seeing with the containerd integration. I added a withLease method to our own image service that will add the labels and logic that is consistent with pruning.

@dmcgowan
Copy link
Copy Markdown
Member Author

Looks unrelated, can someone kick the re-run?

=== Failed
=== FAIL: github.com/docker/docker/integration/networking TestPortMappedHairpinWindows (12.48s)
    nat_windows_test.go:108: assertion failed: error is not nil: Post "http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.48/containers/7a294dab5fdc253f624b2e5202567ade53acdda4394217cb4fd71888e0994404/start": context deadline exceeded
    panic.go:629: assertion failed: error is not nil: Error response from daemon: error while removing network: network clientnet id 1afe2b3927c272b4aea4d92b6b95345f031e76ad126f8651a6af134f10da0483 has active endpoints

DONE 14 tests, 9 skipped, 1 failure in 33.137s


ctx = leases.WithLease(ctx, l.ID)
return ctx, func() {
if ctx.Err() != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems racey. Seems like we should always use context.WithoutCancel regardless of if the context is cancelled.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

pruneLeaseFilter = `labels."moby/prune.images"`
)

func (i *ImageService) withLease(ctx context.Context, cancellable bool) (context.Context, func(), error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand the reason one would set cancellable to true?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resumable pull

@dmcgowan dmcgowan force-pushed the containerd-lease-cleanup branch from d29fe20 to c57c9c6 Compare November 22, 2024 22:28
@vvoland
Copy link
Copy Markdown
Contributor

vvoland commented Nov 25, 2024

    build_userns_linux_test.go:132: run produced invalid output: "", expected "/bin/sleep cap_net_bind_service=eip"
--- FAIL: TestBuildUserNamespaceValidateCapabilitiesAreV2 (15.97s)

Looks like this test became flaky recently 🤔

@thaJeztah
Copy link
Copy Markdown
Member

@vvoland yeah, I think it's one of the tests that became flaky related to the runc 1.2 updates, and there were some remaining patches that didn't get merged (not sure if all of them were good though); #48160 (comment)

@thaJeztah
Copy link
Copy Markdown
Member

var outNewImg containerd.Image

if oldImage.Target.Digest != "" {
// Lease the old image content to prevent it from being garbage collected until we keep it as dangling image.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still useful to preserve?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add a comment to the withLease function, but that is the whole point of the lease, it seems redundant to describe what a lease is everywhere it is used.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this one felt like it added additional context to why we're holding on to the old image (which may not be immediately obvious).

Comment on lines -496 to -497
// Necessary to prevent the contents from being GC'd
// between writing them here and creating an image
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this one?

Comment on lines +58 to 63
// TODO: Consider a better way to do this. It is better to have a container directly
// reference a snapshot, however, that is not done today because a container may
// removed and recreated with nothing holding the snapshot in between. Consider
// removing this lease and only temporarily holding a lease on re-create, using
// non-expiring leases introduces the possibility of leaking resources.
ls := i.client.LeasesService()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the last commit now only adds this comment, but the commit message says;

Add label on snapshotter to warn about non-expiring leases

Ensure that leases have a reasonable expiration and are
cleaned up during prune

Signed-off-by: Derek McGowan <[email protected]>
Currently when preparing a snapshot for a container, a lease is used to
hold that snapshot for the lifespan of a container. That is workaround
to preserve the snapshot when a container is recreated, however, the
containerd object should be able to hold this reference itself.

Signed-off-by: Derek McGowan <[email protected]>
@dmcgowan dmcgowan force-pushed the containerd-lease-cleanup branch from c57c9c6 to 4becdac Compare November 25, 2024 14:44
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/images Image Distribution containerd-integration Issues and PRs related to containerd integration impact/changelog kind/bugfix PR's that fix bugs kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. process/cherry-picked status/4-merge

Projects

Development

Successfully merging this pull request may close these issues.

Containerd snapshotter saves layers that can't be cleared easily

4 participants