-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Fix image pull and remove race #2331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Derek McGowan <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #2331 +/- ##
=========================================
- Coverage 49.28% 44.98% -4.3%
=========================================
Files 84 92 +8
Lines 7424 9340 +1916
=========================================
+ Hits 3659 4202 +543
- Misses 3089 4459 +1370
- Partials 676 679 +3
Continue to review full report at Codecov.
|
rootfs/apply.go
Outdated
| if tries < 3 && errdefs.IsNotFound(err) { | ||
| if err := applyLayers(ctx, layers, chain, sn, a); err != nil { | ||
| if errdefs.IsUnavailable(err) { | ||
| time.Sleep(time.Second * time.Duration(tries)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to wait at a min 1 sec? Seems a little long but not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can simplify this and get rid of the unavailability checks by moving the key generation into the loop. Likely this scenario cannot be reconciled and is reliant on cleanup anyway.
rootfs/apply.go
Outdated
| mounts, err = sn.Prepare(ctx, key, parent.String(), opts...) | ||
| if err != nil { | ||
| tries++ | ||
| if tries < 3 && len(layers) > 1 && errdefs.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is kinda complex, do we need the len(layers) part in this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That len(layers) check is to prevent recursion when called from rootfs.ApplyLayer. I tried to unify the functions, let me see if I can do this in a less complex well, else will document that.
client.go
Outdated
| } | ||
|
|
||
| if pullCtx.Unpack { | ||
| if err := img.Unpack(ctx, pullCtx.Snapshotter); err != nil { | ||
| errors.Wrapf(err, "failed to unpack image on snapshotter %s", pullCtx.Snapshotter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, errors.Wrapf(err, "failed to unpack image on snapshotter %s", pullCtx.Snapshotter)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that isn't right #2011. I will make a seperate PR for that one.
|
Should we cherry-pick this? |
|
Offline discussed with @dmcgowan. Problem 1) and 3) are not a problem for We'll update containerd and use |
Update apply layers to recursive from the top layer. Update apply layer to check for exists and apply single layer. Signed-off-by: Derek McGowan <[email protected]>
Signed-off-by: Derek McGowan <[email protected]>
Signed-off-by: Derek McGowan <[email protected]>
8b711ae to
f0b3d5a
Compare
|
Is this ready to review again? |
|
LGTM |
estesp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A few races exist today if deleting an image during pull that are addressed by this PR.
ctr image pulldoes not hold the same lease between fetch and unpackThis did expose a common problematic pattern of relying on stat object + use object in the client. There is no guarantee after a stat returns that the object will still exist, even if the stat returned the object existed. For mutation and creation operations we normally add the object to the lease to prevent cleanup, but nothing is done on read operations today.