Adjust "docker pull" for images that already exist to update blob mount possibilities#44757
Open
tianon wants to merge 1 commit intomoby:masterfrom
Open
Adjust "docker pull" for images that already exist to update blob mount possibilities#44757tianon wants to merge 1 commit intomoby:masterfrom
tianon wants to merge 1 commit intomoby:masterfrom
Conversation
Member
Author
|
If someone can help me know the best place to put them, I'm happy to try writing some integration tests too, but it's going to be a tiny bit complex. 😅 |
…nt possibilities If we have the same image at `foo:bar` and `example.com/foo:bar` and we already have `foo:bar` locally and then pull `example.com/foo:bar`, Docker will update `RepoDigests` for the latter but **not** the blob mount sources data, so later pushes of images based on `foo:bar` to `example.com` will re-upload those layers, even though Docker itself **knows** they already exist and could/should blob mount them instead. This shuffles that logic slightly so that they can be registered at the new name too. (This makes an especially large difference for big layers like Windows where re-uploading them is extremely expensive, not to mention unnecessarily potentially changing their registry digest due to recompression.) Signed-off-by: Tianon Gravi <[email protected]>
09ce412 to
dee8902
Compare
Member
Author
|
(just a rebase) |
Merged
cpuguy83
approved these changes
Apr 3, 2023
Member
cpuguy83
left a comment
There was a problem hiding this comment.
From what I gather, this records metadata that the blob exists at the given registry reference so this can be used later for cross-blob mounts?
LGTM
Member
Author
|
Yep, that's exactly right -- recording the repo during |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If we have the same image at
foo:barandexample.com/foo:barand we already havefoo:barlocally and then pullexample.com/foo:bar, Docker will updateRepoDigestsfor the latter but not the blob mount sources data, so later pushes of images based onfoo:bartoexample.comwill re-upload those layers, even though Docker itself knows they already exist and could/should blob mount them instead.This shuffles that logic slightly so that they can be registered at the new name too. (This makes an especially large difference for big layers like Windows where re-uploading them is extremely expensive, not to mention unnecessarily potentially changing their registry digest due to recompression.)
My own use case here is that I'm doing
docker loadon some layers that are already pushed to my target registry, then building new things on top and pushing those to my target registry, and Docker doesn't know that the layers already exist so it's re-pushing them (using tar-split so thediff_idsare the same, but it has to recompress so the actual content digest does change, which is unnecessarily confusing and unnecessarily wasting registry storage). With this change, I candocker loadmy local copy, then no-opdocker pullthe remote copy to prime Docker's knowledge of these layers existing so that thedocker pushof my new images based on them do blob mounts instead of recompressing and repushing! 🥳