Skip to content

Adjust "docker pull" for images that already exist to update blob mount possibilities#44757

Open
tianon wants to merge 1 commit intomoby:masterfrom
tianon:noop-pull-blob-mount
Open

Adjust "docker pull" for images that already exist to update blob mount possibilities#44757
tianon wants to merge 1 commit intomoby:masterfrom
tianon:noop-pull-blob-mount

Conversation

@tianon
Copy link
Copy Markdown
Member

@tianon tianon commented Jan 5, 2023

If we have the same image at foo:bar and example.com/foo:bar and we already have foo:bar locally and then pull example.com/foo:bar, Docker will update RepoDigests for the latter but not the blob mount sources data, so later pushes of images based on foo:bar to example.com will re-upload those layers, even though Docker itself knows they already exist and could/should blob mount them instead.

This shuffles that logic slightly so that they can be registered at the new name too. (This makes an especially large difference for big layers like Windows where re-uploading them is extremely expensive, not to mention unnecessarily potentially changing their registry digest due to recompression.)

My own use case here is that I'm doing docker load on some layers that are already pushed to my target registry, then building new things on top and pushing those to my target registry, and Docker doesn't know that the layers already exist so it's re-pushing them (using tar-split so the diff_ids are the same, but it has to recompress so the actual content digest does change, which is unnecessarily confusing and unnecessarily wasting registry storage). With this change, I can docker load my local copy, then no-op docker pull the remote copy to prime Docker's knowledge of these layers existing so that the docker push of my new images based on them do blob mounts instead of recompressing and repushing! 🥳

@tianon
Copy link
Copy Markdown
Member Author

tianon commented Jan 5, 2023

If someone can help me know the best place to put them, I'm happy to try writing some integration tests too, but it's going to be a tiny bit complex. 😅

@thaJeztah thaJeztah added area/distribution Image Distribution status/2-code-review kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. labels Jan 6, 2023
…nt possibilities

If we have the same image at `foo:bar` and `example.com/foo:bar` and we already have `foo:bar` locally and then pull `example.com/foo:bar`, Docker will update `RepoDigests` for the latter but **not** the blob mount sources data, so later pushes of images based on `foo:bar` to `example.com` will re-upload those layers, even though Docker itself **knows** they already exist and could/should blob mount them instead.

This shuffles that logic slightly so that they can be registered at the new name too.  (This makes an especially large difference for big layers like Windows where re-uploading them is extremely expensive, not to mention unnecessarily potentially changing their registry digest due to recompression.)

Signed-off-by: Tianon Gravi <[email protected]>
@tianon tianon force-pushed the noop-pull-blob-mount branch from 09ce412 to dee8902 Compare January 9, 2023 19:06
@tianon
Copy link
Copy Markdown
Member Author

tianon commented Jan 9, 2023

(just a rebase)

@tianon tianon mentioned this pull request Mar 24, 2023
Copy link
Copy Markdown
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I gather, this records metadata that the blob exists at the given registry reference so this can be used later for cross-blob mounts?

LGTM

@tianon
Copy link
Copy Markdown
Member Author

tianon commented Apr 18, 2023

Yep, that's exactly right -- recording the repo during docker pull even if we have the blob already so that it can be used during push (because we might've gotten the blob out of band).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/distribution Image Distribution kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants