Adds ability to flatten image after build#22641
Conversation
d79671f to
c59905f
Compare
|
Hm, Deja vu #4232? |
|
No, this is different. This is specifically for |
412920d to
edee69e
Compare
|
But also addresses the issue with image history. Also, maybe not |
|
Another thing to also consider is that perhaps for security reasons it would be undesirable for the image history to be preserved in the flattened image. |
edee69e to
92d26f3
Compare
|
I wonder if it would make sense to also allow a number as an arg, so 0 is scratch, 1 is parent etc, so you can leave eg two layers if you commonly use a pair of layers for build (eg alpine plus your app base)? |
|
@justincormack But then they can just create that image as the base. |
|
@cpuguy83 yes I guess they would do that. |
|
Thinking about this more, I'm wondering if we should make it default to squashing to the parent image. With content addressability you can no longer push/pull a build cache, as such it doesn't make sense to keep all the extra layers as part of the image. We can then either provide an option to fully squash (to 1 layer)... or even defer on such a decision. |
|
We discussed this in the maintainers meetup, and there's no consensus yet. Things discussed;
Given that there's no consensus yet, we keep this in design review. We like the idea, however, so it stays open. |
|
This would be nice for us. We have some security concerns around using SSH keys in our Dockerfile, so we're making a base image, using docker run to mount the ssh keys to run |
|
@thaJeztah Hi, I am interested in this feature, it must have for a lot of projects, but this night I got better idea how to do it! We need to add new instruction like ONBUILD, name it "SQUASH" for example. If instruction is prefixed with SQUASH, current command fs must be squashed with next instruction. This will make simple to save Dockerfile readability and will allow us to make "commits" to fs layers when we want. We don't need to squash all an image, because it is stupid in many situations when you deploy with docker, if you squash everything, then you will upload and download all image every time even if you already have 90% of system on your server, we need to save ability to have more than one layers and ability to squash it with docker. Now for this I build intermediate image and extend it with "FROM", intermediate image is squashed. This way I make double work: I build fs with docker, download it, unzip, merge layers, zip, upload to docker. Not very good way. I want to try to provide new PR with this feature, but I must to check source codes for now to understand is it possible or is too hard for me (I don't write in Go, but I write in a lot of languages and it can be interesting task for me :). What do you think? Before this idea I thought about cli flag too, but this night I got this cool idea which makes this process very simple and manageble. |
|
Just to do an example to be sure I understand... So the |
|
@docwhat |
|
@docwhat I think about something like FROM alpine # will use all layers from parent image
SQUASH ADD /config/requirements.txt /requirements.txt
SQUASH RUN apt-get install -y python
RUN pip install /requirements.txt # new layer here
SQUASH EXPOSE 80
SQUASH RUN /install-all.sh
SQUASH COPY foobar.tgz /foobar.tgz
RUN tar xf /foobar.tgz && rm -f /foobar.tgz # new layer here |
|
@cpuguy83 you already did it with ONBUILD comand which 99% of users don't need :) My solution will allow to make clear images without extra entries in history (we can join history messages into one in resulting layer) and get very clean Dockerfile, not use 3rd party tools for this feature, which have to be in docker, because now you can optimize your layers only for multiple RUN commands, for EXPOSE, ADD, COPY, etc we can't and we can squash it only with 3rd party tools what is not cool :( One big start point here is that we already DO IT. I just want to provide better (and much faster) way to do it to the community. |
|
@vincentwoo Oh thanks a bunch! Quick search yielded nothing on this flag before your reply. |
|
Is there an indication when this feature can be used in production? |
|
@krm1 It's only in experimental because we are not sure that it is the right interface to expose. With it in experimental we can change it between versions (or replace it with something else). |
|
@krm1 What's keeping you from using it now? As far as I can tell the resulting artifact is a Docker image like all other Docker images. @cpuguy83 It really seems like the core concern for a |
|
@cpuguy83 @alanbrent |
No, experimental features would not be in the main code path; in addition, those features go through the same process as regular features. The difference is that design of the feature may change, and we're accepting feedback to change the feature based on that. Experimental just gives more freedom in that respect, knowing that we can change without causing a breaking change |
|
I think the question was about something else -- does the |
|
For now, the |
|
Considering that one must explicitly opt-in to this experimental squashing feature with |
|
@JonathonReinhart not really, |
|
I just used the new Is there a way to use this flag with the automated build system, and/or will that be available once it is out of |
Adds ability to flatten image after build
When running on a kernel which is not patched for the copy up bug overlay2 will use the naive diff driver. cherry-pick from moby#28138 and backport some code from moby#22641 Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan) Signed-off-by: Lei Jitang <[email protected]>
|
Have you considered rewriting the Dockerfile to employ multi-stage building? I certainly understand that you may not want to potentially radically change your Dockerfile and/or that you have a large inventory of pre-existing Dockerfiles that might cause this solution, at this time, to be prohibitively costly. That said, the large reduction you noticed is mostly the result of defining what should be in your image by excluding, through deletion, what shouldn't be there. Using exclusion can be problematic especially when you've designed either intentionally/accidentally your build to seamlessly "adapt" to new versions of tooling that have semantically changed. For example, a compiler's set of exclusionary artifacts may have been altered, extended, and/or been relocated to a different path with the introduction of a new version, causing the statically defined delete operation, that once eliminated these artifacts to ignore them. This results in these unwanted artifacts remaining in the image. So the several Gig reduction that you noticed due to squash may at some future point mysteriously reappear. Worse yet, failed exclusionary behavior may preserve an artifact that doesn’t noticeably increase the image’s size but presents a juicy exploit. Therefore, instead of relying on an exclusionary mechanism, I would recommend the encoding of an inclusionary one. In adopting an inclusionary strategy, you must fully detail what it is you wish to exist in the resultant image. This is not as difficult as it sounds and has many benefits including improving the security of your image and resulting running container. In general, it's much easier to identify what needs to be included than excluded, as the desired outputs of build tooling represent its public interface. This interface doesn't change as much as the build tooling's private implementation. For example, in a C++ project there maybe hundreds of object files and a small number of precompiled headers. All these artifacts result in producing a single executable file. A developer may change the C++ makefile for this project to incorporate features from libraries altering the build's implementation, but the end result of creating the desired executable is the same. The recently introduced Multi-stage feature allows the resultant image to be isolated from other steps that build the final artifacts you wish to include in your image (separation of concerns). It also provides a copy mechanism to transfer these desired final artifacts, like the executable mentioned above, from the other build polluted steps to construct the resultant image. Although I’ve personally been a proponent for multi-stage builds, I find its current implementation problematic. However, even in its current form, it offers superior facilities to building secure and minimally sized images than squash can ever achieve. Given my assessment of squash and desire for “simplicity” by offering a single way to realize a solution, I would like to see squash - squashed, as multi-stage builds are much more capable at performing the same operation. Perhaps, if you read into the tea leaves, you’ll notice the accelerated development and deployment timeline of multi-stage support, as well as its speedy inclusion as a standard docker feature while squash lingers in its experimental state. Please remember that I’m not a Docker Maintainer and the musings above are my own, not that of Docker Inc. |
|
Are there any indications if this will come out of I would really like to utilize this functionality, but I've no desire to enable any other experimental features. (I also have no interest in multi-stage builds at this time) |
- What I did
Allow built images to be squashed to their parent.
Squashing does not destroy any images or layers, and preserves the build cache.
- How I did it
Introduce a new CLI argument
--squashtodocker build.Introduce a new param to the build API endpoint
squash.Once the build is complete, docker creates a new image loading the diffs from each layer into a single new layer and references all the parent's layers
- How to verify it
Test the image, check for
/remove_mebeing gone, make surehello\nworldis in/hello, make sure theHELLOenvvar's value isworld- Description for the changelog
Add option to squash image layers to the
FROMimage after successful builds- A picture of a cute animal (not mandatory but encouraged)

Some of the implementation is a little rough around the edges but wanted to get this out there.
I also really wanted to make sure that when using the
fulloption the user is prompted to make sure they know what it's really doing.