Fix SharedSize computation in ImageService.Image for filtered requests#42550
Fix SharedSize computation in ImageService.Image for filtered requests#42550thaJeztah merged 2 commits intomoby:masterfrom
ImageService.Image for filtered requests#42550Conversation
thaJeztah
left a comment
There was a problem hiding this comment.
overall looks good. I left some comments / suggestions; also probably ok to combine the general refactor changes to a single commit.
If you want to apply the changes I suggested inline; here's a diff:
diff --git a/daemon/images/images.go b/daemon/images/images.go
index 308a9ae39d..abe62f4bd2 100644
--- a/daemon/images/images.go
+++ b/daemon/images/images.go
@@ -83,8 +83,8 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
selectedImages = i.imageStore.Heads()
}
- summaries := make([]*types.ImageSummary, 0, len(selectedImages))
var (
+ summaries = make([]*types.ImageSummary, 0, len(selectedImages))
summaryMap map[*image.Image]*types.ImageSummary
allContainers []*container.Container
)
@@ -119,9 +119,8 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
continue
}
- layerID := img.RootFS.ChainID()
var size int64
- if layerID != "" {
+ if layerID := img.RootFS.ChainID(); layerID != "" {
l, err := i.layerStore.Get(layerID)
if err != nil {
// The layer may have been deleted between the call to `Map()` or
@@ -213,14 +212,8 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
rootFS.DiffIDs = nil
for _, id := range img.RootFS.DiffIDs {
rootFS.Append(id)
- chid := rootFS.ChainID()
-
- layerRefs[chid]++
- if _, ok := allLayers[chid]; !ok {
- return nil, fmt.Errorf("layer %v was not found (corruption?)", chid)
- }
+ layerRefs[rootFS.ChainID()]++
}
-
}
// Get Shared sizes
@@ -228,17 +221,20 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
rootFS := *img.RootFS
rootFS.DiffIDs = nil
+ // Indicate that we collected shared size information (default is -1, or "not set")
summary.SharedSize = 0
for _, id := range img.RootFS.DiffIDs {
rootFS.Append(id)
chid := rootFS.ChainID()
- diffSize, err := allLayers[chid].DiffSize()
- if err != nil {
- return nil, err
- }
-
if layerRefs[chid] > 1 {
+ if _, ok := allLayers[chid]; !ok {
+ return nil, fmt.Errorf("layer %v was not found (corruption?)", chid)
+ }
+ diffSize, err := allLayers[chid].DiffSize()
+ if err != nil {
+ return nil, err
+ }
summary.SharedSize += diffSize
}
}
daemon/images/images.go
Outdated
There was a problem hiding this comment.
Mostly an observation, but it's somewhat interesting that here, this is an "error", whereas in code above (layer.ErrLayerDoesNotExist), it's not an error.
There was a problem hiding this comment.
Possibly this check also makes more sense to be included in the for img, summary := range summaryMap { loop below, as that is the code that's actually using the allLayers map.
3f219de to
4ccebe3
Compare
daemon/images/images.go
Outdated
There was a problem hiding this comment.
Actually wondering now; if summary is unique per image, I think we can skip this whole withExtraAttrs block, and if summaryMap[img] already exists, we can instead pick the existing summary from imagesMap[img], and add it to summaries ? Something like
if withExtraAttrs {
if s, ok := summaryMap[img]; ok {
// Already calculated size for this image
summaries = append(summaries, summary)
continue
}
}(double check if I'm right though)
There was a problem hiding this comment.
Oh! I don't think that's correct; as this will be incrementing usage count 🤔
There was a problem hiding this comment.
Dang. Looking at the old code (which did layerRefs[chid]++ in this part); I think for the new code, it may be correct (we can possibly reuse the cache).
Man, this code is confusing 😂
There was a problem hiding this comment.
daemon/images/images.go
Outdated
There was a problem hiding this comment.
There's one thing I don't like about the code; this is not something new, but I've been looking at it when reviewing, and I thought I'd leave a comment about it (haven't thought about the best solution for this yet).
Here, we're constructing:
- a map of summaries, indexed by
img - a slice with summaries
I think the reasoning for the map is to de-duplicate multiple images that refer to the same actual "image" (so that we don't re-calculate the size multiple times).
The slice is what we return in the end, but in the code further down (withExtraAttrs), we are indirectly updating the slice by updating the map (?); (link to old code, but new code is similar);
Lines 239 to 241 in 314759d
I think that's confusing (not directly apparent what's happening); Not sure what a good solution is though.
There was a problem hiding this comment.
Let's file a separate ticket to look into this, since this PR is severely growing out of scope and it blocks #42531
b5d628c to
79a0ed9
Compare
ImageService.Image, fix SharedSize computation
79a0ed9 to
a5cf3ba
Compare
|
|
a5cf3ba to
2febcba
Compare
ImageService.Image, fix SharedSize computationImageService.Image
daemon/images/images.go
Outdated
There was a problem hiding this comment.
I'm unsure why this test moved downward (l 233) but if having a corrupted layer is really possible, then the earlier it is detected, the better. Every computation done between here and the new location will be wasted.
There was a problem hiding this comment.
@thaJeztah
I believe the reasoning is #42550 (comment)
daemon/images/images.go
Outdated
There was a problem hiding this comment.
Unrelated to this patch, the Append was alread done when counting layer references ?
There was a problem hiding this comment.
What this does, is that it computes a chain ID of each layer.
The chain ID is equal to diff ID for the bottom-most layer and for the rest it is given by hashing the parent chain ID and the current diff ID.
The logic is quite obfuscated - I added this commit to clear it up 78e0c73, but it will be merged in #42578 to keep the scope of this PR small(-er)
We could potentially cache these, but the computation is quite lightweight, so probably not much use of that.
There was a problem hiding this comment.
yes, this logic is somewhat confusing; what it comes down to is that rootFS.ChainID() is not a simple "getter", but is calculating the chain-id based on the DiffIDs; see
Lines 46 to 53 in 7b9275c
This code is basically doing a "shallow" copy of img.RootFS (to copy all other properties except for the DiffIDs), then adds the DiffIDs of the rootfs, and calls rootFS.ChainID() to calculate the ID.
Possibly we could call layer.CreateChainID(img.RootFS.DiffIDs) directly to get the chain-ID (after doing the windows-specific checks if needed), to make it clearer we're calculating something here.
(not for this PR though)
2febcba to
c337c0e
Compare
ImageService.ImageImageService.Image for filtered requests
- Rename image summary constructor
- Rename `newImage` into `newImageSummary`, since the returned type is
`*types.ImageSummary`
- Rename variables for clarity
- Rename `newImage` into `summary`, since the variable type is
`*types.ImageSummary`
- Rename `imagesMap` into `summaryMap`, since the value type
contained is `*types.ImageSummary`
- Only compute `DiffSize` when more than 1 reference to the layer
exists, since it is not used otherwise
- Move variable declarations closer to where they are used
Signed-off-by: Roman Volosatovs <[email protected]>
Co-authored-by: Sebastiaan van Stijn <[email protected]>
85a37a5 to
edb15c2
Compare
|
@AkihiroSuda @tonistiigi PTAL |
tonistiigi
left a comment
There was a problem hiding this comment.
First commit SGTM, we could merge that separately as well.
daemon/images/images.go
Outdated
There was a problem hiding this comment.
shouldn't this be just range i.imageStore.Map() and no danglingOnly check in lines above
There was a problem hiding this comment.
I guess you want to avoid calling Map() again for the non-dangling case. Maybe add a comment for it.
There was a problem hiding this comment.
Updated the comment
Signed-off-by: Roman Volosatovs <[email protected]>
edb15c2 to
af3e556
Compare
- What I did
ImageService.Imagesfor claritySharedSizecomputation for filtered image queries (prerequisite for Add support forshared-sizeparameter for images queries #42531)- How I did it
Partially applied thaJeztah@3e2193e and review comments
Regarding the fix: compute reference counts of layers used by all images and not only the filtered ones
- How to verify it
I tried to run both
docker imagesanddocker system dfonmasterand this changeset based on top - outputs seems to match