Skip to content

Fix SharedSize computation in ImageService.Image for filtered requests#42550

Merged
thaJeztah merged 2 commits intomoby:masterfrom
rvolosatovs:fix_image_shared_size
Jul 2, 2021
Merged

Fix SharedSize computation in ImageService.Image for filtered requests#42550
thaJeztah merged 2 commits intomoby:masterfrom
rvolosatovs:fix_image_shared_size

Conversation

@rvolosatovs
Copy link
Contributor

@rvolosatovs rvolosatovs commented Jun 22, 2021

- What I did

- How I did it
Partially applied thaJeztah@3e2193e and review comments
Regarding the fix: compute reference counts of layers used by all images and not only the filtered ones

- How to verify it
I tried to run both docker images and docker system df on master and this changeset based on top - outputs seems to match

@rvolosatovs
Copy link
Contributor Author

rvolosatovs commented Jun 22, 2021

blocking #42531 #42578

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good. I left some comments / suggestions; also probably ok to combine the general refactor changes to a single commit.

If you want to apply the changes I suggested inline; here's a diff:

diff --git a/daemon/images/images.go b/daemon/images/images.go
index 308a9ae39d..abe62f4bd2 100644
--- a/daemon/images/images.go
+++ b/daemon/images/images.go
@@ -83,8 +83,8 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
 		selectedImages = i.imageStore.Heads()
 	}
 
-	summaries := make([]*types.ImageSummary, 0, len(selectedImages))
 	var (
+		summaries     = make([]*types.ImageSummary, 0, len(selectedImages))
 		summaryMap    map[*image.Image]*types.ImageSummary
 		allContainers []*container.Container
 	)
@@ -119,9 +119,8 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
 			continue
 		}
 
-		layerID := img.RootFS.ChainID()
 		var size int64
-		if layerID != "" {
+		if layerID := img.RootFS.ChainID(); layerID != "" {
 			l, err := i.layerStore.Get(layerID)
 			if err != nil {
 				// The layer may have been deleted between the call to `Map()` or
@@ -213,14 +212,8 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
 			rootFS.DiffIDs = nil
 			for _, id := range img.RootFS.DiffIDs {
 				rootFS.Append(id)
-				chid := rootFS.ChainID()
-
-				layerRefs[chid]++
-				if _, ok := allLayers[chid]; !ok {
-					return nil, fmt.Errorf("layer %v was not found (corruption?)", chid)
-				}
+				layerRefs[rootFS.ChainID()]++
 			}
-
 		}
 
 		// Get Shared sizes
@@ -228,17 +221,20 @@ func (i *ImageService) Images(imageFilters filters.Args, all bool, withExtraAttr
 			rootFS := *img.RootFS
 			rootFS.DiffIDs = nil
 
+			// Indicate that we collected shared size information (default is -1, or "not set")
 			summary.SharedSize = 0
 			for _, id := range img.RootFS.DiffIDs {
 				rootFS.Append(id)
 				chid := rootFS.ChainID()
 
-				diffSize, err := allLayers[chid].DiffSize()
-				if err != nil {
-					return nil, err
-				}
-
 				if layerRefs[chid] > 1 {
+					if _, ok := allLayers[chid]; !ok {
+						return nil, fmt.Errorf("layer %v was not found (corruption?)", chid)
+					}
+					diffSize, err := allLayers[chid].DiffSize()
+					if err != nil {
+						return nil, err
+					}
 					summary.SharedSize += diffSize
 				}
 			}

Comment on lines 219 to 220
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly an observation, but it's somewhat interesting that here, this is an "error", whereas in code above (layer.ErrLayerDoesNotExist), it's not an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly this check also makes more sense to be included in the for img, summary := range summaryMap { loop below, as that is the code that's actually using the allLayers map.

@rvolosatovs rvolosatovs force-pushed the fix_image_shared_size branch from 3f219de to 4ccebe3 Compare June 28, 2021 14:34
@rvolosatovs rvolosatovs requested a review from thaJeztah June 28, 2021 14:35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually wondering now; if summary is unique per image, I think we can skip this whole withExtraAttrs block, and if summaryMap[img] already exists, we can instead pick the existing summary from imagesMap[img], and add it to summaries ? Something like

if withExtraAttrs {
    if s, ok := summaryMap[img]; ok {
        // Already calculated size for this image
        summaries = append(summaries, summary)
        continue
    }
}

(double check if I'm right though)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I don't think that's correct; as this will be incrementing usage count 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang. Looking at the old code (which did layerRefs[chid]++ in this part); I think for the new code, it may be correct (we can possibly reuse the cache).

Man, this code is confusing 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 200 to 206
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one thing I don't like about the code; this is not something new, but I've been looking at it when reviewing, and I thought I'd leave a comment about it (haven't thought about the best solution for this yet).

Here, we're constructing:

  • a map of summaries, indexed by img
  • a slice with summaries

I think the reasoning for the map is to de-duplicate multiple images that refer to the same actual "image" (so that we don't re-calculate the size multiple times).

The slice is what we return in the end, but in the code further down (withExtraAttrs), we are indirectly updating the slice by updating the map (?); (link to old code, but new code is similar);

if layerRefs[chid] > 1 {
newImage.SharedSize += diffSize
}

I think that's confusing (not directly apparent what's happening); Not sure what a good solution is though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's file a separate ticket to look into this, since this PR is severely growing out of scope and it blocks #42531

@rvolosatovs rvolosatovs force-pushed the fix_image_shared_size branch 5 times, most recently from b5d628c to 79a0ed9 Compare June 29, 2021 12:58
@rvolosatovs rvolosatovs changed the title Fix SharedSize computation on filtered image queries Refactor and optimize ImageService.Image, fix SharedSize computation Jun 29, 2021
@rvolosatovs rvolosatovs requested a review from thaJeztah June 29, 2021 13:23
@rvolosatovs rvolosatovs force-pushed the fix_image_shared_size branch from 79a0ed9 to a5cf3ba Compare June 29, 2021 13:24
@rvolosatovs
Copy link
Contributor Author

rvolosatovs commented Jun 29, 2021

I added some more clarity improvements and optimizations, but I think it's better to continue this work in another PR if other improvements are to be done to prevent the scope from growing even more.
Let's leave further refactoring and optimizations to another PR: #42578

@rvolosatovs rvolosatovs force-pushed the fix_image_shared_size branch from a5cf3ba to 2febcba Compare June 29, 2021 14:48
@rvolosatovs rvolosatovs changed the title Refactor and optimize ImageService.Image, fix SharedSize computation Fix SharedSize computation in ImageService.Image Jun 29, 2021
Copy link
Contributor

@fredericdalleau fredericdalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good imho

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure why this test moved downward (l 233) but if having a corrupted layer is really possible, then the earlier it is detected, the better. Every computation done between here and the new location will be wasted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah
I believe the reasoning is #42550 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this patch, the Append was alread done when counting layer references ?

Copy link
Contributor Author

@rvolosatovs rvolosatovs Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this does, is that it computes a chain ID of each layer.
The chain ID is equal to diff ID for the bottom-most layer and for the rest it is given by hashing the parent chain ID and the current diff ID.

The logic is quite obfuscated - I added this commit to clear it up 78e0c73, but it will be merged in #42578 to keep the scope of this PR small(-er)

We could potentially cache these, but the computation is quite lightweight, so probably not much use of that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this logic is somewhat confusing; what it comes down to is that rootFS.ChainID() is not a simple "getter", but is calculating the chain-id based on the DiffIDs; see

moby/image/rootfs.go

Lines 46 to 53 in 7b9275c

// ChainID returns the ChainID for the top layer in RootFS.
func (r *RootFS) ChainID() layer.ChainID {
if runtime.GOOS == "windows" && r.Type == typeLayersWithBase {
logrus.Warnf("Layer type is unsupported on this platform. DiffIDs: '%v'", r.DiffIDs)
return ""
}
return layer.CreateChainID(r.DiffIDs)
}

This code is basically doing a "shallow" copy of img.RootFS (to copy all other properties except for the DiffIDs), then adds the DiffIDs of the rootfs, and calls rootFS.ChainID() to calculate the ID.

Possibly we could call layer.CreateChainID(img.RootFS.DiffIDs) directly to get the chain-ID (after doing the windows-specific checks if needed), to make it clearer we're calculating something here.

(not for this PR though)

@rvolosatovs rvolosatovs force-pushed the fix_image_shared_size branch from 2febcba to c337c0e Compare June 29, 2021 17:23
@rvolosatovs rvolosatovs requested a review from thaJeztah June 30, 2021 09:10
@rvolosatovs rvolosatovs changed the title Fix SharedSize computation in ImageService.Image Fix SharedSize computation in ImageService.Image for filtered requests Jun 30, 2021
- Rename image summary constructor
    - Rename `newImage` into `newImageSummary`, since the returned type is
      `*types.ImageSummary`
- Rename variables for clarity
    - Rename `newImage` into `summary`, since the variable type is
      `*types.ImageSummary`
    - Rename `imagesMap` into `summaryMap`, since the value type
      contained is `*types.ImageSummary`
- Only compute `DiffSize` when more than 1 reference to the layer
  exists, since it is not used otherwise
- Move variable declarations closer to where they are used

Signed-off-by: Roman Volosatovs <[email protected]>
Co-authored-by: Sebastiaan van Stijn <[email protected]>
@rvolosatovs rvolosatovs force-pushed the fix_image_shared_size branch from 85a37a5 to edb15c2 Compare June 30, 2021 09:34
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!!

@thaJeztah thaJeztah added this to the 21.xx milestone Jun 30, 2021
@thaJeztah
Copy link
Member

@AkihiroSuda @tonistiigi PTAL

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First commit SGTM, we could merge that separately as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be just range i.imageStore.Map() and no danglingOnly check in lines above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you want to avoid calling Map() again for the non-dangling case. Maybe add a comment for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment

@rvolosatovs rvolosatovs force-pushed the fix_image_shared_size branch from edb15c2 to af3e556 Compare July 2, 2021 09:47
@rvolosatovs rvolosatovs requested a review from tonistiigi July 2, 2021 09:48
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah merged commit ababae6 into moby:master Jul 2, 2021
@rvolosatovs rvolosatovs deleted the fix_image_shared_size branch July 2, 2021 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/refactor PR's that refactor, or clean-up code status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants