Skip to content

Fix PushHandler cannot push image that contains duplicated blobs#5379

Merged
estesp merged 1 commit intocontainerd:masterfrom
ktock:fix-push-race
Apr 20, 2021
Merged

Fix PushHandler cannot push image that contains duplicated blobs#5379
estesp merged 1 commit intocontainerd:masterfrom
ktock:fix-push-race

Conversation

@ktock
Copy link
Member

@ktock ktock commented Apr 16, 2021

Fixes: #2706

Currently, remotes.PushHandler fails to push images that contain duplicated blobs. This is because when multiple and parallel calls of dockerPusher.Push with the same digest occur, each call races to update StatusTracker. This breaks the Status of the upload and dockerPusher.Commit() fails occasionally.

This commit tries to solve this issue without breaking backward compatibility.

This commit introduces a new backward-compatible extension for StatusTracker called StatusTrackLocker which implements named mutex for ensuring atomic update.

Additionally, this commit introduces content.Ingester API to dockerPusher. If there is already an ongoing upload of blob with ref "A", the second call with "A" returns ErrUnavailable (This behaviour follows the local content store's Ingester.Writer). PushHandler retries to get the writer using content.OpenWriter. This is somewhat similar to the way of remotes.FetchHandler handling duplicated blobs.

Test

The following image used in CRI test has a lot of duplicated blobs.

gcr.io/k8s-staging-cri-tools/test-image-tags:1@sha256:be918dfb542a9fc19bc7acc7f2687de1a1657e949e6feb48b93b83ddfd36aa4f
Click here to show the manifest index of the image

$ crane manifest gcr.io/k8s-staging-cri-tools/test-image-tags:1@sha256:be918dfb542a9fc19bc7acc7f2687de1a1657e949e6feb48b93b83ddfd36aa4f
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:982e9d1558d1ae505d6fa5feed582fc68022d241a9a2737a7c305b8ad028a432",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:982e9d1558d1ae505d6fa5feed582fc68022d241a9a2737a7c305b8ad028a432",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:982e9d1558d1ae505d6fa5feed582fc68022d241a9a2737a7c305b8ad028a432",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:e802c171fa975dbf6ff94ce6534fdec9ef553ef548c14a3c519cc66305565128",
         "platform": {
            "architecture": "arm",
            "os": "linux",
            "variant": "v7"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:e802c171fa975dbf6ff94ce6534fdec9ef553ef548c14a3c519cc66305565128",
         "platform": {
            "architecture": "arm",
            "os": "linux",
            "variant": "v7"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:e802c171fa975dbf6ff94ce6534fdec9ef553ef548c14a3c519cc66305565128",
         "platform": {
            "architecture": "arm",
            "os": "linux",
            "variant": "v7"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:fa6b7552b66b12f9c55fc22aad1787fc46bf12404ed691e1597fdf07e4f3b4fb",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:fa6b7552b66b12f9c55fc22aad1787fc46bf12404ed691e1597fdf07e4f3b4fb",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:fa6b7552b66b12f9c55fc22aad1787fc46bf12404ed691e1597fdf07e4f3b4fb",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:fbcd9fac39a14e52d5785e9bfdbd8d14a440dc2ba9dba974ec38a1c8a52cb6f5",
         "platform": {
            "architecture": "mips64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:fbcd9fac39a14e52d5785e9bfdbd8d14a440dc2ba9dba974ec38a1c8a52cb6f5",
         "platform": {
            "architecture": "mips64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 733,
         "digest": "sha256:fbcd9fac39a14e52d5785e9bfdbd8d14a440dc2ba9dba974ec38a1c8a52cb6f5",
         "platform": {
            "architecture": "mips64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 734,
         "digest": "sha256:43233a800b014c05c46cae348d6a8ad67373c4820e31c732de36f617ce033fc0",
         "platform": {
            "architecture": "ppc64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 734,
         "digest": "sha256:43233a800b014c05c46cae348d6a8ad67373c4820e31c732de36f617ce033fc0",
         "platform": {
            "architecture": "ppc64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 734,
         "digest": "sha256:43233a800b014c05c46cae348d6a8ad67373c4820e31c732de36f617ce033fc0",
         "platform": {
            "architecture": "ppc64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 734,
         "digest": "sha256:ab312d9ed299313abb11ee9a14f7bddc3beadcc20d35fb9402dafa00c9944244",
         "platform": {
            "architecture": "s390x",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 734,
         "digest": "sha256:ab312d9ed299313abb11ee9a14f7bddc3beadcc20d35fb9402dafa00c9944244",
         "platform": {
            "architecture": "s390x",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 734,
         "digest": "sha256:ab312d9ed299313abb11ee9a14f7bddc3beadcc20d35fb9402dafa00c9944244",
         "platform": {
            "architecture": "s390x",
            "os": "linux"
         }
      }
   ]
}

ctr i push fails with the master version of containerd:

# ctr version
Client:
  Version:  v1.5.0-rc.1-33-gf0890f9b3
  Revision: f0890f9b3a6a2fd22c427b3544ab6d774685e046
  Go version: go1.15.11

Server:
  Version:  v1.5.0-rc.1-33-gf0890f9b3
  Revision: f0890f9b3a6a2fd22c427b3544ab6d774685e046
  UUID: 7d22be4a-0a96-470b-a8a5-b208e2707c41
# ctr i pull --all-platforms gcr.io/k8s-staging-cri-tools/test-image-tags:1@sha256:be918dfb542a9fc19bc7acc7f2687de1a1657e949e6feb48b93b83ddfd36aa4f
...(omit)...
# ctr i tag gcr.io/k8s-staging-cri-tools/test-image-tags:1@sha256:be918dfb542a9fc19bc7acc7f2687de1a1657e949e6feb48b93b83ddfd36aa4f registry2:5000/test-image-tags:1
registry2:5000/test-image-tags:1
# ctr i push --plain-http registry2:5000/test-image-tags:1
ctr: failed commit on ref "layer-sha256:ef7b5b170c36ba0399faf34e9dd62aa6f65b6d5189d63f114d10e009f421718f": failed to do request: Put "http://registry2:5000/v2/test-image-tags/blobs/uploads/(...omit...)digest=sha256%3Aef7b5b170c36ba0399faf34e9dd62aa6f65b6d5189d63f114d10e009f421718f": net/http: HTTP/1.x transport connection broken: http: ContentLength=105 with Body length 0

But the above succeeds with this PR.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 16, 2021

Build succeeded.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 16, 2021

Build succeeded.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 16, 2021

Build succeeded.

@samuelkarp
Copy link
Member

Additionally, this commit introduces content.Ingester API to dockerPusher. If there is already an ongoing upload of blob with ref "A", the second call with "A" returns ErrUnavailable (This behaviour follows the local content store's Ingester.Writer). PushHandler retries to get the writer using content.OpenWriter.

From this description and the code, it sounds like the ErrUnavailable is necessary to avoid duplicate pushes, but the only containerd bug is the concurrent updates to StatusTracker; if the updated dockerPusher is still used by the old remotes.PushHandler without special handling for a content.Ingester the bug would still be resolved. Is that an accurate understanding?

If so, would you mind updating the documentation on Resolver.Pusher to indicate that the returned Pusher may also satisfy content.Ingester and doing so can help avoid duplicate blob pushes?

@ktock
Copy link
Member Author

ktock commented Apr 19, 2021

@samuelkarp Thanks for reviewing this.

if the updated dockerPusher is still used by the old remotes.PushHandler without special handling for a content.Ingester the bug would still be resolved. Is that an accurate understanding?

No, that isn't. Because the duplication of pushes is the source of this reported bug, too.

For example if we have 20 bytes blob (ref = "A") and multiple (duplicated) calls to upload it occur, the following will happen:

pushWriter X (ref = "A") do SetStatus("A", State{offset = 10, total = 20}) during pushWriter.Write, then later, the slower pushWriter Y (ref = "A") do SetStatus("A", State{offset = 5, total = 20}) during pushWriter.Write. Here, a race condition happens on State "A" in StatusTracker (i.e. offset is updated inconsistently).

This PR tries to solve this by de-duplicating push (i.e. introducing Ingester API).

Another possible (ideal) solution is to give unique ref per push even if they point to the same blob. But this solution will break backward compatibility because StatusTracker and each ref's naming convention is public (clients who configure remotes.ResolverOptions.Tracker expect the current naming convention).

The lock on StatusTrackLocker held in dockerPusher.Writer is only for making the following procedure atomic:

  • push checks the existence of State in StatusTracker, by calling GetStatus
  • if there is already an ongoing push, returns Unavailable
  • if not, reserves this ref on StatusTracker by calling SetStatus by its ref.

This prevents the race condition that happens during deciding which push to be deduplicated, like the following (please see also the code snippet):

Multiple pushes with a same ref "A" execute GetStatus("A") at the same time, all of them see err == ErrNotFound and get each one's pushWriter that races to update State of "A" in StatusTracker.

 func (p dockerPusher) push(ctx context.Context, desc ocispec.Descriptor, ref string, unavailableOnFail bool) (content.Writer, error) {
...(omit)...
	status, err := p.tracker.GetStatus(ref)
	if err == nil {
		if status.Committed && status.Offset == status.Total {
			return nil, errors.Wrapf(errdefs.ErrAlreadyExists, "ref %v", ref)
		}
...(omit)...
	} else if !errdefs.IsNotFound(err) {
		return nil, errors.Wrap(err, "failed to get status")
	}
...(omit)...

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 19, 2021

Build succeeded.

@samuelkarp
Copy link
Member

@ktock Thanks for the explanation! Would you mind updating the documentation for Resolver.Pusher to indicate that the returned Pusher should also satisfy content.Ingester and that concurrent attempts to push the same blob should result in ErrUnavailable?

@ktock
Copy link
Member Author

ktock commented Apr 19, 2021

@samuelkarp Thanks for the comment. Added the comment to Resolver.Pusher.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 19, 2021

Build succeeded.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 19, 2021

Build succeeded.

Copy link
Member

@samuelkarp samuelkarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for bearing with me!

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 20, 2021

Build succeeded.

Copy link
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bechhansen
Copy link

bechhansen commented Feb 23, 2022

Hi. We currently see this defect in our environment. I can't really figure out what for what release this defect was fixed?

@estesp
Copy link
Member

estesp commented Feb 23, 2022

If you click on a commit hash, you should go to the commit (instead of PR) view and GitHub will show you all the tags in which the commit is included. In this case it is every release from v1.5.0 through the current v1.6.0 release. However, a bug in this code was just fixed 22 days ago in #6243 which is only in the v1.6.0 release at this time. As a bug fix it should be backported to release/1.5 as well (and I just marked it as such).

@bechhansen
Copy link

If you click on a commit hash, you should go to the commit (instead of PR) view and GitHub will show you all the tags in which the commit is included. In this case it is every release from v1.5.0 through the current v1.6.0 release. However, a bug in this code was just fixed 22 days ago in #6243 which is only in the v1.6.0 release at this time. As a bug fix it should be backported to release/1.5 as well (and I just marked it as such).

Excellent. Thank you.

@bechhansen
Copy link

We are using the latest Docker version 20.10.12 on both Linux and Windows, but as far as I can tell this version uses containerd v1.4.12. I don't see any way for us to update to something where this fix is included.

What are our options here, expect for waiting until containerd is updated in the Docker distributions?

Is there a workaround for this issue. Would it e.g. be possible to configure containerd to not push images in parallel

@bechhansen
Copy link

We are using the latest Docker version 20.10.12 on both Linux and Windows, but as far as I can tell this version uses containerd v1.4.12. I don't see any way for us to update to something where this fix is included.

What are our options here, expect for waiting until containerd is updated in the Docker distributions?

Is there a workaround for this issue. Would it e.g. be possible to configure containerd to not push images in parallel

Our issue is not so much with containerd, but with https://github.com/getporter/porter that depends on containerd 1.3.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parallel pushes fail with invalid written size

5 participants