-
Notifications
You must be signed in to change notification settings - Fork 619
Description
If you build an image for multiple CPU architectures at the same time and use --push, the upload of the images will often get stuck in an endless loop.
The following line is printed over and over again:
error: failed to copy: failed to do request: Put "https://ghcr.io/v2/reconman/example-buildx-push/blobs/upload/a5521203-2c8d-49d5-bcde-d9ba8500a5b0?digest=sha256%3A1e1235e447358303a2d2975f6078eb4f82db3b64fe1ef840976f6033eac9a19f": write tcp 172.17.0.2:40356->140.82.113.33:443: write: connection reset by peer
I'm able to easily reproduce the issue by building a python-based image with all architectures allowed by the base image: https://github.com/reconman/example-buildx-push
I increased the number of layers by adding some RUN commands because I'm suspecting that it increases the failure chance.
When I changed --push to type=oci,dest=/tmp/image.tar and ran the following containerd commands manually, I encountered containerd/containerd#2706, so it may be related to that?
sudo ctr i import --base-name ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} --digests --all-platforms /tmp/image.tar
while IFS= read -r line; do
sudo ctr i push --user "${{ github.actor }}:${{ secrets.GITHUB_TOKEN }}" $line;
done <<< "${{ steps.meta.outputs.tags }}"
Here are the Github workflow logs with the Buildkit debug flag enabled: logs_1.zip