Skip to content

Conversation

@Weichen81
Copy link
Contributor

@Weichen81 Weichen81 commented Sep 29, 2017

Hi,

I am Wei Chen from ARM. Currently, I am working on arm-multi-arch support for containerd.
After #1535 had been merged, the containerd
has enabled the pull multi-arch image support. But the platform.variant field of OCI for ARM
hadn't been ready at that time, so all ARM images could not pull successfully.

"docker.io/library/hello-world:latest:
resolved |++++++++++++++++++++++++++++++++++++++|
index-sha256:3644c0788e3d3823f9e97f757f01d2ddc6eb5458df9d801:
done |++++++++++++++++++++++++++++++++++++++|
elapsed: 5.1 s
total: 2.7 Ki (533.0 B/s)
unpacking sha256:3d3823f9e97f757f01d2ddc6eb5458df9d801...
ctr: : manifest not found: not found"

In this PR we'll detect the ARM variants from /proc/cpuinfo. Because Linux
kernel has already detected the ABI, ISA and Features for us. We don't need to
parse them from registers again.

The PR has been tested on x86 and ARM64 servers, the ctr can pull images correctly now.

I also created an issue to discuss this problem #1576

@stevvooe @mattspencer-arm

@Weichen81 Weichen81 changed the title Arm multi arch Pull-multi-arch images for Arm Sep 29, 2017
@Weichen81 Weichen81 changed the title Pull-multi-arch images for Arm Fix pull-multi-arch images for Arm Sep 29, 2017
@codecov-io
Copy link

codecov-io commented Sep 29, 2017

Codecov Report

Merging #1575 into master will decrease coverage by 1.75%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1575      +/-   ##
=========================================
- Coverage   50.65%   48.9%   -1.76%     
=========================================
  Files          83      27      -56     
  Lines        7290    4071    -3219     
=========================================
- Hits         3693    1991    -1702     
+ Misses       2901    1662    -1239     
+ Partials      696     418     -278
Flag Coverage Δ
#linux ?
Impacted Files Coverage Δ
fs/diff_unix.go 34.78% <0%> (-21.74%) ⬇️
fs/diff.go 41.43% <0%> (-17.42%) ⬇️
fs/hardlink_unix.go 50% <0%> (-10%) ⬇️
fs/copy.go 34.09% <0%> (-7.94%) ⬇️
fs/copy_linux.go 30.76% <0%> (-7.33%) ⬇️
fs/dtype_linux.go 42.85% <0%> (-7.15%) ⬇️
metadata/containers.go 47.31% <0%> (-6.4%) ⬇️
metadata/buckets.go 56% <0%> (-6.13%) ⬇️
metadata/images.go 57.27% <0%> (-5.61%) ⬇️
metadata/snapshot.go 49.34% <0%> (-5.57%) ⬇️
... and 74 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1a56054...cbc3301. Read the comment docs.

platforms/arm.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Does this need to be exported function?
  • This should return err (or just "unknown") rather than call log.Fatal()
  • ParseCpuInfo → getCPUVariant?
  • Please add tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely not a package export.

This should also be in a separate file, probably cpuinfo.go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Akihiro,

Does this need to be exported function?

No, I think it's enough to used this function inside this package.

This should return err (or just "unknown") rather than call log.Fatal()

Your comment is reasonable, I will refine this function to return proper "variant, err"

ParseCpuInfo → getCPUVariant?

I think it's better to rename ParseCpuInfo to getCPUInfo. Because this function
is not only used to parse Variant. We may parse other information like "features"
in the future. But I will add a new function getCPUVariant to wrap getCPUInfo.

Please add tests

Sorry, I am new to here. Did you mean something like platform_test.go?

Hi Stevvooe,

This is definitely not a package export.
This should also be in a separate file, probably cpuinfo.go.
Did you mean rename the arm.go to cpuinfo.go? Because this function has been in
a separate file already.

Thanks for your comments. I am new to golang :)

Add @Pennyzct who is working with me on arm-multi-arch image support.

platforms/arm.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this cannot be a separate type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use an armCpuVariant string instead of this separate type.

platforms/arm.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functions like this belong in database.go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will move it to database.go

platforms/arm.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this format standardized? Looking at a random box, this doesn't seem to be correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reminder. It seems x86 and ARM has different space characters.
I will trim all leading and tailing space characters of the string.

@Weichen81
Copy link
Contributor Author

Hi Stevvooe, Akihiro,

Because I have public holidays in the following week. So I send the v2 updates as my understanding of your comments. If I have done something wrong please let me know. I can reply the email, but the code update would be deferred.

Regards,
Wei Chen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this function needs to be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't remove arm entries here.

@stevvooe
Copy link
Member

stevvooe commented Oct 2, 2017

@Weichen81 Is scanning /proc/cpuinfo the best way to detect architecture for ARM? Do you have a link to the format used so we can ensure that it works across kernel versions?

@Weichen81
Copy link
Contributor Author

Weichen81 commented Oct 3, 2017 via email

@Weichen81
Copy link
Contributor Author

Weichen81 commented Oct 3, 2017 via email

@Weichen81
Copy link
Contributor Author

Weichen81 commented Oct 3, 2017 via email

@estesp
Copy link
Member

estesp commented Oct 3, 2017

My personal opinion is that reading /proc/cpuinfo probably gives us the best-case coverage for ARM variants that are viable for use with container runtimes. I'm sure we could add complexity with support for older kernels and various edge cases, but seems that for container runtime use cases on ARM, we are talking about people who will need to be on recent kernels for support of modern hardware devices/platform support, and can even run a container runtime including runc and containerd (and/or higher layer Docker, K8s, etcd components).

I think we should go forward with this, which adds no additional dependencies on host OS tooling (e.g. lscpu) and will provide us proper differentiation of the key ARM platforms in use by container community players.

EDIT: Forgot to clearly state that since we are only running this parser when runtime.GOARCH == {arm family} it greatly limits our need to consider any other CPU architecture/family and possible changes in the format over time, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern here is we are parsing /proc/cpuinfo on any Linux OS as the code is right now. I would rather this be a "lazy" instantiation so it never runs unless needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this function call is useless for other architectures/OS. I would address this comment in next update.

@Weichen81
Copy link
Contributor Author

I think we should go forward with this, which adds no additional dependencies on host OS tooling (e.g. lscpu) and will provide us proper differentiation of the key ARM platforms in use by container community players.

Phil, thanks. I prefer to keep reading from /proc/cpuinfo to avoid adding additional dependency.

EDIT: Forgot to clearly state that since we are only running this parser when runtime.GOARCH == {arm family} it greatly limits our need to consider any other CPU architecture/family and possible changes in the format over time, etc.

Yes, currently we don't need to consider the differences of arm, x86, ppc and other CPU family's
output formats.

@stevvooe
Copy link
Member

stevvooe commented Oct 9, 2017

@Weichen81 Could address my review items?

@Weichen81
Copy link
Contributor Author

@stevvooe Ah, of course! That's undisputed, I will address your comments too :)
But if possible, I still want your comment on :

// The arch value should be normalized before being passed to this function.
func isKnownArch(arch string) bool {
switch arch {
case "386", "amd64", "amd64p32", "ppc64", "ppc64le", "mips", "mipsle", "mips64", "mips64le", "mips64p32", "mips64p32le", "ppc", "s390", "s390x", "sparc", "sparc64":

Don't remove arm entries here.

But, If we keep multiple points for arm arch check, if we have something update, we should modify these two points.

@Weichen81
Copy link
Contributor Author

@stevvooe After digging into the code, I think you're right. I should not remove the arm entries from
isKnownArch. Because we just need normalized arm entries like arm/arm64 in isArmArch. I would revert
this change.

Regards,
Wei Chen

@Pennyzct
Copy link

My only concern here is we are parsing /proc/cpuinfo on any Linux OS as the code is right now. I would rather this be a "lazy" instantiation so it never runs unless needed.
Hi~I'm penny, working with wei chen in Arm.
How about we make struct armCpuVariant singleton and collaborates sync.Once and getCPUVariant() function to initialize and acquire this single instance whenever needed.

@stevvooe
Copy link
Member

@Pennyzct Correct me if I'm wrong, but, in this case, doing an "init" time initialization will have the same effect as sync.Once. I think we'd want to use sync.Once if we wanted to defer calculation to only when it is actually used. That approach might be better, because I am not sure we want to have a side-effect fork/exec for a package that might be used widely in low-level scenarios.

@Pennyzct
Copy link

@stevvooe you're right, it's simple and neat to include the initialization of armCpuVariant in the "init", and adding the arch estimation will avoid useless work in no-Arm arch.
@estesp One more, I'm still confused that for now the instantiation still concentrates in start-up phase, and doesn't fit in running when needed("lazy instantiation").

@Weichen81
Copy link
Contributor Author

@stevvooe @estesp @AkihiroSuda
Hi Guys! Could you please to help me check the following log:

The command "if [ "$GOOS" = "linux" ]; then sudo PATH=$PATH GOPATH=$GOPATH make integration-parallel ; fi" exited with 2.

We had tested these patches internally and they worked well. I had checked my code and continuous-integration/travis-ci/pr log. I could not understand why they trigger this building error.

Regards,
Wei Chen

@estesp
Copy link
Member

estesp commented Oct 16, 2017

@Weichen81 you can ignore that error for now; there are some GC issues with snapshot removal and the parallel tests (cc: @dmcgowan)

@Weichen81
Copy link
Contributor Author

@estesp Thanks!

@crosbymichael
Copy link
Member

If you rebase on master it should resolve the test issues now. Thanks!

@Weichen81
Copy link
Contributor Author

@crosbymichael I have rebased these changes, and they have passed the tests. Thank you!

@crosbymichael
Copy link
Member

@Weichen81 thanks

@estesp @stevvooe this should be good to review again.

@Weichen81
Copy link
Contributor Author

Hi,
Sorry for disturbing, but any comments : )

Copy link
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@estesp
Copy link
Member

estesp commented Dec 8, 2017

ping @stevvooe ; with the removal of the compat matching and all other comments have been resolved, I'm comfortable with this now. I personally think we should get this in and then start to resolve how matching happens on variant or multi-value supporting architectures.

@estesp estesp added this to the 1.1 milestone Dec 18, 2017
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bottom of a file is not a good place for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ordered like this:

var
init
func

@stevvooe
Copy link
Member

After a fix to the code here, I am okay with merging this. There is still work to be done in this area, so we may make changes, but this is limited for now.

On ARM platforms, we have to prepare the variant for spec.platform.
Currently, this action would only work on Linux/ARM system. We
introduce these two helpers to check running system's OS and Arch.

Change-Id: Iff14087699219413779dd6caf1bf9524db1cc19e
Signed-off-by: Wei Chen <[email protected]>
In the commit "26329b2b8d7fd4e290b2b0f0163547f2d79bb817",
dmcgowan/fix-pull-multi-arch, PR#1535. The containerd has enabled the pull
multi-arch image support. But the platform.variant field of OCI for ARM
hadn't been ready at that time, so all ARM images could not pull successfully.

"
docker.io/library/hello-world:latest:
resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:3644c0788e3d3823f9e97f757f01d2ddc6eb5458df9d801:
done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 5.1 s
total:  2.7 Ki (533.0 B/s)
unpacking sha256:3d3823f9e97f757f01d2ddc6eb5458df9d801...
ctr: : manifest not found: not found
"

In this patch we'll detect the ARM variants from /proc/cpuinfo. Because Linux
kernel has already detected the ABI, ISA and Features for us. We don't need to
parse them from registers again.

Change-Id: I479b34bf3f52df9f7a6b3c77718b7d316dbf7f69
Signed-off-by: Wei Chen <[email protected]>
Signed-off-by: Penny Zheng <[email protected]>
The variant is required for platform match while pulling images
for ARM platforms. Currently, the cpuVariant only would be assigned
on linux/arm|arm64 platforms. Other platforms this variable would
be empty. So we can use this cpuVariant to initialize the Variant
field.

Change-Id: Ic065be9b502f1e662445daa61a0973bf56385b37
Signed-off-by: Wei Chen <[email protected]>
Signed-off-by: Penny Zheng <[email protected]>
This test case should cover the variant field now, as we have
add Variant to default value of spec.platform.

Change-Id: I8359007d40a4b8f6a072510fff2ba604a062afa1
Signed-off-by: Wei Chen <[email protected]>
Signed-off-by: Penny Zheng <[email protected]>
Change-Id: Id9558de2e41b08c41cf7d4b458774e99e24515a0
Signed-off-by: Wei Chen <[email protected]>
Signed-off-by: Penny Zheng <[email protected]>
@Weichen81
Copy link
Contributor Author

@stevvooe Thanks, I have updated this PR to address your comment.

Copy link
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM

@estesp estesp merged commit 1d896a8 into containerd:master Jan 3, 2018
@ernoaapa ernoaapa mentioned this pull request Mar 5, 2018
ernoaapa added a commit to ernoaapa/containerd that referenced this pull request Mar 5, 2018
Merge Weichen81/arm-multi-arch fix to fix pulling multi-arch arm images.
containerd#1575
ernoaapa added a commit to ernoaapa/eliot that referenced this pull request Mar 25, 2018
… Arm"

We want to start follow stable releases of containerd, but there's
one fix containerd/containerd#1575 which
prevents pulling images on ARM.
Switched to use custom containerd version which is
v1.0.2 with Weichen81:arm-multi-arch fix.
ernoaapa added a commit to ernoaapa/eliot that referenced this pull request Mar 25, 2018
… Arm"

We want to start follow stable releases of containerd, but there's
one fix containerd/containerd#1575 which
prevents pulling images on ARM.
Switched to use custom containerd version which is
v1.0.2 with Weichen81:arm-multi-arch fix.
@ArchiFleKs
Copy link

ArchiFleKs commented May 26, 2018

Hi, I stil get this issue on ARM, I just compiled master this morning on raspberry pi3 arm7

@Pennyzct
Copy link

Hi~ @ArchiFleKs Could you give us the detailed error info and the image info ? 😊

@ArchiFleKs
Copy link

@Pennyzct I think my issue might be related to this one : kubernetes/kubernetes#63057

My log is :

mai 26 09:07:45 raiden containerd[410]: time="2018-05-26T09:07:45+02:00" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-proxy-r8mtg,Uid:42c9a762-60b3-11e8-883a-0025900eb312,Namespace:kube-system,Attempt:0,} failed, error" error="failed to get sandbox image "k8s.gcr.io/pause:3.1": failed to pull image "k8s.gcr.io/pause:3.1": failed to get image information: failed to get image diffIDs: manifest sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea: not found"

@amaline
Copy link

amaline commented Jul 19, 2018

I think I am having the same issue on a Rock64 board. Tried upgrading to 1.1.2 from 1.1.0 - same issue although this is my first try at learning Kubernetes the very hard way.

# ctr --debug images pull docker.io/arm64v8/busybox:latest bb

at the end of the output:

DEBU[0002] fetch response received                       base="https://registry-1.docker.io/v2/arm64v8/busybox" digest=sha256:cd6101adb6f7d5c6f64f1bfb40ef0a63eea38274ac16c8c04bb949769962897b mediatype="application/vnd.docker.image.rootfs.diff.tar.gzip" response.headers=map[Date:[Thu, 19 Jul 2018 02:42:41 GMT] Accept-Ranges:[bytes] Cf-Cache-Status:[HIT] Last-Modified:[Tue, 17 Jul 2018 08:42:27 GMT] Vary:[Accept-Encoding] X-Amz-Id-2:[s46bR239AoE6YaCnASjLPYc1wPnuKZdPQU2zdR5m10N7X4XbfdMs7Dbu9G1hCZK1Nhx2Pe8uDKU=] Content-Type:[application/octet-stream] Cache-Control:[public, max-age=14400] Cf-Ray:[43c9d610c9b6241a-IAD] Etag:["ee1e2543ef566531891e4234117ade24"] Expect-Ct:[max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"] Content-Length:[770638] Connection:[keep-alive] Set-Cookie:[__cfduid=dfa6ff117471d06c72227703f7f5911851531968161; expires=Fri, 19-Jul-19 02:42:41 GMT; path=/; domain=.production.cloudflare.docker.com; HttpOnly; Secure __cfduid=dfa6ff117471d06c72227703f7f5911851531968161; expires=Fri, 19-Jul-19 02:42:41 GMT; path=/; domain=.production.cloudflare.docker.com; HttpOnly; Secure] Expires:[Thu, 19 Jul 2018 06:42:41 GMT] X-Amz-Request-Id:[EAEBE7B793E45832] X-Amz-Version-Id:[qS07hFJzeGjXZx02_qOLW4I675ejnmbz] Server:[cloudflare]] size=770638 status="200 OK" url="https://registry-1.docker.io/v2/arm64v8/busybox/blobs/sha256:cd6101adb6f7d5c6f64f1bfb40ef0a63eea38274ac16c8c04bb949769962897b"
DEBU[0003] unpacking                                     image="docker.io/arm64v8/busybox:latest"
unpacking sha256:6e91cbbf76bfb7fe4edcce7b5b5a0f551e9d0b3ba01107b2faa383b33560412d...
ctr: manifest sha256:6e91cbbf76bfb7fe4edcce7b5b5a0f551e9d0b3ba01107b2faa383b33560412d: not found

root@rock64b:~# ctr version
Client:
  Version:  v1.1.2
  Revision: 468a545b9edcd5932818eb9de8e72413e616e86e

Server:
  Version:  v1.1.2
  Revision: 468a545b9edcd5932818eb9de8e72413e616e86e

Same for the multi-arch image

root@rock64b:~# ctr images pull docker.io/library/busybox:latest bb
docker.io/library/busybox:latest:                                              resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:d21b79794850b4b15d8d332b451d95351d14c951542942a816eea69c9e04b240: done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 1.5 s                                                                 total:  2.6 Ki (1.8 KiB/s)                                       
unpacking sha256:d21b79794850b4b15d8d332b451d95351d14c951542942a816eea69c9e04b240...
ctr: manifest sha256:d21b79794850b4b15d8d332b451d95351d14c951542942a816eea69c9e04b240: not found

@rkojedzinszky
Copy link

I have the same with v1.2.6 installed from ubuntu repository. Actually, on an amd64 it can be reproduced, that arm/v7 image pull fails:

# ctr i pull --platform linux/arm/v7  docker.io/library/busybox:latest
docker.io/library/busybox:latest:                                                 resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:c94cf1b87ccb80f2e6414ef913c748b105060debda482058d2b8d0fce39f11b9:    done           |++++++++++++++++++++++++++++++++++++++| 
manifest-sha256:aba5fffb8a7a1a5376c970417b05706ff6ced29a696ee486dc14144b7221d9f1: done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:235dcb7bd32b2dd4c7bbf16e6a81ead3c8250e62255b79adee7e2011d53342ca:    done           |++++++++++++++++++++++++++++++++++++++| 
config-sha256:b85bc5e010b4dcf4c2f5b2b58fc3fff7aab24fc223cc15efa7ed2f39ec253d34:   done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 1.5 s                                                                    total:   0.0 B (0.0 B/s)                                         
unpacking linux/arm/v7 sha256:c94cf1b87ccb80f2e6414ef913c748b105060debda482058d2b8d0fce39f11b9...
ctr: content digest sha256:76535ae6ed1a2a69f5c4fdf6d0a5b17da86f797cb7ac15c06dd74fe44e454db2: not found

@estesp
Copy link
Member

estesp commented Jul 2, 2019

@rkojedzinszky this is fixed in our latest release, 1.2.7, so there isn't much we can do about it--hopefully Ubuntu will update to 1.2.7 soonish.

@rkojedzinszky
Copy link

Thanks @estesp . Then either I do something wrong, or some bug is still present. Even if ctr can download the image, containerd daemon cannot, emits the same error message. I do this on arm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.