Skip to content

Fix .ensure-emptyfs on non-x86_64 architectures#21819

Merged
thaJeztah merged 1 commit intomoby:masterfrom
tophj-ibm:fix-ensure-emptyfs-other-architectures
Apr 9, 2016
Merged

Fix .ensure-emptyfs on non-x86_64 architectures#21819
thaJeztah merged 1 commit intomoby:masterfrom
tophj-ibm:fix-ensure-emptyfs-other-architectures

Conversation

@tophj-ibm
Copy link
Copy Markdown
Contributor

Now that we are checking if the image and host have the same architectures
via #21272, this value should be null so that the test passes on non-x86
machines.

ping @tianon

Signed-off-by: Christopher Jones [email protected]

@tianon
Copy link
Copy Markdown
Member

tianon commented Apr 6, 2016

Hah, if null is an acceptable value here, +1 LGTM! 🤘

@icecrime
Copy link
Copy Markdown
Contributor

icecrime commented Apr 6, 2016

I think this test needs updating:

14:27:22 FAIL: docker_cli_inspect_test.go:23: DockerSuite.TestInspectImage
14:27:22 
14:27:22 docker_cli_inspect_test.go:34:
14:27:22     c.Assert(id, checker.Equals, imageTestID)
14:27:22 ... obtained string = "sha256:20078ae01ef64194a0d4c32e4b3b180d7ce11c17c2bd945347d81505085cf8d1"
14:27:22 ... expected string = "sha256:11f64303f0f7ffdc71f001788132bca5346831939a956e3e975c93267d89a16d"

@icecrime icecrime added status/failing-ci Indicates that the PR in its current state fails the test suite status/2-code-review and removed status/0-triage labels Apr 6, 2016
@StefanScherer
Copy link
Copy Markdown
Contributor

Same problem to build for arm. Will test this PR...

@tophj-ibm tophj-ibm force-pushed the fix-ensure-emptyfs-other-architectures branch from a3e64ca to c60e484 Compare April 7, 2016 18:52
@tonistiigi
Copy link
Copy Markdown
Member

LGTM

Started arm ci also.

@tophj-ibm tophj-ibm force-pushed the fix-ensure-emptyfs-other-architectures branch from c60e484 to 5c6ce44 Compare April 7, 2016 20:02
@tophj-ibm
Copy link
Copy Markdown
Contributor Author

Looks like there was an issue with arm64 images having the "arm64" tag and the runtime.GOARCH reporting just "arm". I think this should fix it, can someone restart arm ci?

@StefanScherer
Copy link
Copy Markdown
Contributor

Still some errors in arm build show problems with that architecture check:

  • DockerSuite.TestEventsImagePull -> "pull hello-world" failed with errors: image is for architecture amd64, expected arm
  • DockerSuite.TestLoadZeroSizeLayer -> "load -i fixtures/load/emptyLayer.tar" failed with errors: Error response from daemon: image is for architecture x86_64, expected arm

Is pulling such images really an error?
What about the multiarch support in Docker for Mac/Windows Beta that uses built-in QEMU to run such images?

@tophj-ibm
Copy link
Copy Markdown
Contributor Author

Power is having the same problems, looks like an issue with pulling down x86 images from the tests. I wonder why hello-world is failing seeing as that is a frozen image, I'll look into it some more tomorrow

@thaJeztah thaJeztah removed the status/failing-ci Indicates that the PR in its current state fails the test suite label Apr 8, 2016
@thaJeztah
Copy link
Copy Markdown
Member

looks like it's all green now @tophj-ibm were you still making changes?

@tonistiigi
Copy link
Copy Markdown
Member

@thaJeztah I think arm reported wrong here.

@thaJeztah
Copy link
Copy Markdown
Member

@tonistiigi oh boy 😞

@StefanScherer
Copy link
Copy Markdown
Contributor

Yes, arm shows green, but there are issues since commit 194eaa5.

I wonder if the function archMatches should do the check only if GOOS is not linux as on Linux you may run multiarch Linux containers?

I tried to replace the docker binaries in latest Docker Beta for Windows / Mac, but on both I no longer can login into the VM to replace the bins 😞

@thaJeztah
Copy link
Copy Markdown
Member

tried to replace the docker binaries in latest Docker Beta for Windows / Mac, but on both I no longer can login into the VM to replace the bins 😞

Hm, yes, that would've been handy in this case; those features were removed to prevent people from relying on features that are only there for debugging purposes (and once users rely on it, it's a lot more troublesome to remove). I know there's still discussion going on to have some way to replace the binaries.

@StefanScherer
Copy link
Copy Markdown
Contributor

No problem. I understand: Yes is forever, No is not (now) 😄

@justincormack
Copy link
Copy Markdown
Contributor

@StefanScherer the portable way to get a full root shell on any docker host still works fine on Docker 4 mac/Win, docker run -it --privileged --pid=host debian nsenter -t 1 -m -n sh

@justincormack
Copy link
Copy Markdown
Contributor

The whole idea of only allowing containers to run on their "native" arch is very problematic. As Stefan says, Linux can run any architecture with transparent emulation, FreeBSD and SmartOS can run Linux containers, and now Windows can run Linux binaries. The checking seems pretty problematic except for the specific Windows related issue it was designed to fix, and will break lots of valid use cases.

@tophj-ibm
Copy link
Copy Markdown
Contributor Author

@justincormack agreed, I'll remove it except for the windows cases.

@tonistiigi
Copy link
Copy Markdown
Member

cc @jstarks @aaronlehmann

@jstarks
Copy link
Copy Markdown
Contributor

jstarks commented Apr 8, 2016

I'm very surprised that GOARCH for arm64 is "arm" -- this is inconsistent with the Golang documentation. Is the daemon running as a 32-bit binary?

Either way, I think these checks are important on all architectures and OSes -- the user shouldn't have to wait until container start to find out they got an image for the wrong OS or arch. What makes sense to me is to have different compatibility checks for different OSes -- if FreeBSD can run Linux containers, that should be codified in the engine somewhere.

But it does seem that this needs to be thought through some more, so I'm OK with reverting this part of my change for now (even on Windows) to unblock everyone.

@StefanScherer
Copy link
Copy Markdown
Contributor

I don't think that GOARCH for arm64 is "arm", I think this is "aarch64". I haven't investigated the problem in the arm build so far, this might be just one of the Docker images used in the tests which has a strange tag.

The arm builds run on Scaleway machines with 32bit ARM CPU's, upcoming 64bit ARM CPU's can run containers from the 32bit ARM Docker images as well.

The new Docker for Windows/Mac beta is able to run eg. ARM Docker images in a Hyper-V/xhyve Linux amd64 VM. An example can be found in slide 17 at https://blog.docker.com/2016/04/online-meetup-36-docker-windows-mac/

@jstarks
Copy link
Copy Markdown
Contributor

jstarks commented Apr 8, 2016

Docker for Windows/Mac does not run the engine on the Windows/Mac host. It runs the engine inside a Linux VM.

@tophj-ibm
Copy link
Copy Markdown
Contributor Author

@StefanScherer looking at https://jenkins.dockerproject.org/job/Docker-PRs-arm/252/console, it might be a frozen images tagging issue. I have no idea how the CI passed the second time though 😕

@StefanScherer
Copy link
Copy Markdown
Contributor

The MultiCPU support can also be used with native Linux Docker Engines, some additional features of Linux are required.

I think your check reveals some related problems. Indeed eg. the armhf/debian:jessie image seems to have a architecture tag "arm64", but this is only a 32bit ARM image. I'm working on another PR, but my builds are blocked at the moment. I think these docker images should be fixed as well.

But this check would restrict the MultiCPU support running Linux containers on Linux Docker Engines. If there is no tag in the images this check could be added in a Windows specific Go code. Just a first idea.

@tophj-ibm Thanks, yes one of the frozen images is armhf/debian. I'll have a closer look at the list of images.

@tophj-ibm
Copy link
Copy Markdown
Contributor Author

I'm just going to change the check to be a logrus error, that seems to be a reasonable compromise until we sort out the image issues and also fixes this specific issue this PR was supposed to do in the first place

@StefanScherer
Copy link
Copy Markdown
Contributor

@tophj-ibm That is a good approach. Just found another constellation. Pulling resin/rpi-raspbian shows "image is for architecture amd64, expected arm".
Now I understand the problem better. The Docker images was built on a Intel Linux Docker Engine by COPYing the rootfs for ARM into the image. That is still a valid Docker image for ARM, but the tag is wrong.

We shouldn't be forced to build ARM Docker images on (probably slow) ARM devices. Someone can build ARM Docker images eg. in the Cloud at Travis, Circle, ... servers which all are Intel only at the moment.

@jstarks
Copy link
Copy Markdown
Contributor

jstarks commented Apr 8, 2016

The COPY case is interesting. We really need a way in Dockerfile to tag the architecture for such cases.

@tophj-ibm tophj-ibm force-pushed the fix-ensure-emptyfs-other-architectures branch from 4550cf5 to 46846eb Compare April 8, 2016 20:27
@thaJeztah
Copy link
Copy Markdown
Member

@jstarks I was discussing this issue with @tonistiigi, and probably the best solution for now would be to (temporarily) revert #21272, and open a new PR for that where we can discuss the design, and see how we want to resolve it (do we need the check on all platforms, etc)?

Let me know what you think

@tophj-ibm
Copy link
Copy Markdown
Contributor Author

Okay, changed it to a warning and removed the arm64 special case because that's another issue. Can someone start the arm CI again?

@jstarks
Copy link
Copy Markdown
Contributor

jstarks commented Apr 8, 2016

@thaJeztah I would prefer if we just removed the checks but left the change to Image and the places where I fill in the new metadata. The metadata is important for Windows even if we don't do anything with it right away. What do you think?

@tophj-ibm
Copy link
Copy Markdown
Contributor Author

okay seems like a consensus, I'll go ahead and remove them

@tophj-ibm tophj-ibm force-pushed the fix-ensure-emptyfs-other-architectures branch from 46846eb to ba3016a Compare April 8, 2016 20:58
@thaJeztah
Copy link
Copy Markdown
Member

@tonistiigi wdyt; does this LGTY?

@tonistiigi
Copy link
Copy Markdown
Member

@tophj-ibm Do we need the emptyfs config change at all then if there are no validations?

@tophj-ibm
Copy link
Copy Markdown
Contributor Author

@tonistiigi nope

Now that we are checking if the image and host have the same architectures
via moby#21272, this value should be null so that the test passes on non-x86
machines

Signed-off-by: Christopher Jones <[email protected]>
@thaJeztah
Copy link
Copy Markdown
Member

hm, couple of failures on ARM, don't know if any is related, or if we need to mark more tests as "flaky" 😞

@thaJeztah
Copy link
Copy Markdown
Member

looks like a container named "test" wasn't cleaned up properly and is causing these errors, https://jenkins.dockerproject.org/job/Docker-PRs-arm/255/console

I'll restart

@tonistiigi
Copy link
Copy Markdown
Member

LGTM

@thaJeztah
Copy link
Copy Markdown
Member

LGTM, thanks!

@thaJeztah
Copy link
Copy Markdown
Member

win2lin is green, but GitHub missed the notification;
https://jenkins.dockerproject.org/job/Docker-PRs-Win2Lin/24689/console

2:16:45 d809c12bb911
22:16:49 INFO: Cleanup complete
22:16:49 Notifying endpoint 'HTTP:https://leeroy.dockerproject.org/notification/jenkins'
22:16:49 Finished: SUCCESS

@thaJeztah thaJeztah merged commit 3f39035 into moby:master Apr 9, 2016
@StefanScherer
Copy link
Copy Markdown
Contributor

Thanks! The arm build is green without failing tests. But there is an "exec format error" afterwards. I'll observe this.

@tophj-ibm tophj-ibm deleted the fix-ensure-emptyfs-other-architectures branch April 11, 2016 13:58
@AkihiroSuda
Copy link
Copy Markdown
Member

compat_{unix,windows}.go is no longer used because compat.go is removed.
Is is possible to remove compat_{unix,windows}.go as well?

@AkihiroSuda
Copy link
Copy Markdown
Member

Opened #22022 for clean-up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants