Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container fails to start on ubuntu14.04 LTS due to memory.swappiness write (Swarm remote API only) #17879

Closed
jtwile2 opened this issue Nov 10, 2015 · 17 comments

Comments

@jtwile2
Copy link

jtwile2 commented Nov 10, 2015

Using latest docker-engine 1.9.0-0~trusty and swarm:1.0.0

Using the remote API, any version, containers fail to start when attempting to write to the cgroup memory.swappiness file. Commandline works fine. Bypassing swarm, using the same method against the docker daemon itself also works.

$ uname -a
Linux twile-dev 3.13.0-58-generic #97-Ubuntu SMP Wed Jul 8 02:56:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ docker -H 127.0.0.1:3333 info
Containers: 2
Images: 21
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 1
twile-dev: 172.17.42.1:2375

  • Containers: 2
  • Reserved CPUs: 0 / 1
  • Reserved Memory: 0 B / 4.054 GiB
  • Labels: executiondriver=native-0.2, kernelversion=3.13.0-58-generic, operatingsystem=Ubuntu 14.04.3 LTS, storagedriver=aufs
    CPUs: 1
    Total Memory: 4.054 GiB
    Name: 72bb050260aa

Swarm
$ curl http://127.0.0.1:3333/v1.21/version
{"ApiVersion":"1.21","Arch":"amd64","GitCommit":"087e245","GoVersion":"go1.5.1","KernelVersion":"","Os":"linux","Version":"swarm/1.0.0"}

Engine
$ curl http://172.17.42.1:2375/v1.21/version
{"Version":"1.9.0","ApiVersion":"1.21","GitCommit":"76d6bc9","GoVersion":"go1.4.2","Os":"linux","Arch":"amd64","KernelVersion":"3.13.0-58-generic","BuildTime":"Tue Nov 3 17:43:42 UTC 2015"}

Swarm Create
$ curl -H "Content-Type: application/json" --data @create.json http://127.0.0.1:3333/v1.21/containers/create
{"Id":"ca3db5144cc283678b0daf7a753f7467d9e5065d9bd22e0475baca57c6dfc178"}

$ curl -H "Content-Type: application/json" --data "{}" http://127.0.0.1:3333/v1.21/containers/ca3db5144cc283678b0daf7a753f7467d9e5065d9bd22e0475baca57c6dfc178/start
Cannot start container ca3db5144cc283678b0daf7a753f7467d9e5065d9bd22e0475baca57c6dfc178: [8] System error: write /sys/fs/cgroup/memory/docker/ca3db5144cc283678b0daf7a753f7467d9e5065d9bd22e0475baca57c6dfc178/memory.swappiness: invalid argument

Engine Create
$ curl -H "Content-Type: application/json" --data @create_daemon.json http://172.17.42.1:2375/v1.21/containers/create
{"Id":"007409fe7ce29b377778d14654273941d15a689f0731eb2cfaf09e84d38e452a","Warnings":null}

$ curl -H "Content-Type: application/json" --data "{}" http://172.17.42.1:2375/v1.21/containers/007409fe7ce29b377778d14654273941d15a689f0731eb2cfaf09e84d38e452a/start
$

Swarm create JSON data
{
"Tty": false,
"AttachStderr": false,
"Netwo rkDisabled": false,
"Image": "mysql:latest",
"StdinOnce": false,
"HostConfig": {
"NetworkMode": "default",
"MemorySwap": -1,
"Memory": 1073741824
},
"AttachStdin": false,
"CpuShares": 1,
"AttachStdout": false,
"OpenStdin": false
}

Engine create JSON data
{
"Tty": false,
"AttachStderr": false,
"Netwo rkDisabled": false,
"Image": "mysql:latest",
"StdinOnce": false,
"HostConfig": {
"NetworkMode": "default",
"MemorySwap": -1,
"Memory": 1073741824
},
"AttachStdin": false,
"CpuShares": 1024,
"AttachStdout": false,
"OpenStdin": false
}

@GordonTheTurtle
Copy link

Hi!

Please read this important information about creating issues.

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

This is an automated, informational response.

Thank you.

For more information about reporting issues, see https://github.com/docker/docker/blob/master/CONTRIBUTING.md#reporting-other-issues


BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:
uname -a:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

----------END REPORT ---------

#ENEEDMOREINFO

@thaJeztah
Copy link
Member

@abronan any ideas what can cause Swarm to act different here?

@jtwile2
Copy link
Author

jtwile2 commented Nov 11, 2015

Should I move this issue to docker/swarm?

@thaJeztah
Copy link
Member

@jtwile2 we can wait a short while to see if someone of the swarm team can have a look

@johnjelinek
Copy link

I can repro this problem when running docker-machine from Ubuntu 14.04 LTS and the swarm machines are all CentOS 7.1

@johnjelinek
Copy link

Is there a workaround in place?

@johnjelinek
Copy link

$ docker version
Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:12:04 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      swarm/1.0.0
 API version:  1.21
 Go version:   go1.5.1
 Git commit:   087e245
 Built:
 OS/Arch:      linux/amd64
$ docker info
Containers: 20
Images: 33
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 5
 swarm1: xxx.xxx.xxx.xxx:12375
  └ Containers: 4
  └ Reserved CPUs: 0 / 7
  └ Reserved Memory: 0 B / 3.372 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.10.0-123.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), provider=generic, storagedriver=devicemapper
 swarm2: xxx.xxx.xxx.xxx:12375
  └ Containers: 4
  └ Reserved CPUs: 0 / 7
  └ Reserved Memory: 0 B / 3.372 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.10.0-123.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), provider=generic, storagedriver=devicemapper
 swarm3: xxx.xxx.xxx.xxx:12375
  └ Containers: 4
  └ Reserved CPUs: 0 / 7
  └ Reserved Memory: 0 B / 3.372 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.10.0-123.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), provider=generic, storagedriver=devicemapper
 swarm4: xxx.xxx.xxx.xxx:12375
  └ Containers: 4
  └ Reserved CPUs: 0 / 7
  └ Reserved Memory: 0 B / 3.372 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.10.0-123.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), provider=generic, storagedriver=devicemapper
 swarm5: xxx.xxx.xxx.xxx:12375
  └ Containers: 4
  └ Reserved CPUs: 0 / 7
  └ Reserved Memory: 0 B / 3.371 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.10.0-229.20.1.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), provider=generic, storagedriver=devicemapper
CPUs: 35
Total Memory: 16.86 GiB
Name: 610106ee8bd9
$ uname -a
Linux ubuntu 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

@johnjelinek
Copy link

$ docker-compose up -d
Creating app_web_1
ERROR: Cannot start container 75e497d5baf276770d2f1398c8ea8319daf4eb9d0639ae79ae49c64b45342b10: [8] System error: write /sys/fs/cgroup/memory/system.slice/docker/75e497d5baf276770d2f1398c8ea8319daf4eb9d0639ae79ae49c64b45342b10/memory.swappiness: invalid argument

@abronan
Copy link
Contributor

abronan commented Dec 3, 2015

@johnjelinek Any way you can try with docker master? We just merged #18285 that is fixing this issue with swarm and compose.

@johnjelinek
Copy link

@abronan: do I need to deploy the master build to all of my swarm instances or just the box where I'm executing docker commands?

@abronan
Copy link
Contributor

abronan commented Dec 3, 2015

@johnjelinek All of the swarm nodes unfortunately as the Swarm Manager just pass on the request to the selected node's daemon, so the memory.swapiness default value will still be 0 in that case as we don't set it in swarm and this will fail the same way as you described it above without the fix.

If using docker-machine I think the closest you can get to test docker:master is using the --engine-install-url "https://experimental.docker.com" flag on docker machine create. But AFAIK this is a nightly build so the change should be there tomorrow to test. Or you can download the master binary directly on the Swarm Agents VMs and upgrade yourself with docker-machine ssh (a bit painful though).

@johnjelinek
Copy link

@abronan: did flags to docker change?

# systemctl status docker
docker.service
   Loaded: loaded (/etc/systemd/system/docker.service; enabled)
   Active: failed (Result: exit-code) since Thu 2015-12-03 00:13:02 EST; 3min 35s ago
  Process: 12184 ExecStart=/usr/bin/docker -d --exec-opt native.cgroupdriver=cgroupfs -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver devicemapper --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic --cluster-store=consul://xxx.xxx.xxx.xxx:8500 --cluster-advertise=ens33:2376 (code=exited, status=125)
 Main PID: 12184 (code=exited, status=125)

Dec 03 00:13:02 knifeparty systemd[1]: Started docker.service.
Dec 03 00:13:02 knifeparty docker[12184]: flag provided but not defined: -d
Dec 03 00:13:02 knifeparty docker[12184]: See '/usr/bin/docker --help'.
Dec 03 00:13:02 knifeparty systemd[1]: docker.service: main process exited, code=exited, status=125/n/a
Dec 03 00:13:02 knifeparty systemd[1]: Unit docker.service entered failed state.

@johnjelinek
Copy link

oic, I changed -d to daemon ... stand by

@johnjelinek
Copy link

@abronan: I upgraded docker on all the servers, but it didn't help:

$ docker version
Client:
 Version:      1.10.0-dev
 API version:  1.22
 Go version:   go1.5.1
 Git commit:   ee3e07d
 Built:        Thu Dec  3 00:48:27 2015
 OS/Arch:      linux/amd64

Server:
 Version:      swarm/1.0.0
 API version:  1.21
 Go version:   go1.5.1
 Git commit:   087e245
 Built:
 OS/Arch:      linux/amd64
$ docker-compose up -d
Recreating app_web_1
Starting app_redis_1
ERROR: Cannot start container 52b42e492cc19759eabc11ae61b9022a62d096bfd8eb151187f08d101dbd635f: [8] System error: write /sys/fs/cgroup/memory/system.slice/docker/52b42e492cc19759eabc11ae61b9022a62d096bfd8eb151187f08d101dbd635f/memory.swappiness: invalid argument

shaleman added a commit to contiv/netplugin that referenced this issue Dec 4, 2015
…hanges

svcplugin cleanup
Merging this as sanity failure seem to be variation of moby/moby#17879
@johnjelinek
Copy link

@abronan: how do I downgrade docker now? I tried replacing the binary and also with docker-machine upgrade <machine-name>

docker: Error response from daemon: client is newer than server (client API version: 1.22, server API version: 1.21).

@johnjelinek
Copy link

my fault ... I needed to delete the newer docker binary from my path.

@thaJeztah thaJeztah changed the title Container fails to start on ubuntu14.04 LTS due to memory.swappiness write (remote API only) Container fails to start on ubuntu14.04 LTS due to memory.swappiness write (Swarm remote API only) Dec 6, 2015
@thaJeztah
Copy link
Member

This should be resolved in Swarm in docker-archive/classicswarm#1425

@jtwile2 thanks for reporting, I'm going to close this issue, because this if an issue in Swarm, and should be resolved. Please follow the discussion in docker-archive/classicswarm#1425, but feel free to comment here after I've closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants