Skip to content

services do not start: ingress-sbox is already present #36743

@tonistiigi

Description

@tonistiigi

When creating services with port options, in some cases the services do not start. In the logs it is visible that the task receives an error on startup: "container ingress-sbox is already present in sandbox ingress_sbox". Then the manager will keep retrying but always retrieves the same error.

level=error msg="fatal task error" error="container ingress-sbox is already present in sandbox ingress_sbox" module=node/agent/taskmanager node.id=ok1mq6f0r0yhmb6l7c5vplet7 service.id=ikjt6wfp2br0p5n47hk5dwl6m task.id=vb7zw9c3a7qwuq0qehn8pmg8b

One place it consistently shows is Moby CI. Both TestAPIServiceUpdatePort and TestSwarmPublishDuplicatePorts have failed ~70 times in last 2 months (full stats and links to failures: https://gist.githubusercontent.com/tonistiigi/9e91cbb968dea83113d09ce5f9dbff79/raw/17696e6fc2cce473d101f70ce6bdeb75a4f35bfd/moby_master_stats)

Examples:

https://jenkins.dockerproject.org/job/Docker%20Master/label=ubuntu-1604-aufs-stable/9755/console

17:31:28 
17:31:28 ----------------------------------------------------------------------
17:31:28 FAIL: docker_api_swarm_service_test.go:32: DockerSwarmSuite.TestAPIServiceUpdatePort
17:31:28 
17:31:28 [da70551459fe6] waiting for daemon to start
17:31:28 [da70551459fe6] daemon started
17:31:28 
17:31:28 docker_api_swarm_service_test.go:38:
17:31:28     waitAndAssert(c, defaultReconciliationTimeout, d.CheckActiveContainerCount, checker.Equals, 1)
17:31:28 docker_utils_test.go:452:
17:31:28     c.Assert(v, checker, args...)
17:31:28 ... obtained int = 0
17:31:28 ... expected int = 1
17:31:28 
17:31:28 [da70551459fe6] exiting daemon
17:31:40 
17:31:40 ----------------------------------------------------------------------

Daemon logs: https://gist.github.com/tonistiigi/6cb35ce9947cdbbcf410cf9863945220#file-docker-log-L317


https://jenkins.dockerproject.org/job/Docker%20Master/label=ubuntu-1604-aufs-stable/9755/console

17:55:13 
17:55:13 ----------------------------------------------------------------------
17:55:13 FAIL: docker_cli_swarm_test.go:1610: DockerSwarmSuite.TestSwarmPublishDuplicatePorts
17:55:13 
17:55:13 [d682c9bd5a30d] waiting for daemon to start
17:55:13 [d682c9bd5a30d] daemon started
17:55:13 
17:55:13 docker_cli_swarm_test.go:1618:
17:55:13     // make sure task has been deployed.
17:55:13     waitAndAssert(c, defaultReconciliationTimeout, d.CheckActiveContainerCount, checker.Equals, 1)
17:55:13 docker_utils_test.go:452:
17:55:13     c.Assert(v, checker, args...)
17:55:13 ... obtained int = 0
17:55:13 ... expected int = 1
17:55:13 
17:55:13 [d682c9bd5a30d] exiting daemon
17:55:16 
17:55:16 ----------------------------------------------------------------------

Daemon logs https://gist.github.com/tonistiigi/92a587084a19e388ac96ec5b096a4dd2#file-gistfile1-txt-L927

Reported previously in #30427 #36501

@fcrisciani

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/networkingNetworkingarea/swarmarea/testingkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions