Skip to content

Add TC to check dynamic subnet for ingress network#39966

Closed
arkodg wants to merge 2 commits intomoby:masterfrom
arkodg:add-tc-dynamic-ingress-network
Closed

Add TC to check dynamic subnet for ingress network#39966
arkodg wants to merge 2 commits intomoby:masterfrom
arkodg:add-tc-dynamic-ingress-network

Conversation

@arkodg
Copy link
Contributor

@arkodg arkodg commented Sep 20, 2019

Needs to be rebased on top of #39953

@derek
Copy link

derek bot commented Sep 20, 2019

Thank you for your contribution. I've just checked and your Pull Request doesn't appear to have any description.
That's something we need before your Pull Request can be merged. Please see our contributing guide.

@thaJeztah
Copy link
Member

Rebased this one to see if #39953 and this one combined will make CI go green

@thaJeztah
Copy link
Member

Not sure why x86 keeps failing; tests have completed successfully, but then it gets killed after 2 hours. Was thinking if this could be related to #39920, but not sure why it would fail only on this PR

https://ci.docker.com/public/job/moby/job/PR-39966/5/execution/node/187/log/

00:50:44.516  DONE 1223 tests, 43 skipped in 2257.616s
00:50:44.516  ---> Making bundle: .integration-daemon-stop (in bundles/test-integration)
00:50:44.516  ++++ cat bundles/test-integration/docker.pid
00:50:44.516  +++ kill 7271
00:50:51.057  umount: bundles/test-integration/root: mountpoint not found
00:50:51.057  +++ /etc/init.d/apparmor stop
00:50:51.057  Clearing AppArmor profiles cache:.
00:50:51.057  All profile caches have been cleared, but no profiles have been unloaded.
00:50:51.057  Unloading profiles will leave already running processes permanently
00:50:51.057  unconfined, which can lead to unexpected situations.
00:50:51.057  
00:50:51.057  To set a process to complain mode, use the command line tool
00:50:51.057  'aa-complain'. To really tear down all profiles, run the init script
00:50:51.057  with the 'teardown' option."
00:50:51.057  
02:00:06.451  Sending interrupt signal to process
02:00:06.481  Sending interrupt signal to process

@thaJeztah thaJeztah force-pushed the add-tc-dynamic-ingress-network branch from 5e26dc6 to e2b5ac7 Compare September 23, 2019 15:13
@thaJeztah
Copy link
Member

rebased #39953 and this one, in case there was a change/fix in master that wasn't yet in this branch

@thaJeztah
Copy link
Member

Still failing; being killed after 2 hours. Could use some help on this one to figure out what's causing it; https://ci.docker.com/public/blue/organizations/jenkins/moby/detail/PR-39966/6/pipeline

@thaJeztah thaJeztah force-pushed the add-tc-dynamic-ingress-network branch 2 times, most recently from 2d06309 to 3a51d97 Compare September 30, 2019 17:57
@thaJeztah
Copy link
Member

rebased; wondering if the failure was related to #40016

@thaJeztah
Copy link
Member

still failing; something fishy going on

@dperny @arkodg

@arkodg
Copy link
Contributor Author

arkodg commented Sep 30, 2019

@thaJeztah I'm not a CI expert but putting the make tests is getting a SIGTERM after 1hr59min and the Jenkinsfile timeout is set to 2hrs

timeout(time: 2, unit: 'HOURS')
, does that theory sound right @seemethere

@arkodg
Copy link
Contributor Author

arkodg commented Sep 30, 2019

cc superadmin @tiborvass

@thaJeztah
Copy link
Member

I'm not a CI expert but putting the make tests is getting a SIGTERM after 1hr59min and the Jenkinsfile timeout is set to 2hrs

That's correct, but the test should take roughly 55 minutes to run, and on this PR it ends after 40 minutes;

00:40:43.713  DONE 1253 tests, 44 skipped, 2 failures in 2024.216s

And gets killed after 2 hours

@thaJeztah
Copy link
Member

thaJeztah commented Oct 1, 2019

Actually, I see there's a NPE panic in a test; https://ci.docker.com/public/job/moby/job/PR-39966/9/execution/node/151/log/

00:37:28.660      --- FAIL: TestDockerSuite/TestEventsContainerEvents (1.19s)
00:37:28.660          suite.go:65: test suite panicked: runtime error: slice bounds out of range [:5] with capacity 4
00:37:28.660              goroutine 3978 [running]:
00:37:28.660              runtime/debug.Stack(0xc0026e3908, 0x1ad9bc0, 0xc0008100c0)
00:37:28.660              	/usr/local/go/src/runtime/debug/stack.go:24 +0x9d
00:37:28.660              github.com/docker/docker/internal/test/suite.failOnPanic(0xc00185e600)
00:37:28.660              	/go/src/github.com/docker/docker/internal/test/suite/suite.go:65 +0x57
00:37:28.660              panic(0x1ad9bc0, 0xc0008100c0)
00:37:28.660              	/usr/local/go/src/runtime/panic.go:679 +0x1b2
00:37:28.660              github.com/docker/docker/integration-cli.(*DockerSuite).TestEventsContainerEvents(0x2f7d7a8, 0xc00185e600)
00:37:28.660              	/go/src/github.com/docker/docker/integration-cli/docker_cli_events_test.go:89 +0x3c5
00:37:28.660              reflect.Value.call(0xc0000c4f00, 0xc0008036c0, 0x13, 0x1bfd18b, 0x4, 0xc000e8df30, 0x2, 0x2, 0xc00075c618, 0x40d903, ...)
00:37:28.660              	/usr/local/go/src/reflect/value.go:460 +0x5f6
00:37:28.660              reflect.Value.Call(0xc0000c4f00, 0xc0008036c0, 0x13, 0xc00075c730, 0x2, 0x2, 0xf, 0x0, 0x0)
00:37:28.660              	/usr/local/go/src/reflect/value.go:321 +0xb4
00:37:28.660              github.com/docker/docker/internal/test/suite.Run.func2(0xc00185e600)
00:37:28.660              	/go/src/github.com/docker/docker/internal/test/suite/suite.go:57 +0x2c2
00:37:28.660              testing.tRunner(0xc00185e600, 0xc0008dbea0)
00:37:28.660              	/usr/local/go/src/testing/testing.go:909 +0xc9
00:37:28.660              created by testing.(*T).Run
00:37:28.660              	/usr/local/go/src/testing/testing.go:960 +0x350

@thaJeztah
Copy link
Member

hm... looks like there's a missing assert;

assert.Assert(c, len(events) >= 5) //Missing expected event
containerEvents := eventActionsByIDAndType(c, events, "container-events-test", "container")
assert.Assert(c, is.DeepEqual(containerEvents[:5], []string{"create", "attach", "start", "die", "destroy"}), out)

(which doesn't explain the failure, but explains the panic)

@thaJeztah
Copy link
Member

opened #40026 to prevent the panic

@thaJeztah
Copy link
Member

Wait; why is this one now failing as well on vendor check?


[2019-10-03T15:13:59.478Z] The result of vndr differs
[2019-10-03T15:13:59.478Z] 
[2019-10-03T15:13:59.478Z]  D vendor/golang.org/x/sync/singleflight/singleflight.go
[2019-10-03T15:13:59.478Z] 
[2019-10-03T15:13:59.478Z] Please vendor your package with github.com/LK4D4/vndr.

@thaJeztah
Copy link
Member

OK, looks like it's a problem on master; opened #40037 to fix it

@thaJeztah thaJeztah force-pushed the add-tc-dynamic-ingress-network branch from b8c8a2b to 203b5fe Compare October 3, 2019 21:10
@arkodg arkodg force-pushed the add-tc-dynamic-ingress-network branch from 203b5fe to 7431bb3 Compare October 4, 2019 20:57
@thaJeztah thaJeztah force-pushed the add-tc-dynamic-ingress-network branch from 7431bb3 to 6010d04 Compare October 7, 2019 17:16
@arkodg arkodg force-pushed the add-tc-dynamic-ingress-network branch from 6010d04 to 9d37447 Compare October 10, 2019 00:44
@arkodg
Copy link
Contributor Author

arkodg commented Oct 10, 2019

@thaJeztah I don't think its hanging but getting killed after the Jenkins Timeout of 2h, can we please extend it and retry

@andrewhsu
Copy link
Contributor

I kicked off a Replay on the last run, but with the jenkins timeout set to 3 hrs instead of 2 hrs: https://ci.docker.com/public/job/moby/job/PR-39966/15/

@thaJeztah
Copy link
Member

@thaJeztah I don't think its hanging but getting killed after the Jenkins Timeout of 2h, can we please extend it and retry

@arkodg see my earlier comment; it should never reach the 2 hour limit #39966 (comment)

@thaJeztah
Copy link
Member

@andrewhsu download.docker.com is partially down, so we'll have to restart again after it's up again

@thaJeztah
Copy link
Member

Kicked it again, as download.docker.com should be up again

thaJeztah and others added 2 commits October 22, 2019 00:44
full diff: moby/swarmkit@7dded76...a8bbe7d

changes included:

- moby/swarmkit#2867 Only update non-terminal tasks on node removal
  - related to moby/swarmkit#2806 Fix leaking task resources when nodes are deleted
- moby/swarmkit#2880 Bump to golang 1.12.9
- moby/swarmkit#2886 Bump vendoring to match current docker/docker master
  - regenerates protobufs
- moby/swarmkit#2890 Remove hardcoded IPAM config subnet value for ingress network
  - fixes [ENGORC-2651] Specifying --default-addr-pool for docker swarm init is not picked up by ingress network

Signed-off-by: Sebastiaan van Stijn <[email protected]>
@andrewhsu
Copy link
Contributor

Hmm...looks like the PR check hit the 2 hr timeout:

00:42:24.678  === RUN   TestDockerSwarmSuite/TestSwarmVolumePlugin
00:43:11.289  === RUN   TestDockerSwarmSuite/TestUnlockEngineAndUnlockedSwarm
02:00:03.738  Sending interrupt signal to process

@arkodg
Copy link
Contributor Author

arkodg commented Oct 23, 2019

@andrewhsu there is some issue with the latest swarmkit refpoint, that needs to be triaged,
just cherrypicking the ingress network fix has no problems docker-archive#402

@thaJeztah
Copy link
Member

replaced by / included in #40309

@thaJeztah thaJeztah closed this Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants