-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky Test: TestSwarmPublishDuplicatePorts on s390 #30427
Comments
/cc @aaronlehmann @aboch |
definitely not just a z issue, seen on power as well https://jenkins.dockerproject.org/job/Docker%20Master%20(powerpc)/3513/console |
This night I tried to reproduce the issue with Docker commit 833f1f4. I ran the test 1000 times on a s390x Debian Jessie host and kernel 4.6 without any failure. |
Thanks @michael-holzheu. This seems to point to another test influencing it somehow, maybe something in the swarm isn't getting cleaned up properly |
Maybe obvious, but looking at the logs from failed runs like 2306, 2311, 2313 or 2320 we see that it always took 31 seconds from the test before until we see the FAIL message. For example here the output of run 2320 (09:11:47 to 09:12:18 = 31 seconds):
So the failure reason is the timeout:
|
@tophj-ibm : I now tried to run the full DockerSwarmSuite in a loop: #!/bin/bash
for i in {1..1000}
do
echo "---------------------------------------------"
echo "TEST: $i"
echo "---------------------------------------------"
TESTFLAGS='-check.f DockerSwarmSuite.*' \
DOCKER_GRAPHDRIVER=vfs DOCKER_EXECDRIVER=native TIMEOUT="120m" \
hack/make.sh test-integration-cli
done 12 loops have been successful, but with run 13 the test cases began to fail
For runs 14-23 always the following 15 tests fail:
TestSwarmPublishDuplicatePorts-s390x-full-DockerSwarmSuite.txt I am not sure if this observation helps us for the TestSwarmPublishDuplicatePorts problem ... |
Update The last failing PR build was run 2020 (Apr 28, 2017): https://jenkins.dockerproject.org/job/Docker-PRs-s390x/2020/
Since then we had 848 runs where the DuplicatePorts test was successful. |
Thanks @michael-holzheu - I think we can close this one then, but ping me if I closed prematurely :D |
seen failing again on arm in #33892 :( |
Interesting; let me reopen |
Changes included: - libnetwork#2147 Adding logs for ipam state - libnetwork#2143 Fix race conditions in the overlay network driver - possibly addresses moby#36743 services do not start: ingress-sbox is already present - possibly addresses moby#30427 Flaky Test: TestSwarmPublishDuplicatePorts on s390 - possibly addresses moby#36501 Flaky tests: Service "port" tests - libnetwork#2142 Add wait time into xtables lock warning - libnetwork#2135 filter xtables lock warnings when firewalld is active - libnetwork#2140 Switch from x/net/context to context - libnetwork#2134 Adding a recovery mechanism for a split gossip cluster Signed-off-by: Sebastiaan van Stijn <[email protected]>
The CI fails sometime on the TestSwarmPublishDuplicatePorts test case like during Build 1781.
On he other hand only failing TestServiceUpdatePort didn't happen for at least the last 29 builds.
The text was updated successfully, but these errors were encountered: