CI: Introduce flaky test finder by olljanat · Pull Request #38523 · moby/moby

olljanat · 2019-01-09T18:49:08Z

- What I did
We all hate to see CI builds to failing randomly to flaky tests.

Lot of work will be needed before all builds will run smoothly but to make sure that any PR does not make things worse I implemented "flaky test finder" which detect new integration tests from PR commits and run stress tests for them.

- How I did it
Included bash script which:

detects new tests on integration/*_test.go
will run hack/make.sh test-integration which five time with setting -test.count 5 -test.run
=> new tests are run totally 25 times before continue to normal integration tests.

- How to verify it
Included second commit ( 4a5d478 ) which contains two example flaky tests. I will drop that after design is approved.

olljanat · 2019-01-09T19:32:41Z

Janky worked correctly:

19:16:37 ---> Making bundle: test-integration-flaky (in bundles/test-integration-flaky)
19:16:38 
19:16:38 Found new integrations tests:
19:16:38 +func TestCommitFlakyRandom1(t *testing.T) {
19:16:38 +func TestCommitFlakyRandom2(t *testing.T) {
19:16:38 Running stress test for them.
19:16:38 Using test flags: -test.count 5 -test.run TestCommitFlakyRandom1|TestCommitFlakyRandom2
.
.
.
19:18:08 Running /go/src/github.com/docker/docker/integration/image
19:18:08 INFO: Testing against a local daemon
19:18:08 === RUN   TestCommitFlakyRandom1
19:18:08 --- PASS: TestCommitFlakyRandom1 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom2
19:18:08 --- PASS: TestCommitFlakyRandom2 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom1
19:18:08 --- PASS: TestCommitFlakyRandom1 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom2
19:18:08 --- PASS: TestCommitFlakyRandom2 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom1
19:18:08 --- FAIL: TestCommitFlakyRandom1 (0.00s)
19:18:08     commit_test.go:61: assertion failed: error is not nil: Game Over, Better Luck Next Time
19:18:08 === RUN   TestCommitFlakyRandom2
19:18:08 --- PASS: TestCommitFlakyRandom2 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom1
19:18:08 --- PASS: TestCommitFlakyRandom1 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom2
19:18:08 --- PASS: TestCommitFlakyRandom2 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom1
19:18:08 --- PASS: TestCommitFlakyRandom1 (0.00s)
19:18:08 === RUN   TestCommitFlakyRandom2
19:18:08 --- PASS: TestCommitFlakyRandom2 (0.00s)
19:18:08 FAIL

and it looks that others do not have .git folder mounted so they cannot use this.

And this is output from Janky after I dropped example flaky tests from this PR:

20:03:52  ---> Making bundle: test-integration-flaky (in bundles/test-integration-flaky)
20:03:52  No new tests added to integration.

olljanat · 2019-01-09T19:37:05Z

(reserved for my derek commands)

codecov · 2019-01-09T19:45:35Z

Codecov Report

Merging #38523 into master will increase coverage by 0.06%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #38523      +/-   ##
==========================================
+ Coverage   36.64%    36.7%   +0.06%     
==========================================
  Files         608      608              
  Lines       45173    45230      +57     
==========================================
+ Hits        16553    16603      +50     
- Misses      26339    26340       +1     
- Partials     2281     2287       +6

olljanat · 2019-01-09T20:13:19Z

@vdemeester @kolyshkin PTAL

thaJeztah · 2019-01-09T21:37:51Z

/cc @tonistiigi

thaJeztah · 2019-01-09T21:43:09Z

This is really cool!

I was also thinking if we could label flaky tests in the test itself, and either run them separate, e.g. have a special "skip";

skip.If(t, SkipFlakyTests)

Then either;

have a separate CI job for the flaky tests (retry those tests if they fail, and PASS if on of those runs passes)
Same as above, but include them in the current CI jobs, but run the integration tests twice; once with, and once without the flaky tests)

tonistiigi · 2019-01-09T23:26:01Z

btw, updated flaky stats https://gist.github.com/tonistiigi/9e91cbb968dea83113d09ce5f9dbff79

olljanat · 2019-01-10T05:54:55Z

@tonistiigi how you btw generate that one? I have been working on similar stats on https://github.com/olljanat/moby-jenkins-statistics but I can see that you have data from longer period.

olljanat · 2019-01-10T14:39:43Z

I was also thinking if we could label flaky tests in the test itself, and either run them separate, e.g. have a special "skip";
skip.If(t, SkipFlakyTests)

@thaJeztah For existing known flaky tests it is option but I'm not sure how actually run only tests which have SkipFlakyTests flag?

thaJeztah · 2019-01-10T15:48:15Z

I'm not sure how actually run only tests which have SkipFlakyTests flag?

Might be hacky. Some quick thoughts;

Abuse testing.Short() to either skip or run them
Add a Flaky suffix to the test names, and use that to match flaky and non-flaky tests

tonistiigi · 2019-01-10T16:39:01Z

how you btw generate that one?

https://gist.github.com/tonistiigi/2e474449ac80e25753a84712e19dfd4d . I have all master logs/status from build 9564 if you are interested. Lost couple of days in the last run because I forgot to run it in december. Only master as PRs may be just failing because of actual bugs.

… tests for all existing integration tests Signed-off-by: Olli Janatuinen <[email protected]>

olljanat · 2019-01-10T19:06:05Z

Add a Flaky suffix to the test names, and use that to match flaky and non-flaky tests

That is probably easiest but as most of the flaky tests are actually on legacy integration-cli tests it is better to do that on another PR.

Btw. I also tested to run this logic for all existing tests on integration and this was the result:

--- FAIL: TestServiceUpdateLabel (2.88s)
--- FAIL: TestServiceUpdateLabel (2.83s)
--- FAIL: TestServiceUpdateLabel (2.85s)
--- FAIL: TestServiceUpdateLabel (2.90s)
--- FAIL: TestCreateServiceSecretFileMode (33.23s)
--- FAIL: TestServiceUpdateLabel (2.84s)

#38520 will fix TestServiceUpdateLabel and TestCreateServiceSecretFileMode is already reported on #37132

thaJeztah

some nits

@tianon could you have a look as well?

hack/make/test-integration-flaky

olljanat · 2019-01-10T20:22:12Z

Updated 😄

kolyshkin · 2019-01-10T22:10:58Z

LGTM, and thanks 👍

hack/make/test-integration-flaky

comparing PR commit(s) to HEAD of moby/moby master branch and if founds new (or renamed) integration tests will run stress tests for them. Signed-off-by: Olli Janatuinen <[email protected]>

vdemeester

LGTM 🐯

olljanat · 2019-01-11T10:10:20Z

Hmm. Weird error on Janky. Maybe some server where it try connect is failing?

08:06:14 FAIL: docker_cli_daemon_plugins_test.go:161: DockerDaemonSuite.TestDaemonKillWithPlugins
08:06:14 
08:06:14 [d2ddfbf1a9745] waiting for daemon to start
08:06:14 [d2ddfbf1a9745] daemon started
08:06:14 
08:06:14 docker_cli_daemon_plugins_test.go:166:
08:06:14     c.Fatalf("Could not install plugin: %v %s", err, out)
08:06:14 ... Error: Could not install plugin: exit status 1 Error response from daemon: error parsing HTTP 404 response body: invalid character '<' looking for beginning of value: "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final//EN\">\n<title>404 Not Found</title>\n<h1>Not Found</h1>\n<p>The requested URL was not found on the server.  If you entered the URL manually please check your spelling and try again.</p>\n"

Let's try once more...

thaJeztah

LGTM

let's wait a bit to see if @tianon wants a chance to review, but otherwise we can merge 🤗

tianon · 2019-01-12T07:48:34Z

Between @thaJeztah and @cpuguy83, my syntax comments are all taken! 🙌 ❤️

This looks sane to me. 👍

thaJeztah · 2019-01-12T10:16:44Z

Thanks!!

tiborvass · 2019-02-07T23:18:53Z

hack/make/test-integration-flaky

+(
+    TESTARRAY=$(echo "$new_tests" | sed 's/+func //' | awk -F'\\(' '{print $1}' | tr '\n' '|')
+    export TESTFLAGS="-test.count 5 -test.run ${TESTARRAY%?}"
+    export TEST_REPEAT=5


@olljanat @thaJeztah why set both -test.count 5 and TEST_REPEAT 5 ? That will run 25 times. Also, if you do set -test.count you need to export TIMEOUT=... (which will set -test.timeout) to 5 times the existing TIMEOUT.

5 runs is not enough to find example known flaky test TestCreateServiceSecretFileMode. Look comments above when I tested to run this to existing tests.

Difference between this and just run 25 times is this will restart daemon after every 5 test runs.

Increasing timeout of course can be done if needed.

Btw. Did you saw this timeouting on some PR now?

@olljanat yes on our internal EE fork. I'm fixing the bug in #38693

kolyshkin · 2019-02-08T17:35:44Z

Difference between this and just run 25 times is this will restart daemon after every 5 test runs

Makes sense to add a comment explaining that

tiborvass · 2019-02-08T17:53:09Z

Thanks I'll fix it in #38693

olljanat requested review from tianon and vdemeester as code owners January 9, 2019 18:49

GordonTheTurtle added the status/0-triage label Jan 9, 2019

derek bot added the area/testing label Jan 9, 2019

olljanat force-pushed the flaky-test-finder branch from 4a5d478 to a503265 Compare January 9, 2019 19:45

olljanat force-pushed the flaky-test-finder branch from a503265 to f7c69d5 Compare January 9, 2019 20:28

olljanat mentioned this pull request Jan 9, 2019

Bump SwarmKit to 1a0ebd43b2d156983a695f90e56f4ecba6ced902 #38520

Merged

thaJeztah added status/2-code-review and removed status/0-triage labels Jan 9, 2019

olljanat mentioned this pull request Jan 10, 2019

integration/container: add a base test for C/R #38452

Merged

olljanat added a commit to olljanat/moby that referenced this pull request Jan 10, 2019

DO NOT MERGE: Using same idea as on moby#38523 but now running stress…

3125a32

… tests for all existing integration tests Signed-off-by: Olli Janatuinen <[email protected]>

thaJeztah requested changes Jan 10, 2019

View reviewed changes

hack/make/test-integration-flaky Outdated Show resolved Hide resolved

hack/make/test-integration-flaky Outdated Show resolved Hide resolved

cpuguy83 requested changes Jan 10, 2019

View reviewed changes

hack/make/test-integration-flaky Outdated Show resolved Hide resolved

hack/make/test-integration-flaky Outdated Show resolved Hide resolved

hack/make/test-integration-flaky Outdated Show resolved Hide resolved

olljanat force-pushed the flaky-test-finder branch 2 times, most recently from 7d0ef03 to b3e49df Compare January 10, 2019 20:18

derek bot added the rebuild/powerpc label Jan 10, 2019

GordonTheTurtle removed the rebuild/powerpc label Jan 10, 2019

derek bot added the rebuild/powerpc label Jan 10, 2019

GordonTheTurtle removed the rebuild/powerpc label Jan 10, 2019

cpuguy83 reviewed Jan 10, 2019

View reviewed changes

hack/make/test-integration-flaky Outdated Show resolved Hide resolved

thaJeztah reviewed Jan 10, 2019

View reviewed changes

hack/make/test-integration-flaky Outdated Show resolved Hide resolved

CI: Introduce flaky test finder

8a8fd37

comparing PR commit(s) to HEAD of moby/moby master branch and if founds new (or renamed) integration tests will run stress tests for them. Signed-off-by: Olli Janatuinen <[email protected]>

olljanat force-pushed the flaky-test-finder branch from b3e49df to 8a8fd37 Compare January 10, 2019 23:14

derek bot added the rebuild/janky label Jan 11, 2019

GordonTheTurtle removed the rebuild/janky label Jan 11, 2019

vdemeester approved these changes Jan 11, 2019

View reviewed changes

derek bot added the rebuild/janky label Jan 11, 2019

GordonTheTurtle removed the rebuild/janky label Jan 11, 2019

thaJeztah approved these changes Jan 11, 2019

View reviewed changes

thaJeztah merged commit ad2765b into moby:master Jan 12, 2019

tiborvass reviewed Feb 7, 2019

View reviewed changes

thaJeztah mentioned this pull request Apr 10, 2019

[18.09 backport] Makefile and test changes docker-archive/engine#195

Merged

olljanat mentioned this pull request Apr 10, 2021

[WIP] Stress testing flaky tests #42281

Closed

Conversation

olljanat commented Jan 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olljanat commented Jan 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olljanat commented Jan 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

olljanat commented Jan 9, 2019

Uh oh!

thaJeztah commented Jan 9, 2019

Uh oh!

thaJeztah commented Jan 9, 2019

Uh oh!

tonistiigi commented Jan 9, 2019

Uh oh!

olljanat commented Jan 10, 2019

Uh oh!

olljanat commented Jan 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thaJeztah commented Jan 10, 2019

Uh oh!

tonistiigi commented Jan 10, 2019

Uh oh!

olljanat commented Jan 10, 2019

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

olljanat commented Jan 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kolyshkin commented Jan 10, 2019

Uh oh!

Uh oh!

Uh oh!

vdemeester left a comment

Choose a reason for hiding this comment

Uh oh!

olljanat commented Jan 11, 2019

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

Uh oh!

tianon commented Jan 12, 2019

Uh oh!

thaJeztah commented Jan 12, 2019

Uh oh!

tiborvass Feb 7, 2019

Choose a reason for hiding this comment

Uh oh!

olljanat Feb 8, 2019

Choose a reason for hiding this comment

Uh oh!

tiborvass Feb 8, 2019

Choose a reason for hiding this comment

Uh oh!

kolyshkin commented Feb 8, 2019

Uh oh!

tiborvass commented Feb 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

olljanat commented Jan 9, 2019 •

edited

Loading

olljanat commented Jan 9, 2019 •

edited

Loading

olljanat commented Jan 9, 2019 •

edited

Loading

codecov bot commented Jan 9, 2019 •

edited

Loading

olljanat commented Jan 10, 2019 •

edited

Loading

olljanat commented Jan 10, 2019 •

edited

Loading