Skip to content

integration/container: add a base test for C/R#38452

Merged
thaJeztah merged 2 commits intomoby:masterfrom
avagin:cr-test
Mar 7, 2019
Merged

integration/container: add a base test for C/R#38452
thaJeztah merged 2 commits intomoby:masterfrom
avagin:cr-test

Conversation

@avagin
Copy link
Contributor

@avagin avagin commented Dec 29, 2018

- What I did

I spent a Friday night to write this test.

- How I did it

Silently

- How to verify it

Just run it

- Description for the changelog

This test creates a container, checkpoints it twice, restore it once and checks that a content of a tmpfs mount is restored correctly.

- A picture of a cute animal (not mandatory but encouraged)
f1c3233b87f1c6e022aa5f9a7117fe59

@avagin
Copy link
Contributor Author

avagin commented Dec 29, 2018

Cc: @kolyshkin @rst0git

@codecov
Copy link

codecov bot commented Dec 30, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@9c83848). Click here to learn what that means.
The diff coverage is 0%.

@@            Coverage Diff            @@
##             master   #38452   +/-   ##
=========================================
  Coverage          ?   36.48%           
=========================================
  Files             ?      613           
  Lines             ?    45882           
  Branches          ?        0           
=========================================
  Hits              ?    16741           
  Misses            ?    26847           
  Partials          ?     2294

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lifubang should we close #38028 in favour of this commit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. If maintainers think it should close, close it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any objections against config.json with some additional information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

criu check --all ?

Copy link
Contributor

@fntlnz fntlnz Jan 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for this test just checking the required features is enough, checking all would require more effort like enabling uffd and specific tmpfs configurations.
I would say, let's have this and then we can add more specific test cases.

@avagin avagin force-pushed the cr-test branch 2 times, most recently from eaac6e8 to 3923a97 Compare December 30, 2018 18:36
@derek derek bot added the rebuild/* label Dec 31, 2018
@olljanat
Copy link
Contributor

olljanat commented Jan 1, 2019

@avagin CI fails which looks to be caused by this change.

Experimental and Janky fails to:

10:01:39 --- FAIL: TestRenameAnonymousContainer (1.72s)
10:01:39     rename_test.go:169: assertion failed: 0 (int) != 1 (inspect.State.ExitCode int): container 8fc6fb78a11c77d8af79ae9e7d73c44ab6c624c10b89c1681484a69ff2b4649e exited with the wrong exitcode: {ContainerJSONBase:0xc00080da20 Mounts:[] Config:0xc0004e03c0 NetworkSettings:0xc0007c8300}

PowerPC and Z fails to:

09:53:38 Running /go/src/github.com/docker/docker/integration/container
09:53:38 INFO: Testing against a local daemon
09:53:38 === RUN   TestCheckpoint
09:53:41 --- FAIL: TestCheckpoint (3.03s)
09:53:41     checkpoint_test.go:55: Error (criu/util.c:816): exited, status=3
09:53:41         Warn  (criu/net.c:2840): Unable to get socket network namespace
09:53:41         Warn  (criu/net.c:2840): Unable to get tun network namespace
09:53:41         Warn  (criu/sk-unix.c:229): unix: Unable to open a socket file: Bad address
09:53:41         Warn  (criu/net.c:2840): Unable to get socket network namespace
09:53:41         Warn  (criu/kerndat.c:881): Can't keep kdat cache on non-tempfs
09:53:41         Looks good.
09:53:41     daemon.go:296: [d572613c2a932] waiting for daemon to start
09:53:41     daemon.go:328: [d572613c2a932] daemon started
09:53:41     checkpoint_test.go:105: ++ type -P true
09:53:41         ++ type -P ip6tables-restore
09:53:41         + mount --bind /bin/true /sbin/ip6tables-restore
09:53:41         ++ type -P true
09:53:41         ++ type -P ip6tables-save
09:53:41         + mount --bind /bin/true /sbin/ip6tables-save
09:53:41     checkpoint_test.go:146: assertion failed: false (false bool) != true (inspect.State.Running bool)
09:53:41     checkpoint_test.go:165: assertion failed: error is not nil: Error response from daemon: cannot start a paused container, try unpause instead
09:53:41     daemon.go:283: [d572613c2a932] exiting daemon
0

@derek derek bot added status/failing-ci Indicates that the PR in its current state fails the test suite area/testing labels Jan 1, 2019
Copy link
Contributor

@fntlnz fntlnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fntlnz
Copy link
Contributor

fntlnz commented Jan 1, 2019

I've been able to reproduce this test on my machine following the process that ended with success.

INFO: Testing against a local daemon                                                                                                                      
=== RUN   TestCheckpoint                                                                                                                                  
--- PASS: TestCheckpoint (4.78s)                                                                                                                          
    checkpoint_test.go:55: Error (criu/uffd.c:264): uffd: Lazy pages are not available: Function not implemented                                          
        Warn  (criu/kerndat.c:881): Can't keep kdat cache on non-tempfs                                                                                   
        Looks good.                                                                                                                                       
    daemon.go:296: [d5c0d8008e63a] waiting for daemon to start                                                                                            
    daemon.go:328: [d5c0d8008e63a] daemon started                                                                                                         
    checkpoint_test.go:105: ++ type -P true                                                                                                               
        ++ type -P ip6tables-restore                                                                                                                      
        + mount --bind /bin/true /sbin/ip6tables-restore                                                                                                  
        ++ type -P true                                                                                                                                   
        ++ type -P ip6tables-save                                                                                                                         
        + mount --bind /bin/true /sbin/ip6tables-save                                                                                                     
    daemon.go:283: [d5c0d8008e63a] exiting daemon                                                                                                         
PASS    

Not sure why it's complaining about pause,I ran multiple times and haven't experienced that, sending a rebuild to verify if it's flaky.

@fntlnz fntlnz added rebuild/* area/daemon Core Engine and removed rebuild/* labels Jan 1, 2019
@derek derek bot added rebuild/janky and removed status/failing-ci Indicates that the PR in its current state fails the test suite rebuild/janky labels Jan 2, 2019
@derek derek bot added the rebuild/z label Jan 2, 2019
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can simplify the test by reusing some stuff from integration/internal/container?

@kolyshkin
Copy link
Contributor

CI failures

experimental

00:14:35.536 === RUN TestRenameAnonymousContainer
00:14:37.289 --- FAIL: TestRenameAnonymousContainer (1.75s)
00:14:37.289 rename_test.go:169: assertion failed: 0 (int) != 1 (inspect.State.ExitCode int): container baac251d5a1cb2221ffedf6f10acbad166b90e3549601e96d908e76762675a81 exited with the wrong exitcode: {ContainerJSONBase:0xc0007a4840 Mounts:[] Config:0xc000714500 NetworkSettings:0xc000235b00}

Haven't seen this one before, will take a look 🙄

janky

00:24:30.155 === RUN TestServiceUpdateSecrets
00:24:32.287 --- FAIL: TestServiceUpdateSecrets (2.13s)
00:24:32.287 daemon.go:296: [dd677d8ec1bb7] waiting for daemon to start
00:24:32.287 daemon.go:328: [dd677d8ec1bb7] daemon started
00:24:32.287 update_test.go:126: assertion failed: error is not nil: Error response from daemon: rpc error: code = Unknown desc = update out of sequence
00:24:32.287 daemon.go:283: [dd677d8ec1bb7] exiting daemon

Known flaky test; #37547

ppc

01:29:00.842 FAIL: docker_api_swarm_test.go:296: DockerSwarmSuite.TestAPISwarmLeaderElection

Known flaky test; #32673

@vdemeester vdemeester added the status/failing-ci Indicates that the PR in its current state fails the test suite label Jan 4, 2019
@olljanat
Copy link
Contributor

I've been able to reproduce this test on my machine following the process that ended with success.

@fntlnz how you run it?

I tested this with #38523 like this:

  • Cherry picked commit: f7c69d5
  • Renamed TestRenameAnonymousContainer --> TestRenameAnonymousContainer2 and commited it.
  • Enabled experimental mode with: export DOCKER_EXPERIMENTAL=true
  • Run flaky finder make test-integration-flaky

and result was that TestCheckpoint and TestRenameAnonymousContainer2 fails on every run. You can see log on: 38452_stress_test.log

@avagin avagin force-pushed the cr-test branch 2 times, most recently from 2a7a7b1 to 2c88895 Compare February 5, 2019 19:23
@kolyshkin
Copy link
Contributor

Derek add label: rebuild/*

@derek derek bot added the rebuild/* label Feb 20, 2019
@thaJeztah thaJeztah added rebuild/* and removed status/failing-ci Indicates that the PR in its current state fails the test suite rebuild/* labels Feb 21, 2019
@kolyshkin
Copy link
Contributor

@avagin can you please squash all the commits dealing with the test case (i.e. everything except the second one) into one?

@kolyshkin
Copy link
Contributor

The only CI failure is

01:27:19.604 FAIL: docker_cli_daemon_test.go:118: DockerDaemonSuite.TestDaemonRestartUnlessStopped

which should be fixed by a rebase since #38737 is merged.

A container checkpoint directory doesn't have config.json.

Signed-off-by: Andrei Vagin <[email protected]>
@kolyshkin
Copy link
Contributor

LGTM // cc @thaJeztah

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants