Skip to content

Flaky test: TestContainerConsumedStats on Windows CRI Integration Test  #7936

@fangn2

Description

@fangn2

Description

Frequent CRI Integration test failures at test TestContainerConsumedStats on Windows platform(3 out of latest 6 CI builds) were seen on main branch. 1, 2, 3.

=== RUN   TestContainerConsumedStats
    container_stats_test.go:76: Create a pod config and run sandbox container
    main_test.go:672: Pull test image "registry.k8s.io/e2e-test-images/resource-consumer:1.10"
    container_stats_test.go:82: Create a container config and run container in a pod
    container_stats_test.go:99: Fetch initial stats for container
    container_stats_test.go:113: Initial container memory consumption is 41.117188 MB. Consume 100 MB and expect the reported stats to increase accordingly
E0106 20:39:21.086023    2732 remote_runtime.go:412] ExecSync 8d56f2e540bc6452b47fe46fccdc50c740cbc2ba02358ea[286](https://github.com/containerd/containerd/actions/runs/3858118554/jobs/6576376374#step:22:287)ac4abb680ee94a 'testlimit.exe -accepteula -d 25 -c 4' from runtime service failed: rpc error: code = DeadlineExceeded desc = failed to exec in container: timeout 30s exceeded: context deadline exceeded
    container_stats_test.go:129: 
        	Error Trace:	D:\a\containerd\containerd\src\github.com\containerd\containerd\container_stats_test.go:129
        	Error:      	Received unexpected error:
        	            	timeout exceeded
        	Test:       	TestContainerConsumedStats
--- FAIL: TestContainerConsumedStats (38.28s)

Seems the random failure was introduced in #7892 which was targeted for Linux. Specifically this change.

Without further research, I think have a check of OS using goruntime.GOOS and reapply the previous method of test to Windows should be sufficient to fix. Raised PR #7935.

@AkihiroSuda Any suggestions?

Steps to reproduce the issue

Random.
Trigger serval builds, wait and see.
Or check the latest failed runs on main branch.

Describe the results you received and expected

Random failures on CRI integration test TestContainerConsumedStats on Windows were seen after merging into main who previously passed the build.

I expect reliable and consistent test results.

What version of containerd are you using?

HEAD of main, commit c0c3546

Any other relevant information

No response

Show configuration if it is related to CRI plugin.

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions