Skip to content

Failing kubelet /stats/summary API and ResourceMetricsAPI tests on cgroup v2 #99230

@harche

Description

@harche

Which jobs are failing:

pull-kubernetes-node-crio-cgrpv2-e2e

Which test(s) are failing:

E2eNode Suite: [k8s.io] Summary API [NodeConformance] when querying /stats/summary should report resource usage through the stats api
E2eNode Suite: [k8s.io] ResourceMetricsAPI [NodeFeature:ResourceMetrics] when querying /resource/metrics should report resource usage through the resource metrics api

Since when has it been failing:

From what I have observed these tests have always failed on cgroup v2 host.

Testgrid link:

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/99110/pull-kubernetes-node-crio-cgrpv2-e2e/1361561789094957056/

Reason for failure:

cgroup v2 has a different set of files compared to cgroup v1 to report the metrics. A lot of places in cadvisor, where the metrics end up getting read from, still assume cgroup v1 paths and files.

Anything else we need to know:

Some of the following PRs end up addressing the issue partially, but not completely.

google/cadvisor#2801
google/cadvisor#2800

Metadata

Metadata

Assignees

Labels

kind/failing-testCategorizes issue or PR as related to a consistently or frequently failing test.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions