-
Notifications
You must be signed in to change notification settings - Fork 43k
Zero value Kubelet PSI metrics emitted even if underlying OS doesn't enable it #136333
Copy link
Copy link
Closed
Closed
Copy link
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.priority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.Important over the long term, but may not be staffed and/or may need multiple releases to complete.sig/nodeCategorizes an issue or PR as relevant to SIG Node.Categorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.
Metadata
Metadata
Assignees
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.priority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.Important over the long term, but may not be staffed and/or may need multiple releases to complete.sig/nodeCategorizes an issue or PR as relevant to SIG Node.Categorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.
Type
Projects
Status
Done
What happened?
In Kubernetes 1.34, the KubeletPSI feature gate was set to true. However, when running Kubernetes on an OS that does not have PSI enabled, the PSI metrics are not expected to be generated. Here are the counts of the metrics observed during a test:
I suspect this part of the code:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cadvisor/cadvisor_linux.go#L105C53-L105C63
and prometheus does this: https://github.com/google/cadvisor/blob/master/metrics/prometheus.go#L1842
cc: @haircommander @bitoku
What did you expect to happen?
There should be no PSI metrics when underlying OS never enables it.
How can we reproduce it (as minimally and precisely as possible)?
Run k8s 1.34 and above and use grafana to monitor PSI metrics
Anything else we need to know?
This is a good to have. Doesn't really impact anything as cardinality is also negligible. This zero value confuses the end-user as it gives a visual impression that PSI is enabled when looking at grafana charts.
I can propose a fix on this.
Kubernetes version
1.34
Details
Cloud provider
Details
OS version
Linux: 5.14.0-570.78.1.el9_6.x86_64
Details
Install tools
Details
Container runtime (CRI) and version (if applicable)
Details
Related plugins (CNI, CSI, ...) and versions (if applicable)
Details