Kubelet - Cadvisor Exposes Misleading IO metrics In Cgroup V2

#### What happened:
We recently upgraded our k8s node from cgroup v1 to cgroup v2 hierarchy and saw that few io related metrics from kubelet (/metrics/cadvisor) are showing erroneous values. for example - `container_fs_writes_bytes_total` or `container_fs_writes_total` always has zero value for all cgroups within that node even though one pod is heavily writing to some pvc in that node. e.g
```
container_fs_writes_bytes_total{container="perfrunner-ssd",device="/dev/dm-3",id="/kubepods.slice/kubepods-pod8f151d4f_88fa_4d94_ae23_ad3573b3454a.slice/cri-containerd-24f81c30c494bf78da70e6b6d0d719a5f38ea723d448f4743f7d81c2fa83ef1d.scope",image="buster-fio:0.0.1-rc1",name="24f81c30c494bf78da70e6b6d0d719a5f38ea723d448f4743f7d81c2fa83ef1d",namespace="demo",pod="test-ssd-a-0"} 0 1621806845709
```

#### What you expected to happen:
We expected to see correct values as specified by `io.stat` file under container's cgroup v2 hierarchy.

#### How to reproduce it (as minimally and precisely as possible):
Simply provision a k8s node with below environment and enable cgroup v2 with following commands during boot.
```
    sed -i -E "s/^GRUB_CMDLINE_LINUX_DEFAULT=\"\"$/GRUB_CMDLINE_LINUX_DEFAULT=\"systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all \"/g" /etc/default/grub
    update-grub
```

#### Anything else we need to know?:
We've tried kubelet version (v.1.19.8) on another node having cgroup v1 hierarchy  with exactly same workload and saw that above metrics are giving values as expected, but they remains zero for cgroup v2 enabled node. We are using kubelet version 1.19.8 which has dependency of cadvisor version v0.37.4. To debug it, I’ve tried increasing the verbosity of kubelet logs --v=5, but there were no surprises there. I’ve also tried running cadvisor (v0.37.4) standalone binary on cgroup v2 enabled node, but it was having the same issue i.e /metrics endpoint zero value for above metrics.

#### Environment:
- Kubernetes version (use `kubectl version`): **1.19.8**
- Cloud provider or hardware configuration: Doesn't matter.
- OS (e.g: `cat /etc/os-release`): _Debian GNU/Linux 10 (buster)`_
- Kernel (e.g. `uname -a`): _5.10.0-0.bpo.3-cloud-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) x86_64 GNU/Linux_
- Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet - Cadvisor Exposes Misleading IO metrics In Cgroup V2 #102285

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Kubelet - Cadvisor Exposes Misleading IO metrics In Cgroup V2 #102285

Description

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions