Skip to content

[release/1.7] cri: Fix umarshal metrics#8472

Merged
fuweid merged 1 commit intocontainerd:release/1.7from
brandond:fix-cri-stats
May 5, 2023
Merged

[release/1.7] cri: Fix umarshal metrics#8472
fuweid merged 1 commit intocontainerd:release/1.7from
brandond:fix-cri-stats

Conversation

@brandond
Copy link
Copy Markdown
Contributor

@brandond brandond commented May 3, 2023

This is essentially the same change as #8335, but for the CRI server.

Without this fix, the kubelet fails to collects stats with:

time="2023-05-03T00:47:28.890947681Z" level=error msg="ListContainerStats failed" error="failed to convert to cri containerd stats format: failed to decode container metrics for \"6fb75f811841004943a338064d05f643b797afac8c09786551a0d5c0865c20b7\": failed to obtain cpu stats: unexpected metrics type: &Metrics{Pids:&PidsStat{Current:13,Limit:18446744073709551615,XXX_unrecognized:[],},CPU:&CPUStat{UsageUsec:1224890,UserUsec:1179817,SystemUsec:45073,NrPeriods:0,NrThrottled:0,ThrottledUsec:0,XXX_unrecognized:[],},Memory:&MemoryStat{Anon:12480512,File:8192,KernelStack:212992,Slab:260032,Sock:12288,Shmem:0,FileMapped:0,FileDirty:8192,FileWriteback:0,AnonThp:0,InactiveAnon:12476416,ActiveAnon:4096,InactiveFile:0,ActiveFile:8192,Unevictable:0,SlabReclaimable:92992,SlabUnreclaimable:167040,Pgfault:5710,Pgmajfault:0,WorkingsetRefault:0,WorkingsetActivate:0,WorkingsetNodereclaim:0,Pgrefill:0,Pgscan:0,Pgsteal:0,Pgactivate:2,Pgdeactivate:0,Pglazyfree:1,Pglazyfreed:0,ThpFaultAlloc:0,ThpCollapseAlloc:0,Usage:13189120,UsageLimit:18446744073709551615,SwapUsage:0,SwapLimit:18446744073709551615,XXX_unrecognized:[],},Rdma:&RdmaStat{Current:[]*RdmaEntry{},Limit:[]*RdmaEntry{},XXX_unrecognized:[],},Io:&IOStat{Usage:[]*IOEntry{},XXX_unrecognized:[],},Hugetlb:[]*HugeTlbStat{&HugeTlbStat{Current:0,Max:18446744073709551615,Pagesize:2MB,XXX_unrecognized:[],},&HugeTlbStat{Current:0,Max:18446744073709551615,Pagesize:1GB,XXX_unrecognized:[],},},MemoryEvents:&MemoryEvents{Low:0,High:0,Max:0,Oom:0,OomKill:0,XXX_unrecognized:[],},XXX_unrecognized:[],}"

Cherry-pick: #8473

@k8s-ci-robot
Copy link
Copy Markdown

Hi @brandond. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@brandond brandond changed the title [release-1.7] cri: Fix umarshal metrics [release/1.7] cri: Fix umarshal metrics May 3, 2023
@samuelkarp
Copy link
Copy Markdown
Member

/ok-to-test

@samuelkarp samuelkarp added the area/cri Container Runtime Interface (CRI) label May 3, 2023
@samuelkarp
Copy link
Copy Markdown
Member

@brandond Did you make this PR to main first? Or does main not need the change?

@brandond
Copy link
Copy Markdown
Contributor Author

brandond commented May 3, 2023

I can also PR to main if it needs it. I honestly haven't checked; I was working against the release branch (trying to get v1.7 working in k3s) so that's where I came first.

Edit: #8473

@brandond
Copy link
Copy Markdown
Contributor Author

brandond commented May 3, 2023

The failed windows tests don't appear to be related to this PR; are they expected to pass?

@samuelkarp
Copy link
Copy Markdown
Member

Edit: #8473

Thanks! For cherry-picks to release branches, we usually recommend running git cherry-pick -xs to record the SHA of the commit to main.

The failed windows tests don't appear to be related to this PR; are they expected to pass?

We can rerun CI to see if it's a flaky test.

@brandond
Copy link
Copy Markdown
Contributor Author

brandond commented May 3, 2023

Got it; I'll redo the cherry-pick for this PR if/when the one for main is merged.

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM very nice thank you

@mikebrow
Copy link
Copy Markdown
Member

mikebrow commented May 3, 2023

/ok-to-test

Signed-off-by: Brad Davidson <[email protected]>
(cherry picked from commit 27f56e6)
Signed-off-by: Brad Davidson <[email protected]>
@brandond
Copy link
Copy Markdown
Contributor Author

brandond commented May 4, 2023

rebased with cherry-pick from main

Copy link
Copy Markdown
Member

@Iceber Iceber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brandond
Copy link
Copy Markdown
Contributor Author

brandond commented May 4, 2023

Looks like one of the tests flaked, but I don't have access to retest.

@fuweid
Copy link
Copy Markdown
Member

fuweid commented May 5, 2023

reruning

Copy link
Copy Markdown
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cri Container Runtime Interface (CRI) ok-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants