Skip to content

Conversation

@jsturtevant
Copy link
Contributor

Signed-off-by: James Sturtevant [email protected]

  • One-line PR description:

Adding Windows stats details to the cri pod container stats KEP

  • Other comments:

After opening kubernetes/kubernetes#110754 we discovered there was more design work need for Windows and was suggested at sig-node to do the work in KEP.

/sig windows
/sig node

@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 22, 2022
@k8s-ci-robot k8s-ci-robot requested a review from dchen1107 July 22, 2022 03:57
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Jul 22, 2022
@jsturtevant
Copy link
Contributor Author

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 27, 2022
@jsturtevant jsturtevant force-pushed the add-windows-podstats-details branch 3 times, most recently from acae793 to 1fa4600 Compare July 27, 2022 16:17
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 27, 2022
@jsturtevant
Copy link
Contributor Author

/assign @haircommander @bobbypage @dchen1107


The field `LinuxPodSandboxStats` would stay the same (with duplicated fields) for backwards compatibility. We will also leave `WindowsPodSandboxStats` with a comment that it is not currently used. This option allows us to add specific fields in the future, Linux and Windows sub fields could be added to `LinuxPodSandboxStats` and `WindowsPodSandboxStats` if there are use cases for specific fields.

The downside to this approach is that we have two fields (`LinuxPodSandboxStats` and `PodSandboxStats`) which are initially identical. We could mark the individual fields in the `LinuxPodSandboxStats` as depreciated and allow only new linux only fields be added to `LinuxPodSandboxStats`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach makes sense. I think we should mention that eventually we will drop LinuxPodSandboxStats and WindowsPodSandboxStats in favor of the single structure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, I was thinking that we could possibly use the fields in the future for OS specific stats but probably better to handle at a later point if it ever comes up in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@haircommander
Copy link
Contributor

one more set of notes, but the approach LGTM

@jsturtevant jsturtevant force-pushed the add-windows-podstats-details branch from 1fa4600 to c896c46 Compare August 1, 2022 15:58
@marosset
Copy link
Contributor

marosset commented Sep 6, 2022

The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels.
My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.

@haircommander @marosset what are your thoughts?

IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed.
If other non-core components needed more detailed metrics they could get them by other means.

@jsturtevant
Copy link
Contributor Author

IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed.
If other non-core components needed more detailed metrics they could get them by other means.

That was my understanding as well.

At sig-node meeting today, there was an update on the performance for moving the additional cadvisor metrics into the ListStats API: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#heading=h.i2n5jj2i4vso

In the proposal Linux specific fields are added to the stats call. I wasn't aware of this use case, but if this is the direction, then I think we would want to have these fields as separate as was initially proposed.

@jsturtevant
Copy link
Contributor Author

@bobbypage @haircommander @mikebrow I have this on the agenda to discuss tomorrow. I won't be able to make it but @marosset is going to attend. Hopefully we can get some consensus on the direction forward. Thanks!

Signed-off-by: James Sturtevant <[email protected]>
@jsturtevant jsturtevant force-pushed the add-windows-podstats-details branch from 2845607 to 73a5c8d Compare September 19, 2022 16:48
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 19, 2022
@jsturtevant
Copy link
Contributor Author

After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate.

I've updated the KEP @haircommander @marosset @bobbypage @dcantah

@marosset
Copy link
Contributor

/milestone v1.26

@k8s-ci-robot k8s-ci-robot added this to the v1.26 milestone Sep 22, 2022
- balancing_enabled
- dm_operation_in_progress

These are not used currently but are be very specific to each implementation and could be collected by specialized tools on a as needed basis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe reported as an unstructured Metric through the endpoint formally known as /metrics/cadvisor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had the same thoughts, exposing them through that would be a good way to provide them to end users. If kubelet ends up needing them for decisions, we could introduce them to the formal Pod Stats call but can be done at a later time if need arises

@haircommander
Copy link
Contributor

haircommander commented Sep 22, 2022

/lgtm

there's one leading question I added, but I don't think we need to hammer out the details of it yet.

edit: oops, missed #3439 (comment), I'm on standby to re-lgtm :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 22, 2022
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 22, 2022
@haircommander
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 22, 2022
@jsturtevant
Copy link
Contributor Author

After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate. The KEP has been updated to reflect this.

/assign @dchen1107 @mrunalp @mikebrow

@marosset
Copy link
Contributor

marosset commented Oct 4, 2022

/lgtm

@haircommander
Copy link
Contributor

/assign @derekwaynecarr

@bobbypage
Copy link
Member

/lgtm

@jsturtevant
Copy link
Contributor Author

/cc @mikebrow @derekwaynecarr for 1.26 enhancement freeze

@derekwaynecarr
Copy link
Member

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, jsturtevant

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 6, 2022
@k8s-ci-robot k8s-ci-robot merged commit beaebac into kubernetes:master Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.