Skip to content

[PodLevelResources] Pod Level Resources Feature Alpha#128407

Merged
k8s-ci-robot merged 15 commits intokubernetes:masterfrom
ndixita:pod-level-resources
Nov 8, 2024
Merged

[PodLevelResources] Pod Level Resources Feature Alpha#128407
k8s-ci-robot merged 15 commits intokubernetes:masterfrom
ndixita:pod-level-resources

Conversation

@ndixita
Copy link
Copy Markdown
Contributor

@ndixita ndixita commented Oct 29, 2024

What type of PR is this?

What this PR does / why we need it:

This PR implements Pod Level Resources that require following changes:

  1. API changes to support Resources in PodSpec
  2. Kubectl changes to display Resources at pod-level
  3. Defaulting logic when pod-level requests are missing but limits are set.
  4. Validation logic for pod-level resources
  5. Scheduler changes to consume pod-level requests
  6. QoS class determination changes
  7. cgroup changes to use pod-level resources
  8. OOM score adjustment calculation changes to account for pod-level requests.
  9. e2e tests
  10. Limit Range changes for type Pod to check against pod-level resources, if set.
  11. Resource Quota changes to check against pod-level resource, if set.

Which issue(s) this PR fixes:

Fixes #
xref: kubernetes/enhancements#2837

Special notes for your reviewer:

API changes: 62c5355

Scheduler changes: cb4f2ea

Kubelet changes: inside pkg/kubelet

Does this PR introduce a user-facing change?

- Changed the Pod API to support `resources` at `spec` level for pod-level resources.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2837-pod-level-resource-spec/README.md 

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/code-generation area/kubectl kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 29, 2024
@ndixita ndixita force-pushed the pod-level-resources branch from 69592f0 to 7125ffa Compare October 30, 2024 06:36
@kannon92
Copy link
Copy Markdown
Contributor

Hey @ndixita, which PR are you working on? I see #128406 and this one.

@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Oct 30, 2024

/test all

@ndixita ndixita force-pushed the pod-level-resources branch from 7125ffa to c4fcf81 Compare October 30, 2024 17:44
@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Oct 30, 2024

Addressing the default test failures

Comment thread pkg/apis/core/validation/validation.go Outdated
@ndixita ndixita closed this Oct 30, 2024
@ndixita ndixita deleted the pod-level-resources branch October 30, 2024 22:26
Copy link
Copy Markdown
Contributor

@mrunalp mrunalp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a description to the commit: QOS changes for Pod Level resources?

@yujuhong
Copy link
Copy Markdown
Contributor

yujuhong commented Nov 8, 2024

/lgtm

Still need to wait for the tests to pass

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 0b1e70614e188256dbcc640d47a110d7a3363be5

@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-e2e-kind

@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-verify

@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-node-kubelet-serial-podresources

1. Add Resources struct to PodSpec struct in both external and internal API packages
2. Adding feature gate and logic for dropping disabled fields for Pod Level Resources
KEP: enhancements/keps/sig-node/2837-pod-level-resource-spec
@yujuhong
Copy link
Copy Markdown
Contributor

yujuhong commented Nov 8, 2024

/lgtm

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: c2e9c9f2c0ca6fa70f05d4791e9aa2ef752a3bf1

1. Add support for pod level resources in kubectl
2. Reuse the existing method to describe container resources and generalize it to describe both pod and container level resources
1. If pod-level limit is set, pod-level request is unset and container-level request is set: derive pod-level request from container-level requests
2. If pod-level limit is set, pod-level request is unset and container-level request is unset: set pod-level request equal to pod-level limit
1. The effective container requests cannot be greater than pod-level requests
2. Inidividual container limits cannot be greater than pod-level limits
3. Only CPU & Memory are supported at pod-level
4. Inplace container resources updates are not supported if pod-level resources are set
Note: effective container requests cannot be greater than pod-level limits is supported by transitivity. Effective container requests <= pod-level requests && pod-level requests <= pod-level limits; Therefore effective container requests <= pod-level limits

Signed-off-by: ndixita <[email protected]>
1. Use pod-level resource when feature is enabled and resources are set at pod-level
2. Edge case handling: When a pod defines only CPU or memory limits at pod-level (but not both), and container-level requests/limits are unset, the pod-level requests stay empty for the resource without a pod-limit. The container's request for that resource is then set to the default request value from schedutil.
1. Pod cgrooup configured to use resources from pod spec if feature is enabled and resources are set at pod-level
2. Container cgroup limits defaulted to pod-level limits is container limits are not set
Signed-off-by: ndixita <[email protected]>
@pacoxu
Copy link
Copy Markdown
Member

pacoxu commented Nov 8, 2024

/lgtm
after a typo fix

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: fc17af58370a99e0823cef19139b071bb4be00e7

@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-node-kubelet-serial-podresources

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@ndixita: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-kind-alpha-beta-features e11ac5d link false /test pull-kubernetes-e2e-kind-alpha-beta-features
pull-kubernetes-node-e2e-alpha-ec2 e11ac5d link false /test pull-kubernetes-node-e2e-alpha-ec2
pull-kubernetes-e2e-gce-cos-alpha-features b30e6c8 link false /test pull-kubernetes-e2e-gce-cos-alpha-features

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-e2e-gce-cos-alpha-features

@ndixita
Copy link
Copy Markdown
Contributor Author

ndixita commented Nov 8, 2024

The tests timing out in pull-kubernetes-e2e-gce-cos-alpha-features are not realated to this PR. They are related to InPlaceVerticalScaling feature

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/kubeadm area/kubectl area/kubelet area/release-eng Issues or PRs related to the Release Engineering subproject area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

Status: API review completed, 1.32
Archived in project
Archived in project
Archived in project
Archived in project

Development

Successfully merging this pull request may close these issues.