Skip to content

KEP-4671: Implement Gang scheduling in kube-scheduler#134722

Merged
k8s-ci-robot merged 4 commits intokubernetes:masterfrom
macsko:gang_scheduling_scheduler
Nov 6, 2025
Merged

KEP-4671: Implement Gang scheduling in kube-scheduler#134722
k8s-ci-robot merged 4 commits intokubernetes:masterfrom
macsko:gang_scheduling_scheduler

Conversation

@macsko
Copy link
Copy Markdown
Member

@macsko macsko commented Oct 20, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds gang scheduling plugin and basic workload awareness (using dedicated manager) to kube-scheduler based on KEP-4671. This is a second PR for this KEP. First one, with the API, is #134564.

Which issue(s) this PR is related to:

Part of #134471
Fixes #134475
Fixes #134476
Fixes #134477

KEP: kubernetes/enhancements#4671

Special notes for your reviewer:

This PR is dedicated for kube-scheduler changes. For API review, please post the comments in #134564. Six first commits are part of #134564, so please filter changes of this PR appropriately.

This PR may be merged with the #134564 depending on review efficiency.

Does this PR introduce a user-facing change?

Introduced GangScheduling kube-scheduler plugin to enable "all-or-nothing" scheduling. Workload API in scheduling.k8s.io/v1alpha1 is used to express the desired policy.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/4671-gang-scheduling

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 20, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Oct 20, 2025
@macsko
Copy link
Copy Markdown
Member Author

macsko commented Oct 20, 2025

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 20, 2025
@k8s-ci-robot k8s-ci-robot added area/apiserver area/code-generation area/kubectl area/test kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cli Categorizes an issue or PR as relevant to SIG CLI. labels Oct 20, 2025
@k8s-ci-robot k8s-ci-robot added sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 20, 2025
@github-project-automation github-project-automation Bot moved this to Needs Triage in SIG Apps Oct 20, 2025
@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Oct 20, 2025
@github-project-automation github-project-automation Bot moved this to Needs Triage in SIG CLI Oct 20, 2025
@k8s-ci-robot k8s-ci-robot added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Oct 20, 2025
@k8s-ci-robot k8s-ci-robot requested a review from dom4ha November 3, 2025 13:48

// deletePod completely deletes the pod from this group.
// It returns true when the group is empty after removal.
func (pgs *podGroupInfo) deletePod(podUID types.UID) bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intuitive interpretation of the return value is whether a pod got deleted or not. If anything I'd suggest to return two values. One for the actual deletion (always true in this case, can change in the future?) and the other as a check of whether a group is empty. Or, how much more expensive would it be to create a new function for "isEmpty" check?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, added new method

logger.V(3).Info("Add event for unscheduled pod", "pod", klog.KObj(pod))
sched.SchedulingQueue.Add(logger, pod)
if utilfeature.DefaultFeatureGate.Enabled(features.GangScheduling) {
sched.SchedulingQueue.MoveAllToActiveOrBackoffQueue(logger, framework.EventUnscheduledPodAdd, nil, pod, nil)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for a mechanism that will avoid putting irrelevant pods into the queue

@macsko macsko force-pushed the gang_scheduling_scheduler branch 2 times, most recently from 3802f5b to 9f9ff21 Compare November 4, 2025 08:46
@macsko
Copy link
Copy Markdown
Member Author

macsko commented Nov 4, 2025

/test pull-kubernetes-integration
Non-related flake

Copy link
Copy Markdown
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only one nit, others lgtm

Comment thread pkg/scheduler/framework/types.go Outdated
{Event: fwk.ClusterEvent{Resource: fwk.StorageClass, ActionType: fwk.All}},
{Event: fwk.ClusterEvent{Resource: fwk.ResourceClaim, ActionType: fwk.All}},
{Event: fwk.ClusterEvent{Resource: fwk.DeviceClass, ActionType: fwk.All}},
{Event: fwk.ClusterEvent{Resource: fwk.Workload, ActionType: fwk.All}},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register it only if the gate is enabled?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, added

@macsko macsko force-pushed the gang_scheduling_scheduler branch from 9f9ff21 to 525ca9c Compare November 4, 2025 18:46
Copy link
Copy Markdown
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 4, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 18e3fbeb2d8866222f5234175b70e75d16899f50

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@macsko: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-apidiff-client-go 50fa39b link false /test pull-kubernetes-apidiff-client-go
pull-kubernetes-e2e-kind-alpha-beta-features 50fa39b link false /test pull-kubernetes-e2e-kind-alpha-beta-features

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@wojtek-t
Copy link
Copy Markdown
Member

wojtek-t commented Nov 6, 2025

/lgtm
/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: a6e5193895a298c8cce734359287f61dc06ae321

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: macsko, sanposhiho, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wojtek-t
Copy link
Copy Markdown
Member

wojtek-t commented Nov 6, 2025

/hold cancel

@k8s-triage-robot
Copy link
Copy Markdown

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@dom4ha
Copy link
Copy Markdown
Member

dom4ha commented Nov 6, 2025

Congrats Maciek, great job on Gang and Workload API!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/code-generation area/kubectl area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

Archived in project
Archived in project
Archived in project
Archived in project

Development

Successfully merging this pull request may close these issues.

KEP-4671: Implement integration tests KEP-4671: Create GangScheduling kube-scheduler plugin KEP-4671: Implement Workload tracker in kube-scheduler