-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Extended Toleration Operators for Threshold-Based Placement #5471
Copy link
Copy link
Open
Labels
lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.Denotes an issue or PR has remained open with no activity and has become stale.sig/appsCategorizes an issue or PR as relevant to SIG Apps.Categorizes an issue or PR as relevant to SIG Apps.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.Categorizes an issue or PR as relevant to SIG Scheduling.stage/alphaDenotes an issue tracking an enhancement targeted for Alpha statusDenotes an issue tracking an enhancement targeted for Alpha status
Metadata
Metadata
Assignees
Labels
lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.Denotes an issue or PR has remained open with no activity and has become stale.sig/appsCategorizes an issue or PR as relevant to SIG Apps.Categorizes an issue or PR as relevant to SIG Apps.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.Categorizes an issue or PR as relevant to SIG Scheduling.stage/alphaDenotes an issue tracking an enhancement targeted for Alpha statusDenotes an issue tracking an enhancement targeted for Alpha status
Type
Projects
Status
Backlog
Status
Done
Enhancement Description
Many production Kubernetes clusters blend on-demand (higher-SLA) and spot/preemptible (lower-SLA) nodes to optimize costs while maintaining reliability for critical workloads. Platform teams need a safe default that keeps most workloads away from risky capacity, while allowing specific workloads to opt-in with explicit thresholds like "SLA ≥ 95%".
Currently,
NodeAffinitysupports numeric comparisons (Gt, Lt, etc.) but lacks the operational benefits that taints/tolerations provide:NodeAffinityis per-pod; to keep most pods away from low-SLA nodes requires editing every workload. Taints invert control: nodes declare risk; only pods with matching tolerations may land.NoExecutewithtolerationSeconds, enabling operators to drain/evict pods when a node's SLA degrades or spot instances are reclaimed.This enhancement extends
core/v1Tolerationto support numeric comparison operators (Lt, Gt) when matching Node Taints. This preserves the well-understood safety model of taints/tolerations while enabling threshold-based placement for SLA-aware scheduling.Benefits for DRA and AI Workloads
NoExecute/tolerationSecondsfor graceful drain and controlled failoverThe scheduler impact is limited to the existing TaintToleration Filter; no new scheduling stages or algorithms are required.
/sig scheduling
/sig apps
/stage alpha
/cc @ahg-g @alculquicondor @johnbelamaric @sanposhiho @kubernetes/sig-scheduling-misc
k/enhancements) update PR(s): KEP-5471 Extended Toleration Operators for Threshold-Based Placement #5473k/k) update PR(s):k/website) update PR(s):