-
Notifications
You must be signed in to change notification settings - Fork 40.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Antiaffinity pods scheduled to the same node during scheduler leader-election #65257
Comments
/sig api-machinery |
I would like to add Antifinity Predict to Kubelet Admit to fix this issue. |
This can happen when the original leader takes too long(eg. longer than |
#65094 This can reduce this probability at some degree. But can not eliminate more than one schedulers working concurrently during some transition time. |
To solve this completely, we have to prevent overlap of two leaders:
|
/sig-scheduling |
@hzxuzhonghu Thanks for the information. |
/remove-sig api-machinery |
@wenjiaswe: Those labels are not set on the issue: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Automatic merge from submit-queue (batch tested with PRs 65094, 65533, 63522, 65694, 65702). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. set leader election client and renew timeout **What this PR does / why we need it**: set leader-election client timeout set timeout for tryAcquireOrRenew **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes #65090 #65257 **Special notes for your reviewer**: **Release note**: ```release-note NONE ```
We can not do that; AntiAffinity also supports "zone/topologyKey", but kubelet should not knows the status of other nodes. That's why we did not include Pod Affinity/Anti-Affinity in kubelet admit. |
@k82cn That makes sense for me. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/close |
@DylanBLE: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
Two pods with AntiAffinity label are scheduled to the same node.
What you expected to happen:
Two pods with AntiAffinity label are scheduled to different nodes.
How to reproduce it (as minimally and precisely as possible):
na
Anything else we need to know?:
The bug is caused by scheduler during leader switch.
Here is what happened:
Suppose there are two schedulers, SA(active), SB(standby). Two pods with antifinity PA, PB. Two nodes NA, NB.
Here is the log of SA:

SA: tw-node2221, SB: tw-node2222.
SA lost election at 16:03:35 but continued to schedule pods until found conflicts in cache and then quited.
Environment:
kubectl version
):v1.5.6
uname -a
):4.4.64-1.el7.elrepo.x86_64
The text was updated successfully, but these errors were encountered: