-
Notifications
You must be signed in to change notification settings - Fork 40.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix toleration comparison & merging logic #81732
Conversation
Forgot to update PodTolerationRestriction. Fixed. |
2c5f10f
to
0888c7f
Compare
0888c7f
to
f82f55c
Compare
Rebased. |
/remove-sig node Unless I'm missing something, not sure this touches sig-node code. Please feel free to add the sig back if I'm incorrect :) |
@tallclair I think the changes as noted make sense. the admission plugin is still alpha so I think the behavior change is fine as well and represent what I agree was probably oversights in the original design. /approve |
7e7b78d
to
2e08288
Compare
Rebased. |
Hi all, code freeze is coming up in just a few days (Thursday, end of day, PST), so we need to make sure that this PR will be merged for v1.16 or moved to v1.17. Looks like it is coming along, but can you comment on it's status with respect to the milestone? |
/assign @liggitt Derek already lgtm'd it, Jordan said it's on his queue (please let me know if you're not going to get to it in time). I think this is on-track for 1.16 |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, liggitt, tallclair The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest
Regards,
Draven
…On Aug 27, 2019, 6:18 PM +0800, Kubernetes Prow Robot ***@***.***>, wrote:
@tallclair: The following test failed, say /retest to rerun them all:
Test name
Commit
Details
Rerun command
pull-kubernetes-e2e-gce-100-performance
2e08288
link
/test pull-kubernetes-e2e-gce-100-performance
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
What type of PR is this?
/kind bug
What this PR does / why we need it:
The list of tolerations on a pod is effectively represents a union of the taints tolerated by each, so it is entirely reasonable to have 2 tolerations like:
However, the old logic would consider those 2 tolerations a conflict, and overwrite one with the other in a merge.
This PR fixes the toleration helper methods to represent a logical merge and comparison. It does that by recognizing that some tolerations are supersets of others, due to wildcard matches. For example, the above tolerations would be redundant if there was a 3rd toleration:
{"key": "foo", "operator": "Exists"}
.These fixes are needed to better support RuntimeClass scheduling, but also affect the alpha PodTolerationRestriction admission controller.
Special notes for your reviewer:
There are 2 design decisions I want to highlight, and am open to discussion on:
isSuperset
logic for checking the whitelist though, and I feel that simplifying with the smart merge offers a cleaner user experience.TolerationSeconds
- The eviction logic actually takes the minimum toleration time. So technically a lower toleration time would be the superset of a higher time. However, I don't think that was the intention (that logic probably wasn't considering duplicate tolerations for the same taint, but rather across all taints), so I've stuck with using the maximum toleration time for the superset. I think that is more inline with the intention of both the PodTolerationRestriction and RuntimeClass admission controllers (probably irrelevant for the latter).Does this PR introduce a user-facing change?:
/sig scheduling
/sig node
/priority important-soon
/milestone v1.16
I wasn't sure who to assign this to, so went with the original author and reviewers:
/assign @aveshagarwal
/cc @timothysc @smarterclayton @derekwaynecarr @davidopp @liggitt