Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix toleration comparison & merging logic #81732

Merged
merged 2 commits into from
Aug 27, 2019

Conversation

tallclair
Copy link
Member

What type of PR is this?
/kind bug

What this PR does / why we need it:

The list of tolerations on a pod is effectively represents a union of the taints tolerated by each, so it is entirely reasonable to have 2 tolerations like:

- key: foo
  operator: Equal
  value: bar
- key: foo
  operator: Equal
  value: baz

However, the old logic would consider those 2 tolerations a conflict, and overwrite one with the other in a merge.

This PR fixes the toleration helper methods to represent a logical merge and comparison. It does that by recognizing that some tolerations are supersets of others, due to wildcard matches. For example, the above tolerations would be redundant if there was a 3rd toleration: {"key": "foo", "operator": "Exists"}.

These fixes are needed to better support RuntimeClass scheduling, but also affect the alpha PodTolerationRestriction admission controller.

Special notes for your reviewer:

There are 2 design decisions I want to highlight, and am open to discussion on:

  1. The decision to "smart merge" - Since tolerations represent a union, we could perform the merge by simply appending one list to the other (and maybe eliminating exact duplicates). We still need the isSuperset logic for checking the whitelist though, and I feel that simplifying with the smart merge offers a cleaner user experience.
  2. TolerationSeconds - The eviction logic actually takes the minimum toleration time. So technically a lower toleration time would be the superset of a higher time. However, I don't think that was the intention (that logic probably wasn't considering duplicate tolerations for the same taint, but rather across all taints), so I've stuck with using the maximum toleration time for the superset. I think that is more inline with the intention of both the PodTolerationRestriction and RuntimeClass admission controllers (probably irrelevant for the latter).

Does this PR introduce a user-facing change?:

Fix an issue with toleration merging & whitelist checking in the PodTolerationRestriction admission controller.

/sig scheduling
/sig node
/priority important-soon
/milestone v1.16

I wasn't sure who to assign this to, so went with the original author and reviewers:
/assign @aveshagarwal
/cc @timothysc @smarterclayton @derekwaynecarr @davidopp @liggitt

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Aug 21, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Aug 21, 2019
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 21, 2019
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 21, 2019
@k8s-ci-robot k8s-ci-robot requested a review from liggitt August 21, 2019 17:10
@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Aug 21, 2019
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 21, 2019
@tallclair
Copy link
Member Author

Forgot to update PodTolerationRestriction. Fixed.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 21, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 21, 2019
@tallclair
Copy link
Member Author

Rebased.

@mattjmcnaughton
Copy link
Contributor

/remove-sig node

Unless I'm missing something, not sure this touches sig-node code. Please feel free to add the sig back if I'm incorrect :)

@k8s-ci-robot k8s-ci-robot removed the sig/node Categorizes an issue or PR as relevant to SIG Node. label Aug 22, 2019
@derekwaynecarr
Copy link
Member

@tallclair I think the changes as noted make sense. the admission plugin is still alpha so I think the behavior change is fine as well and represent what I agree was probably oversights in the original design.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 26, 2019
@tallclair
Copy link
Member Author

Rebased.

@josiahbjorgaard
Copy link
Contributor

Hi all, code freeze is coming up in just a few days (Thursday, end of day, PST), so we need to make sure that this PR will be merged for v1.16 or moved to v1.17. Looks like it is coming along, but can you comment on it's status with respect to the milestone?

@tallclair
Copy link
Member Author

/assign @liggitt
For approval.

Derek already lgtm'd it, Jordan said it's on his queue (please let me know if you're not going to get to it in time). I think this is on-track for 1.16

@liggitt
Copy link
Member

liggitt commented Aug 27, 2019

/approve
/retest

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, liggitt, tallclair

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 27, 2019
Copy link
Contributor

@draveness draveness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 27, 2019
@draveness
Copy link
Contributor

draveness commented Aug 27, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants