Skip to content

The cilium-operator pod is scheduled to another node#28856

Closed
jignyasamishra wants to merge 1 commit intocilium:mainfrom
jignyasamishra:jig-issue#28549
Closed

The cilium-operator pod is scheduled to another node#28856
jignyasamishra wants to merge 1 commit intocilium:mainfrom
jignyasamishra:jig-issue#28549

Conversation

@jignyasamishra
Copy link
Copy Markdown
Contributor

@jignyasamishra jignyasamishra commented Oct 29, 2023

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

I recreated the particular issue on my local machine and replicated the above problem. For cilium-operator the tolerations field was set to - operator: Exists for which reason the pods were scheduling again and again into tainted system. I think changing tolerations field for cilium-operator will fix the issue.

Fixes: #28549

Changed to tolerances on the operator to allow nodes to drain

@jignyasamishra jignyasamishra requested review from a team as code owners October 29, 2023 14:06
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 29, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Oct 29, 2023
@dylandreimerink dylandreimerink added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Oct 30, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 30, 2023
Copy link
Copy Markdown
Contributor

@squeed squeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two changes needed:

  • Can you please edit values.yaml.tmpl instead? Then, do a make -C install/kubernetes to propagate your changes
  • The operator does need some tolerations, so that it can be scheduled in early-stage clusters. Can you please add, for a start, tolerations for node.cilium.io/agent-not-ready, {{.Values.agentNotReadyTaintKey}}, and node.kubernetes.io/not-ready

@aanm aanm self-requested a review October 31, 2023 10:13
@jignyasamishra
Copy link
Copy Markdown
Contributor Author

Two changes needed:

  • Can you please edit values.yaml.tmpl instead? Then, do a make -C install/kubernetes to propagate your changes
  • The operator does need some tolerations, so that it can be scheduled in early-stage clusters. Can you please add, for a start, tolerations for node.cilium.io/agent-not-ready, {{.Values.agentNotReadyTaintKey}}, and node.kubernetes.io/not-ready

Thank you so much for your review!! I am working on it.

@jignyasamishra
Copy link
Copy Markdown
Contributor Author

  • The operator does need some tolerations, so that it can be scheduled in early-stage clusters. Can you please add, for a start, tolerations for node.cilium.io/agent-not-ready, {{.Values.agentNotReadyTaintKey}}, and node.kubernetes.io/not-ready

@squeed in my approach I think in /install/kubernetes/cilium/values.yaml.tmpl
adding tolerations for node.kubernetes.io/not-ready

  - key: node.kubernetes.io/not-ready
    effect: NoSchedule   
    operator: Exists

and for node.cilium.io/agent-not-ready

tolerations:
  - key: node.cilium.io/agent-not-ready
    effect: NoSchedule   
    operator: Exists     

and for agentNotReadyTaintKey: "node.cilium.io/agent-not-ready" I am not sure how to go about it as agentNotReadyTaintKey prevents any cillium operations to be performed on any tainted nodes and adding the above tolerations for node.cilium.io/agent-not-ready will give it some tolerations. Can you guide me on this issue. Thank you

@squeed
Copy link
Copy Markdown
Contributor

squeed commented Nov 8, 2023

@jignyasamishra basically, the problem is that the operator's set of tolerations is over-broad. So, we need to give it a decent set of default tolerations, but allow users to override those defaults.

Helm makes these sorts of changes possible, but it does take a bit of doing. So, you need to make two changes:

1: in values.yaml.tmpl: make the "default" value invalid

operator:
  #-- Tolerations for cilium-operator.
  # @default a reasonable set of tolerations to allow for running on the control plane
  tolerations: ~ 

Then, in cilium-operator/deployment.yaml:

tolerations:
{{- if (not (kindIs "invalid" .Values.operator.tolerations )) }}
  {{- toYaml . | trim | nindent 8 }}
{{- else }}
  - key: "node-role.kubernetes.io/control-plane"
    operator: Exists
  - key: "node-role.kubernetes.io/master" #deprecated
    operator: Exists
  - key: node.kubernetes.io/not-ready
    operator: Exists
  {{- with (coalesce .Values.agentNotReadyTaintKey "node.cilium.io/agent-not-ready" }}
  - key: {{ . }}
    operator: Exists
  {{- end }}

Something like that. Make sense?

@aanm
Copy link
Copy Markdown
Member

aanm commented Nov 9, 2023

@jignyasamishra basically, the problem is that the operator's set of tolerations is over-broad. So, we need to give it a decent set of default tolerations, but allow users to override those defaults.

Helm makes these sorts of changes possible, but it does take a bit of doing. So, you need to make two changes:

1: in values.yaml.tmpl: make the "default" value invalid

operator:
  #-- Tolerations for cilium-operator.
  # @default a reasonable set of tolerations to allow for running on the control plane
  tolerations: ~ 

Then, in cilium-operator/deployment.yaml:

tolerations:
{{- if (not (kindIs "invalid" .Values.operator.tolerations )) }}
  {{- toYaml . | trim | nindent 8 }}
{{- else }}
  - key: "node-role.kubernetes.io/control-plane"
    operator: Exists
  - key: "node-role.kubernetes.io/master" #deprecated
    operator: Exists
  - key: node.kubernetes.io/not-ready
    operator: Exists
  {{- with (coalesce .Values.agentNotReadyTaintKey "node.cilium.io/agent-not-ready" }}
  - key: {{ . }}
    operator: Exists
  {{- end }}

Something like that. Make sense?

@squeed why is this a bug that Cilium has to fix? The default tolerations are "- Exists". Users can still override those in case the feel appropriated. I'm only saying this because it's easier to know what tolerations are and that you need to modify them than not knowing what they are and not understanding why Cilium operator is not being scheduled to a particular node because of taints. This is the primarily reason why we only have "Exists" as the default toleration.

@squeed
Copy link
Copy Markdown
Contributor

squeed commented Nov 10, 2023

@aanm in this case, adding a reasonable set of tolerations can be an advantage, since the operator is a Deployment. In other words, if there is a maintenance issue on one of the control-plane nodes, it will be rescheduled. This is not relevant for DaemonSets since they shouldn't be de-scheduled anyways.

It really stinks to have the operator not running because the node has gone missing. The node will be tainted accordingly, but the operator never rescheduled.

Whatever we do, we should basically match the tolerations in #28947, since they are the same.

@aanm
Copy link
Copy Markdown
Member

aanm commented Nov 14, 2023

@aanm in this case, adding a reasonable set of tolerations can be an advantage, since the operator is a Deployment. In other words, if there is a maintenance issue on one of the control-plane nodes, it will be rescheduled. This is not relevant for DaemonSets since they shouldn't be de-scheduled anyways.

@squeed I understand your point but at the same time this does not happen on environments where you are trying out Cilium for the first time. My point is that if you know that your control-plane nodes are being maintained then you will also know how to change the tolerations of an existing deployment.

First-time users will not know this and if they are running on an environment that has some node taints that is not tolerated by Cilium Operator they will not be able to try out Cilium.

The helm chart should provide some sane defaults that allows users to run on 99% of their environments. If their environment is special then they should change the helm values accordingly.

It really stinks to have the operator not running because the node has gone missing. The node will be tainted accordingly, but the operator never rescheduled.

Whatever we do, we should basically match the tolerations in #28947, since they are the same.

@lmb lmb requested a review from squeed November 22, 2023 12:25
@squeed
Copy link
Copy Markdown
Contributor

squeed commented Nov 28, 2023

The helm chart should provide some sane defaults that allows users to run on 99% of their environments. If their environment is special then they should change the helm values accordingly.

Agreed; that's why I propose it provide a reasonable set of tolerations. Tolerating all possible taints goes counter to general expectations, especially for something that isn't part of an individual node's networking.

In other words, people generally expect kubectl cordon to work, and I don't see any reason why we shouldn't support that by default.

@ti-mo ti-mo added the dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. label Dec 5, 2023
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 5, 2024

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jan 5, 2024
@github-actions
Copy link
Copy Markdown

This pull request has not seen any activity since it was marked stale.
Closing.

@github-actions github-actions bot closed this Jan 19, 2024
@MatthiasWinzeler
Copy link
Copy Markdown

@squeed can I pick this up again? I just ran into this because a node couldn't be drained because of the cilium-operator being rescheduled right away. I'll happily add a set of sane default tolerations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. kind/community-contribution This was a contribution made by a community member. release-note/bug This PR fixes an issue in a previous release of Cilium. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tolerations: [{operator: Exists}] on operator prevents node drain

6 participants