-
Notifications
You must be signed in to change notification settings - Fork 42k
api: document force-delete effect of TerminationGracePeriodSeconds=0 #112564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A pod with Spec.TerminationGracePeriodSeconds will get force-deleted also when the client doing the delete doesn't explicitly ask for it. That follows from https://github.com/kubernetes/kubernetes/blob/0f582f7c3f504e807550310d00f130cb5c18c0c3/pkg/registry/core/pod/strategy.go#L151-L171 choosing TerminationGracePeriodSeconds as the value for GracePeriodSeconds when nothing is chosen explicitly. This effect was not obvious from the documentation of the field and might be something that users should avoid.
| // signal (no opportunity to shut down). | ||
| // signal (no opportunity to shut down) and also turns all Pod deletions for the pod into | ||
| // force-deletes (apiserver removes the Pod object immediately). Forced deletions can be potentially | ||
| // disruptive for some workloads and their Pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Forced deletions ..." is a copy of the advice from https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination-forced.
It's debatable whether it should get repeated here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even the comment above your change is not exactly true:
The value zero indicates stop immediately via the kill signal (no opportunity to shut down).
The doc reads:
Setting the grace period to 0 forcibly and immediately deletes the Pod from the API server. If the pod was still running on a node, that forcible deletion triggers the kubelet to begin immediate cleanup.
[...]
When a force deletion is performed, the API server does not wait for confirmation from the kubelet that the Pod has been terminated on the node it was running on. It removes the Pod in the API immediately so a new Pod can be created with the same name. On the node, Pods that are set to terminate immediately will still be given a small grace period before being force killed.
So here's my suggestion:
The value zero indicates that the pod should be forced-deleted immediately without waiting for confirmation
that it has been terminated. On the node, pods that are set to terminate immediately will still be given a
small grace period before being force killed. Forced deletions can be potentially disruptive for some workloads
and their Pods.
I think that explains both what happens on the apiserver and the node, and also gives the warning you're looking for, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So "The value zero indicates stop immediately via the kill signal (no opportunity to shut down)." (from the original documentation) is really just plain wrong?
On the node, pods that are set to terminate immediately will still be given a small grace period before being force killed.
Where does this small grace period come from? Can it be configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything but what I've seen in the doc you linked :-) I doubt it can be configured. But I think that's almost irrelevant, what matters most is that it's removed from the apiserver without confirmation.
What problem are you trying to solve? It sounds like you need some policy to prevent terminationGracePeriod from being set to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything but what I've seen in the doc you linked :-)
Those docs also might be wrong... If we want the API documentation to be correct, we probably need to determine what the implementation in kubelet actually does.
What problem are you trying to solve?
I wanted test pods to be killed immediately by kubelet but without also enabling a force delete because I wanted to go through the normal pod deletion process. From the documentation it sounded like TerminationGracePeriodSeconds: 0 would do that ("The value zero indicates stop immediately via the kill signal"), but than I discovered that a Delete(..., metav1.DeleteOptions{}) (i.e. a normal delete) acted like a force-delete (Delete(..., metav1.DeleteOptions{GracePeriodSeconds: &zero})).
This was surprising. I want to avoid that surprise for others, either:
- by fixing the documentation to describe accurately what
TerminationGracePeriodSecondsdoes or - by making the implementation behave as implied by the documentation (control kill period in kubelet, without the force-delete side effect).
If someone wants it to be one, why wouldn't they set it to one?
They might want it to be zero, without realizing the full implications. Fixing the documentation would help here because in practice, a delay of one second is close enough - users just need to know it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klueska do you know how quickly kubelet kills pods when TerminationGracePeriodSeconds: 0? Is it immediately (because the Pod object is gone) or after a certain grace period?
It would be counter-intuitive if pods got killed more slowly for TerminationGracePeriodSeconds: 0 than for TerminationGracePeriodSeconds: 1.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pohly The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
|
/assign @apelisse |
|
FWIW, I also wouldn't mind changing the behavior so that if |
If someone wants it to be one, why wouldn't they set it to one? |
|
/assign @smarterclayton he was trying to clarify this too #102025 it seems the bot closed it |
|
Thanks @aojea. That other PR shows that the current behavior is unintentionally and should be changed. Let's put this doc update on hold and instead see whether we can finish the implementation update. /hold |
|
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
|
No progress on #102025 but as I think that that's the right solution I'll close this one here. /close |
|
@pohly: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind documentation
What this PR does / why we need it:
A pod with Spec.TerminationGracePeriodSeconds will get force-deleted also when the client doing the delete doesn't explicitly ask for it. That follows from
kubernetes/pkg/registry/core/pod/strategy.go
Lines 151 to 171 in 0f582f7
This effect was not obvious from the documentation of the field and might be something that users should avoid.
Does this PR introduce a user-facing change?