Delay Deletion of a Pod until volumes are cleaned up #41456

dashpole · 2017-02-15T05:08:42Z

#41436 fixed the bug that caused #41095 and #40239 to have to be reverted. Now that the bug is fixed, this shouldn't cause problems.

@vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc

k8s-reviewable · 2017-02-15T05:08:48Z

This change is

k8s-github-robot · 2017-02-15T05:09:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

The following people have approved this PR: dashpole

Needs approval from an approver in each of these OWNERS Files:

pkg/kubelet/OWNERS

We suggest the following people:
cc @dchen1107
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

…5-deletion_pod_lifecycle" This reverts commit ff87d13, reversing changes made to 46becf2.

jingxu97 · 2017-02-15T18:35:48Z

@dashpole Could you put some details about this issue you discovered in the previous PR so that we could learn something from it? Thanks!

dashpole · 2017-02-15T18:53:57Z

@jingxu97 The issue from the previous PR, as shown here, was that some updates to the API server were breaking the invariant that pod status always progresses. For example, Status can go PENDING -> RUNNING -> FAILED, but should never go PENDING -> RUNNING -> FAILED -> RUNNING, which was what was happening in the previous PR. By adding debugging statements, and reproducing the bug, I was able to see that calls to statusManager.SetPodStatus, was correctly updating the status to FAILED, and that a call to statusManager.TerminatePod was resetting the status to RUNNING. This led to the discovery of #41436 in TerminatePod, which I believe has solved the problem.

vishh · 2017-02-15T19:32:54Z

pkg/kubelet/status/status_manager.go

+	if !ok {
+		return false
+	}
+	return !kubepod.IsMirrorPod(pod) && m.podDeletionSafety.OkToDeletePod(pod)


#41507 might be relevant here.

vishh · 2017-02-15T19:33:44Z

Here we go again :)

vishh · 2017-02-15T19:33:49Z

/lgtm

k8s-github-robot · 2017-02-16T18:14:05Z

Automatic merge from submit-queue (batch tested with PRs 41466, 41456, 41550, 41238, 41416)

k8s-ci-robot · 2017-02-16T19:33:21Z

@dashpole: The following test(s) failed:

Test name	Commit	Details	Rerun command
Jenkins CRI GCE e2e	`1d38818`	link	`@k8s-bot cri e2e test this`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@vishh

Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694) Create DefaultPodDeletionTimeout for e2e tests In our e2e and e2e_node tests, we had a number of different timeouts for deletion. Recent changes to the way deletion works (kubernetes#41644, kubernetes#41456) have resulted in some timeouts in e2e tests. kubernetes#42661 was the most recent fix for this. Most of these tests are not meant to test pod deletion latency, but rather just to clean up pods after a test is finished. For this reason, we should change all these tests to use a standard, fairly high timeout for deletion. cc @vishh @Random-Liu

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 15, 2017

k8s-github-robot assigned vishh Feb 15, 2017

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Feb 15, 2017

Revert "Merge pull request kubernetes#41202 from dashpole/revert-4109…

1d38818

…5-deletion_pod_lifecycle" This reverts commit ff87d13, reversing changes made to 46becf2.

dashpole force-pushed the pod_volume_cleanup branch from ad3fd4f to 1d38818 Compare February 15, 2017 16:44

vishh reviewed Feb 15, 2017

View reviewed changes

vishh approved these changes Feb 15, 2017

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 15, 2017

vishh added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 15, 2017

k8s-github-robot added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Feb 15, 2017

vishh added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels Feb 16, 2017

k8s-github-robot merged commit 3c606cd into kubernetes:master Feb 16, 2017

dashpole deleted the pod_volume_cleanup branch February 16, 2017 18:34

dashpole mentioned this pull request Mar 8, 2017

Create DefaultPodDeletionTimeout for e2e tests #42734

Merged

dchen1107 mentioned this pull request Mar 9, 2017

e2e flake: [k8s.io] NoExecuteTaintManager [Serial] eventually evict pod with finite tolerations from tainted nodes #42685

Closed

dashpole mentioned this pull request Mar 15, 2017

Eviction Manager should not evict the next pod until the previously evicted pod is completely deleted. #43166

Closed

msau42 mentioned this pull request Oct 15, 2018

Can we remove check for container waiting state to determine volume unmounting? #69831

Closed

cofyc mentioned this pull request Feb 26, 2019

Investigate cleanup of orphaned pod volumes #72346

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Delay Deletion of a Pod until volumes are cleaned up #41456

Delay Deletion of a Pod until volumes are cleaned up #41456

Uh oh!

dashpole commented Feb 15, 2017

Uh oh!

k8s-reviewable commented Feb 15, 2017

Uh oh!

k8s-github-robot commented Feb 15, 2017

Uh oh!

jingxu97 commented Feb 15, 2017

Uh oh!

dashpole commented Feb 15, 2017

Uh oh!

vishh Feb 15, 2017

Uh oh!

vishh commented Feb 15, 2017

Uh oh!

vishh commented Feb 15, 2017

Uh oh!

k8s-github-robot commented Feb 16, 2017

Uh oh!

k8s-ci-robot commented Feb 16, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Delay Deletion of a Pod until volumes are cleaned up #41456

Delay Deletion of a Pod until volumes are cleaned up #41456

Uh oh!

Conversation

dashpole commented Feb 15, 2017

Uh oh!

k8s-reviewable commented Feb 15, 2017

Uh oh!

k8s-github-robot commented Feb 15, 2017

Uh oh!

jingxu97 commented Feb 15, 2017

Uh oh!

dashpole commented Feb 15, 2017

Uh oh!

vishh Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

vishh commented Feb 15, 2017

Uh oh!

vishh commented Feb 15, 2017

Uh oh!

k8s-github-robot commented Feb 16, 2017

Uh oh!

k8s-ci-robot commented Feb 16, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants