-
Notifications
You must be signed in to change notification settings - Fork 42k
Fix subpath issues with orphaned pod cleanup #72291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/priority critical-urgent |
pkg/util/mount/mount_linux.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened up #72290 for refactoring and fixing these inter-pkg dependency issues
|
/retest |
pkg/kubelet/kubelet_getters.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modify the comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
pkg/kubelet/kubelet_volumes.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the above podVolumesExist() checking, it calls getMountedVolumePathListFromDisk() to check whether there are mounted volumes exist. And then it calls getPodVolumePathListFromDisk to check whether there are still volume paths exist. I think there are some redundancy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be redundant, but I don't want to remove it as part of this bug fix, as this code is not very well tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it comment that this may be able to be cleaned up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened up #72346
6bfc578 to
639278a
Compare
pkg/kubelet/kubelet_getters.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per documentation:
// IsNotMountPoint is more expensive than IsLikelyNotMountPoint.
How frequently is getPodVolumeSubpathsDir(...) called? Want to be careful we don't introduce a performance regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to remove this commit. Changing to IsNotMountPoint is not strictly necessary to fix this particular bug.
But we should investigate other callers of these functions to see if this may be wrong in other contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened up #72347 to investigate all uses of IsLikelyNotMountPoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it necessary to call IsLikelyNotMountPoint function at all then? There's no protection that it won't end up removing data if mount is from the same device anyway (which was the case for the bug we hit), so if you are ok with that, it seems like calling the function just gives false sense of confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsLikelyNotMountPoint still works for most other volume types that are actually different devices.
The reason why I didn't want to change this function here is because it's being called in other paths as well, and we only wanted to do a targeted fix for this particular issue. We can do a more holistic investigation in #72346 and #72347 for further areas to cleanup.
639278a to
3ebbbbd
Compare
|
/lgtm |
|
/test pull-kubernetes-local-e2e-containerized |
|
/retest |
|
I manually tested this by:
|
|
/retest |
|
@msau42: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/assign @Random-Liu |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msau42, Random-Liu, saad-ali The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| continue | ||
| } | ||
|
|
||
| // If there are any volume-subpaths, do not cleanup directories |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not fully follow why we should not cleanup directories if volume-subpaths exists ? May be add some additional comments to explain that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that all the volumes have not been cleanly unmounted yet
…-upstream-release-1.12 Automated cherry pick of #72291: Check for volume-subpaths directory in orpahaned pod
…-upstream-release-1.13 Automated cherry pick of #72291: Check for volume-subpaths directory in orpahaned pod
…-upstream-release-1.11 Automated cherry pick of #72291: Check for volume-subpaths directory in orpahaned pod
What type of PR is this?
/kind bug
What this PR does / why we need it:
Check for existence of volume-subpaths directory before cleaning up the pod directory
Which issue(s) this PR fixes:
Fixes #72257
Special notes for your reviewer:
Does this PR introduce a user-facing change?: