Skip to content

NodeUnstageVolume not called because unmounter fails when vol_data.json is deleted #101911

@pohly

Description

@pohly

What happened:

The upcoming csi-driver-host-path release v1.7.0 will have a check in DeleteVolume that returns an error when the volume is still attached, staged, or published (kubernetes-csi/csi-driver-host-path#260).

Some of the jobs in the csi-driver-host-path repo with Kubernetes 1.21.0 are failing because NodeUnstageVolume is not called, causing DeleteVolume to fail repeatedly until the test times out.

One example:

volume ID 3f3a132b-b25c-11eb-9aba-a6b5a7ade690 in:

Corresponds to pvc-c0971c32-17c3-427f-a606-1bbf893caf89 in:

There's one kubelet error that seems relevant:

May 11 13:30:23 csi-prow-worker2 kubelet[247]: E0511 13:30:23.786580     247 reconciler.go:193] "operationExecutor.UnmountVolume failed (controllerAttachDetachEnabled true) for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/hostpath.csi.k8s.io^ca29f6e4-b25c-11eb-8da9-1e598a3983d2\") pod \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\" (UID: \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\") : UnmountVolume.NewUnmounter failed for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/hostpath.csi.k8s.io^ca29f6e4-b25c-11eb-8da9-1e598a3983d2\") pod \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\" (UID: \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json]: open /var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json: no such file or directory" err="UnmountVolume.NewUnmounter failed for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/hostpath.csi.k8s.io^ca29f6e4-b25c-11eb-8da9-1e598a3983d2\") pod \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\" (UID: \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json]: open /var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json: no such file or directory"

What you expected to happen:

NodeUnstageVolume should be called.

How to reproduce it (as minimally and precisely as possible):

CSI_PROW_KUBERNETES_VERSION=1.21.0 CSI_PROW_TESTS=parallel CSI_SNAPSHOTTER_VERSION=v4.0.0 ./.prow.sh in csi-driver-host-path and hope that it fails.

Alternatively, retest in kubernetes-csi/csi-driver-host-path#289

Anything else we need to know?:

Random observation: in the two cases that I looked at, the affected volume was published twice for different pods.

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.sig/storageCategorizes an issue or PR as relevant to SIG Storage.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions