Skip to content

Retry unmount on EBUSY and return errors#1868

Merged
estesp merged 2 commits intocontainerd:masterfrom
crosbymichael:unmount
Dec 4, 2017
Merged

Retry unmount on EBUSY and return errors#1868
estesp merged 2 commits intocontainerd:masterfrom
crosbymichael:unmount

Conversation

@crosbymichael
Copy link
Copy Markdown
Member

This is another WIP to fix #1785.

Signed-off-by: Michael Crosby [email protected]

@crosbymichael
Copy link
Copy Markdown
Member Author

@dmcgowan @Random-Liu @yanxuean

Please take a look at this fix to ensure unmount works or we get an error.

I remember in the past cadvisor causing things like this as its crawing the rootfs of mounts. If cadvisor is running on the CI this could be why we are seeing it here. The logic for unmounting is in the mount.unmount function in this PR and all unmounting in containerd has been updated to use the mount.Unmount function.

This is another WIP to fix containerd#1785.

Signed-off-by: Michael Crosby <[email protected]>
@codecov-io
Copy link
Copy Markdown

codecov-io commented Dec 4, 2017

Codecov Report

Merging #1868 into master will decrease coverage by 0.11%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1868      +/-   ##
==========================================
- Coverage   49.18%   49.07%   -0.12%     
==========================================
  Files          86       86              
  Lines        8244     8263      +19     
==========================================
  Hits         4055     4055              
- Misses       3519     3538      +19     
  Partials      670      670
Flag Coverage Δ
#linux 52.45% <0%> (-0.15%) ⬇️
#windows 44.25% <ø> (ø) ⬆️
Impacted Files Coverage Δ
mount/mount_linux.go 0% <0%> (ø) ⬆️
oci/spec_opts_unix.go 5.75% <0%> (-0.16%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e427fd6...a8b543f. Read the comment docs.

Comment thread container_opts_unix.go
err = filepath.Walk(root, incrementFS(root, uid, gid))
if uerr := mount.Unmount(root, 0); err == nil {
err = uerr
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where both error do you think it would make sense to log.Warn() the unmount error? Or is the error message likely to be basically the same as the original walk error and doesn't provide any extra information?

Comment thread mount/mount_linux.go
return nil
}

func unmount(target string, flags int) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually fix the issue. It just reduce the chance, right?

Should we still continue os.RemoveAll when unmount fails?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have had this type of logic in the past for unmounting, EBUSY/EAGAIN means "try again" so its still an other all valid fix for the type of issue.

@dmcgowan suggested that we change RemoveAll to Remove so that we either leak a temp dir if it fails but not RM snapshot data.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@AkihiroSuda
Copy link
Copy Markdown
Member

Can we support context.WithTimeout?

@Random-Liu
Copy link
Copy Markdown
Member

@crosbymichael
Copy link
Copy Markdown
Member Author

@Random-Liu ya, was just pushing the change now

Signed-off-by: Michael Crosby <[email protected]>
@dmcgowan
Copy link
Copy Markdown
Member

dmcgowan commented Dec 4, 2017

LGTM

Support for timeout and additional logging sound good, but can be done in follow up. Let's get this merged so we can verify the fix.

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@estesp estesp merged commit cc969fb into containerd:master Dec 4, 2017
@crosbymichael crosbymichael deleted the unmount branch December 4, 2017 19:04
@crosbymichael crosbymichael restored the unmount branch December 4, 2017 19:16
@crosbymichael crosbymichael deleted the unmount branch December 4, 2017 19:17
@crosbymichael crosbymichael restored the unmount branch December 4, 2017 19:19
Comment thread diff/walking/differ.go
return emptyDesc, errors.Wrap(err, "failed to create temporary directory")
}
defer os.RemoveAll(dir)
defer os.Remove(dir)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed a pr #1877 , PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fail to start container, rootfs seems not ready.

8 participants