More context for errors by kolyshkin · Pull Request #38068 · moby/moby

kolyshkin · 2018-10-23T01:39:50Z

Some functions that moby uses return "raw" errors from the kernel. Wrap those to provide more error context to the caller. Please see individual commit descriptions for details.

AkihiroSuda · 2018-10-23T02:09:17Z

pkg/mount/mount.go

Could you implement causer https://godoc.org/github.com/pkg/errors#Cause

Good idea. Done.

AkihiroSuda · 2018-10-23T02:10:28Z

LGTM but a small nit.

Could you port this to containerd, because we are going to deprecate docker/pkg/mount pkg and use containerd

kolyshkin · 2018-10-23T05:24:40Z

Could you port this to containerd, because we are going to deprecate docker/pkg/mount pkg and use containerd

Yes, and I was about to start working on that... but there's much more to be ported (mountinfo parser, for one thing).

AkihiroSuda · 2018-10-23T05:35:10Z

05:28:00 pkg/mount/mount.go:41:1:warning: exported method MountError.Cause should have comment or be unexported (golint)
05:28:00 pkg/mount/mount.go:13:6:warning: type name will be used as mount.MountError by other packages, and that stutters; consider calling this Error (golint)

kolyshkin · 2018-10-23T22:27:43Z

05:28:00 pkg/mount/mount.go:41:1:warning: exported method MountError.Cause should have comment or be unexported (golint)
05:28:00 pkg/mount/mount.go:13:6:warning: type name will be used as mount.MountError by other packages, and that stutters; consider calling this Error (golint)

Should be fixed now.

thaJeztah · 2018-10-24T22:10:30Z

Failures look legit (at least, same failures on janky, power, and z)

03:41:31 --- FAIL: TestSubtreePrivate (0.04s)
03:41:31     sharedsubtree_linux_test.go:65: %!v(PANIC=runtime error: invalid memory address or nil pointer dereference)
03:41:31 --- FAIL: TestSubtreeSharedSlave (0.06s)
03:41:31     sharedsubtree_linux_test.go:249: %!v(PANIC=runtime error: invalid memory address or nil pointer dereference)
03:41:31 --- FAIL: TestSubtreeUnbindable (0.02s)
03:41:31     sharedsubtree_linux_test.go:330: mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1000: invalid argument
03:41:31 FAIL

kolyshkin · 2018-10-25T02:02:16Z

I have now carefully audited all users of pkg/mount and made all the necessary changes (in most cases it is simplification of logging as the error already contains everything needed).

kolyshkin · 2018-10-25T03:50:39Z

Failure in experimental CI is flaky test, TestServiceWithDefaultAddressPoolInit, #37836

Windows failure is unrelated

kolyshkin · 2018-10-25T04:42:02Z

z failure is known flaky test, TestSwarmClusterRotateUnlockKey, see #33041

daemon/graphdriver/devmapper/deviceset.go

thaJeztah · 2018-10-25T09:33:43Z

daemon/graphdriver/devmapper/deviceset.go

Same here; perhaps s/mount failed/failed to mount/ ?

thaJeztah · 2018-10-25T09:35:16Z

daemon/graphdriver/devmapper/driver.go

Looks like this if is redundant now, and we can just return umountErr (perhaps you find the explicit if more clear though, so up to you 😄)

thaJeztah · 2018-10-25T09:36:55Z

daemon/graphdriver/zfs/zfs.go

Do you think we'd want to keep zfs in the error message (to more easily find where to look for the issue?) or is this already logged with storagedriver=zfs ? 🤔 (in that case; ignore 😂)

I think it will be clear from the context -- after all, we only support one graph driver at the time, and its name is logged.

I think we should keep zfs in there (same with other graphdriver's error messages as well).

"error creating zfs mount" is much easier to search and find in code or to find other people reporting a similar error.

In general, though, I don't think having unique error strings to aid developers in bug report analysis is a good strategy. It would be much better to provide some context in error messages (like file name and line number where the error happens). This is the way I was doing it back in my C days (using preprocessor macros). Alas, logrus doesn't do it. Perhaps it will get this functionality one day? Could be possible with debug.Stack() or smth.

"errors.WithStack" from "github.com/pkg/errors"... but still a good error message that signals where the error came from is important.

thaJeztah · 2018-10-25T11:53:34Z

pkg/mount/mount.go

Looks like we always want to ignore the EINVAL case (a similar check is in RecursiveUnmount); perhaps move this to unmount() and return nil if the error is due to sys call.EINVAL;

diff --git a/pkg/mount/mount.go b/pkg/mount/mount.go index 12f968cbb8..3edd031acf 100644 --- a/pkg/mount/mount.go +++ b/pkg/mount/mount.go @@ -4,9 +4,7 @@ import ( "sort" "strconv" "strings" - "syscall" - "github.com/pkg/errors" "github.com/sirupsen/logrus" ) @@ -125,12 +123,7 @@ func ForceMount(device, target, mType, options string) error { // Unmount lazily unmounts a filesystem on supported platforms, otherwise // does a normal unmount. func Unmount(target string) error { - err := unmount(target, mntDetach) - if err != nil && errors.Cause(err) == syscall.EINVAL { - // ignore "not mounted" error - err = nil - } - return err + return unmount(target, mntDetach) } // RecursiveUnmount unmounts the target and all mounts underneath, starting with @@ -150,16 +143,6 @@ func RecursiveUnmount(target string) error { logrus.Debugf("Trying to unmount %s", m.Mountpoint) err = unmount(m.Mountpoint, mntDetach) if err != nil { - // If the error is EINVAL either this whole package is wrong (invalid flags passed to unmount(2)) or this is - // not a mountpoint (which is ok in this case). - // Meanwhile calling `Mounted()` is very expensive. - // - // We've purposefully used `syscall.EINVAL` here instead of `unix.EINVAL` to avoid platform branching - // Since `EINVAL` is defined for both Windows and Linux in the `syscall` package (and other platforms), - // this is nicer than defining a custom value that we can refer to in each platform file. - if errors.Cause(err) == syscall.EINVAL { - continue - } if i == len(mounts)-1 { if mounted, e := Mounted(m.Mountpoint); e != nil || mounted { return err diff --git a/pkg/mount/mounter_linux.go b/pkg/mount/mounter_linux.go index bfa87a05ee..6f289e06d2 100644 --- a/pkg/mount/mounter_linux.go +++ b/pkg/mount/mounter_linux.go @@ -1,6 +1,9 @@ package mount // import "github.com/docker/docker/pkg/mount" import ( + "syscall" + + "github.com/pkg/errors" "golang.org/x/sys/unix" ) @@ -77,6 +80,16 @@ func unmount(target string, flag int) error { if err == nil { return nil } + // If the error is EINVAL either this whole package is wrong (invalid flags passed to unmount(2)) or this is + // not a mountpoint (which is ok in this case). + // Meanwhile calling `Mounted()` is very expensive. + // + // We've purposefully used `syscall.EINVAL` here instead of `unix.EINVAL` to avoid platform branching + // Since `EINVAL` is defined for both Windows and Linux in the `syscall` package (and other platforms), + // this is nicer than defining a custom value that we can refer to in each platform file. + if errors.Cause(err) == syscall.EINVAL { + return nil + } return &Error{ op: "umount", target: target,

sure; it'll be cleaner this way

thaJeztah · 2018-10-25T11:55:21Z

plugin/manager_linux.go

~~Should this be "Failed to unmount" ?~~ Ah, I see; the .Error() returns the "op" as part of the error, so this will include unmount.

Perhaps we should prefix that error with "failed to <op>" otherwise we'll have to add that everywhere 🤔

I am following the style of errors from os package (and the general unix error style) which is in the form <op>: <error>. In fact it makes sense here as well, something that would look like:

... level=warning msg="umount <dir>: device busy"

which gives no less info than

... level=warning msg="Failed to umount <dir>: device busy"

so I'll drop the "failed to" from this place.

thaJeztah · 2018-10-25T11:59:27Z

pkg/mount/mount.go

Should we have the error start with "failed to " (e.g.)?

Well, all the error messages (say from os package) are in the form op: msg, for example

open: permission denied

or

readlink: no such file or directory

I think we should follow the pattern. Brevity is the soul of wit 😸, besides, "failed to" is redundant here -- it's an error, so it is implied that something has failed.

codecov · 2018-10-25T18:03:57Z

Codecov Report

❗ No coverage uploaded for pull request base (master@3e44f58). Click here to learn what that means.
The diff coverage is 27.69%.

@@            Coverage Diff            @@
##             master   #38068   +/-   ##
=========================================
  Coverage          ?    36.1%           
=========================================
  Files             ?      610           
  Lines             ?    45296           
  Branches          ?        0           
=========================================
  Hits              ?    16352           
  Misses            ?    26706           
  Partials          ?     2238

thaJeztah · 2018-10-27T01:03:39Z

Seeing one more failure;

18:17:27 pkg/mount/mounter_freebsd.go:1::warning: file is not goimported (goimports)

kolyshkin · 2018-10-27T01:41:04Z

Seeing one more failure

I forgot a hunk :( should be fixed now

kolyshkin · 2018-11-28T19:28:44Z

rebased
modified UnmountIpcMount() to not use mount.Mounted() as per Don't log EINVAL when unmount IPC #33329 (comment)

kolyshkin · 2018-11-28T22:29:41Z

PPC failure is #36903

syscall.Stat (and Lstat), unlike functions from os pkg, return "raw" errors (like EPERM or EINVAL), and those are propagated up the function call stack unchanged, and gets logged and/or returned to the user as is. Wrap those into os.PathError{} so the error message will at least have function name and file name. Note we use Capitalized function names to distinguish between functions in os and ours. Signed-off-by: Kir Kolyshkin <[email protected]>

As standard mount.Unmount does what we need, let's use it. In addition, this adds ignoring "not mounted" condition, which was previously implemented (see PR#33329, commit cfa2591) via a very expensive call to mount.Mounted(). Signed-off-by: Kir Kolyshkin <[email protected]>

The function is not needed as it's just a shallow wrapper around unix.Mount(). Signed-off-by: Kir Kolyshkin <[email protected]>

It has been pointed out that we're ignoring EINVAL from umount(2) everywhere, so let's move it to a lower-level function. Also, its implementation should be the same for any UNIX incarnation, so let's consolidate it. Signed-off-by: Kir Kolyshkin <[email protected]>

@cpuguy83

The errors returned from Mount and Unmount functions are raw syscall.Errno errors (like EPERM or EINVAL), which provides no context about what has happened and why. Similar to os.PathError type, introduce mount.Error type with some context. The error messages will now look like this: > mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted or > mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted Before this patch, it was just > operation not permitted [v2: add Cause()] [v3: rename MountError to Error, document Cause()] [v4: fixes; audited all users] [v5: make Error type private; changes after @cpuguy83 reviews] Signed-off-by: Kir Kolyshkin <[email protected]>

kolyshkin · 2018-12-11T04:08:52Z

Rebased; conflicts resolved.

I believe this one is ready to be merged and is very helpful is diagnosing bugs (issue of the day this PR could greatly help with is #38252)

kolyshkin · 2018-12-11T04:10:10Z

@cpuguy83 PTAL; I know you wanted volume mount errors to be wrapped as well but I think it could be done in a separate PR -- this one is more about system mounts (as in mount(2)).

cpuguy83

LGTM

thaJeztah

LGTM

GordonTheTurtle added the status/0-triage label Oct 23, 2018

AkihiroSuda reviewed Oct 23, 2018

View reviewed changes

kolyshkin force-pushed the err branch from 3892bd2 to 848792b Compare October 23, 2018 05:23

AkihiroSuda approved these changes Oct 23, 2018

View reviewed changes

kolyshkin force-pushed the err branch from 848792b to cac6beb Compare October 23, 2018 22:27

AkihiroSuda added the rebuild/* label Oct 24, 2018

GordonTheTurtle removed the rebuild/* label Oct 24, 2018

kolyshkin force-pushed the err branch from cac6beb to 0566522 Compare October 25, 2018 02:01

kolyshkin requested a review from cpuguy83 as a code owner October 25, 2018 02:01

kolyshkin force-pushed the err branch from 0566522 to aac7c1a Compare October 25, 2018 02:24

thaJeztah added status/2-code-review rebuild/* and removed status/0-triage labels Oct 25, 2018

GordonTheTurtle removed the rebuild/* label Oct 25, 2018

thaJeztah requested changes Oct 25, 2018

View reviewed changes

kolyshkin force-pushed the err branch from aac7c1a to 722db3f Compare October 25, 2018 18:03

kolyshkin force-pushed the err branch from 722db3f to 6445fee Compare October 27, 2018 01:40

thaJeztah added the rebuild/janky label Oct 27, 2018

GordonTheTurtle removed the rebuild/janky label Oct 27, 2018

GordonTheTurtle assigned AkihiroSuda Nov 7, 2018

kolyshkin force-pushed the err branch 2 times, most recently from dbfa09b to 8571150 Compare November 8, 2018 20:17

kolyshkin force-pushed the err branch 2 times, most recently from aaedc36 to f4d618f Compare November 28, 2018 19:27

kolyshkin mentioned this pull request Dec 11, 2018

"docker cp" not working when bind mounting cgroup dir with btrfs storage driver #38252

Open

kolyshkin added 5 commits December 10, 2018 20:06

aufs: get rid of mount()

2f98b5f

The function is not needed as it's just a shallow wrapper around unix.Mount(). Signed-off-by: Kir Kolyshkin <[email protected]>

kolyshkin force-pushed the err branch from f4d618f to 6533136 Compare December 11, 2018 04:07

cpuguy83 approved these changes Dec 11, 2018

View reviewed changes

thaJeztah approved these changes Dec 11, 2018

View reviewed changes

thaJeztah added status/4-merge status/2-code-review rebuild/* and removed status/2-code-review status/4-merge labels Dec 11, 2018

GordonTheTurtle removed the rebuild/* label Dec 11, 2018

thaJeztah added status/4-merge and removed status/2-code-review labels Dec 11, 2018

vdemeester added the rebuild/windowsRS5-process label Dec 12, 2018

vdemeester merged commit d4a6e1c into moby:master Dec 12, 2018

kolyshkin mentioned this pull request Jun 4, 2019

[18.09 backport ENGCORE-830] aufs optimizations #39107 docker-archive/engine#262

Merged

kolyshkin mentioned this pull request Mar 7, 2020

EnsureRemoveAll, RecursiveUnmount: don't call Mounted around Unmount #40637

Merged

Conversation

kolyshkin commented Oct 23, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AkihiroSuda commented Oct 23, 2018

Uh oh!

kolyshkin commented Oct 23, 2018

Uh oh!

AkihiroSuda commented Oct 23, 2018

Uh oh!

kolyshkin commented Oct 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thaJeztah commented Oct 24, 2018

Uh oh!

kolyshkin commented Oct 25, 2018

Uh oh!

kolyshkin commented Oct 25, 2018

Uh oh!

kolyshkin commented Oct 25, 2018

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

thaJeztah commented Oct 27, 2018

Uh oh!

kolyshkin commented Oct 27, 2018

Uh oh!

kolyshkin commented Nov 28, 2018

Uh oh!

kolyshkin commented Nov 28, 2018

Uh oh!

kolyshkin commented Dec 11, 2018

Uh oh!

kolyshkin commented Dec 11, 2018

Uh oh!

cpuguy83 left a comment

Choose a reason for hiding this comment

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

kolyshkin commented Oct 23, 2018 •

edited

Loading

codecov bot commented Oct 25, 2018 •

edited

Loading