Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support more overlayfs layers #2502

Merged

Conversation

fuweid
Copy link
Member

@fuweid fuweid commented Jul 26, 2018

Auto-detect longest common dir in lowerdir option and compact it if the
option size is hitting one page size. If does, Use chdir + CLONE to do
mount thing to avoid hitting one page argument buffer in linux kernel
mount.

More information is here #2497

@fuweid fuweid changed the title support more overlayfs layers [WIP] support more overlayfs layers Jul 26, 2018
@fuweid fuweid force-pushed the bugfix_support_more_overlayfs_layers branch 2 times, most recently from 6b8f114 to 47d8a50 Compare July 26, 2018 09:52
@codecov-io
Copy link

codecov-io commented Jul 26, 2018

Codecov Report

Merging #2502 into master will decrease coverage by 0.1%.
The diff coverage is 36.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2502      +/-   ##
==========================================
- Coverage   45.05%   44.95%   -0.11%     
==========================================
  Files          94       95       +1     
  Lines        9796     9928     +132     
==========================================
+ Hits         4414     4463      +49     
- Misses       4662     4743      +81     
- Partials      720      722       +2
Flag Coverage Δ
#linux 48.88% <36.56%> (-0.2%) ⬇️
#windows 41.57% <ø> (ø) ⬆️
Impacted Files Coverage Δ
sys/mount_linux.go 0% <0%> (ø)
mount/mount_linux.go 31.01% <60.49%> (+31.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4fb9230...67b54c6. Read the comment docs.

@fuweid fuweid changed the title [WIP] support more overlayfs layers support more overlayfs layers Jul 26, 2018
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid circular dependency on docker/docker?

@fuweid
Copy link
Member Author

fuweid commented Jul 26, 2018

I see.... the docker/docker/builder using the containerd/containerd/mount.... 🤔

@@ -41,6 +41,7 @@ import (
shimapi "github.com/containerd/containerd/runtime/v1/shim/v1"
"github.com/containerd/ttrpc"
"github.com/containerd/typeurl"
"github.com/docker/docker/pkg/reexec"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this as a dependency is a big no no. Either reimplement or find another solution.

DRY is a liability here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @stevvooe. I will find other way to handle this. 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was looking around for a suggestion, but I think @dmcgowan covered the options in #2497 (comment).

Copy link
Member

@dmcgowan dmcgowan Jul 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a way to create a Fmountat which we could use here dmcgowan@d8f6d09

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe yes. the option provided by @dmcgowan works! I have changed the code. Please take a look.

@fuweid fuweid changed the title support more overlayfs layers [WIP] support more overlayfs layers Jul 27, 2018
newdirs = append(newdirs, dir[len(commondir):])
}

m.Options = append(m.Options[:idx], m.Options[idx+1:]...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a copy of m before altering


// NOTE: the snapshot id is based on digits. in order to avoid to get
// snapshots/x, need to back to parent snapshots dir. however, there is
// assumption that the common dir is ${root}/io.containerd.v1.overlayfs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be ${root}/io.containerd.v1.overlayfs/snapshots

@fuweid fuweid force-pushed the bugfix_support_more_overlayfs_layers branch 7 times, most recently from fd89efb to b41352b Compare July 30, 2018 13:27
@fuweid fuweid changed the title [WIP] support more overlayfs layers support more overlayfs layers Jul 30, 2018
_ "unsafe" // required for go:linkname.
)

//go:linkname beforeFork syscall.runtime_BeforeFork
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this officially supported across different versions of Go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# https://github.com/golang/go/blob/release-branch.go1.8/src/syscall/exec_linux.go#L46
func runtime_BeforeFork()
func runtime_AfterFork()
# https://github.com/golang/go/blob/release-branch.go1.9/src/syscall/exec_linux.go#L52
func runtime_BeforeFork()
func runtime_AfterFork()
func runtime_AfterForkInChild()

1.9 or higher supports the three features. But, I think go1.10 can work well than go1.9 since the runtime.LockOSThread has been improved.

more input from https://golang.org/doc/go1.10#runtime

)

// avoid hitting one page limit of mount argument buffer
if m.Type == "overlay" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should still be a page size check in here, we need to offload that to a linux only section though


// avoid hitting one page limit of mount argument buffer
if m.Type == "overlay" {
options = copyOptions(options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copy should be done as part of the next function, and only when necessary

}

// read exit status from child thread.
n, err = readlen(p[0], (*byte)(unsafe.Pointer(&errno)), int(unsafe.Sizeof(errno)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no other way to do this without a raw syscall? That should only be necessary in the child process

runtime.LockOSThread()
defer runtime.UnlockOSThread()

if _, errno = forkAndMountat(dirfd,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are still missing a waitpid or else we will have zombie babies


// open pipe to receive exit status from other thread
p[0], p[1] = -1, -1
if err = syscall.Pipe(p[:]); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't think you need pipes where, just waitpid(wait4) the exit status and use that. You can always wait on a process, running or not, that you forked off.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I make it complicated and it can cause the zombie process. I have updated the code. Please take a look.

@dmcgowan dmcgowan added this to the 1.2 milestone Jul 30, 2018
@fuweid fuweid force-pushed the bugfix_support_more_overlayfs_layers branch 3 times, most recently from 1ca0406 to 7a57b5c Compare July 31, 2018 15:17
return errors.Wrapf(err, "failed to find pid=%d process", pid)
}

if ps, err = p.Wait(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would almost recommend using unix.Wait4 here to follow the rest of the theme of low level syscalls in this function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make senses. I will update it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for all your help

Copy link
Member Author

@fuweid fuweid Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add loop to retry if the Wait4 return EINTR? it is possible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, good idea

@fuweid fuweid force-pushed the bugfix_support_more_overlayfs_layers branch from 7a57b5c to b9d7839 Compare July 31, 2018 18:00

// optionsSize returns the byte size of options of mount.
func optionsSize(opts []string) int {
size := 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still set some buffer room here, maybe start size at 512

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the buffer will be used in pagesize check. have updated it.

@fuweid fuweid force-pushed the bugfix_support_more_overlayfs_layers branch from b9d7839 to 3a3c261 Compare August 1, 2018 00:45
@crosbymichael
Copy link
Member

mount code LGTM

@fuweid
Copy link
Member Author

fuweid commented Aug 3, 2018

ping @dmcgowan @stevvooe @AkihiroSuda

please take a look~ thanks

//
// NOTE: 512 is a buffer during pagesize check.
if m.Type == "overlay" && optionsSize(options) >= pagesize-512 {
chdir, options = compactLowerdirOption(options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there should be another check here that after compacting the options do fit a page, i.e. something like

if optionsSize(options) >= pagesize-512 {
	return errors.Errorf("can't mount: lower layers list too long")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet better, move this check to after parseMountOptions() as it's simpler and more precise in there:

	flags, data := parseMountOptions(options)
	if len(data) >= pagesize {
		return errors.Errorf("can't mount: mount options string too long")
	}
```
or smth like that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make senses. will update it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolyshkin I add another check after compact and show the error if the mount option is too long.
if we got empty chdir or the data is too long, return the mount options is too long. It is readable to user instead of failed to mount: no such file or directory.

Auto-detect longest common dir in lowerdir option and compact it if the
option size is hitting one page size. If does, Use chdir + CLONE to do
mount thing to avoid hitting one page argument buffer in linux kernel
mount.

Signed-off-by: Wei Fu <[email protected]>
@fuweid fuweid force-pushed the bugfix_support_more_overlayfs_layers branch from 3a3c261 to 67b54c6 Compare August 7, 2018 03:00
@dmcgowan
Copy link
Member

dmcgowan commented Aug 7, 2018

LGTM

1 similar comment
@crosbymichael
Copy link
Member

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants