Skip to content

Use naive diff for overlay2 when opaque copy up bug present#28138

Merged
cpuguy83 merged 1 commit intomoby:masterfrom
dmcgowan:handle-overlay2-copy-up-bug
Nov 10, 2016
Merged

Use naive diff for overlay2 when opaque copy up bug present#28138
cpuguy83 merged 1 commit intomoby:masterfrom
dmcgowan:handle-overlay2-copy-up-bug

Conversation

@dmcgowan
Copy link
Copy Markdown
Member

@dmcgowan dmcgowan commented Nov 7, 2016

- What I did

Update overlay2 driver to use naive diff when kernel has bug which copies up opaque directory flag

- How I did it

Added a check on kernels older than 4.8 for the bug.

- How to verify it

Run this Dockerfile on kernel without patched fix and should work

FROM alpine

RUN mkdir /dir1 && touch /dir1/f1
RUN rm -rf /dir1 && mkdir /dir1 && touch /dir1/f2
RUN touch /dir1/f3 
RUN [ -f /dir1/f2 ]

Closes #25244
... for all kernels which support overlay2

Ping @tonistiigi

Copy link
Copy Markdown
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a test for #25244

Comment thread daemon/graphdriver/overlay2/overlay.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this like https://github.com/docker/docker/blob/master/daemon/graphdriver/aufs/aufs.go#L643 with sync.Once and lazy loading. Kernel check could probably be skipped then.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we skip the kernel check though? The fix was merged before 4.8 was released so we can be confident that 4.8 or later have the fixed. Kernel tests were also added around this issue to prevent regression in later versions. Ideally users on newer Kernels won't have to incur the additional startup time required for this check.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep that check if you like it, although kernel versions are abused all the time and it is hard to predict who will decide to release something with a wrong version. With lazy loading, the additional time for full check should be very small.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use sync.Once, not every workflow will need to diff so it makes sense to do lazily.

Comment thread daemon/graphdriver/overlay2/overlay.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about printing something like "Please consider updating the kernel to 4.8.0 or later" here

Comment thread daemon/graphdriver/overlay2/check.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windows fails to compile during UT:
https://jenkins.dockerproject.org/job/Docker-PRs-WoW-RS1/5763/console

00:09:14.608 # github.com/docker/docker/daemon/graphdriver/overlay2
00:09:14.609 daemon\graphdriver\overlay2\check.go:53: undefined: syscall.Mount
00:09:14.609 daemon\graphdriver\overlay2\check.go:57: undefined: syscall.Unmount

Please add the build tag?

@icecrime icecrime added the status/failing-ci Indicates that the PR in its current state fails the test suite label Nov 9, 2016
@dmcgowan dmcgowan force-pushed the handle-overlay2-copy-up-bug branch from 38ee5c0 to 8985860 Compare November 9, 2016 19:52
@dmcgowan
Copy link
Copy Markdown
Member Author

dmcgowan commented Nov 9, 2016

@AkihiroSuda added build tag, updated the error to mention updating to fix, and also added to the status output (if this merges first you will need to rebase your other PR).

@tonistiigi
Copy link
Copy Markdown
Member

@dmcgowan Need to mark that test for linux as well.

@dmcgowan dmcgowan force-pushed the handle-overlay2-copy-up-bug branch from 8985860 to bc430aa Compare November 9, 2016 21:14
@tonistiigi tonistiigi removed the status/failing-ci Indicates that the PR in its current state fails the test suite label Nov 9, 2016
When running on a kernel which is not patched for the copy up bug
overlay2 will use the naive diff driver.

Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan)
@dmcgowan dmcgowan force-pushed the handle-overlay2-copy-up-bug branch from bc430aa to 64b43ed Compare November 9, 2016 21:42
@tonistiigi
Copy link
Copy Markdown
Member

LGTM

1 similar comment
@cpuguy83
Copy link
Copy Markdown
Member

LGTM

@tristanz
Copy link
Copy Markdown

tristanz commented Jan 6, 2017

Am I correct to read that this patch means that for kernels without backports yet (RHEL 7), overlay2 will now have same issues with inode exhaustion as overlay in 1.13?

If so, the documentation (https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/) should be updated to reflect that the overlay2 benefits do not apply to kernels older than 4.8 without backports, even though it will work.

@tonistiigi
Copy link
Copy Markdown
Member

@tristanz No, this only changes the diff algorithm, making docker commit bit slower. The optimizations from on disk layout still apply.

@dmcgowan
Copy link
Copy Markdown
Member Author

dmcgowan commented Jan 6, 2017

@tristanz the backport does not impact storage for overlay2. The only potential impact is running into #21555.

@tristanz
Copy link
Copy Markdown

tristanz commented Jan 6, 2017

@tonistiigi @dmcgowan thanks! Reading #21555 it looks like this is no longer critical bug after #27209 mitigation. With this PR and #27209 are there any major issues with overlay2 on RHEL 7.2+ beyond SELinux not working yet?

liusdu pushed a commit to liusdu/moby that referenced this pull request Oct 30, 2017
When running on a kernel which is not patched for the copy up bug
overlay2 will use the naive diff driver.

cherry-pick from moby#28138
and backport some code from moby#22641

Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan)
Signed-off-by: Lei Jitang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Disappearing npm packages when building with overlay2

9 participants