Skip to content

snapshots/devmapper: deactivate thin device after committed#3262

Merged
crosbymichael merged 2 commits intocontainerd:masterfrom
renzhengeek:renzhen/devmapper-bugfix
May 9, 2019
Merged

snapshots/devmapper: deactivate thin device after committed#3262
crosbymichael merged 2 commits intocontainerd:masterfrom
renzhengeek:renzhen/devmapper-bugfix

Conversation

@renzhengeek
Copy link
Copy Markdown

  1. reason to deactivate committed snapshot

The thin device will not be used for IO after committed,
and further thin snapshotting is OK using an inactive thin
device as origin. The benefits to deactivate are:

  • device is not unneccesary visible avoiding any unexpected IO;
  • save useless kernel data structs for maintaining active dm.

Quote from kernel doc (Documentation/device-mapper/provisioning.txt):

"
ii) Using an internal snapshot.

Once created, the user doesn't have to worry about any connection
between the origin and the snapshot. Indeed the snapshot is no
different from any other thinly-provisioned device and can be
snapshotted itself via the same method. It's perfectly legal to
have only one of them active, and there's no ordering requirement on
activating or removing them both. (This differs from conventional
device-mapper snapshots.)
"

  1. an thinpool metadata bug is naturally removed

An problem happens when failed to suspend/resume origin thin device
when creating snapshot:

"failed to create snapshot device from parent vg0-mythinpool-snap-3"
error="failed to save initial metadata for snapshot "vg0-mythinpool-snap-19":
object already exists"

This issue occurs because when failed to create snapshot, the
snapshotter.store can be rollbacked, but the thin pool metadata
boltdb failed to rollback in PoolDevice.CreateSnapshotDevice(),
therefore metadata becomes inconsistent: the snapshotID is not
taken in snapshotter.store, but saved in pool metadata boltdb.

The cause is, in PoolDevice.CreateSnapshotDevice(), the defer calls
are invoked on "first-in-last-out" order. When the error happens
on the "resume device" defer call, the metadata is saved and
snapshot is created, which has no chance to be rollbacked.

Signed-off-by: Eric Ren [email protected]

@renzhengeek
Copy link
Copy Markdown
Author

The following error:

"failed to create snapshot device from parent vg0-mythinpool-snap-3"
error="failed to save initial metadata for snapshot "vg0-mythinpool-snap-19":
object already exists"

can be reproduced by fault injection in code like this:

in snapshots/devmapper/pool_device.go::CreateSnapshotDevice()

                                                                
               // Resume back base thin device on exit
               defer func() {                    
                       //retErr = multierror.Append(retErr, p.resumeDevice(ctx, baseInfo)).ErrorOrNil()
                       log.G(context.TODO()).Debugf("In Resume back base thin device")
                       retErr = multierror.Append(retErr, errors.New("fake resumeDevice failure!")).ErrorOrNil()
               }()                         

@renzhengeek
Copy link
Copy Markdown
Author

@mxpv Hi, please help review, any comments will be appreciated :)

@renzhengeek
Copy link
Copy Markdown
Author

@mxpv Hi, if you don't like the idea to deactivate committed snapshot, another idea I can think up to solve the rollback issue is like this - use one defer call, put the dev resume action first:

// CreateSnapshotDevice creates and activates new thin-device from parent thin-device (makes snapshot)
func (p *PoolDevice) CreateSnapshotDevice(ctx context.Context, deviceName string, snapshotName string, virtualSizeBytes uint64) (retErr error) {
        var (
                devSuspended = false
                metaSaved = false
                snapCreated = false
        )   
...
       defer func() {
                // Resume back base thin device on exit
                if devSuspended {
                        retErr = multierror.Append(retErr, p.resumeDevice(ctx, baseInfo))
                }   

                // Rollback snapshot creation
                if snapCreated && retErr != nil {
                        retErr = multierror.Append(retErr, p.deleteDevice(ctx, snapInfo))
                }   

                // Rollback metadata
                if metaSaved && retErr != nil {
                        retErr = multierror.Append(retErr, p.metadata.RemoveDevice(ctx, snapInfo.Name))
                }   
        }() 
...

@mxpv
Copy link
Copy Markdown
Member

mxpv commented May 7, 2019

In general I like the idea to deactivate a device after Commit as no IO expected anymore and it simplifies logic (so we don't have to submit extra calls to activate/deactivate/check active to devmapper).

Some snapshotter tests are failing now, can you have a look on those?

@renzhengeek
Copy link
Copy Markdown
Author

In general I like the idea to deactivate a device after Commit as no IO expected anymore and it simplifies logic (so we don't have to submit extra calls to activate/deactivate/check active to devmapper).

Thanks.

Some snapshotter tests are failing now, can you have a look on those?

Sure:)

@renzhengeek
Copy link
Copy Markdown
Author

renzhengeek commented May 8, 2019

Hmm strange, the CI failed at this case - FAIL: TestSnapshotterSuite/128LayersMount,
while my local testing can pass, I tried 10 times, no one failed.
go test -test.root -run TestSnapshotterSuite/128LayersMount

@renzhengeek renzhengeek force-pushed the renzhen/devmapper-bugfix branch from 68a9760 to 8045d2b Compare May 8, 2019 07:05
@renzhengeek
Copy link
Copy Markdown
Author

Hmm, the AppVeyor failure seems unrelated to this pr:

[00:07:51]     --- FAIL: TestContentClient/CommitErrorState (4.00s)
[00:07:51]         testsuite.go:117: Cleanup failed: unknown
[00:07:51]             github.com/containerd/containerd/errdefs.init.ializers
[00:07:51]             	C:/gopath/src/github.com/containerd/containerd/errdefs/errors.go:39
[00:07:51]             runtime.main
[00:07:51]             	C:/go/src/runtime/proc.go:188
[00:07:51]             runtime.goexit
[00:07:51]             	C:/go/src/runtime/asm_amd64.s:1337
[00:07:51]             remove C:\Program Files\containerd\root-test\io.containerd.content.v1.content\ingest\9101c5f249b4cb38a50dddda86a787c09f31099573c87c525fc50774ac21d05d\.tmp-updatedat747542151: The process cannot access the file because it is being used by another process.

@fuweid
Copy link
Copy Markdown
Member

fuweid commented May 8, 2019

@renzhengeek the AppVeyor failed case is flaky one. you can repush to make it happy...

@renzhengeek renzhengeek force-pushed the renzhen/devmapper-bugfix branch 3 times, most recently from af4daee to f87245b Compare May 8, 2019 11:26
@renzhengeek
Copy link
Copy Markdown
Author

Hmm strange, the CI failed at this case - FAIL: TestSnapshotterSuite/128LayersMount,
while my local testing can pass, I tried 10 times, no one failed.
go test -test.root -run TestSnapshotterSuite/128LayersMount

The failure shows that the new snapshot doesn't capture all the data, which seems caused by the umount flags:

UMOUNT(2)

MNT_DETACH (since Linux 2.4.11)
              Perform a lazy unmount: make the mount point unavailable for
              new accesses, immediately disconnect the filesystem and all
              filesystems mounted below it from each other and from the
              mount table, and actually perform the unmount when the mount
              point ceases to be busy.

I am thinking if there is a proper way to flush the data anyway before deactivating the snapshot.

@mxpv
Copy link
Copy Markdown
Member

mxpv commented May 8, 2019

I ran tests on Ubuntu 18. Everything works as expected except TransitivityTest, which is failed, because parseDmsetupError couldn't recognize ENXIO error. Something like this fixed the problem:

func parseDmsetupError(output string) string {
 ...
	line := lines[0]

	// Handle output like "Device /dev/mapper/snapshotter-suite-pool-snap-1 not found"
	if strings.HasSuffix(line, "not found") {
		return unix.ENXIO.Error()
	}

	const failedSubstr = "failed: "

As for Travis, can you try it without MNT_DETACH ?

@renzhengeek renzhengeek force-pushed the renzhen/devmapper-bugfix branch 3 times, most recently from f36ca76 to e637578 Compare May 9, 2019 01:21
@renzhengeek
Copy link
Copy Markdown
Author

Hi,

I ran tests on Ubuntu 18. Everything works as expected except TransitivityTest, which is failed, because parseDmsetupError couldn't recognize ENXIO error. Something like this fixed the problem:

func parseDmsetupError(output string) string {
 ...
	line := lines[0]

	// Handle output like "Device /dev/mapper/snapshotter-suite-pool-snap-1 not found"
	if strings.HasSuffix(line, "not found") {
		return unix.ENXIO.Error()
	}

	const failedSubstr = "failed: "

Done and pushed, thanks!

As for Travis, can you try it without MNT_DETACH ?

I'm afraid changing testutil.Unmount() may bring effects on other places, instead I add SuspendDevice before removing in Commit() to ensure the IO laid on disk. I think Commit() is a much less frequent operation than Prepare()/View() (creating snapshot), adding a suspend operation is acceptable:)

Travis can pass now. Will trigger more times to see if it's stable enough.

@renzhengeek renzhengeek force-pushed the renzhen/devmapper-bugfix branch 2 times, most recently from 971d292 to 70e0c16 Compare May 9, 2019 02:12
Copy link
Copy Markdown
Member

@mxpv mxpv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead I add SuspendDevice before removing in Commit() to ensure the IO laid on disk

Yep, suspend will do the trick too, great idea.
LGTM

1. reason to deactivate committed snapshot

The thin device will not be used for IO after committed,
and further thin snapshotting is OK using an inactive thin
device as origin. The benefits to deactivate are:
 - device is not unneccesary visible avoiding any unexpected IO;
 - save useless kernel data structs for maintaining active dm.

 Quote from kernel doc (Documentation/device-mapper/provisioning.txt):

"
  ii) Using an internal snapshot.

  Once created, the user doesn't have to worry about any connection
  between the origin and the snapshot.  Indeed the snapshot is no
  different from any other thinly-provisioned device and can be
  snapshotted itself via the same method.  It's perfectly legal to
  have only one of them active, and there's no ordering requirement on
  activating or removing them both.  (This differs from conventional
  device-mapper snapshots.)
"

2. an thinpool metadata bug is naturally removed

An problem happens when failed to suspend/resume origin thin device
when creating snapshot:

"failed to create snapshot device from parent vg0-mythinpool-snap-3"
error="failed to save initial metadata for snapshot "vg0-mythinpool-snap-19":
object already exists"

This issue occurs because when failed to create snapshot, the
snapshotter.store can be rollbacked, but the thin pool metadata
boltdb failed to rollback in PoolDevice.CreateSnapshotDevice(),
therefore metadata becomes inconsistent: the snapshotID is not
taken in snapshotter.store, but saved in pool metadata boltdb.

The cause is, in PoolDevice.CreateSnapshotDevice(), the defer calls
are invoked on "first-in-last-out" order. When the error happens
on the "resume device" defer call, the metadata is saved and
snapshot is created, which has no chance to be rollbacked.

Signed-off-by: Eric Ren <[email protected]>
@renzhengeek renzhengeek force-pushed the renzhen/devmapper-bugfix branch from 70e0c16 to d03417d Compare May 9, 2019 02:59
@renzhengeek
Copy link
Copy Markdown
Author

Travis failed with one failure not related to this pr:

Summarizing 1 Failure:
[Fail] [k8s.io] Container runtime should support basic operations on container [BeforeEach] runtime should support removing container [Conformance] 
/home/travis/gopath/src/github.com/kubernetes-sigs/cri-tools/pkg/framework/util.go:213
Ran 66 of 73 Specs in 55.701 seconds
FAIL! -- 65 Passed | 1 Failed | 0 Pending | 7 Skipped
Ginkgo ran 1 suite in 55.814877871s
Test Suite Failed
--- FAIL: TestCRISuite (55.82s)
    cri_test.go:121: Failed to run tests in paralllel: exit status 1
FAIL

Let's trigger CI again.

@renzhengeek renzhengeek force-pushed the renzhen/devmapper-bugfix branch from d03417d to c560040 Compare May 9, 2019 03:32
@codecov-io
Copy link
Copy Markdown

codecov-io commented May 9, 2019

Codecov Report

Merging #3262 into master will decrease coverage by 0.02%.
The diff coverage is 55.55%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3262      +/-   ##
==========================================
- Coverage   44.56%   44.53%   -0.03%     
==========================================
  Files         113      113              
  Lines       12194    12195       +1     
==========================================
- Hits         5434     5431       -3     
- Misses       5928     5930       +2     
- Partials      832      834       +2
Flag Coverage Δ
#linux 48.5% <55.55%> (-0.04%) ⬇️
#windows 39.8% <ø> (ø) ⬆️
Impacted Files Coverage Δ
snapshots/devmapper/dmsetup/dmsetup.go 80.23% <20%> (-1.97%) ⬇️
snapshots/devmapper/snapshotter.go 66.39% <33.33%> (-0.96%) ⬇️
snapshots/devmapper/pool_device.go 61.4% <75%> (+0.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d68b593...6f463d3. Read the comment docs.

For block device like devicemapper, umount before committing
active snapshot can sync data onto disk. Otherewise:

1. deactivating a block device in use will fail or IO error
when forced;
2. new snapshot on it will not catch the data not laid on disk yet.

Signed-off-by: Eric Ren <[email protected]>
@renzhengeek renzhengeek force-pushed the renzhen/devmapper-bugfix branch from c560040 to 6f463d3 Compare May 9, 2019 04:27
Copy link
Copy Markdown
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

this is good improvement. when we run several instances from same image, it can save time.

@crosbymichael
Copy link
Copy Markdown
Member

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants