Skip to content

Commit 63b7587

Browse files
author
Eric Ren
committed
snapshots/devmapper: fix race windown causing IO hangup
The issue beblow happens several times beforing the root cause found: 1. A `fdisk -l` process has being hung up for a long time; 2. A image layer snapshot device is visiable to dmsetup, which should *not* happen because it should be deactivated after `Commit()`; The backtrace of `fdisk` is always the same over time: ```bash [<ffffffff810bbc6a>] io_schedule+0x2a/0x80 [<ffffffff81295a3f>] do_blockdev_direct_IO+0x1e9f/0x2f10 [<ffffffff81296aea>] __blockdev_direct_IO+0x3a/0x40 [<ffffffff81290e43>] blkdev_direct_IO+0x43/0x50 [<ffffffff811b8a14>] generic_file_read_iter+0x374/0x960 [<ffffffff81291ad5>] blkdev_read_iter+0x35/0x40 [<ffffffff8125229b>] new_sync_read+0xfb/0x240 [<ffffffff81252406>] __vfs_read+0x26/0x40 [<ffffffff81252b96>] vfs_read+0x96/0x130 [<ffffffff812540e5>] SyS_read+0x55/0xc0 [<ffffffff81003c04>] do_syscall_64+0x74/0x180 ``` The root cause is, in Commit(), there's a race window between `SuspendDevice()` and `DeactivateDevice()`, which may cause the IOs of a process or command like `fdisk` on the "suspended" device hang up forever. It has twofold: 1. The IOs suspends on the devices; 2. The device is in `Suspended` state, because it's deactivated with `deferred` flag and without `force` flag; So they cannot make progress. One reproducer is: 1. enlarge the race window by putting sleep seconds there; 2. run `while true; do sudo fdisk -l; sleep 0.5; done` on one terminal; 3. and pull image on another terminal; Fixes it by: 1. Resume the devices again after flushing IO by suspend; 2. Remove device without `deferred` flag; Fix: #4234 Signed-off-by: Eric Ren <[email protected]>
1 parent b1f5146 commit 63b7587

2 files changed

Lines changed: 25 additions & 3 deletions

File tree

snapshots/devmapper/pool_device.go

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,6 +347,16 @@ func (p *PoolDevice) SuspendDevice(ctx context.Context, deviceName string) error
347347
return nil
348348
}
349349

350+
func (p *PoolDevice) ResumeDevice(ctx context.Context, deviceName string) error {
351+
if err := p.transition(ctx, deviceName, Resuming, Resumed, func() error {
352+
return dmsetup.ResumeDevice(deviceName)
353+
}); err != nil {
354+
return errors.Wrapf(err, "failed to resume device %q", deviceName)
355+
}
356+
357+
return nil
358+
}
359+
350360
// DeactivateDevice deactivates thin device
351361
func (p *PoolDevice) DeactivateDevice(ctx context.Context, deviceName string, deferred, withForce bool) error {
352362
if !p.IsLoaded(deviceName) {

snapshots/devmapper/snapshotter.go

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -277,14 +277,26 @@ func (s *Snapshotter) Commit(ctx context.Context, name, key string, opts ...snap
277277
return err
278278
}
279279

280-
// The thin snapshot is not used for IO after committed, so
281-
// suspend to flush the IO and deactivate the device.
280+
// After committed, the snapshot device will not be directly
281+
// used anymore. We'd better deativate it to make it *invisible*
282+
// in userspace, so that tools like LVM2 and fdisk cannot touch it,
283+
// and avoid useless IOs on it.
284+
//
285+
// Before deactivation, we need to flush the outstanding IO by suspend.
286+
// Afterward, we resume it again to prevent a race window which may cause
287+
// a process IO hang. See the issue below for details:
288+
// (https://github.com/containerd/containerd/issues/4234)
282289
err = s.pool.SuspendDevice(ctx, deviceName)
283290
if err != nil {
284291
return err
285292
}
286293

287-
return s.pool.DeactivateDevice(ctx, deviceName, true, false)
294+
err = s.pool.ResumeDevice(ctx, deviceName)
295+
if err != nil {
296+
return err
297+
}
298+
299+
return s.pool.DeactivateDevice(ctx, deviceName, false, false)
288300
})
289301
}
290302

0 commit comments

Comments
 (0)