Better error recovery in device mapper#3470
Conversation
Signed-off-by: Maksym Pavlenko <[email protected]>
Signed-off-by: Maksym Pavlenko <[email protected]>
|
Build succeeded.
|
| if activeErr != nil { | ||
| return activeErr | ||
| } | ||
|
|
There was a problem hiding this comment.
Hi,
What if crash happens in between these steps and rollback code doesn't get chance to run?
This possibility also applies to other places to change device or metadata, right?
There was a problem hiding this comment.
What if crash happens in between these steps and rollback code doesn't get chance to run?
Each device mapper operation is wrapped with state transition function (have a look on 1 and 2). If we're failed to activate a device and crash happens, so no defer was run, you'll end up with something like this in your metadata database (meaning that the device didn't finish the transition):
DevName=X ID=9 State=Activating Error="failed to activate"
There was a problem hiding this comment.
Essentially each operation with devmapper device is recorded (https://github.com/containerd/containerd/blob/master/snapshots/devmapper/device_info.go#L34), so you can get a clear picture about device state and last operation at any time. For example, for CreateThinDevice the following states will be used:
Unknown <-- preallocated metadata info and devID
Creating <-- createDevice
Created
Activating <-- activateDevice
Activated
There was a problem hiding this comment.
Hi,
thanks for your detailed explanation ;-) Yes, I know, in the crash case, the metadata DB can reflect correctly the device state before crashing, such as creating, activating etc.
Actually, my question was, in that case, there is no chance to the rollback code to mark the device as faulty, right?
If so, after machine comes up and containerd get started again, we still have "object already exists" error from the AddDevice() - 878a320#diff-4132e94f99d36d21b09c446fb20687faL100.
There was a problem hiding this comment.
If you crash right in between create and rollback, then yes, a device won't be marked as Faulty one. In this case, what behavior would you expect to see?
There was a problem hiding this comment.
If a crash happens, containerd must restart again, so devmapper will be reloaded. Could we take the chance to scan the device list and mark the device as faulty who's state is not in a final state - Deactivated and Activated. Or I feel it's safe to cleanup that device from metadata and thin pool completely.
There was a problem hiding this comment.
If you crash right in between create and rollback, then yes, a device won't be marked as
Faultyone. In this case, what behavior would you expect to see?
Hi,
I just come up this idea. I'd like to know what do you think?thanks.
diff --git a/snapshots/devmapper/pool_device.go b/snapshots/devmapper/pool_device.go
index 22e78c1..4eaab6e 100644
--- a/snapshots/devmapper/pool_device.go
+++ b/snapshots/devmapper/pool_device.go
@@ -150,6 +150,16 @@ func (p *PoolDevice) CreateThinDevice(ctx context.Context, deviceName string, vi
retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, info.Name))
return
}
+
+ // If ErrAlreadyExists returns, it indicates the device name is already in metadata,
+ // but not aware by upper layer. This may occur when crash happens and there is no
+ // chance to mark the device faulty. So, mark it here and upper layer will retry with
+ // the same device name.
+ if metaErr == ErrAlreadyExists {
+ log.G(ctx).Warnf("conflicting thin device %s, mark it faulty", info.Name)
+ retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, info.Name))
+ return
+ }
}()
// Save initial device metadata and allocate new device ID from store
@@ -241,6 +251,12 @@ func (p *PoolDevice) CreateSnapshotDevice(ctx context.Context, deviceName string
retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, snapInfo.Name))
return
}
+
+ if metaErr == ErrAlreadyExists {
+ log.G(ctx).Warnf("conflicting snapshot %s, mark it faulty", snapInfo.Name)
+ retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, snapInfo.Name))
+ return
+ }
}()
There was a problem hiding this comment.
I've addressed this use case in #3489. In case of crash, the snapshotter will mark devices as faulty after restart. Can you check if it works for you?
There was a problem hiding this comment.
I've addressed this use case in #3489. In case of crash, the snapshotter will mark devices as faulty after restart. Can you check if it works for you?
Great, thanks!
|
Hmm, devmapper: deferred remove can break consistency |
This makes sense. devmapper needs more precise control when to do unmounts, and certainly that should happen before remove. Though I think we don't need |
Signed-off-by: Maksym Pavlenko <[email protected]>
|
Build succeeded.
|
Codecov Report
@@ Coverage Diff @@
## master #3470 +/- ##
==========================================
- Coverage 44.24% 44.22% -0.03%
==========================================
Files 124 124
Lines 13732 13780 +48
==========================================
+ Hits 6076 6094 +18
- Misses 6725 6747 +22
- Partials 931 939 +8
Continue to review full report at Codecov.
|
Yes, thanks. |
Do you plan to add some documentation as part of this PR for what a user can do to remediate the faulty device ID and to mark it as ok once it has been remediated? If not, is that planned for a follow-up PR? |
Once it's correctly marked as faulty, I think it's safe to just delete those devices which are not aware by upper layer and contains user data. |
@samuelkarp Yes, I plan to update the README in a follow up PR.
@renzhengeek SGTM. At init time we can check devices that are not in final state and mark them faulty. However I'd prefer to play safe and leave any device deletion for user in order to prevent unexpected situations. Sometimes it's possible to restore a device instead of cleanup. |
|
LGTM |
|
Should any of this cleanup/err handling get taken back into |
|
@estesp no, this is devmapper specific err handling. |
This PR attempts to mitigate some cases described in #3436
It introduces faulty state for devices and devmapper device ID.
A device that is failed to create/activate and failed to rollback (delete back), will be marked as faulty in metadata store (instead of cleaning up the metadata and ending up in an inconsistent state). The snapshotter will try to recreate this device with a different devID in a subsequent call. This would allow us to move on and prevent the snapshotter to entirely stuck on a first faulty ID.
A faulty device ID means that the snapshotter could not handle it (nor create nor rollback), and it needs manual handling. It'll be marked in metadata store and won't be reused by snapshotter until addressed and manually marked as ok.
@renzhengeek @jiazhiguang Could you please give it a try on your env.