Require explicit permission to start a recovery after 30 generations

Large numbers of unfinished recoveries (recoveries that never make it to the fully_recovered state) in a row is very bad for cluster performance. This is because the metadata for all these generations of TLogs are stored in the coordinated state, so each recovery is adding more work for the next recovery.

This makes problems which cause repeated failures more dangerous, because is no action is taken, the cluster will degraded as more and more recoveries are attempted.

To prevent this failure cascade, after a certain number of recoveries the user should be required to tell the system it is okay to attempt a recovery. This will give administrators a chance to fix the root cause before doing more recoveries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Require explicit permission to start a recovery after 30 generations #2796

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Require explicit permission to start a recovery after 30 generations #2796

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions