Skip to content

Ratekeeper cannot control based on storage server issues when the list of storage servers cannot be read #1642

@ajbeamon

Description

@ajbeamon

If ratekeeper cannot read the list of storage servers in the cluster, it will still compute the transaction limit as if there are no storage server issues. This is problematic in particular if the reason we can't read the list of storage servers is because the storage servers are having issues.

For example, if all storage servers fall sufficiently far behind, then attempts to read the list of storage servers will repeatedly fail with future_version errors. Once we start failing to control, a write-only workload can continue to run and dig the cluster into a deeper hole.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions