Skip to content
This repository was archived by the owner on Jul 22, 2020. It is now read-only.
This repository was archived by the owner on Jul 22, 2020. It is now read-only.

How to configure for HA Alertmanagers? #37

@mattbostock

Description

@mattbostock

We should determine and document how to configure Unsee when running Alertmanager in a highly-available (HA) setup.

Version 0.5.x of Alertmanager introduces HA capability, which roughly works as follows:

  • Prometheus must be configured to send alerts to all instances (Alertmanager instances do not share alert data between them)
  • Alertmanager shares silences and notification events between instances in the HA group using the Mesh gossip library
  • Alertmanager will avoid sending a notification if one of its peers has already sent a notification for the same alert

That model works well for sending notifications (since you should receive a notification if it reached at least one of the peers), but less well for API queries since each Alertmanager instance may have a differing view of what alerts are currently firing.

Seems to me that the options are:

  1. Rely on one instance and accept that some alerts may be missing - if they're severe enough, we should be paged for them anyway.

  2. Try to poll all Alertmanager instances and merge the results.

My intuition says that option 1 is by far the most preferable, for simplicity's sake. If we agree, then we should document that approach.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions