Conversation
inhibit/inhibit.go
Outdated
| } | ||
| } | ||
| ih.marker.SetInhibited(fp, false) | ||
| ih.marker.SetStatus(fp, types.Active) |
There was a problem hiding this comment.
I'm not sure if this is semantically correct. It appears that alerts will be marked in the notify pipeline, so setting them to active here would be temporary until they get checked as part of the pipeline. Can someone confirm? @brancz
There was a problem hiding this comment.
If we don't know the real status until notify can re-check it again then maybe we should reset it into types.Unprocessed, wouldn't that make more sense?
There was a problem hiding this comment.
Yeah, it has to go through both the inhibit and silence exec stages, so it'll be set correctly.
types/types.go
Outdated
| type state uint8 | ||
|
|
||
| const ( | ||
| Unprocessed state = iota |
There was a problem hiding this comment.
I'm not sure if Unprocessed is the best name for it, I was thinking about calling it New but that on the other hand might be mistaken for an alert that just fired rather than indicate what we really mean here.
There was a problem hiding this comment.
Yeah I'm not sure what name to provide. I don't feel strongly one way or the other.
types/types.go
Outdated
| status = "unknown" | ||
| } | ||
|
|
||
| // TODO: Do we want to return all values? |
There was a problem hiding this comment.
This would make sense for Silenced status as there might be multiple silences that match our alert. But this means that notify would need to check all, which I see that this PR is adding, so I would say yes, we should return everything.
8181492 to
ac6f8eb
Compare
types/types.go
Outdated
| type State uint8 | ||
|
|
||
| const ( | ||
| Unprocessed State = iota |
There was a problem hiding this comment.
Always prefix enums with the original type.
The type should be AlertState to know what it's even about. Thereby the values would be AlertStateUnprocessed etc.
| } | ||
|
|
||
| func TestAlertsGC(t *testing.T) { | ||
| dir, err := ioutil.TempDir("", "alerts_test") |
types/types.go
Outdated
| Unprocessed: "unprocessed", | ||
| Active: "active", | ||
| Silenced: "silenced", | ||
| Inhibited: "inhibited", |
There was a problem hiding this comment.
An alert can be silenced and inhibited. More so once we move towards a model where this is shown immediately and not just after initial processing.
So "unprocessed", "active", and "suppressed" might be more accurate? @brian-brazil
There was a problem hiding this comment.
Sounds understandable to me
There was a problem hiding this comment.
I think there's a difference between silenced and inhibited from the user perspective and Alertmanager should distinguish between these two states. Inhibition and silencing is used for different purposes. One is "oh it's silenced, someone is working on it" the other is "oh it's caused by some other issue" (btw. there's no way to ACK an alert, so silences might be used as ACKs).
If alert is inhibited than it's clear that it's suppressed by another alert, so when that other alert is resolved our inhibited alert will fire. Silenced alert on the other hand will never fire while the silence is active. If you just see "suppressed" you won't know what that really means and whenever this alert will fire if there's an action to take in the near future (it won't if it's silenced).
There was a problem hiding this comment.
True, but then from AlertStatus we could infer what sort of suppressed it is: if it has silenceIDs in the silencedBy field, it's silenced, if it has alertIDs in the inhibitedBy field it's inhibited. I suppose I'm not a big fan of having to infer the state, though.
types/types.go
Outdated
| // AlertStatus stores the state and values associated with an Alert. | ||
| type AlertStatus struct { | ||
| status State | ||
| values []string |
There was a problem hiding this comment.
Just having silence IDs here as a bunch of strings might fit with the UI currently offers, but it has semantically no real meaning from an API perspective.
What if we also want to show which alerts inhibited the given alert? How would we distinguish its ID from that of a silence.
I think the status should rather have "status", "silencedBy" (list of silence IDs), and "inhibitedBy" (list of alert IDs).
Also the attributes are both unexported so they shouldn't be marshalled by the JSON decoder, no?
There was a problem hiding this comment.
AlertStatus implements the marshaler interface so we can convert the enum into a string representation, so that handles marshaling the unexported fields.
I flatten the AlertStatus inside APIAlert in dispatch.go:139, but I'm not sure if that's the "right" way to do it.
inhibit/inhibit.go
Outdated
| // distinguish AlertStateSuppressed between silenced | ||
| // and/or inhibited. In the future there will be unique | ||
| // identifiers for alerts. | ||
| ih.marker.SetInhibited(fp, "123") |
There was a problem hiding this comment.
I put these semantics in place now even though we won't use them. This does lead to potential confusing behavior if we attempt to SetInhibited elsewhere before unique identifiers for alerts exist, but the symmetry with SetSilenced made sense to me.
This will be used to track and expose the status of an alert, it will replace dedicated fields in the Marker
Serialize and return all values to clients.
The inhibited stage is the first in the notify pipeline. If the alert here isn't inhibited and the pipeline continues, the alert's state will correctly be changed to Active or Silenced in the silence stage.
Differentiate based on whether the state stores a list of silenceIDs or inhibitIDs
5d03da7 to
5d1464b
Compare
885c14f to
b1c6c67
Compare
| // identifiers for alerts. | ||
| ih.marker.SetInhibited(fp, "123") | ||
| if inhibitedByFP, eq := r.hasEqual(lset); r.TargetMatchers.Match(lset) && eq { | ||
| ih.marker.SetInhibited(fp, fmt.Sprintf("%d", inhibitedByFP)) |
|
👍 |
|
thanks for getting this started @prymitive !!! |
|
Is there any current estimate when would the next release be out? One that would contain this and possibly #748 |
|
A patch release should go out very soon to release #750 |
Adding on to #711