Skip to content

[BUG] is_clearly_healthy short-circuit ignores Alertmanager, Coralogix, and Honeycomb evidence keys #670

@kaushal-bakrania

Description

@kaushal-bakrania

Summary

_INVESTIGATED_EVIDENCE_KEYS in app/nodes/root_cause_diagnosis/evidence_checker.py#L16-L36 is missing five keys that the investigation merge step actually writes to the evidence dict: alertmanager_alerts, alertmanager_silences, coralogix_logs, coralogix_error_logs, honeycomb_traces (see the _map_alertmanager_* / _map_coralogix_logs / _map_honeycomb_traces entries in app/nodes/investigate/processing/post_process.py).

Condition 4 of is_clearly_healthy() therefore never fires for pure-Alertmanager, pure-Coralogix, or pure-Honeycomb healthy states. Same drift bug that #582 fixed for EKS.

Impact

Every resolved low-severity alert from these three stacks pays a full LLM RCA round-trip instead of taking the fast path. Cost + latency, not correctness.

Reproduction

python3 -c "
from app.nodes.root_cause_diagnosis.evidence_checker import is_clearly_healthy, _INVESTIGATED_EVIDENCE_KEYS
missing = [k for k in ('alertmanager_alerts','alertmanager_silences','coralogix_logs','coralogix_error_logs','honeycomb_traces') if k not in _INVESTIGATED_EVIDENCE_KEYS]
print('missing:', missing)

alert = {'state':'resolved','commonLabels':{'severity':'info'},'commonAnnotations':{}}
for key in ('alertmanager_alerts','coralogix_logs','honeycomb_traces','grafana_logs'):
    print(f'  {key:25s} short-circuit -> {is_clearly_healthy(alert, {key: []})}')
"

Fix PR incoming.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions