Summary
_INVESTIGATED_EVIDENCE_KEYS in app/nodes/root_cause_diagnosis/evidence_checker.py#L16-L36 is missing five keys that the investigation merge step actually writes to the evidence dict: alertmanager_alerts, alertmanager_silences, coralogix_logs, coralogix_error_logs, honeycomb_traces (see the _map_alertmanager_* / _map_coralogix_logs / _map_honeycomb_traces entries in app/nodes/investigate/processing/post_process.py).
Condition 4 of is_clearly_healthy() therefore never fires for pure-Alertmanager, pure-Coralogix, or pure-Honeycomb healthy states. Same drift bug that #582 fixed for EKS.
Impact
Every resolved low-severity alert from these three stacks pays a full LLM RCA round-trip instead of taking the fast path. Cost + latency, not correctness.
Reproduction
python3 -c "
from app.nodes.root_cause_diagnosis.evidence_checker import is_clearly_healthy, _INVESTIGATED_EVIDENCE_KEYS
missing = [k for k in ('alertmanager_alerts','alertmanager_silences','coralogix_logs','coralogix_error_logs','honeycomb_traces') if k not in _INVESTIGATED_EVIDENCE_KEYS]
print('missing:', missing)
alert = {'state':'resolved','commonLabels':{'severity':'info'},'commonAnnotations':{}}
for key in ('alertmanager_alerts','coralogix_logs','honeycomb_traces','grafana_logs'):
print(f' {key:25s} short-circuit -> {is_clearly_healthy(alert, {key: []})}')
"
Fix PR incoming.
Summary
_INVESTIGATED_EVIDENCE_KEYSinapp/nodes/root_cause_diagnosis/evidence_checker.py#L16-L36is missing five keys that the investigation merge step actually writes to the evidence dict:alertmanager_alerts,alertmanager_silences,coralogix_logs,coralogix_error_logs,honeycomb_traces(see the_map_alertmanager_*/_map_coralogix_logs/_map_honeycomb_tracesentries inapp/nodes/investigate/processing/post_process.py).Condition 4 of
is_clearly_healthy()therefore never fires for pure-Alertmanager, pure-Coralogix, or pure-Honeycomb healthy states. Same drift bug that #582 fixed for EKS.Impact
Every resolved low-severity alert from these three stacks pays a full LLM RCA round-trip instead of taking the fast path. Cost + latency, not correctness.
Reproduction
Fix PR incoming.