Skip to content

fix: recognise grafana_alert_rules in check_evidence_availability (#684)#685

Merged
VaibhavUpreti merged 1 commit intoTracer-Cloud:mainfrom
kaushal-bakrania:fix/check-evidence-grafana-alert-rules
Apr 19, 2026
Merged

fix: recognise grafana_alert_rules in check_evidence_availability (#684)#685
VaibhavUpreti merged 1 commit intoTracer-Cloud:mainfrom
kaushal-bakrania:fix/check-evidence-grafana-alert-rules

Conversation

@kaushal-bakrania
Copy link
Copy Markdown
Contributor

Summary

Context

Same drift-bug class as commit 3dbfb52 (EKS keys) and open PR #672 (Alertmanager / Coralogix / Honeycomb keys): the key is in the frozenset and emitted by _map_grafana_alert_rules, but the OR-chain guard was not updated in lockstep. A healthy Grafana alert whose only evidence is grafana_alert_rules fell through to _handle_insufficient_evidence instead of satisfying the evidence gate.

Test plan

  • pytest tests/nodes/root_cause_diagnosis/test_evidence_checker.py — 22 passed
  • ruff check app/nodes/root_cause_diagnosis/evidence_checker.py — clean
  • mypy app/nodes/root_cause_diagnosis/evidence_checker.py — clean

…acer-Cloud#684)

The key is already in _INVESTIGATED_EVIDENCE_KEYS and emitted by the
_map_grafana_alert_rules mapper, but check_evidence_availability's
has_cloudwatch_evidence OR-chain never probed it. A healthy alert whose
only evidence was grafana_alert_rules fell through to
_handle_insufficient_evidence instead of satisfying the evidence gate.

Same drift-bug class as commit 3dbfb52 (EKS) and open PR Tracer-Cloud#672 (AM/CL/HC).
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 19, 2026

Greptile Summary

This PR fixes a key-drift bug where grafana_alert_rules was present in _INVESTIGATED_EVIDENCE_KEYS and emitted by _map_grafana_alert_rules, but was missing from the has_cloudwatch_evidence OR-chain in check_evidence_availability — causing a healthy Grafana-alert-rules-only investigation to fall through to _handle_insufficient_evidence. The one-line addition is correct, targeted, and consistent with how the same bug was previously fixed for EKS keys.

Confidence Score: 5/5

Safe to merge — fix is correct, minimal, and all existing tests pass.

Only P2 findings remain: a suggestion to add a cross-reference comment documenting the coupling between _INVESTIGATED_EVIDENCE_KEYS and the OR-chain, and a suggestion for a direct check_evidence_availability regression test. Neither blocks merge.

No files require special attention.

Important Files Changed

Filename Overview
app/nodes/root_cause_diagnosis/evidence_checker.py Adds grafana_alert_rules to the has_cloudwatch_evidence OR-chain; the key was already present in _INVESTIGATED_EVIDENCE_KEYS — fix is correct and minimal.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[check_evidence_availability called] --> B{evidence keys present?}
    B -- grafana_alert_rules new --> C[has_cloudwatch_evidence = True]
    B -- grafana_logs / grafana_metrics / cloudwatch_logs / eks_* / etc. --> C
    B -- none of the above --> D[has_cloudwatch_evidence = False]
    C --> E[Return tuple: tracer / cloudwatch / alert evidence]
    D --> E
    E --> F{downstream router}
    F -- has_cloudwatch_evidence True --> G[Normal diagnosis path]
    F -- has_cloudwatch_evidence False --> H[_handle_insufficient_evidence - bug: grafana_alert_rules fell here before fix]
Loading

Comments Outside Diff (1)

  1. app/nodes/root_cause_diagnosis/evidence_checker.py, line 55-80 (link)

    P2 Structural drift risk between OR-chain and _INVESTIGATED_EVIDENCE_KEYS

    The has_cloudwatch_evidence OR-chain and _INVESTIGATED_EVIDENCE_KEYS are maintained in two separate places, which is exactly the class of bug this PR fixes. Several keys in the OR-chain (grafana_error_logs, grafana_traces, lambda_logs, s3_object, s3_marker, etc.) are not in _INVESTIGATED_EVIDENCE_KEYS, so a pure unification isn't possible — but any key added to _INVESTIGATED_EVIDENCE_KEYS in the future also needs a manual entry here. A short comment cross-referencing the frozenset would make the coupling explicit and reduce the chance of a future drift:

        # NOTE: every key in _INVESTIGATED_EVIDENCE_KEYS must also appear in this
        # OR-chain (see drift-bug history in commit 3dbfb521 and PRs #672, #684).
        has_cloudwatch_evidence = bool(
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: app/nodes/root_cause_diagnosis/evidence_checker.py
    Line: 55-80
    
    Comment:
    **Structural drift risk between OR-chain and `_INVESTIGATED_EVIDENCE_KEYS`**
    
    The `has_cloudwatch_evidence` OR-chain and `_INVESTIGATED_EVIDENCE_KEYS` are maintained in two separate places, which is exactly the class of bug this PR fixes. Several keys in the OR-chain (`grafana_error_logs`, `grafana_traces`, `lambda_logs`, `s3_object`, `s3_marker`, etc.) are not in `_INVESTIGATED_EVIDENCE_KEYS`, so a pure unification isn't possible — but any key added to `_INVESTIGATED_EVIDENCE_KEYS` in the future also needs a manual entry here. A short comment cross-referencing the frozenset would make the coupling explicit and reduce the chance of a future drift:
    
    ```python
        # NOTE: every key in _INVESTIGATED_EVIDENCE_KEYS must also appear in this
        # OR-chain (see drift-bug history in commit 3dbfb521 and PRs #672, #684).
        has_cloudwatch_evidence = bool(
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: app/nodes/root_cause_diagnosis/evidence_checker.py
Line: 55-80

Comment:
**Structural drift risk between OR-chain and `_INVESTIGATED_EVIDENCE_KEYS`**

The `has_cloudwatch_evidence` OR-chain and `_INVESTIGATED_EVIDENCE_KEYS` are maintained in two separate places, which is exactly the class of bug this PR fixes. Several keys in the OR-chain (`grafana_error_logs`, `grafana_traces`, `lambda_logs`, `s3_object`, `s3_marker`, etc.) are not in `_INVESTIGATED_EVIDENCE_KEYS`, so a pure unification isn't possible — but any key added to `_INVESTIGATED_EVIDENCE_KEYS` in the future also needs a manual entry here. A short comment cross-referencing the frozenset would make the coupling explicit and reduce the chance of a future drift:

```python
    # NOTE: every key in _INVESTIGATED_EVIDENCE_KEYS must also appear in this
    # OR-chain (see drift-bug history in commit 3dbfb521 and PRs #672, #684).
    has_cloudwatch_evidence = bool(
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: app/nodes/root_cause_diagnosis/evidence_checker.py
Line: 62

Comment:
**No direct `check_evidence_availability` test for this key**

The parametrized test in `TestIsClearlyHealthyExistingSources` covers `grafana_alert_rules` for `is_clearly_healthy`, but there is no test that calls `check_evidence_availability` directly with `{"grafana_alert_rules": ...}` and asserts `has_cloudwatch_evidence is True`. Given this is the exact code path the bug affected, a targeted test here would provide a regression lock.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: recognise grafana_alert_rules in ch..." | Re-trigger Greptile

or evidence.get("grafana_error_logs") is not None
or evidence.get("grafana_traces") is not None
or evidence.get("grafana_metrics") is not None
or evidence.get("grafana_alert_rules") is not None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No direct check_evidence_availability test for this key

The parametrized test in TestIsClearlyHealthyExistingSources covers grafana_alert_rules for is_clearly_healthy, but there is no test that calls check_evidence_availability directly with {"grafana_alert_rules": ...} and asserts has_cloudwatch_evidence is True. Given this is the exact code path the bug affected, a targeted test here would provide a regression lock.

Prompt To Fix With AI
This is a comment left during a code review.
Path: app/nodes/root_cause_diagnosis/evidence_checker.py
Line: 62

Comment:
**No direct `check_evidence_availability` test for this key**

The parametrized test in `TestIsClearlyHealthyExistingSources` covers `grafana_alert_rules` for `is_clearly_healthy`, but there is no test that calls `check_evidence_availability` directly with `{"grafana_alert_rules": ...}` and asserts `has_cloudwatch_evidence is True`. Given this is the exact code path the bug affected, a targeted test here would provide a regression lock.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Member

@VaibhavUpreti VaibhavUpreti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome @kaushal-bakrania !

@VaibhavUpreti VaibhavUpreti merged commit 61e567e into Tracer-Cloud:main Apr 19, 2026
8 checks passed
@kaushal-bakrania kaushal-bakrania deleted the fix/check-evidence-grafana-alert-rules branch April 19, 2026 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: grafana_alert_rules key missing from check_evidence_availability OR-chain

2 participants