fix(airflow): correct tool source, StrictConfigModel, surfaces, exception-recovery test by yashksaini-coder · Pull Request #1077 · Tracer-Cloud/opensre

yashksaini-coder · 2026-04-29T12:34:18Z

Follow-up to #570 + fixes pre-existing test failures

Addresses all remaining issues from the post-merge review comment: #570 (comment)

Changes

P1 — Fix `source="airflow"` on all 3 tools (was `"tracer_web"`)

Root cause: prioritization.py:58 awards a +2 score when action.source in sources. For an Airflow incident sources contains "airflow", but all 3 tools had source="tracer_web" — so they scored 0 instead of 2 during prioritization.

Also added "airflow" to the EvidenceSource Literal (app/types/evidence.py) which is validated at import time by BaseTool.__init_subclass__().

P2 — `AirflowConfig(BaseModel)` → `AirflowConfig(StrictConfigModel)`

Matches every other integration config model (DatadogConfig, GrafanaConfig, etc.). StrictConfigModel sets extra="forbid" to reject unexpected fields from the store early.

P2 — Exception-recovery test

New test test_get_recent_airflow_failures_partial_run_error_preserves_evidence:

Two DAG runs: run_ok returns a failed task, run_bad raises HTTPStatusError(500)
Asserts evidence from run_ok is preserved and run_bad was actually attempted

Nit — `surfaces=("investigation", "chat")` on all 3 tools

Matches the convention used by PostgreSQLTableStatsTool, GitHubCommitsTool, ClickHouseSystemHealthTool, and others.

Bonus — Fix 3 pre-existing broken `test_datadog_client` tests

search_logs() returns {"success": True, "logs": [...], "total": N} — a flat dict. Three tests (test_search_logs_success, test_search_logs_empty_data, test_search_logs_http_error) were written as if it returns fetch_all() output (nested {logs: {...}, monitors: {...}, events: {...}}). Fixed assertions to match the actual return shape.

Test results

14 passed  — Airflow-specific tests
19 passed  — test_datadog_client.py (was 3 failing on main)
3906 passed, 2 skipped  — full suite, zero regressions
ruff check: All checks passed

Files changed

File	What changed
`app/types/evidence.py`	Added `"airflow"` to `EvidenceSource` Literal
`app/integrations/airflow.py`	`BaseModel` → `StrictConfigModel`
`app/tools/TracerAirflowDAGTool/__init__.py`	`source="airflow"`, `surfaces=("investigation","chat")` on all 3 tools
`tests/integrations/test_airflow.py`	New exception-recovery test
`tests/services/test_datadog_client.py`	Fix 3 broken `search_logs` assertions

@tool

…s, test recovery path - Fix P1: change source='tracer_web' → source='airflow' on all 3 Airflow @tool decorators so the prioritizer's +2 score bump fires correctly when 'airflow' appears in detected sources (prioritization.py:58) - Add 'airflow' to EvidenceSource Literal in app/types/evidence.py - Fix P2: switch AirflowConfig from BaseModel → StrictConfigModel (extra='forbid') to match the rest of the integration config models - Add surfaces=('investigation', 'chat') to all 3 tools, matching the convention used by other integrations (PostgreSQL, MySQL, GitHub, etc.) - Add P2 test: test_get_recent_airflow_failures_partial_run_error_preserves_evidence covers the exception-recovery loop in get_recent_airflow_failures — verifies evidence from a successful run is preserved when another run's task-instance fetch raises HTTPStatusError Follows up on #570. Co-authored-by: Copilot <[email protected]>

greptile-apps · 2026-04-29T12:36:29Z

Greptile Summary

This PR fixes the root cause of Airflow tools never receiving the +2 prioritization score boost: all three tools were registered with source="tracer_web" instead of source="airflow". It also migrates AirflowConfig to StrictConfigModel for consistency with other integration configs, adds surfaces=("investigation","chat") to all three tools, and introduces a test for the exception-recovery path in get_recent_airflow_failures.

Confidence Score: 5/5

Safe to merge — all changes are targeted bug fixes with no regressions.

All four changes are correct and complete: the source field is fixed on all three tools, EvidenceSource is updated, StrictConfigModel migration matches every other integration config, and the new test exercises a previously-untested code path. No P0 or P1 issues found.

No files require special attention.

Important Files Changed

Filename	Overview
app/tools/TracerAirflowDAGTool/init.py	Fixes source="tracer_web" → source="airflow" on all 3 tools (the root cause of the +2 prioritization score never firing), and adds surfaces=("investigation","chat") consistently across tools.
app/integrations/airflow.py	Migrates AirflowConfig from BaseModel to StrictConfigModel, matching every other integration config (DatadogConfig, GrafanaConfig, etc.) and enabling fail-fast rejection of unknown fields from the store.
app/types/evidence.py	Adds "airflow" to the EvidenceSource Literal — required for the source validation in BaseTool/ToolMetadata to accept the new tool registrations.
tests/integrations/test_airflow.py	Adds test_get_recent_airflow_failures_partial_run_error_preserves_evidence — a well-structured test covering the try/except recovery path with two runs, one raising HTTPStatusError(500), verifying evidence from the healthy run is preserved and the bad run was actually attempted.

Sequence Diagram

sequenceDiagram
    participant P as Prioritizer
    participant R as ToolRegistry
    participant T as TracerAirflowDAGTool
    participant C as AirflowConfig (StrictConfigModel)
    participant A as Airflow API

    P->>R: get_available_actions()
    R->>T: load tools (source="airflow")
    P->>P: score += 2 (source "airflow" matches incident sources)
    P->>T: get_recent_airflow_failures(config, dag_id)
    T->>C: build_airflow_config(raw) — extra fields rejected early
    T->>A: GET /dags/{dag_id}/dagRuns
    A-->>T: dag_runs[]
    loop for each dag_run
        T->>A: GET /dagRuns/{run_id}/taskInstances
        alt HTTP 500
            A-->>T: HTTPStatusError
            T->>T: log warning, continue (evidence preserved)
        else OK
            A-->>T: task_instances[]
            T->>T: filter failed/up_for_retry states
        end
    end
    T-->>P: evidence[]

_{Reviews (1): Last reviewed commit: "fix(airflow): correct tool source, use S..." | Re-trigger Greptile}

Copilot

Pull request overview

Follow-up to prior Airflow integration work to ensure Airflow tools are correctly prioritized and consistently registered, while tightening config validation and adding missing exception-recovery coverage.

Changes:

Corrected Airflow tool metadata (source="airflow") so action prioritization properly scores these tools for Airflow incidents.
Updated AirflowConfig to inherit from StrictConfigModel (fail-fast on unknown config fields).
Added a unit test that exercises per-run exception recovery in get_recent_airflow_failures to ensure partial evidence is preserved.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
`tests/integrations/test_airflow.py`	Adds a regression test covering partial API failure behavior while preserving previously collected evidence.
`app/types/evidence.py`	Extends the canonical `EvidenceSource` set with `"airflow"` so tooling/source logic can recognize it.
`app/tools/TracerAirflowDAGTool/__init__.py`	Fixes tool `source` to `"airflow"` and adds `surfaces=("investigation","chat")` for consistency and prioritization.
`app/integrations/airflow.py`	Switches `AirflowConfig` to `StrictConfigModel` to forbid unknown fields and normalize inputs consistently.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

search_logs() returns {success, logs, total} directly — not a nested {logs: {success, logs}, monitors, events} shape like fetch_all() does. Three tests were written against the wrong shape: - test_search_logs_success: removed monitors/events assertions, fixed result['logs'][0]['message'] (was result['logs']['logs'][0]['message']) - test_search_logs_empty_data: fixed result['logs'] == [] (was result['logs']['logs']), removed incorrect monitors/events checks - test_search_logs_http_error: fixed result['error'] (was result['logs']['error']), removed unnecessary mock_instance.get stub All 19 tests in test_datadog_client.py now pass. Co-authored-by: Copilot <[email protected]>

…t.py from main The file was removed from main (cb37b6e). Accepting the upstream deletion. Co-authored-by: Copilot <[email protected]>

cerencamkiran · 2026-04-29T13:27:49Z

Nice @yashksaini-coder, I was about to follow up on this as well thanks for taking care of it and adding the extra cleanup and tests!

github-actions · 2026-04-29T14:07:52Z

💜 One more reason the project grows. Thanks @yashksaini-coder — your contribution just landed!

👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome.

Copilot AI review requested due to automatic review settings April 29, 2026 12:34

Copilot started reviewing on behalf of yashksaini-coder April 29, 2026 12:35 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

yashksaini-coder and others added 2 commits April 29, 2026 18:10

chore: resolve merge conflict — accept deletion of test_datadog_clien…

19ead83

…t.py from main The file was removed from main (cb37b6e). Accepting the upstream deletion. Co-authored-by: Copilot <[email protected]>

Devesh36 approved these changes Apr 29, 2026

View reviewed changes

yashksaini-coder merged commit 504aa15 into main Apr 29, 2026
9 checks passed

yashksaini-coder deleted the fix/airflow-follow-up branch April 29, 2026 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(airflow): correct tool source, StrictConfigModel, surfaces, exception-recovery test#1077

fix(airflow): correct tool source, StrictConfigModel, surfaces, exception-recovery test#1077
yashksaini-coder merged 3 commits intomainfrom
fix/airflow-follow-up

yashksaini-coder commented Apr 29, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

cerencamkiran commented Apr 29, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yashksaini-coder commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Follow-up to #570 + fixes pre-existing test failures

Changes

P1 — Fix source="airflow" on all 3 tools (was "tracer_web")

P2 — AirflowConfig(BaseModel) → AirflowConfig(StrictConfigModel)

P2 — Exception-recovery test

Nit — surfaces=("investigation", "chat") on all 3 tools

Bonus — Fix 3 pre-existing broken test_datadog_client tests

Test results

Files changed

Uh oh!

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

cerencamkiran commented Apr 29, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yashksaini-coder commented Apr 29, 2026 •

edited

Loading

P1 — Fix `source="airflow"` on all 3 tools (was `"tracer_web"`)

P2 — `AirflowConfig(BaseModel)` → `AirflowConfig(StrictConfigModel)`

Nit — `surfaces=("investigation", "chat")` on all 3 tools

Bonus — Fix 3 pre-existing broken `test_datadog_client` tests