feat(autofix): migrate to explorer agent by roaga · Pull Request #104615 · getsentry/sentry

roaga · 2025-12-09T21:14:46Z

Runs Autofix's 3 steps and 2 more based on the Seer Explorer Client rather than calling it's own Seer agent. Supports both manual runs and automation runs.

Controlled by the seer explorer feature flag and a new one, so we can start with internal testing for now.

Does not support automation or manual handoff to 3rd party agents currently, but i have a ticket to add that back along with other improvements (e.g. suspect commit artifact will need improvement too)

Frontend: #104618
Part of AIML-2004

linear · 2025-12-09T21:14:50Z

AIML-2004 initial migration

src/sentry/seer/endpoints/group_ai_autofix.py

codecov · 2025-12-09T21:22:26Z

Codecov Report

❌ Patch coverage is 70.61404% with 67 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/sentry/seer/autofix/on_completion_hook.py	60.90%	43 Missing ⚠️
src/sentry/seer/autofix/autofix_agent.py	63.41%	15 Missing ⚠️
src/sentry/seer/endpoints/group_ai_autofix.py	80.64%	6 Missing ⚠️
src/sentry/seer/autofix/issue_summary.py	25.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff            @@
##           master   #104615    +/-   ##
=========================================
  Coverage   80.51%    80.52%            
=========================================
  Files        9329      9337     +8     
  Lines      400842    401462   +620     
  Branches    25705     25705            
=========================================
+ Hits       322756    323278   +522     
- Misses      77620     77718    +98     
  Partials      466       466

kddubey · 2025-12-11T17:12:14Z

src/sentry/seer/autofix/prompts/code_changes.py

@@ -0,0 +1,11 @@
+CODE_CHANGES_PROMPT = """Implement the fix for issue {short_id}: "{title}" (culprit: {culprit})


will be less error-prone to wrap this in a function which inputs parameters and outputs a dedented formatted string

kddubey · 2025-12-11T17:18:57Z

src/sentry/seer/autofix/prompts/triage.py

+
+When you have enough information, generate the triage artifact with:
+- suspect_commit: If you can identify a likely culprit commit:
+  - sha: The git commit SHA


double checking that explorer will see short SHAs, not the 40 char ones?

yeah, this whole model will need to be reworked soon to enable better UI, just have this here as a placeholder so I can test in prod with real data

kddubey · 2025-12-11T17:19:33Z

src/sentry/seer/autofix/artifact_schemas.py

@@ -0,0 +1,92 @@
+from __future__ import annotations


nit: is this necessary?

kddubey · 2025-12-11T17:25:15Z

src/sentry/seer/autofix/prompts/root_cause.py

+- five_whys: Chain of "why" statements leading to the root cause.
+- reproduction_steps: Steps that would reproduce this issue


i like the event timeline artifact + prompting in current RCA. "repro steps" could be reasonably interpreted as how to repro in a test env, rather than how the issue/event at hand unfolded

that's fair, but I was intentionally going for a minimal, test env repro. I kinda like it better. But we'll see, it's very easy to change prompts and UI once we can feel it out in prod

kddubey · 2025-12-11T17:36:00Z

src/sentry/seer/autofix/prompts/__init__.py

@@ -0,0 +1,17 @@
+"""
+Prompts for Explorer-based Autofix steps.


nit: prompts in one file seems simpler. same is done for schemas. then no need for init logic and can order them more linearly

kddubey · 2025-12-11T17:42:41Z

src/sentry/seer/autofix/prompts/root_cause.py

+
+Guidelines:
+1. Use your tools to fetch the issue details and examine the evidence
+2. Investigate the trace, replay, logs, other issues, trends, and other telemetry when available to gain a deeper understanding of the issue


(for later) does transitioning to explorer mean we can encourage it to look at other events in the issue to capture a slightly broader picture? feel like it's nicer for RCA to not mention transient, event-specific info. maybe explorer will search events on its own if it needs to, since i see there's already prompting + tool descs around issue vs event

yep this is definitely possible. we can prompt engineer later, but in local testing on mock issues, explorer was already using Discover to find multiple samples so it might just happen naturally

kddubey · 2025-12-11T17:59:00Z

src/sentry/seer/autofix/autofix_agent.py

+            metadata=metadata,
+        )
+    else:
+        return client.continue_run(


how does context accumulation work—is it that all artifacts for a run_id are included as context? feel like the impact and triage runs should only have RCA as context, not sure

yeah the whole run is one continuous chat and the agent can freely update any artifact it has been tasked with generating before when it deems it's appropriate. This is intentional because the siloing of steps in Autofix were what made the UX so rigid (imagine investigating the solution and the agent realizes the root cause was wrong, you'd have to rethink all the way up the convo instead of it just editing the root cause with the new info)

kddubey · 2025-12-11T18:03:18Z

src/sentry/seer/autofix/on_completion_hook.py

+        artifacts: dict[str, Artifact],
+    ) -> None:
+        """
+        Continue to the next step if stopping_point hasn't been reached.


would be cool to trigger triage and impact concurrently w/ solution

it would be, i just didn't want to mess with automation too much in this PR. we can def discuss what ideal automation flow looks like in a world with 5 steps and steps that can run in any order

Adds UI to support the new explorer-backed Autofix agent. Controlled by explorer FF and a new one. Should preserve old UI as is if not behind the flags. <img width="1610" height="1686" alt="image" src="https://github.com/user-attachments/assets/c8b095a5-e2f0-4c58-af53-a9f291f55a6d" /> <img width="720" height="591" alt="image" src="https://github.com/user-attachments/assets/1a0fb45f-5825-4f4f-be02-3e1ae893a94c" /> Have tickets for follow up work like backrgound agents, better code changes display, and better suspect commit card. Backend: #104615 Part of AIML-2004 and AIML-1732

cursor · 2025-12-12T14:01:22Z

src/sentry/seer/autofix/prompts.py

+        - impacts: List of specific impacts, each with:
+          - label: What is impacted (e.g., "User Authentication", "Payment Flow")
+          - impact_description: One line describing the impact
+          - evidence: Evidence or reasoning for this assessment


Bug: Impact assessment prompt missing required rating field

The ImpactItem schema in artifact_schemas.py requires a rating field of type Literal["low", "medium", "high"], but the impact_assessment_prompt function only instructs the agent to include label, impact_description, and evidence. Since rating is not optional (no default value), artifacts generated by the agent following these prompt instructions will fail Pydantic validation when the schema is applied.

Additional Locations (1)

src/sentry/seer/autofix/artifact_schemas.py#L18-L21

cursor · 2025-12-12T14:01:22Z

src/sentry/seer/autofix/on_completion_hook.py

+
+        elif "impact_assessment" in artifacts and artifacts["impact_assessment"].data:
+            webhook_event = "impact_assessment_completed"
+            webhook_payload["impact_assessment"] = artifacts["impact_assessment"].data


Bug: Invalid webhook event types for triage and impact

The _send_step_webhook method uses event names "triage_completed" and "impact_assessment_completed", but these are not defined in the SentryAppEventType enum. The broadcast_webhooks_for_organization task validates event types against this enum and raises SentryAppSentryError for invalid types (lines 949-956 in sentry_apps.py). This will cause webhook failures when the triage or impact_assessment steps complete.

cursor · 2025-12-12T14:01:22Z

src/sentry/seer/autofix/on_completion_hook.py

+
+        elif "root_cause" in artifacts and artifacts["root_cause"].data:
+            webhook_event = "root_cause_completed"
+            webhook_payload["root_cause"] = artifacts["root_cause"].data


Bug: Webhook step detection wrong when triage/impact artifacts exist

The _send_step_webhook method determines which step completed by checking artifact presence in a fixed order (triage first, then impact_assessment, etc.). Since artifacts accumulate across all steps in a run, if triage or impact_assessment was run before other pipeline steps, subsequent step completions will incorrectly trigger the triage/impact webhook instead of the correct one. For example, if a user runs root_cause, then triage, then solution, the solution completion would check "triage" in artifacts first (which is true from the earlier step) and incorrectly send a triage webhook instead of a solution webhook.

feat(autofix): migrate to explorer agent

5785edf

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Dec 9, 2025

roaga mentioned this pull request Dec 9, 2025

feat(autofix): add UI for explorer-backed agent #104618

Merged

github-advanced-security bot found potential problems Dec 9, 2025

View reviewed changes

src/sentry/seer/endpoints/group_ai_autofix.py Dismissed Show dismissed Hide dismissed

src/sentry/seer/endpoints/group_ai_autofix.py Dismissed Show dismissed Hide dismissed

rename test file

9e115c2

vercel bot deployed to Preview December 9, 2025 21:29 View deployment

better type annotations

3b4c998

vercel bot deployed to Preview December 9, 2025 21:38 View deployment

fix typing

a313616

vercel bot deployed to Preview December 9, 2025 21:50 View deployment

roaga requested a review from aliu39 December 9, 2025 22:05

Merge branch 'master' into autofix/migrate-to-explorer-be

b8a5ba8

vercel bot deployed to Preview December 9, 2025 22:11 View deployment

un-nest import

873c34c

vercel bot deployed to Preview December 10, 2025 15:07 View deployment

remove comments

dbebbcd

vercel bot deployed to Preview December 10, 2025 15:10 View deployment

fix nested

f9ac924

vercel bot deployed to Preview December 10, 2025 15:16 View deployment

re-nest

fc0741b

vercel bot deployed to Preview December 10, 2025 16:12 View deployment

kddubey approved these changes Dec 11, 2025

View reviewed changes

pr feedback

56d5e77

vercel bot deployed to Preview December 11, 2025 19:08 View deployment

roaga marked this pull request as ready for review December 12, 2025 13:53

roaga requested a review from a team as a code owner December 12, 2025 13:53

roaga merged commit c4d1cc6 into master Dec 12, 2025
67 checks passed

roaga deleted the autofix/migrate-to-explorer-be branch December 12, 2025 13:55

cursor bot reviewed Dec 12, 2025

View reviewed changes

github-actions bot locked and limited conversation to collaborators Dec 28, 2025

		@@ -0,0 +1,11 @@
		CODE_CHANGES_PROMPT = """Implement the fix for issue {short_id}: "{title}" (culprit: {culprit})

		- five_whys: Chain of "why" statements leading to the root cause.
		- reproduction_steps: Steps that would reproduce this issue

		@@ -0,0 +1,17 @@
		"""
		Prompts for Explorer-based Autofix steps.

Uh oh!

Conversation

roaga commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear bot commented Dec 9, 2025

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roaga Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot Dec 12, 2025

Choose a reason for hiding this comment

Bug: Impact assessment prompt missing required rating field

Uh oh!

cursor bot Dec 12, 2025

Choose a reason for hiding this comment

Bug: Invalid webhook event types for triage and impact

Uh oh!

cursor bot Dec 12, 2025

Choose a reason for hiding this comment

Bug: Webhook step detection wrong when triage/impact artifacts exist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roaga commented Dec 9, 2025 •

edited

Loading

codecov bot commented Dec 9, 2025 •

edited

Loading

roaga Dec 11, 2025 •

edited

Loading

Bug: Impact assessment prompt missing required `rating` field