Skip to content

feat(code_review): Set Sentry tags & context#108435

Merged
armenzg merged 22 commits intomasterfrom
code-review-sdk-tags
Feb 20, 2026
Merged

feat(code_review): Set Sentry tags & context#108435
armenzg merged 22 commits intomasterfrom
code-review-sdk-tags

Conversation

@armenzg
Copy link
Copy Markdown
Member

@armenzg armenzg commented Feb 18, 2026

Summary

  • Add sentry_sdk.set_tags() in the code review Celery task (process_github_webhook_event) to enrich errors with correlation metadata
  • Uses the same tag names as Seer's extract_context() so errors can be searched consistently across both Sentry and Seer projects (e.g., scm_repo_full_name:getsentry/sentry pr_id:42)
  • Tags set: scm_provider, scm_owner, scm_repo_name, scm_repo_full_name, pr_id, sentry_organization_id, sentry_integration_id, github_event
  • Gracefully handles minimal payloads (e.g., check_run rerun events that lack repo data)

Previously, the Celery task had zero sentry_sdk usage, so exceptions captured by instrumented_task had no code-review-specific context for debugging.

Test plan

  • Added TestSetSentryTags with 3 tests: PR event payload, check_run minimal payload, missing owner/name
  • All 32 existing webhook tests still pass

Made with Cursor

…lation

Set minimal Sentry SDK tags in the code review Celery task using the
same naming conventions as Seer (scm_provider, scm_owner, pr_id, etc.)
so errors can be searched consistently across both projects.

Co-authored-by: Cursor <[email protected]>
@armenzg armenzg requested a review from a team as a code owner February 18, 2026 15:18
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 18, 2026
@armenzg armenzg marked this pull request as draft February 18, 2026 17:10
@armenzg armenzg removed the request for review from a team February 18, 2026 17:10
…tions

Merge _set_sentry_tags and log_seer_request into a single _set_tags_and_log
function to eliminate duplicate payload parsing. Filter out None values
before calling sentry_sdk.set_tags() to prevent "None" string pollution
in tag searches (e.g. from check_run minimal payloads).

Co-authored-by: Cursor <[email protected]>
Rename extract_github_info output keys from github_* to scm_* to match
Seer's extract_context() tag names, so errors are searchable with the
same queries across both projects.

- github_owner → scm_owner
- github_repo_name → scm_repo_name
- github_repo_full_name → scm_repo_full_name
- github_event_url → scm_event_url
- Add scm_provider = "github" (always set)

_set_tags_and_log now uses extract_github_info and constructs scm_event_url
from the task payload trigger type, mirroring Seer's extract_context()
logic (ON_NEW_COMMIT links to the commit, ON_COMMAND_PHRASE links to the
comment, default is the PR URL).

Co-authored-by: Cursor <[email protected]>
armenzg and others added 2 commits February 18, 2026 13:01
extract_github_info now calls sentry_sdk.set_tags() as a side effect
so any caller (handlers.py, the Celery task) automatically tags the
scope with the scm_* fields without extra boilerplate.

_set_tags in task.py is simplified to only handle the task-payload-
specific extras (pr_id, sentry_organization_id, sentry_integration_id)
and the trigger-based scm_event_url override.

Co-authored-by: Cursor <[email protected]>
armenzg and others added 3 commits February 18, 2026 13:31
…extract_github_info

- Add organization_id, organization_slug, integration_id params to
  extract_github_info so callers can set all Seer-consistent scope tags
  from one place
- Pass these from handlers.py (which already has organization/integration)
  so the full set of scm_* + sentry_* tags is set at webhook entry
- Remove extra dict threading through handlers now that all context
  is on the Sentry scope automatically
- Fix regression: restore github_to_seer_latency metric that was lost
  when log_seer_request was refactored; emits as
  seer.code_review.task.github_to_seer_latency before each Seer call

Co-authored-by: Cursor <[email protected]>
…with tags parameter

- Added a `tags` parameter to `handle_check_run_event` and `handle_pull_request_event` to allow passing Sentry SDK tags directly.
- Updated the logic in `handle_check_run_event` to merge incoming tags with check_run-specific overrides.
- Refactored tests in `TestExtractGithubInfo` to include organization and integration IDs when calling `get_tags`, ensuring consistent tagging across events.
status = "success"
should_record_latency = True
try:
_set_tags(event_payload, github_event)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Seer, we decided to set tags & context with Sentry rather than passing extra to the log lines.

organization: Organization,
repo: Repository,
integration: RpcIntegration | None = None,
integration: RpcIntegration,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated typing change I can pull out.

You can see that we always pass integration:

self._handle(
github_event=github_event,
integration=integration,
event=event,
organization=orgs[repo.organization_id],
repo=repo,
github_delivery_id=github_delivery_id,
)

and that it's not None earlier in that function:

if integration is None or not installs:
# It seems possible for the GH or GHE app to be installed on their
# end, but the integration to not exist. Possibly from deleting in
# Sentry first or from a failed install flow (where the integration
# didn't get created in the first place)
logger.info(
"github.missing-integration",
extra={
"action": event.get("action"),
"repository": event.get("repository", {}).get("full_name", None),
"external_id": str(external_id),
},
)
metrics.incr("github.webhook.integration_does_not_exist")
return

github_event: GithubWebhookType,
event: Mapping[str, Any],
extra: Mapping[str, str | None],
tags: Mapping[str, Any],
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main handler is going to set the tags and context, however, we want to pass the tags to the scheduled task so we can set it there as well (the info is lost when we schedule a task).


if action is None:
logger.error(Log.MISSING_ACTION.value, extra=extra)
logger.error(Log.MISSING_ACTION.value)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now set the context and tags we don't need to pass it to the logs just so we can filter out logs.

action=validated_event.action,
html_url=validated_event.check_run.html_url,
enqueued_at_str=datetime.now(timezone.utc).isoformat(),
tags=task_tags,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just schedules the task with the tags so it can be set inside the task.

enqueued_at_str: str,
github_event: str,
event_payload: Mapping[str, Any],
tags: Mapping[str, Any] | None = None,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily optional since for a brief moment we will have task scheduled without tags.

payload = event_payload

log_seer_request(event_payload, github_event)
record_github_to_seer_latency(event_payload)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor unrelated refactor. We're going to do the logging within the make_seer_request instead of here.

- github_event: The GitHub event type (e.g., "pull_request", "check_run", "issue_comment")
- github_event_action: The event action (e.g., "opened", "closed", "created")
- pr_id: The pull request number (when available in the event)
- scm_event_url: URL to the specific event (check_run, pull_request, comment, or commit)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Seer we use the scm_ prefix instead of github_ since one day we will support other integrations.

Args:
event: The GitHub webhook event payload
github_event: The GitHub event type (e.g., "pull_request", "check_run", "issue_comment")
organization_id: Sentry organization ID
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@armenzg armenzg marked this pull request as ready for review February 20, 2026 13:43
@armenzg armenzg requested review from a team as code owners February 20, 2026 13:43
Comment on lines +478 to +484
if github_event == "issue_comment":
result["trigger"] = "on_command_phrase"
elif github_event == "pull_request":
if github_event_action in ("opened", "ready_for_review"):
result["trigger"] = "on_ready_for_review"
elif github_event_action == "synchronize":
result["trigger"] = "on_new_commit"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using

class SeerCodeReviewTrigger(StrEnum):
    UNKNOWN = "unknown"
    ON_COMMAND_PHRASE = "on_command_phrase"
    ON_READY_FOR_REVIEW = "on_ready_for_review"
    ON_NEW_COMMIT = "on_new_commit"

elif github_event_action == "synchronize":
result["trigger"] = "on_new_commit"
elif github_event_action == "closed":
result["trigger"] = "pr-closed"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

underscore vs dash - should we keep consistent?

also, all the other have "on_" prefix (which I don't like but again, for the sake of consistency...)

# Extract pr_id from the event.
pr_id = event.get("pull_request", {}).get("number") or event.get("issue", {}).get("number")
if pr_id:
result["pr_id"] = str(pr_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any chance we could use pr_number instead of pr_id? The latter evokes a database ID. But IDK what we currently do in seer at the moment.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

@armenzg armenzg merged commit 62365c5 into master Feb 20, 2026
77 checks passed
@armenzg armenzg deleted the code-review-sdk-tags branch February 20, 2026 16:43
priscilawebdev pushed a commit that referenced this pull request Feb 24, 2026
## Summary

- Add `sentry_sdk.set_tags()` in the code review Celery task
(`process_github_webhook_event`) to enrich errors with correlation
metadata
- Uses the **same tag names as Seer's `extract_context()`** so errors
can be searched consistently across both Sentry and Seer projects (e.g.,
`scm_repo_full_name:getsentry/sentry pr_id:42`)
- Tags set: `scm_provider`, `scm_owner`, `scm_repo_name`,
`scm_repo_full_name`, `pr_id`, `sentry_organization_id`,
`sentry_integration_id`, `github_event`
- Gracefully handles minimal payloads (e.g., check_run rerun events that
lack repo data)

Previously, the Celery task had zero `sentry_sdk` usage, so exceptions
captured by `instrumented_task` had no code-review-specific context for
debugging.

## Test plan

- [x] Added `TestSetSentryTags` with 3 tests: PR event payload,
check_run minimal payload, missing owner/name
- [x] All 32 existing webhook tests still pass


Made with [Cursor](https://cursor.com)

---------

Co-authored-by: Cursor <[email protected]>
mchen-sentry pushed a commit that referenced this pull request Feb 24, 2026
## Summary

- Add `sentry_sdk.set_tags()` in the code review Celery task
(`process_github_webhook_event`) to enrich errors with correlation
metadata
- Uses the **same tag names as Seer's `extract_context()`** so errors
can be searched consistently across both Sentry and Seer projects (e.g.,
`scm_repo_full_name:getsentry/sentry pr_id:42`)
- Tags set: `scm_provider`, `scm_owner`, `scm_repo_name`,
`scm_repo_full_name`, `pr_id`, `sentry_organization_id`,
`sentry_integration_id`, `github_event`
- Gracefully handles minimal payloads (e.g., check_run rerun events that
lack repo data)

Previously, the Celery task had zero `sentry_sdk` usage, so exceptions
captured by `instrumented_task` had no code-review-specific context for
debugging.

## Test plan

- [x] Added `TestSetSentryTags` with 3 tests: PR event payload,
check_run minimal payload, missing owner/name
- [x] All 32 existing webhook tests still pass


Made with [Cursor](https://cursor.com)

---------

Co-authored-by: Cursor <[email protected]>
@github-actions github-actions bot locked and limited conversation to collaborators Mar 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

claude-code-assisted Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants