feat(code-review): Add CodeReviewEvent model and migration [internal] #108531
feat(code-review): Add CodeReviewEvent model and migration [internal] #108531
Conversation
Add the CodeReviewEvent model to track Seer PR code review lifecycle events. This is the data foundation for the PR Review Dashboard, split out as the first of three PRs.
|
This PR has a migration; here is the generated SQL for for --
-- Create model CodeReviewEvent
--
CREATE TABLE "sentry_code_review_event" ("id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY, "organization_id" bigint NOT NULL, "repository_id" integer NOT NULL CHECK ("repository_id" >= 0), "pr_number" integer NULL, "pr_title" text NULL, "pr_author" text NULL, "pr_url" text NULL, "pr_state" varchar(16) NULL, "raw_event_type" varchar(64) NOT NULL, "raw_event_action" varchar(64) NOT NULL, "trigger_id" varchar(64) NULL, "trigger" varchar(64) NULL, "trigger_user" text NULL, "trigger_at" timestamp with time zone NOT NULL, "target_commit_sha" varchar(64) NULL, "status" varchar(32) NOT NULL, "denial_reason" text NULL, "date_added" timestamp with time zone NOT NULL, "webhook_received_at" timestamp with time zone NULL, "preflight_completed_at" timestamp with time zone NULL, "task_enqueued_at" timestamp with time zone NULL, "sent_to_seer_at" timestamp with time zone NULL, "review_started_at" timestamp with time zone NULL, "review_completed_at" timestamp with time zone NULL, "seer_run_id" varchar(64) NULL, "comments_posted" integer NULL CHECK ("comments_posted" >= 0), "review_result" jsonb NULL);
CREATE UNIQUE INDEX CONCURRENTLY "unique_org_repo_trigger_id" ON "sentry_code_review_event" ("organization_id", "repository_id", "trigger_id") WHERE "trigger_id" IS NOT NULL;
CREATE INDEX CONCURRENTLY "sentry_code_review_event_organization_id_1ce9fe63" ON "sentry_code_review_event" ("organization_id");
CREATE INDEX CONCURRENTLY "sentry_code_review_event_date_added_cd622aec" ON "sentry_code_review_event" ("date_added");
CREATE INDEX CONCURRENTLY "sentry_code_organiz_4f4b09_idx" ON "sentry_code_review_event" ("organization_id", "trigger_at");
CREATE INDEX CONCURRENTLY "sentry_code_organiz_7ba32c_idx" ON "sentry_code_review_event" ("organization_id", "repository_id", "trigger_at");
CREATE INDEX CONCURRENTLY "sentry_code_organiz_76bbd1_idx" ON "sentry_code_review_event" ("organization_id", "repository_id", "pr_number"); |
There was a problem hiding this comment.
Here are a few good suggestions (I have some inline comments):
1. Tie status to CodeReviewEventStatus
You have a CodeReviewEventStatus enum but status is a plain CharField, so the DB/ORM doesn’t restrict or document allowed values. You can wire it up like CodeReviewTrigger in repositorysettings.py:
- Add an
as_choices()(or equivalent) onCodeReviewEventStatusand use it in the field’schoices. - Optionally set a default (e.g.
CodeReviewEventStatus.WEBHOOK_RECEIVED) so every row has a status.
That gives validation, better docs, and a clear default.
2. comments_posted type
comments_posted is IntegerField(null=True). For a count that should be non‑negative, use BoundedPositiveIntegerField (or a non‑negative bounded int if you have one) so the DB enforces >= 0. Requires a migration.
3. Index on status
If you’ll query by status (e.g. “in progress”, “failed”, “completed”), add an index that matches how you filter. For example:
Index(fields=["organization_id", "status"])
orIndex(fields=["organization_id", "status", "trigger_at"])
depending on your actual query patterns. Add only if you have (or expect) such queries.
Armen says: We can add indexes as we find uses cases rather than now. They're cheap to add.
4. review_failed_at (optional)
You have review_completed_at but not review_failed_at. If you want to distinguish “when the review finished successfully” vs “when we recorded failure” for analytics or debugging, add a nullable review_failed_at and set it when status becomes REVIEW_FAILED. Otherwise, using only review_completed_at (and maybe status) is fine.
Armen says: We can rename this to review_last_status_change or we can get rid of the field until we have a use case. We could also query Seer for when we have use case for it.
5. repository_id as FK (optional)
repository_id is a BoundedPositiveIntegerField. If you want referential integrity and easier joins to Repository, consider a FlexibleForeignKey("sentry.Repository", ...) instead. Staying with an integer ID is also a valid pattern in Sentry for cross‑references. Only change if you want FK constraints and ORM relations.
Armen says: The same as organization_id.
6. pr_state and raw_event_type / raw_event_action
pr_state: The comment documents expected values (“open”, “closed”, “merged”). If you’re fine with any string, leave as-is; if you want to lock it down, you could use a smallchoiceslist (would require a migration if you add a check constraint).raw_event_type/raw_event_action: Keeping them unconstrained is reasonable so provider-specific values don’t require schema changes.
Summary: The highest-impact, low-friction improvements are (1) wiring status to CodeReviewEventStatus with choices and a default, and (2) using BoundedPositiveIntegerField for comments_posted. The rest depend on how you query the table and how strict you want the schema to be.
There was a problem hiding this comment.
- Tie status to CodeReviewEventStatus
Agree — adding choices + default is low-friction and improves self-documentation. Will add as_choices() and default=CodeReviewEventStatus.WEBHOOK_RECEIVED.
- comments_posted type
Agree — switching to BoundedPositiveIntegerField for consistency with Sentry conventions. Will include in the same migration.
- Index on status
Skipping for now. The only status-based queries are in the stats endpoint, where they're already scoped by organization_id + trigger_at range — so they hit the existing composite index and filter status in-memory. Easy to add later if needed.
- review_failed_at
review_completed_at intentionally represents "pipeline reached terminal state" — the status field distinguishes success vs failure. Adding a separate timestamp for one status value isn't worth the complexity.
- repository_id as FK
Keeping as integer, consistent with current conventions in the repo:
- commit.py — repository_id = BoundedPositiveIntegerField()
- pullrequest.py — repository_id = BoundedPositiveIntegerField()
- releaseheadcommit.py — repository_id = BoundedPositiveIntegerField()
- pr_state and raw_event_type / raw_event_action
Leaving as-is. pr_state values come from provider APIs (GitHub today, potentially others later) — constraining them would require migrations when provider semantics change. raw_event_* being unconstrained is intentional.
|
|
||
| __relocation_scope__ = RelocationScope.Excluded | ||
|
|
||
| organization_id = BoundedBigIntegerField(db_index=True) |
There was a problem hiding this comment.
Any reason we didn't use a full fk here?
There was a problem hiding this comment.
Just doing what all (or at least all I have had a look at) other models do - same as the comment below for repository_id
There was a problem hiding this comment.
Just doing what all (or at least all I have had a look at) other models do - same as the comment below for repository_id
| __relocation_scope__ = RelocationScope.Excluded | ||
|
|
||
| organization_id = BoundedBigIntegerField(db_index=True) | ||
| repository_id = BoundedPositiveIntegerField() |
There was a problem hiding this comment.
I've mentioned this above in another comment, see below. Having said that, I don't have strong opinions about this and was just following other places that did this. If this is a big deal, we can change the model because there's no data yet (would only be added once #108533 is merged)
repository_id as FK
Keeping as integer, consistent with current conventions in the repo:
commit.py — repository_id = BoundedPositiveIntegerField()
pullrequest.py — repository_id = BoundedPositiveIntegerField()
releaseheadcommit.py — repository_id = BoundedPositiveIntegerField()
| denial_reason = models.TextField(null=True) | ||
|
|
||
| # Timestamps for pipeline stages | ||
| date_added = models.DateTimeField(default=timezone.now, db_index=True) |
There was a problem hiding this comment.
Probably we should have used DefaultFieldsModel here
There was a problem hiding this comment.
Hmm, good point, I didn't know about these. Yeah if we go ahead and recreate the model (depends on the PK discussion above), than I can do this too.
| app_label = "sentry" | ||
| db_table = "sentry_code_review_event" |
There was a problem hiding this comment.
Should this model be in its own project? Are you starting a new module for wherever this will be used?
There was a problem hiding this comment.
Not expecting to add any more at the moment.
…#108531) ## Summary > [!NOTE] > This is currently an internal/experimental tool, not meant to be released to customers as is. Take it as a conversation starter and source of internal UX feedback. - Add `CodeReviewEvent` model to track Seer PR code review lifecycle events (trigger, completion, status) - Includes composite indexes for efficient querying by org, repo, and PR number - Unique constraint on `(organization_id, repository_id, trigger_id)` to prevent duplicate events **PR 1 of 3** for the PR Review Dashboard feature. - Project overview: https://linear.app/getsentry/project/pr-review-dashboard-dcc8b47f6d28/overview - Full PR (being split): vaind#1 ## Test plan - Migration applies cleanly on a fresh database - Backend logic and API endpoints (PR 2) will add tests exercising this model
…#108531) ## Summary > [!NOTE] > This is currently an internal/experimental tool, not meant to be released to customers as is. Take it as a conversation starter and source of internal UX feedback. - Add `CodeReviewEvent` model to track Seer PR code review lifecycle events (trigger, completion, status) - Includes composite indexes for efficient querying by org, repo, and PR number - Unique constraint on `(organization_id, repository_id, trigger_id)` to prevent duplicate events **PR 1 of 3** for the PR Review Dashboard feature. - Project overview: https://linear.app/getsentry/project/pr-review-dashboard-dcc8b47f6d28/overview - Full PR (being split): vaind#1 ## Test plan - Migration applies cleanly on a fresh database - Backend logic and API endpoints (PR 2) will add tests exercising this model
Address post-merge feedback on CodeReviewEvent from PR #108531: - Use DefaultFieldsModel base class for standard date_added/date_updated fields (wedamija, Dan Fuller) - Add FlexibleForeignKey for organization and repository instead of plain integer fields, enabling cascade deletion (wedamija, Dan Fuller, Armen) - Change __relocation_scope__ from Excluded to Organization so data is preserved during org relocation (markstory) - Add backup test coverage in create_exhaustive_organization() The migration uses SeparateDatabaseAndState because the underlying column names (organization_id, repository_id) don't change — only Django's field representation and the addition of FK constraints. Co-Authored-By: Claude <[email protected]>
Address post-merge feedback on CodeReviewEvent from PR getsentry#108531: - Use DefaultFieldsModel base class for standard date_added/date_updated fields (wedamija, Dan Fuller) - Add FlexibleForeignKey for organization and repository instead of plain integer fields, enabling cascade deletion (wedamija, Dan Fuller, Armen) - Change __relocation_scope__ from Excluded to Organization so data is preserved during org relocation (markstory) - Add backup test coverage in create_exhaustive_organization() The migration uses SeparateDatabaseAndState because the underlying column names (organization_id, repository_id) don't change — only Django's field representation and the addition of FK constraints. Co-Authored-By: Claude <[email protected]>
Remove the CodeReviewEvent model class and mark the table for pending deletion. The table has no data and will be recreated with an updated schema (DefaultFieldsModel, FlexibleForeignKey, Organization relocation scope) per reviewer feedback on PR #108531. The CodeReviewEventStatus enum is kept since migration 1032 references it for the status field default. Co-Authored-By: Claude <[email protected]>
Remove the `CodeReviewEvent` model class and mark the table for pending deletion. The table has no data and will be recreated with an updated schema per reviewer feedback on PR #108531. **This is PR 1 of 3:** 1. **This PR** — `SafeDeleteModel` with `MOVE_TO_PENDING` (removes model from Django state, keeps table in DB) 2. `SafeDeleteModel` with `DELETE` (drops the table) 3. Recreate the model with updated schema: `DefaultFieldsModel`, `FlexibleForeignKey` for org/repo, `RelocationScope.Organization` The `CodeReviewEventStatus` enum is kept in the file since migration 1032 references it for the `status` field default. --------- Co-authored-by: Claude <[email protected]>
…9424) Recreate the `CodeReviewEvent` table with reviewer feedback from PR #108531 applied: - `DefaultFieldsModel` base class for standard `date_added`/`date_updated` fields - `FlexibleForeignKey` for organization and repository (cascade deletion, referential integrity) - `RelocationScope.Global` to match `Repository`'s scope (required for org-scoped exports since Repository is Global) - `DateUpdatedComparator` for `date_added`/`date_updated` in backup comparators - Backup test coverage in `create_exhaustive_organization()` Migration is a clean auto-generated `CreateModel` — no manual edits. **This is PR 3 of 3:** 1. #109420 — `SafeDeleteModel` with `MOVE_TO_PENDING` (merged) 2. #109422 — `SafeDeleteModel` with `DELETE` (merged) 3. **This PR** — Recreate with updated schema --------- Co-authored-by: Claude <[email protected]>
…9424) Recreate the `CodeReviewEvent` table with reviewer feedback from PR #108531 applied: - `DefaultFieldsModel` base class for standard `date_added`/`date_updated` fields - `FlexibleForeignKey` for organization and repository (cascade deletion, referential integrity) - `RelocationScope.Global` to match `Repository`'s scope (required for org-scoped exports since Repository is Global) - `DateUpdatedComparator` for `date_added`/`date_updated` in backup comparators - Backup test coverage in `create_exhaustive_organization()` Migration is a clean auto-generated `CreateModel` — no manual edits. **This is PR 3 of 3:** 1. #109420 — `SafeDeleteModel` with `MOVE_TO_PENDING` (merged) 2. #109422 — `SafeDeleteModel` with `DELETE` (merged) 3. **This PR** — Recreate with updated schema --------- Co-authored-by: Claude <[email protected]>
Summary
Note
This is currently an internal/experimental tool, not meant to be released to customers as is. Take it as a conversation starter and source of internal UX feedback.
CodeReviewEventmodel to track Seer PR code review lifecycle events (trigger, completion, status)(organization_id, repository_id, trigger_id)to prevent duplicate eventsPR 1 of 3 for the PR Review Dashboard feature.
Test plan