✨ feat(ci): decomposed-review prompt for Claude Code review — structured output

clubanderson · clubanderson · commit 56782fbb33a6 · 2026-04-15T20:01:06.000-04:00
Fourth of four PRs from the fullsend-ai/fullsend automation evaluation. Adopts fullsend's "decomposed code review" pattern (docs/problems/code-review.md) — split review into specialized concerns (correctness, security, style) rather than one monolithic pass. Done as a single LLM call with structured output, not 3 parallel jobs, to keep token cost neutral. The prompt asks the existing /code-review:code-review plugin to organize its findings into three explicit sections with P0/P1/P2 priority tags, and to write "None." under any section that has nothing to report so it doesn't fabricate issues. The SECURITY section references docs/security/SECURITY-AI.md (added in PR #8249) so the reviewer explicitly watches for the six threat categories — external prompt injection, insider credentials, DoS, agent drift, supply chain, agent-to-agent injection — on any PR that touches LLM-calling code. Only change is the `prompt:` field in the existing `.github/workflows/claude-code-review.yml`. No new actions, no new secrets, no cost increase. Expected behavior after merge: - Every PR's Claude Code review comment now has a CORRECTNESS / SECURITY / STYLE structure instead of prose. - Every issue is tagged P0/P1/P2 so reviewers can triage quickly. - A pure doc PR should show "None." in all three sections, not fabricated nits. - A PR touching LLM-calling code should produce at least one item in SECURITY referencing prompt-injection risk. Signed-off-by: Andrew Anderson <andy@clubanderson.com>
diff --git a/.github/workflows/claude-code-review.yml b/.github/workflows/claude-code-review.yml
@@ -41,7 +41,32 @@ jobs:
           claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
           plugin_marketplaces: 'https://github.com/anthropics/claude-code.git'
           plugins: 'code-review@claude-code-plugins'
-          prompt: '/code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }}'
+          # Decomposed-review prompt: single LLM call, structured output
+          # with three explicit concern sections + priority ranking.
+          # Adapted from fullsend-ai/fullsend's "decomposed code review"
+          # pattern (docs/problems/code-review.md) — kept as a single call
+          # rather than 3 parallel jobs to avoid tripling the token cost.
+          # See docs/security/SECURITY-AI.md for the security concerns the
+          # SECURITY section is asked to watch for.
+          prompt: |
+            /code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }}
+
+            After producing the review, organize your comments into exactly three sections in this order:
+
+            ## CORRECTNESS
+            Bugs, race conditions, logic errors, broken edge cases, incorrect assumptions, type mismatches, missing null/undefined guards. Things that will make the code behave wrong at runtime.
+
+            ## SECURITY
+            Authentication/authorization gaps, input validation, secret handling, injection (SQL, command, prompt), unsafe deserialization, path traversal, CORS, CSRF, and — for any LLM-calling code — the six threat categories from `docs/security/SECURITY-AI.md` (external prompt injection, insider credentials, DoS/resource exhaustion, agent drift, supply chain, agent-to-agent injection).
+
+            ## STYLE
+            Naming, comments, idiomatic patterns, dead code, missing constants (magic numbers/strings), hardcoded routes, localization gaps. Things that affect readability and maintainability but not correctness.
+
+            For every issue you raise, prefix it with a priority tag: **P0** (must fix before merge), **P1** (should fix before merge), or **P2** (nice to have, follow-up OK).
+
+            If a section has nothing to report, write exactly `None.` on a line by itself under that heading — do not fabricate issues to fill the section.
+
+            Keep the total response scannable: prefer bullets over prose, link to specific file:line references, and don't repeat the same issue across sections.
           # See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
           # or https://code.claude.com/docs/en/cli-reference for available options