✨ feat(ci): decomposed-review prompt for Claude Code review — structured output#8253
✨ feat(ci): decomposed-review prompt for Claude Code review — structured output#8253clubanderson merged 2 commits intomainfrom
Conversation
Updates SECURITY-MODEL.md §3 to reflect #8248, which registers Ollama, llama.cpp, LocalAI, vLLM, LM Studio, RHAIIS, Groq, OpenRouter and Open WebUI as chat-only agent providers in InitializeProviders. Changes: - Provider table flips the Registered column from "no" to "yes (chat only)" for the nine HTTP providers that are now wired into the agent dropdown, and adds rows for the six new local LLM runners with their env vars and default URLs. - Explains the chat-only capability flag and why missions still route through the tool-capable CLI agents (registry.go:303 rationale). - Adds a "Local LLM strategy" subsection that cross-links the docs.kubestellar.io local-llm-strategy page and the eight install missions on kubestellar/console-kb. - Replaces the "Planned follow-up" subsection with active recipes for each runner — Ollama loopback default, in-cluster Service URLs for llama.cpp/LocalAI/vLLM/RHAIIS, LM Studio workstation default, and Groq/OpenRouter/Open WebUI gateway overrides. The "# PLANNED — not yet wired at runtime" bash comments are removed. The threat model claims about kubeconfig and credentials staying out of the request body are unchanged and still authoritative. Signed-off-by: Andrew Anderson <[email protected]>
…red output Fourth of four PRs from the fullsend-ai/fullsend automation evaluation. Adopts fullsend's "decomposed code review" pattern (docs/problems/code-review.md) — split review into specialized concerns (correctness, security, style) rather than one monolithic pass. Done as a single LLM call with structured output, not 3 parallel jobs, to keep token cost neutral. The prompt asks the existing /code-review:code-review plugin to organize its findings into three explicit sections with P0/P1/P2 priority tags, and to write "None." under any section that has nothing to report so it doesn't fabricate issues. The SECURITY section references docs/security/SECURITY-AI.md (added in PR #8249) so the reviewer explicitly watches for the six threat categories — external prompt injection, insider credentials, DoS, agent drift, supply chain, agent-to-agent injection — on any PR that touches LLM-calling code. Only change is the `prompt:` field in the existing `.github/workflows/claude-code-review.yml`. No new actions, no new secrets, no cost increase. Expected behavior after merge: - Every PR's Claude Code review comment now has a CORRECTNESS / SECURITY / STYLE structure instead of prose. - Every issue is tagged P0/P1/P2 so reviewers can triage quickly. - A pure doc PR should show "None." in all three sections, not fabricated nits. - A PR touching LLM-calling code should produce at least one item in SECURITY referencing prompt-injection risk. Signed-off-by: Andrew Anderson <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
✅ Deploy Preview for kubestellarconsole canceled.
|
|
👋 Hey @clubanderson — thanks for opening this PR!
This is an automated message. |
There was a problem hiding this comment.
Pull request overview
Updates the Claude Code review workflow prompt to request a decomposed, structured review output (correctness/security/style with P0–P2 priorities), and revises the security model documentation around local/self-hosted LLM providers.
Changes:
- Adjust
.github/workflows/claude-code-review.ymlto use a multi-line prompt that enforces three review sections and priority tags. - Substantially rewrite
docs/security/SECURITY-MODEL.md’s “Local / Self-Hosted LLMs” section (provider table + guidance/recipes).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
.github/workflows/claude-code-review.yml |
Changes the Claude Code Action prompt to require decomposed/structured review output with priorities. |
docs/security/SECURITY-MODEL.md |
Updates documentation about AI provider registration and local LLM strategy/recipes (currently inconsistent with runtime code). |
| | Groq (OpenAI-compatible, HTTP) | `groq` | `GROQ_API_KEY` | `GROQ_MODEL` | `GROQ_BASE_URL` | **yes (chat only)** | `pkg/agent/provider_groq.go` | | ||
| | OpenRouter (OpenAI-compatible, HTTP) | `openrouter` | `OPENROUTER_API_KEY` | `OPENROUTER_MODEL` | `OPENROUTER_BASE_URL` | **yes (chat only)** | `pkg/agent/provider_openrouter.go` | | ||
| | Open WebUI (OpenAI-compatible, HTTP) | `open-webui` | `OPEN_WEBUI_API_KEY` | `OPEN_WEBUI_MODEL` | `OPEN_WEBUI_URL` | **yes (chat only)** | `pkg/agent/provider_openwebui.go` | | ||
| | Ollama (local, OpenAI-compatible) | `ollama` | `OLLAMA_API_KEY` (optional) | `OLLAMA_MODEL` | `OLLAMA_URL` (default `http://127.0.0.1:11434`) | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` | | ||
| | llama.cpp server | `llamacpp` | `LLAMACPP_API_KEY` (optional) | `LLAMACPP_MODEL` | `LLAMACPP_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` | | ||
| | LocalAI | `localai` | `LOCALAI_API_KEY` (optional) | `LOCALAI_MODEL` | `LOCALAI_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` | | ||
| | vLLM | `vllm` | `VLLM_API_KEY` (optional) | `VLLM_MODEL` | `VLLM_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` | | ||
| | LM Studio | `lm-studio` | `LM_STUDIO_API_KEY` (optional) | `LM_STUDIO_MODEL` | `LM_STUDIO_URL` (default `http://127.0.0.1:1234`) | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` | | ||
| | Red Hat AI Inference Server | `rhaiis` | `RHAIIS_API_KEY` (optional) | `RHAIIS_MODEL` | `RHAIIS_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` | |
There was a problem hiding this comment.
The table now marks Groq/OpenRouter/Open WebUI (and several local runners) as “Registered? yes (chat only)”, but pkg/agent/registry.go:283-307 shows InitializeProviders() only registers CLI/tool-capable providers and explicitly excludes API-only agents. Also, the referenced pkg/agent/provider_local_openai_compat.go file does not exist in the repo. Please correct the table to match current runtime registration (or update the code to actually register these providers, and reference the correct source file).
| | Red Hat AI Inference Server | `rhaiis` | `RHAIIS_API_KEY` (optional) | `RHAIIS_MODEL` | `RHAIIS_URL` | **yes (chat only)** | `pkg/agent/provider_local_openai_compat.go` | | ||
|
|
||
| Note the asymmetry: the upstream OpenAI provider source file hard-codes its hostname as a package-level variable in `pkg/agent/provider_openai.go:15` (no `OPENAI_BASE_URL` override). Groq, OpenRouter, and Open WebUI do parse base-URL env vars, but because those providers are not registered at runtime today, setting those env vars does not actually route AI traffic through a local endpoint. | ||
| "Chat only" means the provider reports `CapabilityChat` but not `CapabilityToolExec`. AI missions that need to execute cluster commands (kubectl, helm) still route through the tool-capable CLI agents (`claude`, `codex`, `gemini-cli`, `antigravity`, `goose`, `copilot-cli`, `bob`); local LLM providers are selectable in the agent dropdown for analysis and chat workflows but do not drive missions. See `pkg/agent/registry.go:303` for the rationale comment and `promoteExecutingDefault()` which keeps a mission-capable agent as the default whenever one is available. |
There was a problem hiding this comment.
This paragraph lists claude as a tool-capable CLI agent and implies promoteExecutingDefault() keeps a “mission-capable agent” default. In code, the registered CLI provider name is claude-code (provider_claudecode.go:130-132), while claude refers to the HTTP API provider (which is intentionally unregistered). Also, promoteExecutingDefault() only promotes away from suggest-only agents like copilot-cli, not from generic chat-only providers. Please adjust the provider names/behavior description to match the actual registry behavior.
| "Chat only" means the provider reports `CapabilityChat` but not `CapabilityToolExec`. AI missions that need to execute cluster commands (kubectl, helm) still route through the tool-capable CLI agents (`claude`, `codex`, `gemini-cli`, `antigravity`, `goose`, `copilot-cli`, `bob`); local LLM providers are selectable in the agent dropdown for analysis and chat workflows but do not drive missions. See `pkg/agent/registry.go:303` for the rationale comment and `promoteExecutingDefault()` which keeps a mission-capable agent as the default whenever one is available. | |
| "Chat only" means the provider reports `CapabilityChat` but not `CapabilityToolExec`. AI missions that need to execute cluster commands (kubectl, helm) still route through the tool-capable CLI agents (`claude-code`, `codex`, `gemini-cli`, `antigravity`, `goose`, `copilot-cli`, `bob`); local LLM providers are selectable in the agent dropdown for analysis and chat workflows but do not drive missions. See `pkg/agent/registry.go:303` for the rationale comment and `promoteExecutingDefault()`, which promotes away from suggest-only agents such as `copilot-cli` rather than guaranteeing a generic chat-only provider becomes the default. |
| #### Ollama | ||
|
|
||
| [Ollama](https://ollama.com) exposes an OpenAI-compatible endpoint at `http://localhost:11434/v1`. Once the Groq provider is registered, `GROQ_BASE_URL` would be honored verbatim and you could repurpose the Groq provider to point at Ollama: | ||
| [Ollama](https://ollama.com) exposes an OpenAI-compatible endpoint at `http://localhost:11434/v1`. kc-agent ships with Ollama as a registered provider; the env var defaults to the loopback endpoint so on a workstation with Ollama running, the provider becomes Available automatically: | ||
|
|
||
| ```bash | ||
| # PLANNED — not yet wired at runtime | ||
| export GROQ_API_KEY=unused-but-nonempty # kc-agent only checks for non-empty | ||
| export GROQ_BASE_URL=http://localhost:11434/v1 | ||
| export GROQ_MODEL=llama3.1:8b | ||
| export OLLAMA_URL=http://127.0.0.1:11434 # optional — this is also the default | ||
| export OLLAMA_MODEL=llama3.2 | ||
| ./bin/kc-agent | ||
| ``` | ||
|
|
||
| kc-agent would then call `http://localhost:11434/v1/chat/completions` (see `pkg/agent/provider_groq.go` where `baseURL + groqChatCompletionsPath` is assembled) with the standard OpenAI request shape. Ollama would handle it natively. | ||
| kc-agent calls `${OLLAMA_URL}/v1/chat/completions` (see `pkg/agent/provider_local_openai_compat.go` — the generic LocalOpenAICompatProvider factory). The dropdown lists "Ollama (Local)"; selecting it routes chat through Ollama. |
There was a problem hiding this comment.
The Ollama example/config in this section relies on OLLAMA_URL/OLLAMA_MODEL and claims kc-agent calls ${OLLAMA_URL}/v1/chat/completions via pkg/agent/provider_local_openai_compat.go. Those env vars and that provider file aren’t present in the codebase, and InitializeProviders() doesn’t register an ollama provider today. Please update this section to describe the currently supported self-hosted path (CLI-based agents), or add/land the missing provider implementation + registration in the same PR/series before documenting it as “working path today.”
|
|
||
| For every issue you raise, prefix it with a priority tag: **P0** (must fix before merge), **P1** (should fix before merge), or **P2** (nice to have, follow-up OK). | ||
|
|
||
| If a section has nothing to report, write exactly `None.` on a line by itself under that heading — do not fabricate issues to fill the section. |
There was a problem hiding this comment.
The instruction says to write exactly None. (with backticks). That wording can cause the model to literally include backticks in the output, which will break any downstream parsing that expects the plain text line None.. Consider changing the prompt to require the exact line None. without backticks, or explicitly say “write the plain text line: None.”
| If a section has nothing to report, write exactly `None.` on a line by itself under that heading — do not fabricate issues to fill the section. | |
| If a section has nothing to report, write the plain text line: None. Put it on a line by itself under that heading — do not fabricate issues to fill the section. |
| # Decomposed-review prompt: single LLM call, structured output | ||
| # with three explicit concern sections + priority ranking. | ||
| # Adapted from fullsend-ai/fullsend's "decomposed code review" | ||
| # pattern (docs/problems/code-review.md) — kept as a single call | ||
| # rather than 3 parallel jobs to avoid tripling the token cost. | ||
| # See docs/security/SECURITY-AI.md for the security concerns the | ||
| # SECURITY section is asked to watch for. | ||
| prompt: | |
There was a problem hiding this comment.
PR description says “Only the prompt: field in .github/workflows/claude-code-review.yml”, but this PR also changes docs/security/SECURITY-MODEL.md substantially. Please either update the PR description to reflect the docs changes, or split the docs edits into a separate PR so reviewers can scope/triage appropriately.
|
Thank you for your contribution! Your PR has been merged. Check out what's new:
Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey |
|
Post-merge build verification passed ✅ Both Go and frontend builds compiled successfully against merge commit |
Summary
Fourth and final PR from the fullsend-ai/fullsend automation evaluation. Adopts fullsend's "decomposed code review" pattern — split review into specialized concerns (correctness, security, style) rather than one monolithic generalist pass.
Implemented as a single LLM call with structured output, not 3 parallel jobs, to keep token cost neutral. The prompt asks the existing
/code-review:code-reviewplugin to organize its findings into three explicit sections with P0/P1/P2 priority tags.What changes
Only the
prompt:field in.github/workflows/claude-code-review.yml. No new actions, no new secrets, no new jobs. The existinganthropics/claude-code-action@v1invocation stays — just asks for a different output shape.The new review structure
Every issue is tagged P0 (must fix before merge) / P1 (should fix) / P2 (follow-up OK) so reviewers can triage at a glance.
The SECURITY section explicitly references
docs/security/SECURITY-AI.md(added in #8249) so the reviewer watches for the six AI threat categories — external prompt injection, insider credentials, DoS/resource exhaustion, agent drift, supply chain, agent-to-agent injection — on every PR that touches LLM-calling code.Anti-fabrication guard
The prompt explicitly instructs: "If a section has nothing to report, write exactly
None.on a line by itself under that heading — do not fabricate issues to fill the section." Without this, the LLM tends to invent nits to avoid looking unhelpful on pure doc PRs.Test plan
None.Series recap
This is the final PR in a four-PR series from the fullsend-ai/fullsend automation evaluation:
SECURITY-AI.md— six-threat AI automation threat modeltier-classifier.yml— label PRs tier/0 through tier/3Deferred work (documented in local memory but not shipping): per-role GitHub Apps with OIDC isolation, git-based intent ledger. Both need org-level coordination beyond a single PR cycle.
🤖 Generated with Claude Code