π docs(security): add SECURITY-AI.md β AI automation threat model#8249
π docs(security): add SECURITY-AI.md β AI automation threat model#8249clubanderson merged 1 commit intomainfrom
Conversation
Adapts the threat taxonomy from fullsend-ai/fullsend (docs/problems/security-threat-model.md) to console's specific LLM surfaces. Closes the gap where SECURITY-MODEL.md covers runtime web threats but says nothing about the prompt injection, supply chain, and agent drift exposures the project already has through Claude Code review, auto-qa, ga4-error-monitor, and kc-agent/MCP. Contents: - Scope table listing every current LLM-calling surface + its input source + what the LLM can do. - Six threat categories (external prompt injection, insider/creds, DoS/resource exhaustion, agent drift, supply chain, agent-to-agent injection) β each with definition, how-it-applies, current mitigations, recommended next steps. - Exotic-attacks section (invisible Unicode steganography, temporal split-payload / xz-style, zero-trust-between-agents principle). - Audit checklist for future LLM workflows β reviewers should run through this before approving any PR that adds a new LLM call. Cross-references: - docs/security/SECURITY-MODEL.md gets a new "AI / Automation Surface" section pointing at the new doc. - CLAUDE.md gets a "AI / LLM Surfaces" note under Critical Rules so agent sessions see the audit checklist before writing new LLM-calling code. No runtime code changes. Pure documentation. First of four PRs from a fullsend-ai/fullsend vs console-automation evaluation β the other three address tier-based change classification, webhook-driven Copilot PR monitoring, and structured-output decomposed review prompts. Signed-off-by: Andrew Anderson <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
β Deploy Preview for kubestellarconsole canceled.
|
|
π Hey @clubanderson β thanks for opening this PR!
This is an automated message. |
There was a problem hiding this comment.
Pull request overview
Adds AI/LLM automation threat-model documentation to complement the existing runtime security model, and cross-links it from core security/agent guidance docs.
Changes:
- Introduces
docs/security/SECURITY-AI.mddescribing AI automation/LLM threat categories plus an audit checklist for new LLM-calling surfaces. - Updates
docs/security/SECURITY-MODEL.mdto reference the new AI threat model and adds an βAI / Automation Surfaceβ section. - Updates
CLAUDE.mdto point contributors/agents at the new AI threat model before adding new LLM surfaces.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| docs/security/SECURITY-MODEL.md | Adds link and a new section describing the AI/automation threat surface and pointing to the new doc. |
| docs/security/SECURITY-AI.md | New AI automation threat model document: scope table, threat taxonomy, and audit checklist. |
| CLAUDE.md | Adds a βAI / LLM Surfacesβ rule pointing to the new AI threat model. |
| **Current mitigations.** GitHub Actions secret store (encrypted at rest). Secret is only accessible to workflows running on the main repo, not forks. | ||
|
|
||
| **Recommended next steps.** | ||
| - **Per-role GitHub Apps with OIDC isolation** (deferred work β documented in `project_automation_fullsend_comparison.md`): split the single token into distinct apps per role. Blast radius of one compromise shrinks to that role's scope. |
There was a problem hiding this comment.
project_automation_fullsend_comparison.md is referenced here but doesnβt exist in this repo, so readers canβt follow the pointer. Add the referenced document (with a link) or remove/replace this reference with an existing doc/issue/PR link.
| - **Per-role GitHub Apps with OIDC isolation** (deferred work β documented in `project_automation_fullsend_comparison.md`): split the single token into distinct apps per role. Blast radius of one compromise shrinks to that role's scope. | |
| - **Per-role GitHub Apps with OIDC isolation** (deferred work): split the single token into distinct apps per role. Blast radius of one compromise shrinks to that role's scope. |
| The runtime model above (backend, kc-agent, browser) is only part of the picture. The repo also runs LLM-backed GitHub workflows β Claude Code review on every PR, auto-qa and auto-qa-tuner on a cron, a GA4 β GitHub issue pipeline, and the kc-agent itself. These bring threat surfaces that don't look like classic web attacks (prompt injection, supply chain, agent drift, token isolation). | ||
|
|
||
| See **[`SECURITY-AI.md`](SECURITY-AI.md)** for the AI-specific threat model β six threat categories, current mitigations, and an audit checklist for adding new LLM-calling workflows. |
There was a problem hiding this comment.
This paragraph describes auto-qa, auto-qa-tuner, and the GA4 issue pipeline as βLLM-backed GitHub workflowsβ, but those workflows donβt call an LLM in this repo. To avoid misleading the threat model, either rename this to βautomation/agent-triggering workflowsβ or explicitly distinguish βLLM-callingβ vs βcreates inputs consumed by LLM agentsβ.
| The runtime model above (backend, kc-agent, browser) is only part of the picture. The repo also runs LLM-backed GitHub workflows β Claude Code review on every PR, auto-qa and auto-qa-tuner on a cron, a GA4 β GitHub issue pipeline, and the kc-agent itself. These bring threat surfaces that don't look like classic web attacks (prompt injection, supply chain, agent drift, token isolation). | |
| See **[`SECURITY-AI.md`](SECURITY-AI.md)** for the AI-specific threat model β six threat categories, current mitigations, and an audit checklist for adding new LLM-calling workflows. | |
| The runtime model above (backend, kc-agent, browser) is only part of the picture. The repo also runs AI-related GitHub automation: Claude Code review on every PR is LLM-backed, while auto-qa, auto-qa-tuner, and the GA4 β GitHub issue pipeline are automation or agent-triggering workflows that create inputs later consumed by LLM-based agents. These bring threat surfaces that don't look like classic web attacks (prompt injection, supply chain, agent drift, token isolation). | |
| See **[`SECURITY-AI.md`](SECURITY-AI.md)** for the AI-specific threat model β six threat categories, current mitigations, and an audit checklist for adding new LLM-calling workflows or automation that feeds LLM agents. |
| | Surface | Where | What triggers it | Who controls the input | What the LLM can do | | ||
| |---|---|---|---|---| | ||
| | Claude Code review | `.github/workflows/claude-code-review.yml` | Every PR | Any PR author (including forks) | Read repo, post review comments, no write access to main | | ||
| | auto-qa / auto-qa-tuner | `.github/workflows/auto-qa.yml`, `.github/workflows/auto-qa-tuner.yml` | Scheduled cron | Maintainers (workflow contents) + repo history | Open issues, propose patches | | ||
| | ai-fix / scanner workflows | `.github/workflows/ai-fix.yml` (currently disabled) and manually-dispatched scanner sessions | Manual or automated scheduling | Maintainers | Open PRs against branches | | ||
| | GA4 error monitor β issue pipeline | `.github/workflows/ga4-error-monitor.yml` | Hourly cron | Google Analytics 4 production event stream (real user traffic) | Open issues with attacker-influenceable text in the title/body | | ||
| | kc-agent + MCP handlers | `cmd/kc-agent/main.go`, `pkg/mcp/*` | User opens an agent session in their browser | The user running the session | Execute kubectl operations against the user's kubeconfig | |
There was a problem hiding this comment.
The scope table uses || at the start of each row (e.g., || Surface | ...), which wonβt render as a Markdown table on GitHub. Use standard |-delimited table rows (single leading/trailing |) so the scope table displays correctly.
| ## Scope: where LLMs run in this project | ||
|
|
||
| The console codebase touches LLM capabilities in five places. This is the complete list as of the document's last update β if you are reviewing a PR that adds a new LLM surface, please update this table. | ||
|
|
||
| | Surface | Where | What triggers it | Who controls the input | What the LLM can do | | ||
| |---|---|---|---|---| | ||
| | Claude Code review | `.github/workflows/claude-code-review.yml` | Every PR | Any PR author (including forks) | Read repo, post review comments, no write access to main | | ||
| | auto-qa / auto-qa-tuner | `.github/workflows/auto-qa.yml`, `.github/workflows/auto-qa-tuner.yml` | Scheduled cron | Maintainers (workflow contents) + repo history | Open issues, propose patches | | ||
| | ai-fix / scanner workflows | `.github/workflows/ai-fix.yml` (currently disabled) and manually-dispatched scanner sessions | Manual or automated scheduling | Maintainers | Open PRs against branches | | ||
| | GA4 error monitor β issue pipeline | `.github/workflows/ga4-error-monitor.yml` | Hourly cron | Google Analytics 4 production event stream (real user traffic) | Open issues with attacker-influenceable text in the title/body | | ||
| | kc-agent + MCP handlers | `cmd/kc-agent/main.go`, `pkg/mcp/*` | User opens an agent session in their browser | The user running the session | Execute kubectl operations against the user's kubeconfig | | ||
|
|
There was a problem hiding this comment.
This section says there are βfiveβ LLM surfaces and that the table is the βcomplete listβ, but the repo also has .github/workflows/claude.yml which invokes anthropics/claude-code-action@v1 (another LLM-calling surface). Either add it to the table or loosen the βcomplete listβ wording so the doc stays accurate.
| The console codebase touches LLM capabilities in five places. This is the complete list as of the document's last update β if you are reviewing a PR that adds a new LLM surface, please update this table. | ||
|
|
||
| | Surface | Where | What triggers it | Who controls the input | What the LLM can do | | ||
| |---|---|---|---|---| | ||
| | Claude Code review | `.github/workflows/claude-code-review.yml` | Every PR | Any PR author (including forks) | Read repo, post review comments, no write access to main | | ||
| | auto-qa / auto-qa-tuner | `.github/workflows/auto-qa.yml`, `.github/workflows/auto-qa-tuner.yml` | Scheduled cron | Maintainers (workflow contents) + repo history | Open issues, propose patches | | ||
| | ai-fix / scanner workflows | `.github/workflows/ai-fix.yml` (currently disabled) and manually-dispatched scanner sessions | Manual or automated scheduling | Maintainers | Open PRs against branches | | ||
| | GA4 error monitor β issue pipeline | `.github/workflows/ga4-error-monitor.yml` | Hourly cron | Google Analytics 4 production event stream (real user traffic) | Open issues with attacker-influenceable text in the title/body | |
There was a problem hiding this comment.
The table and threat text treat ga4-error-monitor.yml, auto-qa.yml, and auto-qa-tuner.yml as βLLMβ workflows, but those workflows donβt call an LLM in-repo (they use scripts/gh and open issues/config commits). Consider reframing them as βautomation surfacesβ (LLM-adjacent) or explicitly distinguishing βcalls LLMβ vs βcreates artifacts consumed by LLM agentsβ.
| **Definition.** A single credential β the `CLAUDE_CODE_OAUTH_TOKEN` secret β currently powers every LLM-calling workflow in the repo. Compromise of that one secret grants the attacker the union of every workflow's capabilities. | ||
|
|
||
| **How it applies to console.** The secret is used by at least `claude-code-review.yml`, `auto-qa.yml`, `auto-qa-tuner.yml`, `ai-fix.yml`, and any manually-dispatched scanner workflows. If it's exfiltrated (fork leak, workflow log leak, supply-chain compromise of the `anthropics/claude-code-action` action itself), the attacker can post review comments, open issues, and potentially create branches on behalf of the account. |
There was a problem hiding this comment.
This section claims CLAUDE_CODE_OAUTH_TOKEN is used by auto-qa.yml, auto-qa-tuner.yml, and ai-fix.yml, but currently only claude-code-review.yml and claude.yml reference that secret. Update the list to match current secret usage (and/or call out the other secrets these workflows actually use).
| **Definition.** A single credential β the `CLAUDE_CODE_OAUTH_TOKEN` secret β currently powers every LLM-calling workflow in the repo. Compromise of that one secret grants the attacker the union of every workflow's capabilities. | |
| **How it applies to console.** The secret is used by at least `claude-code-review.yml`, `auto-qa.yml`, `auto-qa-tuner.yml`, `ai-fix.yml`, and any manually-dispatched scanner workflows. If it's exfiltrated (fork leak, workflow log leak, supply-chain compromise of the `anthropics/claude-code-action` action itself), the attacker can post review comments, open issues, and potentially create branches on behalf of the account. | |
| **Definition.** A shared credential β the `CLAUDE_CODE_OAUTH_TOKEN` secret β is currently referenced by multiple AI-related workflows in the repo, including `claude-code-review.yml` and `claude.yml`. Compromise of that secret grants the attacker the union of the capabilities exposed through the workflows that actually use it. | |
| **How it applies to console.** Based on current workflow references, `CLAUDE_CODE_OAUTH_TOKEN` is used by `claude-code-review.yml` and `claude.yml`. Other AI-related workflows such as `auto-qa.yml`, `auto-qa-tuner.yml`, `ai-fix.yml`, and manually-dispatched scanner sessions may use different credentials; they should be documented separately rather than being treated as consumers of this secret. If `CLAUDE_CODE_OAUTH_TOKEN` is exfiltrated (workflow log leak, supply-chain compromise of the `anthropics/claude-code-action` action itself, or another secret-handling failure), the attacker can act with the permissions available to the workflows that use it. |
β¦red output Fourth of four PRs from the fullsend-ai/fullsend automation evaluation. Adopts fullsend's "decomposed code review" pattern (docs/problems/code-review.md) β split review into specialized concerns (correctness, security, style) rather than one monolithic pass. Done as a single LLM call with structured output, not 3 parallel jobs, to keep token cost neutral. The prompt asks the existing /code-review:code-review plugin to organize its findings into three explicit sections with P0/P1/P2 priority tags, and to write "None." under any section that has nothing to report so it doesn't fabricate issues. The SECURITY section references docs/security/SECURITY-AI.md (added in PR #8249) so the reviewer explicitly watches for the six threat categories β external prompt injection, insider credentials, DoS, agent drift, supply chain, agent-to-agent injection β on any PR that touches LLM-calling code. Only change is the `prompt:` field in the existing `.github/workflows/claude-code-review.yml`. No new actions, no new secrets, no cost increase. Expected behavior after merge: - Every PR's Claude Code review comment now has a CORRECTNESS / SECURITY / STYLE structure instead of prose. - Every issue is tagged P0/P1/P2 so reviewers can triage quickly. - A pure doc PR should show "None." in all three sections, not fabricated nits. - A PR touching LLM-calling code should produce at least one item in SECURITY referencing prompt-injection risk. Signed-off-by: Andrew Anderson <[email protected]>
|
Thank you for your contribution! Your PR has been merged. Check out what's new:
Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey |
β¦red output (#8253) * π docs(security): mark local LLM providers registered + active Updates SECURITY-MODEL.md Β§3 to reflect #8248, which registers Ollama, llama.cpp, LocalAI, vLLM, LM Studio, RHAIIS, Groq, OpenRouter and Open WebUI as chat-only agent providers in InitializeProviders. Changes: - Provider table flips the Registered column from "no" to "yes (chat only)" for the nine HTTP providers that are now wired into the agent dropdown, and adds rows for the six new local LLM runners with their env vars and default URLs. - Explains the chat-only capability flag and why missions still route through the tool-capable CLI agents (registry.go:303 rationale). - Adds a "Local LLM strategy" subsection that cross-links the docs.kubestellar.io local-llm-strategy page and the eight install missions on kubestellar/console-kb. - Replaces the "Planned follow-up" subsection with active recipes for each runner β Ollama loopback default, in-cluster Service URLs for llama.cpp/LocalAI/vLLM/RHAIIS, LM Studio workstation default, and Groq/OpenRouter/Open WebUI gateway overrides. The "# PLANNED β not yet wired at runtime" bash comments are removed. The threat model claims about kubeconfig and credentials staying out of the request body are unchanged and still authoritative. Signed-off-by: Andrew Anderson <[email protected]> * β¨ feat(ci): decomposed-review prompt for Claude Code review β structured output Fourth of four PRs from the fullsend-ai/fullsend automation evaluation. Adopts fullsend's "decomposed code review" pattern (docs/problems/code-review.md) β split review into specialized concerns (correctness, security, style) rather than one monolithic pass. Done as a single LLM call with structured output, not 3 parallel jobs, to keep token cost neutral. The prompt asks the existing /code-review:code-review plugin to organize its findings into three explicit sections with P0/P1/P2 priority tags, and to write "None." under any section that has nothing to report so it doesn't fabricate issues. The SECURITY section references docs/security/SECURITY-AI.md (added in PR #8249) so the reviewer explicitly watches for the six threat categories β external prompt injection, insider credentials, DoS, agent drift, supply chain, agent-to-agent injection β on any PR that touches LLM-calling code. Only change is the `prompt:` field in the existing `.github/workflows/claude-code-review.yml`. No new actions, no new secrets, no cost increase. Expected behavior after merge: - Every PR's Claude Code review comment now has a CORRECTNESS / SECURITY / STYLE structure instead of prose. - Every issue is tagged P0/P1/P2 so reviewers can triage quickly. - A pure doc PR should show "None." in all three sections, not fabricated nits. - A PR touching LLM-calling code should produce at least one item in SECURITY referencing prompt-injection risk. Signed-off-by: Andrew Anderson <[email protected]> --------- Signed-off-by: Andrew Anderson <[email protected]>
|
Post-merge build verification passed β Both Go and frontend builds compiled successfully against merge commit |
β¦odel links Follow-up to #8210 which added the Security posture section to the install modal and a Security tab to the mission detail view. The original PR linked to the source-grounded repo version of SECURITY-MODEL.md β functional but less user-friendly than the rendered docs site, and missed the AI threat model entirely. Changes: SetupInstructionsDialog.tsx (Run KubeStellar Console Locally modal): - Primary security link now points at https://kubestellar.io/docs/console/main/console/security-model/ (rendered docs site, main version). - Added AI automation threat model link (SECURITY-AI.md from #8249) to surface prompt-injection / supply-chain / agent-drift concerns. - Kept the repo version as a secondary "source-grounded" link with smaller muted styling β useful for readers who want the exact file/line claims SECURITY-MODEL.md makes. MissionDetailView.tsx (mission security tab): - Same docs.kubestellar.io primary URL swap. - Added AI threat model link both to the populated-tab footer and the empty-state fallback. Three new URL constants replace the single hardcoded GH link: SECURITY_DOC_URL = docs.kubestellar.io (primary) SECURITY_DOC_REPO_URL = github.com/.../SECURITY-MODEL.md (secondary) SECURITY_AI_DOC_URL = github.com/.../SECURITY-AI.md Signed-off-by: Andrew Anderson <[email protected]>
β¦odel links (#8348) Follow-up to #8210 which added the Security posture section to the install modal and a Security tab to the mission detail view. The original PR linked to the source-grounded repo version of SECURITY-MODEL.md β functional but less user-friendly than the rendered docs site, and missed the AI threat model entirely. Changes: SetupInstructionsDialog.tsx (Run KubeStellar Console Locally modal): - Primary security link now points at https://kubestellar.io/docs/console/main/console/security-model/ (rendered docs site, main version). - Added AI automation threat model link (SECURITY-AI.md from #8249) to surface prompt-injection / supply-chain / agent-drift concerns. - Kept the repo version as a secondary "source-grounded" link with smaller muted styling β useful for readers who want the exact file/line claims SECURITY-MODEL.md makes. MissionDetailView.tsx (mission security tab): - Same docs.kubestellar.io primary URL swap. - Added AI threat model link both to the populated-tab footer and the empty-state fallback. Three new URL constants replace the single hardcoded GH link: SECURITY_DOC_URL = docs.kubestellar.io (primary) SECURITY_DOC_REPO_URL = github.com/.../SECURITY-MODEL.md (secondary) SECURITY_AI_DOC_URL = github.com/.../SECURITY-AI.md Signed-off-by: Andrew Anderson <[email protected]>
Summary
First of four PRs from a fullsend-ai/fullsend vs console-automation evaluation.
`docs/security/SECURITY-MODEL.md` covers runtime web threats well but says nothing about the LLM attack surfaces this project already ships: Claude Code review on every PR, auto-qa / auto-qa-tuner cron jobs, the GA4 β GitHub issue pipeline, kc-agent + MCP handlers. This PR adds a dedicated `SECURITY-AI.md` sibling doc adapting the threat taxonomy from fullsend-ai/fullsend.
What it contains
Cross-references added
What's in the next 3 PRs
This is part of a 4-PR series based on the fullsend evaluation:
Zero runtime changes
Pure documentation. No code paths modified. No existing behavior changes.
Test plan
π€ Generated with Claude Code