-
Notifications
You must be signed in to change notification settings - Fork 0
Security Model
Unleashed's primary threat is accidental destruction by a hallucinating LLM — not a human attacker with network access. Claude Code can:
- Delete files outside the project directory
- Overwrite system configuration
- Exfiltrate secrets via Bash commands (e.g.,
curlwith env vars) - Run destructive git operations (
push --force,reset --hard)
These are probabilistic (LLM hallucination) not deterministic (targeted attack). The safety system is designed for this threat model — it doesn't try to stop a sophisticated attacker who controls the terminal.
| Asset | Value | Threat |
|---|---|---|
| Project source code | High | Accidental deletion, destructive git operations |
| System files | Critical | Writes to C:\Windows, Program Files, AppData |
| User data outside projects | High | OneDrive, Dropbox, personal documents |
| API keys and secrets | Critical | Exfiltration via curl, env var logging |
| Git history | High | Force push, hard reset |
- Targeted attack on unleashed itself — an attacker with shell access can just kill the process
- Malicious Claude Code updates — we trust Anthropic's distribution
- Side-channel attacks — timing, power analysis, etc.
- Social engineering — the user is the only operator
flowchart TD
subgraph TRUSTED ["Trusted (User's Machine)"]
direction TB
USER["User"]
UNLEASHED["Unleashed Process"]
CC["Claude Code Process"]
FS["File System<br/>(within safe_paths)"]
end
subgraph SEMI ["Semi-Trusted (Boundaries)"]
direction TB
FSOUT["File System<br/>(outside safe_paths)"]
GIT["Git Remote<br/>(push operations)"]
end
subgraph UNTRUSTED ["Untrusted (External)"]
direction TB
API["Anthropic API<br/>(sentinel Haiku calls)"]
WEB["Internet<br/>(curl, wget, WebFetch)"]
LLM["LLM Output<br/>(Claude's responses)"]
end
USER --> UNLEASHED
UNLEASHED --> CC
CC --> FS
CC --> FSOUT
CC --> GIT
CC --> WEB
UNLEASHED -->|"sentinel check"| API
CC -.->|"generates"| LLM
LLM -.->|"could contain<br/>prompt injection"| CC
style UNTRUSTED fill:#ef4444,stroke:#991b1b,color:#fff
style SEMI fill:#fbbf24,stroke:#92400e,color:#000
style TRUSTED fill:#4ade80,stroke:#166534,color:#000
| Destination | Data Sent | When | Privacy Impact |
|---|---|---|---|
| Anthropic API (Haiku) | Tool type + command text (first 500 chars) + CWD | Every sentinel UNCERTAIN verdict | Low — command text only, no file contents |
| Git remote | Committed code |
git push (auto-approved) |
Project-scoped |
| Web (curl, wget, etc.) | Whatever the command specifies | When Claude runs web-fetching commands | Depends on command — sentinel evaluates |
Critical: Sentinel API calls send command text to Anthropic. If the command contains inline secrets (e.g., curl -H "Authorization: Bearer sk-...") those secrets reach the Haiku API. This is inherent to the evaluation model — sentinel must see the command to evaluate it. See "Privacy Considerations" below.
| Issue | Severity | Status | Description |
|---|---|---|---|
| #38 | P0 Critical | Open | API key in sentinel exception messages — str(e) may contain API key from HTTP headers |
| #41 | P0 Critical | Open |
in_approval stuck flag — no timeout, could freeze all approvals |
| #39 | P2 | Open | Permission pattern spoofing — LLM output could trigger false approval |
| #42 | P1 | Open | Bare except: break in stdin reader — silent thread death |
| #43 | P1 | Open | Context buffer too small — sentinel may miss tool type |
| #3 | P2 | Open | API key isolation causes silent hang in standalone sentinel |
See Sentinel Safety Gate for full details.
- Local rules: 19 hard block patterns, 12 safe patterns, path-based rules
- API evaluation: Haiku LLM evaluates ambiguous commands
- Fail-open: API errors don't freeze the session
Operations are allowed or blocked based on directory:
-
Safe paths:
C:\Users\mcwiz\Projects\,/c/Users/mcwiz/Projects/ -
Excluded paths: OneDrive, AppData,
.cache,.local, Dropbox, Google Drive, Windows, Program Files
19 regex patterns for always-dangerous commands: dd, mkfs, shred, format, git push --force, git reset --hard, rm -rf /, etc.
256-byte overlap between PTY read chunks prevents missing permission patterns at read boundaries.
-
Sentinel API calls: Command text (up to 500 chars) is sent to Anthropic's Haiku API for evaluation. This includes Bash commands, file paths, and command arguments. No file contents are sent.
-
Mirror transcripts: Stored locally in
logs/. Contain a filtered view of the session. Not transmitted anywhere. -
Shadow logs: Stored locally in
logs/. Contain tool type and arguments for every detected permission prompt. -
Friction logs: Stored locally as JSONL. Contain permission prompt timing and verdict data.
-
No telemetry: Unleashed sends no analytics, usage data, or crash reports.
| Key | Storage | Used By |
|---|---|---|
ANTHROPIC_API_KEY |
Environment variable (via ~/.agentos_secrets) |
Claude Code |
AGENTOS_SENTINEL_KEY |
Environment variable (via ~/.agentos_secrets) |
Sentinel gate |
Both keys are loaded from environment variables, sourced from ~/.agentos_secrets in .bash_profile. The secrets file is not version-controlled.
Known leak path (#38): When the Anthropic SDK raises an exception, str(e) may include HTTP request headers containing the API key. The current _api_check() returns str(e) as the error reason, which propagates to stderr and log files.
See For Security Reviewers for a structured review guide.
Architecture
Safety & Security
Session Mirror
Reference