Skip to content

Permission patterns can be spoofed by LLM-generated output #39

@martymcenroe

Description

@martymcenroe

Summary

Unleashed detects Claude Code's permission prompts by matching raw PTY output against four byte-string patterns (PERMISSION_PATTERNS at ~line 280 of unleashed-c-21.py):

PERMISSION_PATTERNS = [
    b"Allow this action?",
    b"Allow tool?",
    b"Do you want to proceed?",
    b"Press Enter to allow",
]

These patterns are matched against the full PTY output stream — which includes both Claude Code's UI chrome AND the LLM's generated text. If Claude's response text happens to contain one of these exact strings (e.g., in a code comment, documentation, or even adversarial prompt injection from a malicious file), unleashed would interpret it as a real permission prompt and send a CR keystroke.

Attack Scenario

  1. A malicious README.md or source file contains the text Allow this action? followed by a description of a destructive command
  2. Claude reads the file and echoes its contents in the terminal
  3. Unleashed matches the pattern in the echoed output
  4. do_approval() fires, sending CR to PTY
  5. If Claude happens to be at any interactive prompt at that moment, the CR could approve something unintended

Why This Is Low-Probability but Worth Tracking

  • Claude Code's Ink-based UI renders permission prompts with specific ANSI formatting (colors, positioning) that raw text wouldn't have
  • The pattern match would need to align with the in_approval flag timing
  • The context_buffer extraction would likely produce garbage context, not a real tool invocation

But the fundamental issue remains: the permission detection is purely text-based with no structural verification. A more robust approach would anchor patterns to specific ANSI sequences that only Claude Code's UI framework emits.

Proposed Mitigation

  1. Short term: Add ANSI anchor sequences to patterns (e.g., match the specific color codes Claude Code uses before "Allow this action?")
  2. Medium term: Look for the full permission prompt structure (the Yes/No option rendering, not just the question text)
  3. Long term: Claude Code should expose a machine-readable permission event (IPC, file watcher, or structured output) rather than requiring screen-scraping

Affected Files

  • src/unleashed-c-21.pyPERMISSION_PATTERNS definition and _reader_pty() matching logic

Context

This is inherent to the screen-scraping approach that unleashed uses. The PTY wrapper reads raw terminal bytes and pattern-matches — it has no semantic understanding of whether a matched string is a real UI element or echoed text. The sentinel integration (issue #12) mitigates the consequences of false approvals but doesn't prevent the false detection itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    on-iceFeature/fix on hold — not in active developmentsecuritySecurity vulnerability or hardening

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions