Introduction
Open-source security for AI agents. Scan prompts, guard machines, shield in real-time.
pip install agentseal
agentseal guardThe problem
Every AI agent runs on a system prompt. That prompt holds your business logic, safety rules, and behavioral constraints. The problem is that attackers can steal it, override it, or weaponize the tools your agent connects to.
Prompt extraction leaks your intellectual property. Prompt injection turns your agent into a tool for the attacker. MCP tool poisoning means the servers your agent trusts could be feeding it malicious instructions. And skill files on your machine could be rewriting your agent's behavior without you knowing.
What AgentSeal does
AgentSeal is a security toolkit with four core capabilities. Each one addresses a different layer of the AI agent attack surface.
| Command | What it does | No API key needed |
|---|---|---|
| agentseal scan | Tests your agent with 225+ attack probes to find prompt extraction and injection vulnerabilities | No (needs model access) |
| agentseal guard | Scans your entire machine for dangerous skill files, poisoned MCP configs, and hidden threats | Yes |
| agentseal shield | Watches your machine in real-time and alerts you when skill files or MCP configs change | Yes |
| agentseal scan-mcp | Connects to running MCP servers, analyzes tool definitions, detects toxic flows and rug pulls | Yes |
Tip
guard, shield, and scan-mcp work without any API keys or model access. They analyze files and tool definitions locally using pattern matching, deobfuscation, and semantic analysis.How scanning works
The scan command runs a four-phase behavioral pipeline against your agent: extraction, injection, data extraction, and boundary integrity. Every probe is deterministic and carries a unique canary. Same input, same result, every run. No LLM judges your output, so findings are reproducible and safe to include in CI.
┌──────────────┐ ┌──────────────┐
│ │ Phase 1 · Extraction (82) │ │
│ │ ──────────────────────────────────▶ │ │
│ │ Phase 2 · Injection (143) │ │
│ AgentSeal │ ──────────────────────────────────▶ │ Your Agent │
│ Scanner │ Phase 3 · Data extraction │ │
│ │ ──────────────────────────────────▶ │ │
│ │ Phase 4 · Boundary integrity │ │
│ │ ──────────────────────────────────▶ │ │
│ │ ◀──────────── responses ─────────── │ │
└──────────────┘ └──────────────┘
│
▼
┌───────────────────────────────────────────────────────────────────┐
│ Deterministic detection │
│ • N-gram matching → extraction leaks │
│ • Canary token detection → injection success │
│ • Response classifier → partial leaks, refusals, errors │
│ • Defense fingerprinting → known guard bypass attempts │
│ • Adaptive mutation → blocked probes re-run transformed │
└───────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────────┐
│ Trust score (5-category weighted) │
│ • Extraction resistance 30% │
│ • Injection resistance 25% │
│ • Data extraction resistance 20% │
│ • Boundary integrity 15% │
│ • Consistency 10% │
│ ───────────────────────────────── │
│ 0–100 · CRITICAL → LOW → MEDIUM → HIGH → EXCELLENT │
└───────────────────────────────────────────────────────────────────┘
Free tier: 225 base probes (82 extraction + 143 injection)
Pro adds 86 more: MCP tool poisoning (45) · RAG poisoning (28) · Multimodal (13)
Full attack surface: 311 probesHow guarding works
The guard command scans your machine without needing any API keys. It discovers 17+ types of AI agents (Claude, Cursor, Windsurf, Cline, Copilot, and more), reads their skill files and MCP configurations, and runs a multi-layer analysis pipeline:
- Pattern detection - Regex scanning for injection markers, exfiltration patterns, credential theft
- Blocklist matching - SHA-256 hashes of known malicious skill files
- Deobfuscation - Strips Unicode tag characters (U+E0001-E007F, ASCII smuggling), zero-width characters (U+200B/200D, keyword obfuscation), Base64 payloads, variation selectors, and BiDi controls before analysis
- Semantic analysis - MiniLM embeddings detect paraphrased threats (optional)
- LLM judge - BYOK deep analysis of suspicious files (optional)
- MCP config checking - 6 static checks on MCP server configurations
- Toxic flow detection - Finds dangerous capability combinations across servers
- Baseline tracking - Detects config changes and rug pulls between scans
Scan modes at a glance
| Command | Probes | What it tests | Tier |
|---|---|---|---|
| agentseal scan | 225 | Base scan: 82 extraction + 143 injection probes | Free |
| agentseal scan --adaptive | 225+ | + adaptive mutation transforms on blocked probes | Free |
| agentseal watch | 5 | Canary regression scan with baseline comparison | Free |
| agentseal guard | N/A | Machine-wide skill file and MCP config scan | Free |
| agentseal shield | N/A | Real-time file monitoring with desktop notifications | Free |
| agentseal scan-mcp | N/A | Runtime MCP server analysis, toxic flows, rug pulls | Free |
| agentseal scan --mcp | +45 | + MCP tool poisoning probes | Pro |
| agentseal scan --rag | +28 | + RAG poisoning probes | Pro |
| agentseal scan --multimodal | +13 | + multimodal attack probes | Pro |
| agentseal scan --genome | varies | + behavioral genome mapping | Pro |
Free vs Pro
| Feature | Free | Pro |
|---|---|---|
| 225 base attack probes (82 extraction + 143 injection) | Yes | Yes |
| Machine guard (agentseal guard) | Yes | Yes |
| Skill file scanner (agentseal scan-skills) | Yes | Yes |
| MCP runtime scanner (agentseal scan-mcp) | Yes | Yes |
| Real-time shield (agentseal shield) | Yes | Yes |
| Canary regression watch (agentseal watch) | Yes | Yes |
| Fix and quarantine (agentseal fix) | Yes | Yes |
| Scan comparison (agentseal compare) | Yes | Yes |
| Adaptive mutations (--adaptive) | Yes | Yes |
| Semantic detection (--semantic) | Yes | Yes |
| Terminal, JSON, SARIF, and JUnit output | Yes | Yes |
| CI/CD integration (--min-score, --fail-on) | Yes | Yes |
| Defense fingerprinting | Yes | Yes |
| Interactive setup wizard (agentseal setup) | Yes | Yes |
| Scan workflows and saved configs (agentseal workflow, run) | Yes | Yes |
| MCP tool poisoning probes (scan --mcp, +45) | - | Yes |
| RAG poisoning probes (scan --rag, +28) | - | Yes |
| Multimodal attack probes (Pro scan profile, +13) | - | Yes |
| Behavioral genome mapping (scan --genome) | - | Yes |
| Deep LLM-verified analysis (guard / scan-skills / scan-mcp --deep) | - | Yes |
| PDF security assessment report (scan --report) | - | Yes |
| Dashboard upload and historical tracking (scan --upload) | - | Yes |
| Scheduled recurring scans | - | Yes |
Design principles
- Deterministic - No LLM judges. N-gram matching and canary tokens give you the same result every time.
- Privacy-first - System prompts never leave your machine. The dashboard only receives a SHA-256 hash.
- Reproducible - Probes are hardcoded, not randomly generated. Every scan is auditable.
- Open source - The core scanner is MIT licensed. No vendor lock-in.
- Works offline - Guard, shield, and scan-mcp run entirely locally with no API calls.
Tip

