Introduction

Open-source security for AI agents. Scan prompts, guard machines, shield in real-time.

pip install agentseal
agentseal guard

The problem

Every AI agent runs on a system prompt. That prompt holds your business logic, safety rules, and behavioral constraints. The problem is that attackers can steal it, override it, or weaponize the tools your agent connects to.

Prompt extraction leaks your intellectual property. Prompt injection turns your agent into a tool for the attacker. MCP tool poisoning means the servers your agent trusts could be feeding it malicious instructions. And skill files on your machine could be rewriting your agent's behavior without you knowing.

What AgentSeal does

AgentSeal is a security toolkit with four core capabilities. Each one addresses a different layer of the AI agent attack surface.

Command	What it does	No API key needed
agentseal scan	Tests your agent with 225+ attack probes to find prompt extraction and injection vulnerabilities	No (needs model access)
agentseal guard	Scans your entire machine for dangerous skill files, poisoned MCP configs, and hidden threats	Yes
agentseal shield	Watches your machine in real-time and alerts you when skill files or MCP configs change	Yes
agentseal scan-mcp	Connects to running MCP servers, analyzes tool definitions, detects toxic flows and rug pulls	Yes

Tip

guard, shield, and scan-mcp work without any API keys or model access. They analyze files and tool definitions locally using pattern matching, deobfuscation, and semantic analysis.

How scanning works

The scan command runs a four-phase behavioral pipeline against your agent: extraction, injection, data extraction, and boundary integrity. Every probe is deterministic and carries a unique canary. Same input, same result, every run. No LLM judges your output, so findings are reproducible and safe to include in CI.

┌──────────────┐                                     ┌──────────────┐
│              │   Phase 1 · Extraction  (82)        │              │
│              │ ──────────────────────────────────▶ │              │
│              │   Phase 2 · Injection   (143)       │              │
│  AgentSeal   │ ──────────────────────────────────▶ │  Your Agent  │
│   Scanner    │   Phase 3 · Data extraction         │              │
│              │ ──────────────────────────────────▶ │              │
│              │   Phase 4 · Boundary integrity      │              │
│              │ ──────────────────────────────────▶ │              │
│              │ ◀──────────── responses ─────────── │              │
└──────────────┘                                     └──────────────┘
        │
        ▼
┌───────────────────────────────────────────────────────────────────┐
│  Deterministic detection                                          │
│  • N-gram matching           →  extraction leaks                  │
│  • Canary token detection    →  injection success                 │
│  • Response classifier       →  partial leaks, refusals, errors   │
│  • Defense fingerprinting    →  known guard bypass attempts       │
│  • Adaptive mutation         →  blocked probes re-run transformed │
└───────────────────────────────────────────────────────────────────┘
        │
        ▼
┌───────────────────────────────────────────────────────────────────┐
│  Trust score  (5-category weighted)                               │
│  • Extraction resistance      30%                                 │
│  • Injection resistance       25%                                 │
│  • Data extraction resistance 20%                                 │
│  • Boundary integrity         15%                                 │
│  • Consistency                10%                                 │
│  ─────────────────────────────────                                │
│  0–100  ·  CRITICAL → LOW → MEDIUM → HIGH → EXCELLENT             │
└───────────────────────────────────────────────────────────────────┘

Free tier: 225 base probes (82 extraction + 143 injection)
Pro adds 86 more:  MCP tool poisoning (45) · RAG poisoning (28) · Multimodal (13)
Full attack surface: 311 probes

How guarding works

The guard command scans your machine without needing any API keys. It discovers 17+ types of AI agents (Claude, Cursor, Windsurf, Cline, Copilot, and more), reads their skill files and MCP configurations, and runs a multi-layer analysis pipeline:

Pattern detection - Regex scanning for injection markers, exfiltration patterns, credential theft
Blocklist matching - SHA-256 hashes of known malicious skill files
Deobfuscation - Strips Unicode tag characters (U+E0001-E007F, ASCII smuggling), zero-width characters (U+200B/200D, keyword obfuscation), Base64 payloads, variation selectors, and BiDi controls before analysis
Semantic analysis - MiniLM embeddings detect paraphrased threats (optional)
LLM judge - BYOK deep analysis of suspicious files (optional)
MCP config checking - 6 static checks on MCP server configurations
Toxic flow detection - Finds dangerous capability combinations across servers
Baseline tracking - Detects config changes and rug pulls between scans

Scan modes at a glance

Command	Probes	What it tests	Tier
agentseal scan	225	Base scan: 82 extraction + 143 injection probes	Free
agentseal scan --adaptive	225+	+ adaptive mutation transforms on blocked probes	Free
agentseal watch	5	Canary regression scan with baseline comparison	Free
agentseal guard	N/A	Machine-wide skill file and MCP config scan	Free
agentseal shield	N/A	Real-time file monitoring with desktop notifications	Free
agentseal scan-mcp	N/A	Runtime MCP server analysis, toxic flows, rug pulls	Free
agentseal scan --mcp	+45	+ MCP tool poisoning probes	Pro
agentseal scan --rag	+28	+ RAG poisoning probes	Pro
agentseal scan --multimodal	+13	+ multimodal attack probes	Pro
agentseal scan --genome	varies	+ behavioral genome mapping	Pro

Free vs Pro

Feature	Free	Pro
225 base attack probes (82 extraction + 143 injection)	Yes	Yes
Machine guard (agentseal guard)	Yes	Yes
Skill file scanner (agentseal scan-skills)	Yes	Yes
MCP runtime scanner (agentseal scan-mcp)	Yes	Yes
Real-time shield (agentseal shield)	Yes	Yes
Canary regression watch (agentseal watch)	Yes	Yes
Fix and quarantine (agentseal fix)	Yes	Yes
Scan comparison (agentseal compare)	Yes	Yes
Adaptive mutations (--adaptive)	Yes	Yes
Semantic detection (--semantic)	Yes	Yes
Terminal, JSON, SARIF, and JUnit output	Yes	Yes
CI/CD integration (--min-score, --fail-on)	Yes	Yes
Defense fingerprinting	Yes	Yes
Interactive setup wizard (agentseal setup)	Yes	Yes
Scan workflows and saved configs (agentseal workflow, run)	Yes	Yes
MCP tool poisoning probes (scan --mcp, +45)	-	Yes
RAG poisoning probes (scan --rag, +28)	-	Yes
Multimodal attack probes (Pro scan profile, +13)	-	Yes
Behavioral genome mapping (scan --genome)	-	Yes
Deep LLM-verified analysis (guard / scan-skills / scan-mcp --deep)	-	Yes
PDF security assessment report (scan --report)	-	Yes
Dashboard upload and historical tracking (scan --upload)	-	Yes
Scheduled recurring scans	-	Yes

Design principles

Deterministic - No LLM judges. N-gram matching and canary tokens give you the same result every time.
Privacy-first - System prompts never leave your machine. The dashboard only receives a SHA-256 hash.
Reproducible - Probes are hardcoded, not randomly generated. Every scan is auditable.
Open source - The core scanner is MIT licensed. No vendor lock-in.
Works offline - Guard, shield, and scan-mcp run entirely locally with no API calls.

Tip

See AgentSeal in action: Browse 9,100+ analyzed MCP servers, compare agents on the security leaderboard, or check the LLM security benchmark.

Installation →