Introduction

Open-source security for AI agents. Scan prompts, guard machines, shield in real-time.

pip install agentseal
agentseal guard

The problem

Every AI agent runs on a system prompt. That prompt holds your business logic, safety rules, and behavioral constraints. The problem is that attackers can steal it, override it, or weaponize the tools your agent connects to.

Prompt extraction leaks your intellectual property. Prompt injection turns your agent into a tool for the attacker. MCP tool poisoning means the servers your agent trusts could be feeding it malicious instructions. And skill files on your machine could be rewriting your agent's behavior without you knowing.

What AgentSeal does

AgentSeal is a security toolkit with four core capabilities. Each one addresses a different layer of the AI agent attack surface.

CommandWhat it doesNo API key needed
agentseal scanTests your agent with 225+ attack probes to find prompt extraction and injection vulnerabilitiesNo (needs model access)
agentseal guardScans your entire machine for dangerous skill files, poisoned MCP configs, and hidden threatsYes
agentseal shieldWatches your machine in real-time and alerts you when skill files or MCP configs changeYes
agentseal scan-mcpConnects to running MCP servers, analyzes tool definitions, detects toxic flows and rug pullsYes

Tip

guard, shield, and scan-mcp work without any API keys or model access. They analyze files and tool definitions locally using pattern matching, deobfuscation, and semantic analysis.

How scanning works

The scan command runs a four-phase behavioral pipeline against your agent: extraction, injection, data extraction, and boundary integrity. Every probe is deterministic and carries a unique canary. Same input, same result, every run. No LLM judges your output, so findings are reproducible and safe to include in CI.

┌──────────────┐                                     ┌──────────────┐
│              │   Phase 1 · Extraction  (82)        │              │
│              │ ──────────────────────────────────▶ │              │
│              │   Phase 2 · Injection   (143)       │              │
│  AgentSeal   │ ──────────────────────────────────▶ │  Your Agent  │
│   Scanner    │   Phase 3 · Data extraction         │              │
│              │ ──────────────────────────────────▶ │              │
│              │   Phase 4 · Boundary integrity      │              │
│              │ ──────────────────────────────────▶ │              │
│              │ ◀──────────── responses ─────────── │              │
└──────────────┘                                     └──────────────┘
        │
        ▼
┌───────────────────────────────────────────────────────────────────┐
│  Deterministic detection                                          │
│  • N-gram matching           →  extraction leaks                  │
│  • Canary token detection    →  injection success                 │
│  • Response classifier       →  partial leaks, refusals, errors   │
│  • Defense fingerprinting    →  known guard bypass attempts       │
│  • Adaptive mutation         →  blocked probes re-run transformed │
└───────────────────────────────────────────────────────────────────┘
        │
        ▼
┌───────────────────────────────────────────────────────────────────┐
│  Trust score  (5-category weighted)                               │
│  • Extraction resistance      30%                                 │
│  • Injection resistance       25%                                 │
│  • Data extraction resistance 20%                                 │
│  • Boundary integrity         15%                                 │
│  • Consistency                10%                                 │
│  ─────────────────────────────────                                │
│  0–100  ·  CRITICAL → LOW → MEDIUM → HIGH → EXCELLENT             │
└───────────────────────────────────────────────────────────────────┘

Free tier: 225 base probes (82 extraction + 143 injection)
Pro adds 86 more:  MCP tool poisoning (45) · RAG poisoning (28) · Multimodal (13)
Full attack surface: 311 probes

How guarding works

The guard command scans your machine without needing any API keys. It discovers 17+ types of AI agents (Claude, Cursor, Windsurf, Cline, Copilot, and more), reads their skill files and MCP configurations, and runs a multi-layer analysis pipeline:

  • Pattern detection - Regex scanning for injection markers, exfiltration patterns, credential theft
  • Blocklist matching - SHA-256 hashes of known malicious skill files
  • Deobfuscation - Strips Unicode tag characters (U+E0001-E007F, ASCII smuggling), zero-width characters (U+200B/200D, keyword obfuscation), Base64 payloads, variation selectors, and BiDi controls before analysis
  • Semantic analysis - MiniLM embeddings detect paraphrased threats (optional)
  • LLM judge - BYOK deep analysis of suspicious files (optional)
  • MCP config checking - 6 static checks on MCP server configurations
  • Toxic flow detection - Finds dangerous capability combinations across servers
  • Baseline tracking - Detects config changes and rug pulls between scans

Scan modes at a glance

CommandProbesWhat it testsTier
agentseal scan225Base scan: 82 extraction + 143 injection probesFree
agentseal scan --adaptive225++ adaptive mutation transforms on blocked probesFree
agentseal watch5Canary regression scan with baseline comparisonFree
agentseal guardN/AMachine-wide skill file and MCP config scanFree
agentseal shieldN/AReal-time file monitoring with desktop notificationsFree
agentseal scan-mcpN/ARuntime MCP server analysis, toxic flows, rug pullsFree
agentseal scan --mcp+45+ MCP tool poisoning probesPro
agentseal scan --rag+28+ RAG poisoning probesPro
agentseal scan --multimodal+13+ multimodal attack probesPro
agentseal scan --genomevaries+ behavioral genome mappingPro

Free vs Pro

FeatureFreePro
225 base attack probes (82 extraction + 143 injection)YesYes
Machine guard (agentseal guard)YesYes
Skill file scanner (agentseal scan-skills)YesYes
MCP runtime scanner (agentseal scan-mcp)YesYes
Real-time shield (agentseal shield)YesYes
Canary regression watch (agentseal watch)YesYes
Fix and quarantine (agentseal fix)YesYes
Scan comparison (agentseal compare)YesYes
Adaptive mutations (--adaptive)YesYes
Semantic detection (--semantic)YesYes
Terminal, JSON, SARIF, and JUnit outputYesYes
CI/CD integration (--min-score, --fail-on)YesYes
Defense fingerprintingYesYes
Interactive setup wizard (agentseal setup)YesYes
Scan workflows and saved configs (agentseal workflow, run)YesYes
MCP tool poisoning probes (scan --mcp, +45)-Yes
RAG poisoning probes (scan --rag, +28)-Yes
Multimodal attack probes (Pro scan profile, +13)-Yes
Behavioral genome mapping (scan --genome)-Yes
Deep LLM-verified analysis (guard / scan-skills / scan-mcp --deep)-Yes
PDF security assessment report (scan --report)-Yes
Dashboard upload and historical tracking (scan --upload)-Yes
Scheduled recurring scans-Yes

Design principles

  • Deterministic - No LLM judges. N-gram matching and canary tokens give you the same result every time.
  • Privacy-first - System prompts never leave your machine. The dashboard only receives a SHA-256 hash.
  • Reproducible - Probes are hardcoded, not randomly generated. Every scan is auditable.
  • Open source - The core scanner is MIT licensed. No vendor lock-in.
  • Works offline - Guard, shield, and scan-mcp run entirely locally with no API calls.

Tip

See AgentSeal in action: Browse 9,100+ analyzed MCP servers, compare agents on the security leaderboard, or check the LLM security benchmark.
Edit this page on GitHub© 2026 AgentSeal