MetaLLM v2.0 -- AI/ML Security Testing Framework

MetaLLM is a Metasploit-inspired security testing framework purpose-built for AI and ML systems. It provides 61 working modules spanning LLM prompt attacks, RAG poisoning, agentic AI exploitation, MLOps infrastructure compromise, API security testing, and network-layer ML attacks -- all driven through an interactive CLI with tab completion, session tracking, and structured reporting mapped to MITRE ATLAS and OWASP LLM Top 10 2025.

What Makes MetaLLM Different

There are other AI security testing tools. Here is where MetaLLM fits relative to them:

Capability	MetaLLM	Garak	PyRIT	Promptfoo
Metasploit-style operator workflow (`use` / `set` / `run` / `sessions`)	Yes	No	No	No
Full-stack coverage (network to model to agent)	Yes	No	Partial	No
MCP tool poisoning module	Yes	No	No	No
Multi-turn adaptive jailbreaks (crescendo, context buildup)	Yes	No	Yes	No
MLOps infrastructure exploits (Jupyter, MLflow, W&B, TensorBoard)	Yes	No	No	No
Session manager with loot tracking	Yes	No	No	No
SQLite target database for engagement persistence	Yes	No	No	No
MITRE ATLAS + OWASP mapping in reports	Yes	No	Partial	Partial

MetaLLM is not a replacement for any of these tools. It fills a gap: an operator-oriented framework for hands-on AI red team engagements that covers the full attack surface from network reconnaissance through model exploitation to post-exploitation loot collection.

Quick Start

Prerequisites

Python 3.10 or higher
pip

Install

git clone https://github.com/perfecXion-ai/MetaLLM.git
cd MetaLLM

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

pip install -r requirements.txt

Launch

python metallm.py

Basic Workflow

metalllm> use exploit/llm/prompt_injection
metalllm exploit(prompt_injection)> show options
metalllm exploit(prompt_injection)> set TARGET_URL http://target.example.com/api/chat
metalllm exploit(prompt_injection)> set PROVIDER openai
metalllm exploit(prompt_injection)> set MODEL gpt-4
metalllm exploit(prompt_injection)> run

metalllm> sessions -l          # List active sessions
metalllm> sessions -i 1        # Interact with session 1
metalllm> report generate       # Generate assessment report

CLI Commands

Command	Description
`use <module>`	Select a module
`show options`	Display configurable options for the active module
`set <OPTION> <value>`	Set a module option
`run`	Execute the active module
`check`	Probe the target without full exploitation
`search <term>`	Search modules by name or keyword
`show modules`	List all available modules
`sessions -l`	List active sessions
`sessions -i <id>`	Interact with a session
`targets`	Manage the target database
`report generate`	Generate HTML/Markdown/JSON report
`back`	Deselect the active module
`help`	Show available commands

Module Categories

Exploit Modules (44)

LLM -- 12 modules

Module	Description
`exploit/llm/prompt_injection`	Multi-technique prompt injection
`exploit/llm/prompt_injection_basic`	Basic prompt injection patterns
`exploit/llm/prompt_injection_advanced`	Advanced injection with context manipulation
`exploit/llm/jailbreak`	Standard jailbreak techniques
`exploit/llm/jailbreak_dan`	DAN-style jailbreak variants
`exploit/llm/system_prompt_leak`	System prompt leakage via indirect methods
`exploit/llm/system_prompt_extraction`	Direct system prompt extraction (multi-technique)
`exploit/llm/context_manipulation`	Context window manipulation attacks
`exploit/llm/adaptive_jailbreak`	Multi-turn adaptive jailbreaks (crescendo, context buildup)
`exploit/llm/flipattack`	FlipAttack -- word/segment reversal bypass
`exploit/llm/encoding_bypass`	Base64, ROT13, hex, unicode encoding bypasses
--	(11 unique modules; prompt_injection counted once)

RAG -- 5 modules

Module	Description
`exploit/rag/vector_injection`	Inject adversarial vectors into retrieval pipeline
`exploit/rag/document_poisoning`	Poison documents ingested by RAG systems
`exploit/rag/knowledge_corruption`	Corrupt knowledge base entries
`exploit/rag/retrieval_manipulation`	Manipulate retrieval ranking and results
`exploit/rag/poisoning`	General RAG poisoning techniques

Agent / MCP -- 10 modules

Module	Description
`exploit/agent/goal_hijacking`	Redirect agent goals via injected instructions
`exploit/agent/tool_misuse`	Abuse agent tool-calling interfaces
`exploit/agent/memory_manipulation`	Tamper with agent memory stores
`exploit/agent/plugin_abuse`	Exploit plugin loading and execution
`exploit/agent/langchain_rce`	LangChain RCE via unsafe deserialization
`exploit/agent/langchain_tool_injection`	Inject malicious tools into LangChain agents
`exploit/agent/crewai_task_manipulation`	Manipulate CrewAI task definitions
`exploit/agent/autogpt_goal_corruption`	Corrupt AutoGPT goal state
`exploit/agent/protocol_message_injection`	Inject messages into agent protocol layer
`exploit/agent/mcp_tool_poisoning`	Poison MCP tool definitions and responses

MLOps Infrastructure -- 9 modules

Module	Description
`exploit/mlops/pickle_deserialization`	Pickle deserialization RCE
`exploit/mlops/mlflow_model_poison`	MLflow model poisoning
`exploit/mlops/mlflow_model_poisoning`	MLflow model registry poisoning variant
`exploit/mlops/jupyter_rce`	Jupyter Notebook RCE
`exploit/mlops/jupyter_notebook_rce`	Jupyter Notebook kernel exploitation
`exploit/mlops/wandb_credential_theft`	Weights & Biases credential extraction
`exploit/mlops/wandb_data_exfiltration`	W&B experiment data exfiltration
`exploit/mlops/model_registry_manipulation`	Model registry tampering
`exploit/mlops/tensorboard_attack`	TensorBoard exploitation

API -- 3 modules

Module	Description
`exploit/api/api_key_extraction`	Extract API keys from LLM responses and configs
`exploit/api/excessive_agency`	Test for excessive agency / over-permissioned actions
`exploit/api/unauthorized_access`	API authorization bypass

Network -- 5 modules

Module	Description
`exploit/network/adversarial_examples`	Craft adversarial inputs for ML models
`exploit/network/model_extraction`	Model stealing via API queries
`exploit/network/model_inversion`	Reconstruct training data from model outputs
`exploit/network/membership_inference`	Determine if data was in the training set
`exploit/network/api_key_harvesting`	Harvest API keys from network traffic

Auxiliary Modules (16)

Scanner -- 5 modules

Module	Description
`auxiliary/scanner/llm_api_scanner`	Discover LLM API endpoints
`auxiliary/scanner/mlops_discovery`	Find MLOps platforms (MLflow, Kubeflow, etc.)
`auxiliary/scanner/rag_endpoint_enum`	Enumerate RAG system endpoints
`auxiliary/scanner/agent_framework_detect`	Detect agent frameworks in use
`auxiliary/scanner/ai_service_port_scan`	Port scan for AI/ML services

Fingerprint -- 4 modules

Module	Description
`auxiliary/fingerprint/llm_model_detector`	Identify which LLM is behind an API
`auxiliary/fingerprint/capability_prober`	Probe model capabilities and limits
`auxiliary/fingerprint/safety_filter_detect`	Detect and characterize safety filters
`auxiliary/fingerprint/embedding_model_id`	Identify embedding models

Discovery -- 3 modules

Module	Description
`auxiliary/discovery/vector_db_enum`	Enumerate vector databases (Pinecone, Weaviate, Qdrant, etc.)
`auxiliary/discovery/model_registry_scan`	Scan model registries
`auxiliary/discovery/training_infra_disc`	Discover training infrastructure

DoS Testing -- 3 modules

Module	Description
`auxiliary/dos/token_exhaustion`	Token budget exhaustion
`auxiliary/dos/rate_limit_test`	Rate limit boundary testing
`auxiliary/dos/context_overflow`	Context window overflow

LLM Auxiliary -- 2 modules

Module	Description
`auxiliary/llm/fingerprint`	LLM fingerprinting via behavioral analysis
`auxiliary/llm/fuzzer`	LLM input fuzzing

Post-Exploitation (1)

Module	Description
`post/llm/context_extraction`	Extract conversation context and data from compromised sessions

Architecture Overview

MetaLLM/
├── metallm.py                  # Entry point -- launches interactive CLI
├── cli/
│   ├── console.py              # REPL with tab completion and command history
│   ├── commands.py             # Command implementations
│   ├── completer.py            # Tab completion engine
│   └── formatter.py            # Output formatting
├── metallm/
│   ├── base/                   # Base classes: Module, Target, Result, Option
│   └── core/
│       ├── module_loader.py    # Dynamic module discovery and loading
│       ├── session.py          # Session manager (active sessions, loot)
│       ├── db.py               # SQLite target database
│       ├── llm_client.py       # Unified LLM client (OpenAI, Anthropic, Ollama, Google, generic HTTP)
│       └── reporting.py        # Report generation with MITRE ATLAS + OWASP mapping
├── modules/
│   ├── exploits/               # 44 exploit modules
│   │   ├── llm/                # Prompt injection, jailbreaks, extraction
│   │   ├── rag/                # Vector injection, poisoning, corruption
│   │   ├── agent/              # Goal hijacking, MCP poisoning, RCE
│   │   ├── mlops/              # Pickle deser, MLflow, Jupyter, W&B
│   │   ├── api/                # Key extraction, excessive agency
│   │   └── network/            # Model extraction, adversarial examples
│   ├── auxiliary/              # 16 auxiliary modules
│   │   ├── scanner/            # API scanning, port scanning
│   │   ├── fingerprint/        # Model detection, filter detection
│   │   ├── discovery/          # Vector DB, registry, infra discovery
│   │   ├── dos/                # Token exhaustion, rate limits
│   │   └── llm/                # Fingerprinting, fuzzing
│   └── post/                   # 1 post-exploitation module
└── tests/                      # 120 unit + 17 integration tests

Key Components

Unified LLM Client -- Modules do not make raw HTTP calls. LLMClient handles provider-specific request formatting for OpenAI, Anthropic, Ollama, Google (Gemini), and any OpenAI-compatible endpoint. Modules call client.send(prompt) and get text back.

Session Manager -- After successful exploitation, a session is created with loot storage. Operators can list sessions, interact with them, background them, and close them -- exactly like Metasploit.

Target Database -- SQLite-backed persistence for targets, engagements, findings, and loot. Resume work across sessions without re-entering target information.

Reporting Engine -- Generates self-contained HTML, Markdown, and JSON reports. Every finding is mapped to MITRE ATLAS techniques (49 technique IDs) and OWASP LLM Top 10 2025 categories.

Testing

MetaLLM has 137 tests: 120 unit tests covering base classes, module loading, and core framework, plus 17 integration tests that run real exploits against a live Ollama instance.

Run Unit Tests

pytest tests/test_base.py -v

Run Integration Tests (requires local Ollama with llama3.2:1b)

ollama pull llama3.2:1b
pytest tests/test_integration_ollama.py -v -s -m integration

Integration tests validate that modules actually work end-to-end:

System Prompt Extraction -- Runs all extraction techniques against a known system prompt. Verifies the module extracts recognizable content (e.g., keywords from the target prompt).
Encoding Bypass -- Tests base64, ROT13, hex, unicode, and other encoding techniques. Verifies the module executes all techniques without errors.
FlipAttack -- Tests word reversal, segment reversal, and all flip techniques. Verifies multi-technique execution and scoring.
Adaptive Jailbreak -- Runs crescendo and context buildup strategies over multiple conversation turns. Verifies multi-turn state management and alternating user/assistant message flow.
LLM Client -- Direct tests against Ollama: basic send, system prompt influence, conversation history, and response timing.

These are not mocked. Every test sends real prompts to a real model and validates real responses.

Responsible Use

MetaLLM is a security testing tool. It is designed for authorized use only.

Requirements for use:

Obtain explicit written authorization before testing any system you do not own
Conduct testing only in authorized environments
Follow coordinated vulnerability disclosure for any findings
Comply with all applicable laws and regulations
Use results to improve defenses, not to cause harm

Do not use MetaLLM for:

Unauthorized access to any system
Attacks against production systems without authorization
Data theft or exfiltration
Any activity that violates applicable law

The authors are not responsible for misuse. Users bear sole responsibility for compliance with all applicable laws and regulations.

Contributing

Contributions are welcome. To add a module:

Fork the repository
Create a feature branch
Write your module following the existing patterns in modules/exploits/base.py
Add tests
Submit a pull request

All modules must define: name, description, author, options, and implement run(). Exploit modules should also implement check() for non-destructive vulnerability probing.

License

MIT License. See LICENSE for details.

Credits

Author: Scott Thornton Contact: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
cli		cli
docs		docs
infrastructure/gcp		infrastructure/gcp
metaLLM		metaLLM
modules		modules
plugins		plugins
tests		tests
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLI_GUIDE.md		CLI_GUIDE.md
LICENSE		LICENSE
MODULE_LIBRARY.md		MODULE_LIBRARY.md
README.md		README.md
SECURITY.md		SECURITY.md
metallm		metallm
metallm.py		metallm.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
structlog.py		structlog.py
test_cli.py		test_cli.py
test_new_module.py		test_new_module.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MetaLLM v2.0 -- AI/ML Security Testing Framework

What Makes MetaLLM Different

Quick Start

Prerequisites

Install

Launch

Basic Workflow

CLI Commands

Module Categories

Exploit Modules (44)

Auxiliary Modules (16)

Post-Exploitation (1)

Architecture Overview

Key Components

Testing

Run Unit Tests

Run Integration Tests (requires local Ollama with llama3.2:1b)

Responsible Use

Contributing

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

MetaLLM v2.0 -- AI/ML Security Testing Framework

What Makes MetaLLM Different

Quick Start

Prerequisites

Install

Launch

Basic Workflow

CLI Commands

Module Categories

Exploit Modules (44)

Auxiliary Modules (16)

Post-Exploitation (1)

Architecture Overview

Key Components

Testing

Run Unit Tests

Run Integration Tests (requires local Ollama with llama3.2:1b)

Responsible Use

Contributing

License

Credits

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages