The self-improving AI agent built by Nous Research.
Hermes Agent is an autonomous AI agent with a built-in learning loop. It creates skills from experience, improves them during use, searches its own past conversations for context, and builds a deepening model of who you are across sessions. It runs on a $5 VPS, a GPU cluster, or serverless infrastructure -- not tied to your laptop.
Current version: v0.5.0 (v2026.3.28) -- released March 28, 2026.
Hermes Agent is not a coding copilot or a chatbot wrapper. It is a multi-platform, multi-provider autonomous agent that:
- Executes tool calls sequentially or concurrently via
ThreadPoolExecutor(up to 8 parallel workers) - Manages long-running conversations with context compression and session lineage
- Persists memory, skills, and session history in SQLite across restarts
- Streams responses token-by-token from the model to the user interface (v0.3.0)
- Runs as a CLI, a messaging gateway (Telegram, Discord, Slack, WhatsApp, Signal, DingTalk, SMS, Mattermost, Matrix, Webhook, Email, Home Assistant), or an IDE integration (VS Code, Zed, JetBrains via ACP)
- Exposes an OpenAI-compatible
/v1/chat/completionsAPI server with REST cron job management (v0.4.0) - Delegates work to isolated subagents and spawns scheduled cron jobs
- Exports training trajectories for SFT data generation and RL fine-tuning
The agent core is AIAgent in run_agent.py. All other subsystems -- gateway, CLI, ACP server, cron scheduler -- use this single agent core, so behavior is consistent across all platforms.
Real-time token-by-token delivery in the CLI and all gateway platforms. Responses stream as they are generated instead of arriving as a single block. Supported on Telegram, Discord, Slack, and the interactive CLI. WhatsApp, Signal, Email, and Home Assistant fall back to non-streaming automatically (no message-edit API). Implemented in _interruptible_streaming_api_call() with graceful fallback to non-streaming on any error.
- Persistent memory -- agent-curated facts written to
MEMORY.mdwith periodic nudges to persist durable knowledge - Autonomous skill creation -- after complex tasks (5+ tool calls), the agent creates reusable skill documents
- Skill self-improvement -- skills are patched during use when they are outdated, incomplete, or wrong
- FTS5 session search -- full-text search across all past sessions with LLM summarization for cross-session recall
- Honcho integration -- dialectic user modeling that builds a persistent model of who you are across sessions
Telegram, Discord, Slack, WhatsApp, Signal, DingTalk, SMS (Twilio), Mattermost, Matrix, Webhook, Email (IMAP/SMTP), and Home Assistant -- all from a single gateway process. Six new adapters were added in v0.4.0. Unified session management, media attachments, voice transcription, and per-platform tool configuration. The gateway auto-reconnects failed platforms with exponential backoff (v0.4.0).
Drop Python files into ~/.hermes/plugins/ to extend Hermes with custom tools, commands, and hooks. No forking required.
Use any model without code changes. Supported providers:
- Nous Portal -- first-class provider; 400+ models via Nous inference (
https://inference-api.nousresearch.com/v1) - OpenRouter -- 200+ models (
https://openrouter.ai/api/v1) - Anthropic (native, v0.3.0) -- direct API with Claude Code credential auto-discovery, OAuth PKCE flows, and native prompt caching
- OpenAI -- GPT-5 variants via chat completions or Codex Responses API
- Vercel AI Gateway (v0.3.0) -- access Vercel's model catalog and infrastructure (
https://ai-gateway.vercel.sh/v1) - GitHub Copilot (v0.4.0) -- OAuth auth, 400k context, token validation
- Alibaba Cloud / DashScope (v0.4.0) -- DashScope v1 runtime
- Kilo Code (v0.4.0) -- direct API-key provider
- OpenCode Zen / OpenCode Go (v0.4.0) -- provider backends
- Hugging Face (v0.5.0) -- HF Inference API with curated agentic model picker and live
/modelsprobe - z.ai/GLM, Kimi/Moonshot, MiniMax -- direct API-key providers
- Custom endpoints -- any OpenAI-compatible API
Switch with hermes model -- no code changes, no lock-in.
Expose Hermes as an /v1/chat/completions endpoint with a /api/jobs REST API for cron job management. Hardened with input limits, field whitelists, SQLite-backed response persistence across restarts, CORS origin protection, and Idempotency-Key support (v0.5.0). The API server exposes its own hermes-api-server toolset.
Claude Code-style @file and @url context injection with tab completions in the CLI. Files and web pages are inserted directly into the conversation. Access is restricted to the workspace -- reading secrets outside the workspace is blocked.
hermes mcp commands for installing, configuring, and authenticating MCP servers, with a full OAuth 2.1 PKCE flow for remote MCP servers. MCP servers are exposed as standalone toolsets and are configurable interactively in hermes tools.
pre_llm_call, post_llm_call, on_session_start, and on_session_end hooks now fire in the agent loop and CLI/gateway, completing the plugin hook system. Plugins can also register slash commands and extend the TUI.
Full uv2nix build, NixOS module with persistent container mode, auto-generated config keys from Python source, and suffix PATHs for agent-friendliness. See installation.md for Nix installation details. Contributed by @alt-glitch.
Multiple independent tool calls run in parallel via ThreadPoolExecutor (max 8 workers), significantly reducing latency for multi-tool turns. Interactive tools (e.g. clarify) force sequential execution. Path-scoped tools (read_file, write_file, patch) run concurrently only when targeting independent paths. Message and result ordering is preserved when reinserting tool responses into conversation history.
ContextCompressor monitors token usage and compresses context when approaching the model's context limit (default threshold: 50% of the context window). The algorithm protects the first protect_first_n turns (default: 3) and the last protect_last_n turns (default: 4), summarizes the middle section via an auxiliary model call (call_llm(task="compression")), and sanitizes orphaned tool-call/result pairs. Session lineage is preserved via parent_session_id chains in the SQLite state store.
Local, Docker, SSH, Daytona, Singularity, and Modal. Daytona and Modal offer serverless persistence -- the environment hibernates when idle and wakes on demand, costing nearly nothing between sessions.
Local and SSH terminal backends maintain shell state across tool calls -- cd, environment variables, and aliases persist within a session.
Codex-inspired approval system that learns which commands are safe and remembers your preferences. /stop kills the current agent run immediately.
Push-to-talk in the CLI, voice notes in Telegram and Discord, Discord voice channel support, and local Whisper transcription via faster-whisper. Configurable STT backends with a stt.enabled config flag.
Native MCP client with stdio and HTTP transports, selective tool loading with utility policies, sampling (server-initiated LLM requests), and auto-reload when mcp_servers config changes. Tools are prefixed as mcp_<server>_<tool_name>.
VS Code, Zed, and JetBrains connect to Hermes as an agent backend over stdio/JSON-RPC. Full slash command support. The hermes-acp entry point runs the ACP server.
When privacy.redact_pii is enabled, personally identifiable information is automatically scrubbed before sending context to LLM providers.
agent/redact.py provides regex-based secret redaction for logs and tool output. It masks API keys (OpenAI, GitHub, Slack, Google, AWS, Stripe, SendGrid, HuggingFace, and others), env variable assignments containing secret names, JSON secret fields, Authorization headers, Telegram bot tokens, private key blocks, database connection string passwords, and E.164 phone numbers. Short tokens (< 18 chars) are fully masked; longer tokens preserve the first 6 and last 4 characters. Controlled via security.redact_secrets in config.yaml or HERMES_REDACT_SECRETS env var. A RedactingFormatter log handler applies redaction to all log output.
Batch trajectory generation, trajectory compression for training datasets (trajectory_compressor.py), and Atropos RL environments including WebResearchEnv, YC-Bench, and Agentic On-Policy Distillation (OPD, v0.3.0). The pipeline supports SFT data generation and GRPO/PPO RL fine-tuning.
70+ bundled and optional skills across 15+ categories. Compatible with the agentskills.io open standard. Skills support per-platform enable/disable, conditional activation based on tool availability, and prerequisite validation. The Skills Hub integrates with both ClawHub and skills.sh (v0.3.0).
- Python: 3.11 or newer (
requires-python = ">=3.11"inpyproject.toml) - Operating Systems: Linux, macOS, WSL2. Windows native is not supported; use WSL2.
- Git: Required by the installer
Key Python dependencies (from pyproject.toml):
| Package | Purpose |
|---|---|
openai |
OpenAI-compatible API client |
anthropic>=0.39.0 |
Native Anthropic API client |
prompt_toolkit |
Interactive CLI TUI |
rich |
Terminal output formatting |
pyyaml |
Configuration file parsing |
pydantic>=2.12.5 |
Data validation |
faster-whisper>=1.0.0 |
Local voice transcription |
firecrawl-py>=4.16.0 |
Web content extraction |
edge-tts>=7.2.7 |
Free text-to-speech (no API key needed) |
Optional extras (install with pip install "hermes-agent[extra]"):
| Extra | Contents |
|---|---|
messaging |
Telegram, Discord, Slack, WhatsApp gateway |
voice |
Push-to-talk CLI and voice note transcription |
mcp |
Model Context Protocol client |
honcho |
Honcho AI user modeling |
modal / daytona |
Serverless sandbox backends |
rl |
Atropos RL training integration |
acp |
ACP IDE integration server |
cron |
Cron job scheduler |
tts-premium |
ElevenLabs premium TTS |
all |
All of the above |
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bashWorks on Linux, macOS, and WSL2. The installer handles Python, Node.js, dependencies, and the hermes command. No prerequisites except Git.
After installation:
source ~/.bashrc # reload shell (or: source ~/.zshrc)
hermes # start chattinghermes # Interactive CLI -- start a conversation
hermes model # Choose your LLM provider and model
hermes tools # Configure which tools are enabled (curses UI)
hermes setup # Run the full setup wizard (configures everything at once)
hermes gateway # Start the messaging gateway (Telegram, Discord, etc.)
hermes mcp # Install, configure, and authenticate MCP servers (v0.4.0)
hermes claw migrate # Migrate settings and memories from OpenClaw
hermes doctor # Diagnose configuration issues across all providers
hermes update # Update to the latest version (with auto-restart for gateway)Hermes has two entry points: start the terminal UI with hermes, or run the gateway and talk to it from Telegram, Discord, Slack, WhatsApp, Signal, or other platforms. Once you are in a conversation, many slash commands are shared across both interfaces.
| Action | CLI | Messaging platforms |
|---|---|---|
| Start chatting | hermes |
Run hermes gateway setup + hermes gateway start, then send the bot a message |
| Start fresh conversation | /new or /reset |
/new or /reset |
| Resume a named session | /resume [session] (v0.5.0) |
/resume [session] (v0.5.0) |
| Change model | hermes model |
hermes model |
| Set a personality | /personality [name] |
/personality [name] |
| Retry or undo the last turn | /retry, /undo |
/retry, /undo |
| Compress context / check usage | /compress, /usage, /insights [--days N] |
/compress, /usage, /insights [days] |
| Toggle config status bar | /statusbar (v0.4.0) |
-- |
| Queue a prompt without interrupting | /queue [prompt] (v0.4.0) |
/queue [prompt] (v0.4.0) |
| Switch approval mode | /permission [mode] (v0.4.0) |
/permission [mode] (v0.4.0) |
| Open an interactive browser session | /browser (v0.4.0) |
-- |
| Show live cost / usage in gateway | -- | /cost (v0.4.0) |
| Approve / deny a pending command | -- | /approve, /deny (v0.4.0) |
| Toggle tool output verbosity | -- | /verbose (v0.5.0) |
| Browse skills | /skills or /<skill-name> |
/skills or /<skill-name> |
| Interrupt current work | Ctrl+C or send a new message |
/stop or send a new message |
| Platform-specific status | /platforms |
/status, /sethome |
| Document | Contents |
|---|---|
| architecture.md | Component diagram, agent loop, context compression, session storage, internal design |
| changelog.md | v0.5.0, v0.4.0, v0.3.0 and v0.2.0 release notes with full feature lists |
| streaming.md | Unified streaming infrastructure, token delivery, platform support (v0.3.0) |
| Document | Contents |
|---|---|
| installation.md | Install methods (one-liner, manual, Windows), prerequisites, setup wizard |
| configuration.md | Complete config.yaml reference, env vars, precedence rules |
| cli-reference.md | All CLI commands, slash commands, flags, environment variables |
| providers.md | All 17 LLM providers, credential resolution, routing, OAuth flows |
| Document | Contents |
|---|---|
| plugins.md | Plugin architecture, discovery, registration, custom tools (v0.3.0) |
| hooks.md | Gateway and plugin lifecycle hooks, event types |
| browser.md | Browser automation, Browserbase, Browser Use, CDP connect |
| checkpoints.md | Filesystem checkpoints, rollback, git worktree isolation |
| cron.md | Scheduled task system, schedule formats, job management |
| voice-mode.md | Voice interaction, STT/TTS providers, Discord voice channels (v0.3.0) |
| batch-processing.md | Batch trajectory generation for SFT/RL training data |
| delegation.md | Subagent spawning, task isolation, parallel workstreams |
| api-server.md | OpenAI-compatible HTTP API, endpoints, compatible frontends |
| skins.md | CLI themes, custom skins, banner configuration |
| Document | Contents |
|---|---|
| skills.md | Skills system, SKILL.md format, 94 bundled + 12 optional skills catalog |
| tools.md | All 40+ built-in tools reference with parameters and toolsets |
| toolsets.md | Toolset definitions, compositions, per-platform configuration |
| mcp.md | MCP client integration, stdio/HTTP transports, sampling, tool filtering |
| Document | Contents |
|---|---|
| security.md | Five-layer security model, PII redaction, approvals, tirith scanning, OAuth |
| memory.md | MEMORY.md/USER.md system, Honcho integration, session storage |
| acp.md | ACP server for IDE integration (VS Code, Zed, JetBrains) |
| Document | Contents |
|---|---|
| gateway.md | Gateway architecture, session management, authorization, streaming |
| messaging/ | Per-platform setup guides for all 12 messaging platforms |
| messaging/telegram.md | Telegram Bot API setup, forum topics, voice notes |
| messaging/discord.md | Discord bot setup, voice channels, threads |
| messaging/slack.md | Slack Bolt + Socket Mode setup |
| messaging/whatsapp.md | WhatsApp Baileys bridge setup |
| messaging/signal.md | Signal via signal-cli-rest-api |
| messaging/email.md | Email via IMAP/SMTP |
| messaging/matrix.md | Matrix with E2E encryption |
| messaging/mattermost.md | Mattermost team chat |
| messaging/dingtalk.md | DingTalk enterprise messaging |
| messaging/homeassistant.md | Home Assistant smart home |
| messaging/open-webui.md | Open WebUI / API server frontend |
| messaging/sms.md | SMS via Twilio |
| Document | Contents |
|---|---|
| contributing.md | Dev setup, contribution guide, PR process, priorities |
Official documentation: hermes-agent.nousresearch.com/docs
- Discord: discord.gg/NousResearch
- Skills Hub: agentskills.io
- Issues: github.com/NousResearch/hermes-agent/issues
- Discussions: github.com/NousResearch/hermes-agent/discussions
MIT -- see LICENSE.
Built by Nous Research.