Bittensor Subnet 11 — Optimize AI agent policies through decentralized competition
TrajectoryRL is a Bittensor subnet where miners compete to optimize AI agent policies for real-world tasks. Validators evaluate policy packs using deterministic scenarios, rewarding agents that are safe, efficient, and reliable.
┌──────────────────────────────────────────────────────────────┐
│ TRAJECTORYRL SUBNET (SN11) │
│ │
│ MINERS VALIDATORS │
│ ┌───────────────┐ ┌───────────────────┐ │
│ │ Upload │ on-chain │ Read commitments │ │
│ │ pack.json to │ commitment │ from chain │ │
│ │ public HTTP │─────────────────> │ │ │
│ │ endpoint │ │ Fetch packs via │ │
│ └───────────────┘ │ HTTP, verify │ │
│ │ │ hash + timestamp │ │
│ │ │ │ │
│ │ │ Evaluate via │ │
│ │ │ ClawBench │ │
│ │ └───────────────────┘ │
│ │ │ │
│ │ │ set_weights │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ BITTENSOR BLOCKCHAIN │ │
│ │ Commitments, weights, TAO rewards │ │
│ └──────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
- No server required — Miners upload packs to any HTTP endpoint and commit metadata on-chain. No public IP, no uptime needed.
- Two-phase evaluation — ClawBench scenarios with fixed fixtures; LLM-as-judge scores trajectories against natural-language criteria (Phase 1: pack integrity, Phase 2: trajectory quality)
- Content-addressed — Packs identified by SHA256 hash, verified against on-chain commitment
- Winner-take-all — Best miner gets 100% of rewards; first-mover advantage protects early innovators
- Anti-copy — On-chain block timestamps + NCD similarity detection + first-mover threshold (delta=0.05)
See INCENTIVE_MECHANISM.md for full scoring, rewards, and anti-gaming details.
Unoptimized GLM-5: $12,300/month
Stage 1 — Prompt optimization (AGENTS.md tuning):
Optimized prompts + stop rules: $3,300/month (73% reduction)
Stage 2 — Hybrid routing (AGENTS.md + injected skills):
Multi-LLM dynamic routing: $900/month (93% reduction)
├─ Qwen 3.5 (Alibaba) handles 40% of sub-tasks (tool calls, lookups)
├─ GLM-5 (Z.ai) handles 25% (structured extraction, formatting)
├─ Gemini 3 Flash (Google) handles 20% (search, summarization)
├─ GPT-5.2 (OpenAI) handles 10% (reasoning, drafting)
└─ Claude Opus 4.6 (Anthropic) handles 5% (complex judgment calls)
Validators run via Docker with automatic updates from GHCR via Watchtower. When new code is pushed to prod, GitHub Actions builds a new image and Watchtower auto-pulls and restarts within 5 minutes.
# Install btcli
pip install bittensor-cli
# Create or import your wallet
btcli wallet create --wallet-name my-validator
# Register hotkey on SN11 (~0.2 TAO burn fee)
btcli subnets register --wallet-name my-validator --hotkey default --netuid 11
# Stake alpha so your weights count (must be top 64 by stake for validator permit)
btcli stake add --wallet-name my-validator --hotkey default --netuid 11 --amount 100cat > .env.validator <<'EOF'
WALLET_NAME=my-validator
WALLET_HOTKEY=default
NETUID=11
NETWORK=finney
CLAWBENCH_LLM_API_KEY=your-api-key
CLAWBENCH_LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/
CLAWBENCH_DEFAULT_MODEL=zhipu/glm-5
EOFSupported providers (any OpenAI-compatible API works):
| Provider | CLAWBENCH_LLM_BASE_URL |
CLAWBENCH_DEFAULT_MODEL |
|---|---|---|
| Zhipu AI (default) | https://open.bigmodel.cn/api/paas/v4 |
zhipu/glm-5 |
| Chutes | https://llm.chutes.ai/v1 |
chutes/zai-org/GLM-5-TEE |
| OpenRouter | https://openrouter.ai/api/v1 |
openrouter/zhipu/glm-5 |
| Variable | Required | Description |
|---|---|---|
WALLET_NAME |
Yes | Bittensor wallet name |
WALLET_HOTKEY |
Yes | Hotkey name (usually default) |
NETUID |
Yes | Subnet UID (11) |
NETWORK |
Yes | finney, test, or local |
CLAWBENCH_LLM_API_KEY |
Yes | API key for the LLM provider (e.g. Zhipu AI, Chutes, OpenRouter) |
CLAWBENCH_LLM_BASE_URL |
Yes | Base URL for the OpenAI-compatible API |
CLAWBENCH_DEFAULT_MODEL |
Yes | LLM model for evaluation (default: zhipu/glm-5) |
JUDGE_MODEL |
No | LLM model for judge (defaults to CLAWBENCH_DEFAULT_MODEL) |
JUDGE_API_KEY |
No | API key for judge (defaults to CLAWBENCH_LLM_API_KEY) |
JUDGE_BASE_URL |
No | Base URL for judge (defaults to CLAWBENCH_LLM_BASE_URL) |
# Start validator + Watchtower (auto-updates from GHCR)
docker compose -f docker/docker-compose.validator.yml --env-file .env.validator up -d
# View logs
docker compose -f docker/docker-compose.validator.yml logs -f validatorThe Docker container reads wallet keyfiles from the mounted ~/.bittensor/wallets/ directory. No btcli is needed inside the container.
Tip: Watchtower checks for new images every 5 minutes. To update immediately:
docker compose -f docker/docker-compose.validator.yml pull docker compose -f docker/docker-compose.validator.yml --env-file .env.validator up -d
See VALIDATOR_OPERATIONS.md for cost model, auto-update details, and operational guidance.
Mining means writing policy packs — system prompts, tool usage rules, and stop conditions — that make AI agents perform tasks safely and cheaply. No GPU, no server, no uptime required.
IP Notice: All policy packs submitted to TrajectoryRL are published to public repositories and licensed under the MIT License. By submitting a pack, you agree that your submission is freely available for anyone — including TrajectoryRL, other miners, and third parties — to use, modify, and redistribute. Do not submit content you are not willing to release publicly under MIT.
pip install bittensor-cli
btcli wallet create --wallet-name my-miner
btcli subnets register --wallet-name my-miner --hotkey default --netuid 11cat > .env.miner <<'EOF'
WALLET_NAME=my-miner
WALLET_HOTKEY=default
NETUID=11
NETWORK=finney
LLM_API_KEY=your-api-key
LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/
LLM_MODEL=zhipu/glm-5
EOFTip: Any OpenAI-compatible provider works. For OpenRouter, use
LLM_BASE_URL=https://openrouter.ai/api/v1andLLM_MODEL=zhipu/glm-5.
git clone https://github.com/trajectoryRL/trajectoryRL.git
cd trajectoryRL
pip install -e .
# Run in default mode: generates AGENTS.md → builds pack → uploads → submits
python neurons/miner.py run --mode defaultNote: Simply letting the LLM randomly generate AGENTS.md may not get you a good score. You need to actively optimize and improve your policy pack — study the ClawBench scenarios, understand what makes an agent perform well, and iteratively refine your prompts, tool rules, and stop conditions.
# Build pack from your own AGENTS.md
python neurons/miner.py build --agents-md ./AGENTS.md -o pack.json
# Validate pack locally
python neurons/miner.py validate pack.json
# Check on-chain status
python neurons/miner.py statuscd clawbench
pip install -e .
# Set CLAWBENCH_LLM_API_KEY, CLAWBENCH_LLM_BASE_URL, CLAWBENCH_DEFAULT_MODEL in .env
# Example Zhipu: CLAWBENCH_LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/, CLAWBENCH_DEFAULT_MODEL=zhipu/glm-5
# Example Chutes: CLAWBENCH_LLM_BASE_URL=https://llm.chutes.ai/v1, CLAWBENCH_DEFAULT_MODEL=chutes/zai-org/GLM-5-TEE
# Example OpenRouter: CLAWBENCH_LLM_BASE_URL=https://openrouter.ai/api/v1, CLAWBENCH_DEFAULT_MODEL=openrouter/zhipu/glm-5
# Test a single scenario
python scripts/run_episode.py --scenario inbox_triage --variant optimized --json
# Test all scenarios
python scripts/run_batch.pySee MINER_OPERATIONS.md for full details: automated mode, S3 upload, pack format, and scoring targets.
A standalone CLI for querying live subnet data — validators, miners, scores, submissions, and eval logs. Designed for both humans and AI agents (Claude Code, Cursor, Codex, OpenClaw, Manus).
pip install trajrl
trajrl status # Network health overview
trajrl validators # List all validators
trajrl scores # Per-miner scores (auto-picks validator)
trajrl miner --uid <uid> # Miner detail + diagnostics
trajrl download -u <uid> # Download miner's pack + eval results
trajrl submissions --failed # Recent failed submissions
trajrl logs --show # Download and display latest cycle log
trajrl logs --type cycle # List cycle log archivesOutputs JSON automatically when piped, Rich tables when interactive. See trajrl/README.md for full documentation.
- Incentive Mechanism — Scoring, rewards, winner-take-all, and anti-copy protection
- Validator Operations — Cost model, auto-updates, and operational guidance
- Miner Operations — Pack format, run modes, local testing, and submission workflow
- ClawBench — Evaluation framework (scenarios, fixtures, scoring)
- trajrl CLI — Query live subnet data from the terminal
- GitHub: https://github.com/trajectoryRL/trajectoryRL
- Website: https://trajrl.com
This project is licensed under the MIT License.
All miner-submitted policy packs are public and released under the same MIT License. By participating as a miner, you acknowledge that your submissions become open-source contributions available to everyone.