TrajectoryRL

Bittensor Subnet 11 — Optimize AI agent policies through decentralized competition

TrajectoryRL is a Bittensor subnet where miners compete to optimize AI agent policies for real-world tasks. Validators evaluate policy packs using deterministic scenarios, rewarding agents that are safe, efficient, and reliable.

Overview

┌──────────────────────────────────────────────────────────────┐
│                   TRAJECTORYRL SUBNET (SN11)                 │
│                                                              │
│  MINERS                              VALIDATORS              │
│  ┌───────────────┐                   ┌───────────────────┐   │
│  │ Upload        │   on-chain        │ Read commitments  │   │
│  │ pack.json to  │   commitment      │ from chain        │   │
│  │ public HTTP   │─────────────────> │                   │   │
│  │ endpoint      │                   │ Fetch packs via   │   │
│  └───────────────┘                   │ HTTP, verify      │   │
│        │                             │ hash + timestamp  │   │
│        │                             │                   │   │
│        │                             │ Evaluate via      │   │
│        │                             │ ClawBench         │   │
│        │                             └───────────────────┘   │
│        │                                      │              │
│        │                                      │ set_weights  │
│        ▼                                      ▼              │
│  ┌──────────────────────────────────────────────────────┐    │
│  │              BITTENSOR BLOCKCHAIN                    │    │
│  │   Commitments, weights, TAO rewards                  │    │
│  └──────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘

No server required — Miners upload packs to any HTTP endpoint and commit metadata on-chain. No public IP, no uptime needed.
Two-phase evaluation — ClawBench scenarios with fixed fixtures; LLM-as-judge scores trajectories against natural-language criteria (Phase 1: pack integrity, Phase 2: trajectory quality)
Content-addressed — Packs identified by SHA256 hash, verified against on-chain commitment
Winner-take-all — Best miner gets 100% of rewards; first-mover advantage protects early innovators
Anti-copy — On-chain block timestamps + NCD similarity detection + first-mover threshold (delta=0.05)

See INCENTIVE_MECHANISM.md for full scoring, rewards, and anti-gaming details.

Example ROI (1,000 tasks/day)

Unoptimized GLM-5:                       $12,300/month

Stage 1 — Prompt optimization (AGENTS.md tuning):
  Optimized prompts + stop rules:         $3,300/month  (73% reduction)

Stage 2 — Hybrid routing (AGENTS.md + injected skills):
  Multi-LLM dynamic routing:               $900/month  (93% reduction)
    ├─ Qwen 3.5 (Alibaba) handles 40% of sub-tasks (tool calls, lookups)
    ├─ GLM-5 (Z.ai) handles 25% (structured extraction, formatting)
    ├─ Gemini 3 Flash (Google) handles 20% (search, summarization)
    ├─ GPT-5.2 (OpenAI) handles 10% (reasoning, drafting)
    └─ Claude Opus 4.6 (Anthropic) handles 5% (complex judgment calls)

Quick Start

For Validators

Validators run via Docker with automatic updates from GHCR via Watchtower. When new code is pushed to prod, GitHub Actions builds a new image and Watchtower auto-pulls and restarts within 5 minutes.

1. Prerequisites (one-time)

# Install btcli
pip install bittensor-cli

# Create or import your wallet
btcli wallet create --wallet-name my-validator

# Register hotkey on SN11 (~0.2 TAO burn fee)
btcli subnets register --wallet-name my-validator --hotkey default --netuid 11

# Stake alpha so your weights count (must be top 64 by stake for validator permit)
btcli stake add --wallet-name my-validator --hotkey default --netuid 11 --amount 100

2. Configure environment

cat > .env.validator <<'EOF'
WALLET_NAME=my-validator
WALLET_HOTKEY=default
NETUID=11
NETWORK=finney
CLAWBENCH_LLM_API_KEY=your-api-key
CLAWBENCH_LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/
CLAWBENCH_DEFAULT_MODEL=zhipu/glm-5
EOF

Supported providers (any OpenAI-compatible API works):

Provider	`CLAWBENCH_LLM_BASE_URL`	`CLAWBENCH_DEFAULT_MODEL`
Zhipu AI (default)	`https://open.bigmodel.cn/api/paas/v4`	`zhipu/glm-5`
Chutes	`https://llm.chutes.ai/v1`	`chutes/zai-org/GLM-5-TEE`
OpenRouter	`https://openrouter.ai/api/v1`	`openrouter/zhipu/glm-5`

Variable	Required	Description
`WALLET_NAME`	Yes	Bittensor wallet name
`WALLET_HOTKEY`	Yes	Hotkey name (usually `default`)
`NETUID`	Yes	Subnet UID (`11`)
`NETWORK`	Yes	`finney`, `test`, or `local`
`CLAWBENCH_LLM_API_KEY`	Yes	API key for the LLM provider (e.g. Zhipu AI, Chutes, OpenRouter)
`CLAWBENCH_LLM_BASE_URL`	Yes	Base URL for the OpenAI-compatible API
`CLAWBENCH_DEFAULT_MODEL`	Yes	LLM model for evaluation (default: `zhipu/glm-5`)
`JUDGE_MODEL`	No	LLM model for judge (defaults to `CLAWBENCH_DEFAULT_MODEL`)
`JUDGE_API_KEY`	No	API key for judge (defaults to `CLAWBENCH_LLM_API_KEY`)
`JUDGE_BASE_URL`	No	Base URL for judge (defaults to `CLAWBENCH_LLM_BASE_URL`)

3. Start validator

# Start validator + Watchtower (auto-updates from GHCR)
docker compose -f docker/docker-compose.validator.yml --env-file .env.validator up -d

# View logs
docker compose -f docker/docker-compose.validator.yml logs -f validator

The Docker container reads wallet keyfiles from the mounted ~/.bittensor/wallets/ directory. No btcli is needed inside the container.

Tip: Watchtower checks for new images every 5 minutes. To update immediately:
docker compose -f docker/docker-compose.validator.yml pull
docker compose -f docker/docker-compose.validator.yml --env-file .env.validator up -d

See VALIDATOR_OPERATIONS.md for cost model, auto-update details, and operational guidance.

For Miners

Mining means writing policy packs — system prompts, tool usage rules, and stop conditions — that make AI agents perform tasks safely and cheaply. No GPU, no server, no uptime required.

IP Notice: All policy packs submitted to TrajectoryRL are published to public repositories and licensed under the MIT License. By submitting a pack, you agree that your submission is freely available for anyone — including TrajectoryRL, other miners, and third parties — to use, modify, and redistribute. Do not submit content you are not willing to release publicly under MIT.

1. Prerequisites (one-time)

pip install bittensor-cli

btcli wallet create --wallet-name my-miner
btcli subnets register --wallet-name my-miner --hotkey default --netuid 11

2. Configure environment

cat > .env.miner <<'EOF'
WALLET_NAME=my-miner
WALLET_HOTKEY=default
NETUID=11
NETWORK=finney
LLM_API_KEY=your-api-key
LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/
LLM_MODEL=zhipu/glm-5
EOF

Tip: Any OpenAI-compatible provider works. For OpenRouter, use LLM_BASE_URL=https://openrouter.ai/api/v1 and LLM_MODEL=zhipu/glm-5.

3. Start mining

git clone https://github.com/trajectoryRL/trajectoryRL.git
cd trajectoryRL
pip install -e .

# Run in default mode: generates AGENTS.md → builds pack → uploads → submits
python neurons/miner.py run --mode default

Note: Simply letting the LLM randomly generate AGENTS.md may not get you a good score. You need to actively optimize and improve your policy pack — study the ClawBench scenarios, understand what makes an agent perform well, and iteratively refine your prompts, tool rules, and stop conditions.

4. Manual operations (optional)

# Build pack from your own AGENTS.md
python neurons/miner.py build --agents-md ./AGENTS.md -o pack.json

# Validate pack locally
python neurons/miner.py validate pack.json

# Check on-chain status
python neurons/miner.py status

5. Local testing with ClawBench

cd clawbench
pip install -e .
# Set CLAWBENCH_LLM_API_KEY, CLAWBENCH_LLM_BASE_URL, CLAWBENCH_DEFAULT_MODEL in .env
# Example Zhipu:      CLAWBENCH_LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/, CLAWBENCH_DEFAULT_MODEL=zhipu/glm-5
# Example Chutes:     CLAWBENCH_LLM_BASE_URL=https://llm.chutes.ai/v1,              CLAWBENCH_DEFAULT_MODEL=chutes/zai-org/GLM-5-TEE
# Example OpenRouter: CLAWBENCH_LLM_BASE_URL=https://openrouter.ai/api/v1,           CLAWBENCH_DEFAULT_MODEL=openrouter/zhipu/glm-5

# Test a single scenario
python scripts/run_episode.py --scenario inbox_triage --variant optimized --json

# Test all scenarios
python scripts/run_batch.py

See MINER_OPERATIONS.md for full details: automated mode, S3 upload, pack format, and scoring targets.

trajrl CLI

A standalone CLI for querying live subnet data — validators, miners, scores, submissions, and eval logs. Designed for both humans and AI agents (Claude Code, Cursor, Codex, OpenClaw, Manus).

pip install trajrl

trajrl status                       # Network health overview
trajrl validators                   # List all validators
trajrl scores                       # Per-miner scores (auto-picks validator)
trajrl miner --uid <uid>            # Miner detail + diagnostics
trajrl download -u <uid>            # Download miner's pack + eval results
trajrl submissions --failed         # Recent failed submissions
trajrl logs --show                  # Download and display latest cycle log
trajrl logs --type cycle            # List cycle log archives

Outputs JSON automatically when piped, Rich tables when interactive. See trajrl/README.md for full documentation.

Documentation

Incentive Mechanism — Scoring, rewards, winner-take-all, and anti-copy protection
Validator Operations — Cost model, auto-updates, and operational guidance
Miner Operations — Pack format, run modes, local testing, and submission workflow
ClawBench — Evaluation framework (scenarios, fixtures, scoring)
trajrl CLI — Query live subnet data from the terminal

Community

License

This project is licensed under the MIT License.

All miner-submitted policy packs are public and released under the same MIT License. By participating as a miner, you acknowledge that your submissions become open-source contributions available to everyone.

Built on Bittensor | Powered by ClawBench

Name		Name	Last commit message	Last commit date
Latest commit History 428 Commits
.github/workflows		.github/workflows
clawbench @ 357baf9		clawbench @ 357baf9
docker		docker
neurons		neurons
openclaw @ 30f9f45		openclaw @ 30f9f45
scripts		scripts
tests		tests
tools		tools
trajectoryrl		trajectoryrl
trajrl		trajrl
.dockerignore		.dockerignore
.env.miner.example		.env.miner.example
.env.validator.example		.env.validator.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
DATASET_v0.1.md		DATASET_v0.1.md
INCENTIVE_MECHANISM.md		INCENTIVE_MECHANISM.md
LICENSE		LICENSE
MINER_OPERATIONS.md		MINER_OPERATIONS.md
README.md		README.md
VALIDATOR_OPERATIONS.md		VALIDATOR_OPERATIONS.md
VERSION		VERSION
pack.json		pack.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TrajectoryRL

Overview

Example ROI (1,000 tasks/day)

Quick Start

For Validators

1. Prerequisites (one-time)

2. Configure environment

3. Start validator

For Miners

1. Prerequisites (one-time)

2. Configure environment

3. Start mining

4. Manual operations (optional)

5. Local testing with ClawBench

trajrl CLI

Documentation

Community

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TrajectoryRL

Overview

Example ROI (1,000 tasks/day)

Quick Start

For Validators

1. Prerequisites (one-time)

2. Configure environment

3. Start validator

For Miners

1. Prerequisites (one-time)

2. Configure environment

3. Start mining

4. Manual operations (optional)

5. Local testing with ClawBench

trajrl CLI

Documentation

Community

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages