Skip to content

nginx for AI agents. Cost budgets, rate limiting, circuit breakers, live dashboard.

Notifications You must be signed in to change notification settings

tobySolutions/agentgateway

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agentgateway

nginx for AI agents.

Build npm License: MIT


The Problem

You have a fleet of AI agents -- researchers, coders, reviewers, planners -- all making API calls to Claude, GPT-4, and other models. Without a gateway, you have:

  • No visibility into which agent is spending how much
  • No budget controls -- a runaway agent can burn through your API credits in minutes
  • No rate limiting per agent -- one noisy agent starves the others
  • No circuit breakers -- one failing provider cascades errors everywhere
  • No audit trail -- good luck debugging what happened at 3am

agentgateway sits between your agents and the LLM providers. It proxies every request, tracks every token, enforces budgets, and gives you a real-time dashboard to see exactly what your fleet is doing.

Installation

npm install -g agentgateway

Quick Start

Create a gateway.yaml:

gateway:
  port: 8080

agents:
  researcher:
    model: claude-sonnet-4-20250514
    provider: anthropic
    budget:
      max_cost_per_hour: 5.00
    rate_limit: 100/min

  writer:
    model: claude-haiku-4-5-20251001
    provider: anthropic
    budget:
      max_cost_per_hour: 1.00
    rate_limit: 50/min

dashboard:
  port: 4040

Start the gateway:

agentgateway start -c gateway.yaml

Point your agents at the gateway instead of the API directly:

# Before (direct to Anthropic)
curl https://api.anthropic.com/v1/messages ...

# After (through agentgateway)
curl http://localhost:8080/v1/messages \
  -H "x-agent-id: researcher" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -d '{"model": "claude-sonnet-4-20250514", "messages": [...]}'

CLI Commands

Command Description
agentgateway start Start the proxy server
agentgateway status Show status of all agents
agentgateway logs Stream audit logs
agentgateway logs -f Follow logs in real-time
agentgateway pause <agent> Pause an agent
agentgateway pause <agent> -r Resume a paused agent
agentgateway budget <agent> View agent budget
agentgateway budget <agent> --hourly 10 Update hourly budget
agentgateway dashboard Open web dashboard

Architecture

                                   agentgateway
 ┌──────────┐     ┌─────────────────────────────────────────────┐
 │  Agent 1  │────▶│                                             │
 │ researcher│     │  ┌───────────┐  ┌──────────┐  ┌─────────┐  │     ┌──────────┐
 └──────────┘     │  │   Rate    │  │  Budget  │  │ Circuit │  │────▶│ Anthropic│
                  │  │  Limiter  │─▶│ Tracker  │─▶│ Breaker │  │     └──────────┘
 ┌──────────┐     │  └───────────┘  └──────────┘  └─────────┘  │
 │  Agent 2  │────▶│                                             │
 │   coder   │     │  ┌───────────┐  ┌──────────┐  ┌─────────┐  │     ┌──────────┐
 └──────────┘     │  │  Policy   │  │  Audit   │  │  Proxy  │  │────▶│  OpenAI  │
                  │  │  Engine   │  │  Logger  │  │  Layer  │  │     └──────────┘
 ┌──────────┐     │  └───────────┘  └──────────┘  └─────────┘  │
 │  Agent 3  │────▶│                                             │
 │  writer   │     │         ┌──────────────────┐               │
 └──────────┘     │         │    Dashboard     │               │
                  │         │   :4040 (WS)     │               │
                  │         └──────────────────┘               │
                  └─────────────────────────────────────────────┘

Features

Per-Agent Budget Control

Set hourly, daily, and total cost limits per agent. Agents are automatically paused when they exceed their budget.

Token Bucket Rate Limiting

Each agent gets its own rate limiter. Configure requests per second, minute, hour, or day. Burst-friendly token bucket algorithm.

Circuit Breaker

Full state machine (CLOSED -> OPEN -> HALF_OPEN -> CLOSED) with configurable error thresholds, reset timeouts, and sliding window error tracking.

Policy Engine

Define rules in YAML that evaluate against agent state:

policies:
  - name: high-error-rate
    condition: "error_rate > 0.5"
    action: pause

  - name: runaway-spend
    condition: "cost_this_hour > 50.0"
    action: deny

SSE Streaming Support

Full passthrough of Server-Sent Events with mid-stream token counting. Your agents get streaming responses with zero added latency.

Real-Time Dashboard

Dark-themed web dashboard showing agent status, costs, call rates, errors, and circuit breaker states. Updates via WebSocket every second.

Structured Audit Log

Every request logged with: agent ID, model, provider, tokens (in/out), cost, latency, and HTTP status. Query by agent, time range, or export as JSON.

Multi-Provider

Route different agents to different providers. Run your planner on Claude Opus, your coder on GPT-4o, and your summarizer on Haiku -- all through one gateway.

Configuration Reference

gateway:
  port: 8080            # Proxy server port
  host: 0.0.0.0         # Bind address

agents:
  <agent-id>:
    model: string        # Default model for this agent
    provider: string     # "anthropic" or "openai"
    api_key: string      # Optional per-agent API key
    budget:
      max_cost_per_hour: number
      max_cost_per_day: number     # Optional
      max_cost_total: number       # Optional lifetime cap
    rate_limit: string   # Format: "<count>/<unit>" (e.g. "100/min")
    circuit_breaker:
      error_threshold: number      # Errors before tripping (default: 5)
      reset_timeout: number        # Ms before retry (default: 30000)
      half_open_requests: number   # Successes to close (default: 3)
    tags: string[]       # Optional labels

policies:
  - name: string
    condition: string    # e.g. "error_rate > 0.5"
    action: string       # "pause", "deny", "alert", "throttle"
    params: object       # Action-specific parameters

dashboard:
  port: 4040
  host: 0.0.0.0

audit:
  log_file: string       # Path to JSONL audit log
  max_entries: number    # Max in-memory entries

Condition Fields

Field Type Description
error_rate number Error ratio in current hour (0-1)
cost_this_hour number Dollar cost in current hour
cost_today number Dollar cost today
total_cost number Lifetime dollar cost
calls_this_hour number API calls in current hour
call_count number Lifetime API calls
error_count number Lifetime errors
status string Agent status (active/paused/idle)

API Endpoints

The gateway exposes management endpoints alongside the proxy:

Endpoint Method Description
/health GET Health check
/api/status GET All agent statuses
/api/agents/:id/pause POST Pause an agent
/api/agents/:id/resume POST Resume an agent
/api/budget/:id PUT Update agent budget
/api/logs GET Recent audit entries

Development

# Clone and install
git clone https://github.com/tobySolutions/agentgateway.git
cd agentgateway
npm install

# Build all packages
npm run build

# Development mode (watch)
npm run dev

License

MIT

About

nginx for AI agents. Cost budgets, rate limiting, circuit breakers, live dashboard.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors