v0.30 - Agents Unleashed

Describe what you want. Go home.
Sandcastle ships it.

Which provider. Which model. Which agent. What happens when it fails. What it costs. Where the data lives. Who approves what. You used to build all of that yourself. Now you describe what you want and go home. European-built, open source.

Terminal
1
Install
One command. No database, no Redis, no Docker. Just Python.
2
Describe
Tell AI what you need. Or pick from 236 ready-made templates.
3
Ship
It runs. It fails over. It tracks costs. You go home.
or from source
$ git clone https://github.com/gizmax/Sandcastle.git && cd Sandcastle && uv sync && uv run sandcastle serve
7
AI Providers
21
Step Types
236
Templates
63
Integrations
14,600+
Tests Passing
EU AI Act
Compliant
GitHub Stars PyPI Version Python Versions License
Zero vendor lock-in - 7 AI providers + 15 managed agent templates. Cloud agents, local models, OCR - all composable. EU data residency built in.

You became a developer to build things.

Not to manage API keys. Not to debug rate limits at 3 AM. Not to explain to your EU client why their data just passed through a Virginia data center. Somewhere between the framework docs and the third provider outage this month, you forgot what you were actually trying to build. Sandcastle remembers.

Without Sandcastle
  • Three months building glue code before the first real feature
  • Locked into one provider. Pricing changes? Start over.
  • Invoice arrives. Surprise. No one tracked the costs.
  • Provider goes down at 3 AM. Your phone rings.
  • EU client asks where their data lives. You check Slack.
  • Compliance audit next month. Nobody documented anything.
With Sandcastle
  • Describe it, ship it. Tell AI what you need - in production today, not next quarter.
  • 7 providers, one setting. Switch from Claude to Mistral in one line. Run locally with oMLX.
  • You know what it costs. Per provider. Per workflow. Per step. Right now.
  • Provider down? You sleep. Auto-failover switches to the next one.
  • EU data stays in EU. One toggle. Hard enforcement. Not a policy doc.
  • Agents as building blocks. 15 templates. Or describe what you need. AI designs the agent.
  • Audit trail from day one. Tamper-evident. EU AI Act ready. Always on.
๐Ÿ”—
Pipelines, not scripts
"Scrape 50 pages, enrich each one, score and rank - in one YAML file."
๐Ÿ”€
Any provider, one setting
"Claude today, Mistral tomorrow, Ollama on your laptop. Change one line."
๐Ÿ’ฐ
Know what it costs
"$120 on Claude last month. Same work on Mistral? $45. Here's a button."
๐Ÿ˜ด
Sleep through outages
"Provider hit rate limit at 3 AM. Sandcastle switched to the next one. You slept."
๐Ÿ‡ช๐Ÿ‡บ
EU data stays in EU
"One toggle. Not a policy document - hard enforcement at the routing layer."
๐ŸŽฏ
Right model, right job
"Critical analysis gets Claude. Simple formatting gets Haiku. Automatic."
๐Ÿ“„
Parse anything
"PDF, DOCX, XLSX. 4 OCR engines including GLM-OCR at 94.6% accuracy. Runs locally, free."
๐Ÿ›ก๏ธ
Compliance from day one
"EU AI Act, audit trail, PII redaction, kill switch. Not an add-on - built in."
๐Ÿค–
Agents as steps
"Claude researches in the cloud. Mistral formats for 1/10th cost. oMLX runs locally. All in one workflow."
๐Ÿ“Š
Documents in, reports out
"Scan a contract with 94.6% OCR. Analyze it with AI. Get a PDF report with charts. One workflow."

Everything you need for production agents

From sandbox execution to production-grade orchestration - everything you need in one package.

๐Ÿ”€

DAG Workflow Engine

Define multi-step pipelines in YAML. Dependencies, parallel branches, data passing between steps.

โšก

Parallel Execution

Steps at the same DAG layer run concurrently. Fan out over lists with configurable concurrency.

๐Ÿ–๏ธ

Sandshore Runtime

Purpose-built agent runtime with circuit breaker, pool management, health caching, and production-grade optimization.

๐Ÿ“ฆ

Pluggable Sandboxes

Four backends: E2B cloud microVMs, Docker containers, Cloudflare Workers edge, or local subprocess. Switch with one env var.

๐Ÿงฉ

20 Step Types

llm, http, code, condition, classify, loop, race, sensor, gate, transform, notify, delegate, parse, openclaw. Mix AI with $0 deterministic steps.

๐ŸŒ

Universal Advisor

Any LLM powers all AI features - Claude, Mistral, OpenAI, Ollama, Google, MiniMax. One config, short aliases like sonnet, opus, openai/gpt-4o, ollama/llama3.

๐Ÿ”„

Smart Auto-Failover

Automatic fallback on 429 or 5xx. Per-key cooldown tracking, ordered failover chains. SLO-aware routing picks the best model for critical ops, cheapest for simple ones.

โœจ

AI Workflow Generator

Describe what you need in plain English. sandcastle generate creates a complete workflow YAML with steps, dependencies, and model selection.

๐Ÿง 

Agent Memory - No OpenAI Required

Persistent context across runs powered by Anthropic Claude + local fastembed/ONNX embeddings. Semantic search auto-injects relevant memories into prompts. Zero OpenAI dependency.

๐ŸŽฏ

Cost-Latency Optimizer

SLO-based model routing with feedback loop, model degradation alerts, and automatic severity-based recommendations.

๐Ÿงฌ

AutoPilot v2

Thompson Sampling variant selection, Welch's t-test significance testing, and progressive rollout (canary 10% -> partial 50% -> full 100%).

๐Ÿ’ก

Cost Intelligence

Per-provider cost breakdown with proactive savings recommendations. Shows "Would cost $45 via Mistral" comparisons so you always pick the right model for the budget.

๐Ÿ›ก๏ธ

Policy Engine

Declarative rules for PII redaction, secret blocking, cost guards. Applied per step or globally.

๐Ÿ’ฐ

Budget Guardrails

Cost tracking per step and per run. Set hard budgets and get alerts before they blow up your invoice.

๐Ÿ‘ค

HITL Approvals

Pause workflows for human review. Approve, reject, or edit the data before the next step runs. Multi-strategy gates.

โšก

Circuit Breaker

Automatic failure detection with CLOSED/OPEN/HALF_OPEN states. Prevents cascading failures across your pipeline.

๐Ÿšฆ

Progressive Rollout

Deploy experiment winners safely: canary (10%) to partial (50%) to full (100%). Statistical significance required before each stage advance.

๐Ÿ‡ช๐Ÿ‡บ

EU Data Residency

Toggle one setting - all AI processing stays in EU or local. Built-in compliance with data residency requirements, not just a promise. Works with any provider that offers EU endpoints.

๐Ÿ”ง

63 Tool Connectors

Slack, GitHub, OpenAI, Anthropic, AWS S3, Google Sheets, Airtable, Discord, Supabase, Pinecone, Stripe, Shopify, PagerDuty, Datadog, Langfuse, Qdrant, GCS, Azure Blob, Exa, and 44 more. Add tools: [slack] to any step.

๐Ÿค

A2A Protocol

Google's Agent-to-Agent protocol. Agent card discovery at /.well-known/agent.json, JSON-RPC 2.0 task management.

๐Ÿ“บ

AG-UI Streaming

CopilotKit's Agent-User Interaction protocol. Real-time SSE streaming of agent state, tool calls, and text deltas to any frontend.

๐Ÿ”Œ

MCP Server

Built-in Model Context Protocol server. Run workflows, check status, and manage schedules from Claude Desktop, Cursor, or Windsurf.

๐Ÿ“ก

Webhooks + SSE

Real-time event streaming via SSE. Webhook dispatcher for external integrations. Live updates, no polling.

๐Ÿ”—

Named Connections

Multiple credential instances per tool. slack:engineering, postgresql:analytics - named connections resolve to the right credentials automatically.

๐ŸŒ

Browser RPA

Five modes: Playwright (selector-based), Computer Use (vision AI), DOM Extract (accessibility tree), LightPanda (10x faster headless via CDP), Browserbase (cloud-hosted, zero cold-start). Action caching, CAPTCHA escalation, execution replay.

๐Ÿง 

AI Connectors

OpenAI, Anthropic, ElevenLabs, Tavily, Firecrawl, Pinecone. Chain multiple AI providers, vector search, web scraping, and text-to-speech in your workflows.

๐Ÿš€

DevOps Connectors

Vercel, Cloudflare Workers, Datadog, PagerDuty, AWS S3, Redis. Deploy, monitor, alert, and manage infrastructure from workflow steps.

โŒจ๏ธ

Full CLI Suite

templates, runs, replay, fork, approve, reject, generate, doctor. Global --json flag for scripting.

๐Ÿ“Š

Real-time Dashboard

Runs, costs, schedules, dead letters, approvals, experiments, policy violations - all in one place. Visual workflow builder included.

๐Ÿ“‹

236 Templates

118 built-in + 118 community templates. Marketing, sales, engineering, support, HR, legal, and general AI. Full Community Hub with one-click install and uninstall.

๐Ÿงช

Evaluation Framework

A/B test models and prompts per step. Automatic quality evaluation and best-variant deployment with AutoPilot.

๐Ÿš€

REST API + Docs

Full OpenAPI spec. Interactive docs at /api/docs. Zero-config local mode - sandcastle init + sandcastle serve.

๐Ÿช

Community Hub

Browse, install, and share workflow templates. One-click install from the dashboard or sandcastle hub install author/name from CLI.

๐Ÿ“ก

OpenTelemetry

Optional OTLP instrumentation with workflow and step-level spans. Includes cost, duration, and token counts as span attributes. Install with pip install sandcastle-ai[otel].

๐Ÿค–
AI Providers - mix per step
C
Claude
Opus, Sonnet, Haiku
O
OpenAI
Codex, Codex Mini
M
MiniMax
M2.5
G
Gemini
via OpenRouter
O
Ollama
Local models
M
Mistral
Large, Small, Codestral
X
oMLX
Apple Silicon local
๐Ÿ“ฆ
Sandbox Backends - switch via env var
E
E2B
Cloud sandboxes (default)
D
Docker
Local containers
L
Local
Subprocess (dev only)
CF
Cloudflare
Edge Workers

Workflows as YAML

No SDKs, no boilerplate. Declare your pipeline, Sandcastle handles the rest.

lead-enrichment.yaml hybrid
name: "Lead Enrichment"
default_model: sonnet

steps:
  - id: "fetch"
    type: http              # $0 - no LLM needed
    http_config:
      url: "https://api.example.com/company/{input.id}"
      method: GET

  - id: "enrich"
    type: llm               # single API call, no sandbox
    depends_on: ["fetch"]
    prompt: "Research: {steps.fetch.output}"

  - id: "route"
    type: classify          # LLM-based routing
    depends_on: ["enrich"]
    classify_config:
      categories: [hot, warm, cold]
      input: "{steps.enrich.output}"
      branches:
        hot: [priority-outreach]
        warm: [nurture-sequence]
        cold: [archive]

Mix agent + lightweight steps. One file.

Not every step needs a full AI agent. Use http for API calls ($0), code for Python snippets, condition and classify for branching, loop for iteration, race for parallel competition, sensor for polling, transform for templates, notify for alerts - and llm or full agent steps where you actually need AI.

  • 21 step types - standard, llm, http, code, condition, classify, loop, race, sensor, gate, transform, notify, delegate, approval, sub_workflow, parse, openclaw, map, retry, switch, agent
  • Smart branching - condition, classify, and race route to different paths
  • $0 steps - http, code, condition, transform, notify cost nothing - no LLM call
  • Data passing - reference prior outputs with {steps.id.output}
  • Per-step models - Claude, OpenAI, MiniMax, Gemini - mix per step
  • Automatic retries - exponential backoff on failure
  • Persistent storage - results saved to disk (local) or S3 (production)

See everything. Control everything.

Runs, costs, schedules, dead letters, approvals, experiments, policy violations - all in one place. Includes a visual workflow builder for drag-and-drop pipeline design.

Talk to Sandcastle from your AI editor

Built-in MCP (Model Context Protocol) server. Claude Desktop, Cursor, and Windsurf can run workflows, check status, and manage schedules - all from the chat interface.

claude_desktop_config.json
{
  "mcpServers": {
    "sandcastle": {
      "command": "sandcastle",
      "args": ["mcp"]
    }
  }
}

8 tools. 3 resources. Zero config.

Add one JSON block to your client config. The MCP server connects to a running sandcastle serve instance and exposes the full workflow API.

  • run_workflow - run a saved workflow by name with optional input
  • run_workflow_yaml - run a workflow from inline YAML definition
  • get_run_status - detailed status with all step results
  • list_runs - browse runs with status and workflow filters
  • cancel_run - stop a queued or running workflow
  • save_workflow - save YAML workflow definitions to the server
  • create_schedule / delete_schedule - manage cron schedules
  • Resources - read-only access to workflows, schedules, and health
Claude Desktop Cursor Windsurf

236 workflow templates, ready to deploy

Skip the blank page. Pick a template, tweak the prompts, and ship. Browse the Community Hub or install via CLI.

20 Marketing Blog to Social, SEO Content, Ad Copy, Email Campaign, Content Calendar, Competitor Analysis...
17 Engineering API Docs, Data Extractor, Jira Triage, Release Notes, Slack Standup, Incident Responder...
17 General AI Research Agent, PDF Summary, Chain of Thought, Invoice Processor, Course Creator...
15 Sales & CRM Lead Enrichment, Lead Scoring, CRM Sync, Meeting Recap, Churn Predictor, Pipeline Autopilot...
11 HR & Legal Resume Screener, Contract Review, Compliance Checker, Onboarding, Job Description, Recruiting Pipeline...
8 Support Ticket Triage, FAQ Generator, SLA Watchdog, Customer Health Check, Multi-Channel Router, Voice Agent...
Browse Community Hub

Your workflows improve themselves

Set a goal. Walk away. Come back to a faster, cheaper, more accurate workflow.

Baseline
45
evolving...
Best
79
+75%
🧰

Autonomous Optimization

Sandcastle mutates prompts, swaps models, removes unnecessary steps - and keeps only what improves your score. Inspired by Karpathy's autoresearch.

📈

Composite Scoring

Quality, cost, and speed measured together. Every mutation is evaluated against your eval suite. No subjective "looks good" - pure metrics.

The Great Simplification

The best optimization is often removal. Sandcastle discovers that simpler workflows outperform complex ones - just like the AI that taught itself to be a quant.

Start evolving

Enterprise-Ready Compliance

Built-in EU AI Act compliance, tamper-evident audit trail, and PII redaction - so you can deploy AI in regulated industries.

๐Ÿ›ก๏ธ

EU AI Act Compliance

  • Risk classification: minimal / limited / high / unacceptable
  • Compliance mode enforcement for high-risk workflows
  • Transparency reports (Article 13)
  • Annex IV technical documentation generator
  • Emergency stop for high-risk workflows
๐Ÿ”’

Tamper-Evident Audit Trail

  • SHA-256 hash chain on all audit events
  • Full lifecycle tracking across 7 executor events
  • Chain integrity verification endpoint
  • Per-run and global audit trail APIs
  • Admin route actions fully logged
๐Ÿ”

Privacy Router (PII Redaction)

  • 7 PII patterns: email, phone, SSN, credit card, IP, IBAN, DOB
  • Per-workflow or per-server configuration
  • Redact or audit-only modes
  • Zero-dependency regex engine
  • Applied to all LLM inputs and outputs

Ready to ship agent workflows?

Install Sandcastle in under 30 seconds. No infrastructure, no setup - just works.

$ pip install sandcastle-ai

Boring tech, reliable results

No exotic dependencies. Battle-tested tools you already know. Local mode needs zero infrastructure.

๐Ÿ
Python 3.12
API Server
โšก
FastAPI
REST + SSE
๐Ÿ—„๏ธ
SQLite / PostgreSQL
Local / Production DB
๐Ÿ”ด
In-process / Redis
Local / Production Queue
๐Ÿ“ฆ
Filesystem / S3
Local / Production Storage
โš›๏ธ
React + TS
Dashboard
๐ŸŽจ
Tailwind CSS
Styling
๐Ÿ–๏ธ
Sandshore
Agent Runtime

Free forever. Seriously.

Sandcastle is fully open source under the Business Source License 1.1. Free for non-production use, converts to Apache 2.0 in 2030. No feature gates, no "contact sales" buttons. Every feature you see on this page ships in the free version - because there is only one version.

โ˜•

Buy me a coffee

I built Sandcastle in my free time and I plan to keep improving it - new features, better docs, bug fixes, community support. If it saves you time or you just think it's cool, a coffee goes a long way.

It keeps me caffeinated and motivated to ship the next update.

โ˜• Buy Me a Coffee
โ˜• Fuels new features ๐Ÿ› Faster bug fixes ๐Ÿ“– Better documentation
Get Started