Sandcastle is a production-ready workflow orchestrator for AI agents. It lets you define agent pipelines in YAML, run them with pluggable sandbox backends, and monitor everything through a real-time dashboard. It includes 63 integrations, EU AI Act compliance features, and costs $0 to start.

Is Sandcastle EU AI Act compliant?

Yes. Sandcastle includes built-in EU AI Act compliance: risk classification, tamper-evident audit trail (SHA-256 hash chain), transparency reports, Annex IV documentation, emergency stop, PII redaction, and EU data residency enforcement - all AI processing stays in EU when enabled.

How do I install Sandcastle?

Install with pip: pip install sandcastle-ai. Then run sandcastle init to configure, and sandcastle serve to start the dashboard. Zero external dependencies required for local mode.

What LLM providers does Sandcastle support?

Sandcastle supports multi-provider model routing across Anthropic Claude, OpenAI GPT, Google Gemini, MiniMax, Mistral, OpenRouter, local models via Ollama, and on-device inference via oMLX on Apple Silicon. You can mix and match models per workflow step.

Does Sandcastle have a free tier?

Sandcastle is open-source with a BSL 1.1 license. The core product including all 63 integrations, the dashboard, CLI, and EU AI Act compliance features are free to use. pip install sandcastle-ai to get started.

v0.30 - Agents Unleashed

Describe what you want. Go home.
Sandcastle ships it.

Name: Sandcastle
Rating: 4.9 (47 reviews)
Author: Tomas Pflanzer

Which provider. Which model. Which agent. What happens when it fails. What it costs. Where the data lives. Who approves what. You used to build all of that yourself. Now you describe what you want and go home. European-built, open source.

View on GitHub Try Live Demo

Terminal

Install

One command. No database, no Redis, no Docker. Just Python.

Describe

Tell AI what you need. Or pick from 236 ready-made templates.

Ship

It runs. It fails over. It tracks costs. You go home.

or from source

$ git clone https://github.com/gizmax/Sandcastle.git && cd Sandcastle && uv sync && uv run sandcastle serve

The Problem

You became a developer to build things.

Not to manage API keys. Not to debug rate limits at 3 AM. Not to explain to your EU client why their data just passed through a Virginia data center. Somewhere between the framework docs and the third provider outage this month, you forgot what you were actually trying to build. Sandcastle remembers.

✕ Without Sandcastle

✕Three months building glue code before the first real feature
✕Locked into one provider. Pricing changes? Start over.
✕Invoice arrives. Surprise. No one tracked the costs.
✕Provider goes down at 3 AM. Your phone rings.
✕EU client asks where their data lives. You check Slack.
✕Compliance audit next month. Nobody documented anything.

✓ With Sandcastle

✓Describe it, ship it. Tell AI what you need - in production today, not next quarter.
✓7 providers, one setting. Switch from Claude to Mistral in one line. Run locally with oMLX.
✓You know what it costs. Per provider. Per workflow. Per step. Right now.
✓Provider down? You sleep. Auto-failover switches to the next one.
✓EU data stays in EU. One toggle. Hard enforcement. Not a policy doc.
✓Agents as building blocks. 15 templates. Or describe what you need. AI designs the agent.
✓Audit trail from day one. Tamper-evident. EU AI Act ready. Always on.

🔗

Pipelines, not scripts

"Scrape 50 pages, enrich each one, score and rank - in one YAML file."

🔀

Any provider, one setting

"Claude today, Mistral tomorrow, Ollama on your laptop. Change one line."

💰

Know what it costs

"$120 on Claude last month. Same work on Mistral? $45. Here's a button."

😴

Sleep through outages

"Provider hit rate limit at 3 AM. Sandcastle switched to the next one. You slept."

🇪🇺

EU data stays in EU

"One toggle. Not a policy document - hard enforcement at the routing layer."

🎯

Right model, right job

"Critical analysis gets Claude. Simple formatting gets Haiku. Automatic."

📄

Parse anything

"PDF, DOCX, XLSX. 4 OCR engines including GLM-OCR at 94.6% accuracy. Runs locally, free."

🛡️

Compliance from day one

"EU AI Act, audit trail, PII redaction, kill switch. Not an add-on - built in."

🤖

Agents as steps

"Claude researches in the cloud. Mistral formats for 1/10th cost. oMLX runs locally. All in one workflow."

📊

Documents in, reports out

"Scan a contract with 94.6% OCR. Analyze it with AI. Get a PDF report with charts. One workflow."

Capabilities

Everything you need for production agents

From sandbox execution to production-grade orchestration - everything you need in one package.

Core Engine Intelligence Safety Integrations Developer Experience

🔀

DAG Workflow Engine

Define multi-step pipelines in YAML. Dependencies, parallel branches, data passing between steps.

⚡

Parallel Execution

Steps at the same DAG layer run concurrently. Fan out over lists with configurable concurrency.

🏖️

Sandshore Runtime

Purpose-built agent runtime with circuit breaker, pool management, health caching, and production-grade optimization.

📦

Pluggable Sandboxes

Four backends: E2B cloud microVMs, Docker containers, Cloudflare Workers edge, or local subprocess. Switch with one env var.

🧩

20 Step Types

llm, http, code, condition, classify, loop, race, sensor, gate, transform, notify, delegate, parse, openclaw. Mix AI with $0 deterministic steps.

🌐

Universal Advisor

Any LLM powers all AI features - Claude, Mistral, OpenAI, Ollama, Google, MiniMax. One config, short aliases like sonnet, opus, openai/gpt-4o, ollama/llama3.

🔄

Smart Auto-Failover

Automatic fallback on 429 or 5xx. Per-key cooldown tracking, ordered failover chains. SLO-aware routing picks the best model for critical ops, cheapest for simple ones.

✨

AI Workflow Generator

Describe what you need in plain English. sandcastle generate creates a complete workflow YAML with steps, dependencies, and model selection.

🧠

Agent Memory - No OpenAI Required

Persistent context across runs powered by Anthropic Claude + local fastembed/ONNX embeddings. Semantic search auto-injects relevant memories into prompts. Zero OpenAI dependency.

🎯

Cost-Latency Optimizer

SLO-based model routing with feedback loop, model degradation alerts, and automatic severity-based recommendations.

🧬

AutoPilot v2

Thompson Sampling variant selection, Welch's t-test significance testing, and progressive rollout (canary 10% -> partial 50% -> full 100%).

💡

Cost Intelligence

Per-provider cost breakdown with proactive savings recommendations. Shows "Would cost $45 via Mistral" comparisons so you always pick the right model for the budget.

🛡️

Policy Engine

Declarative rules for PII redaction, secret blocking, cost guards. Applied per step or globally.

💰

Budget Guardrails

Cost tracking per step and per run. Set hard budgets and get alerts before they blow up your invoice.

👤

HITL Approvals

Pause workflows for human review. Approve, reject, or edit the data before the next step runs. Multi-strategy gates.

⚡

Circuit Breaker

Automatic failure detection with CLOSED/OPEN/HALF_OPEN states. Prevents cascading failures across your pipeline.

🚦

Progressive Rollout

Deploy experiment winners safely: canary (10%) to partial (50%) to full (100%). Statistical significance required before each stage advance.

🇪🇺

EU Data Residency

Toggle one setting - all AI processing stays in EU or local. Built-in compliance with data residency requirements, not just a promise. Works with any provider that offers EU endpoints.

🔧

63 Tool Connectors

Slack, GitHub, OpenAI, Anthropic, AWS S3, Google Sheets, Airtable, Discord, Supabase, Pinecone, Stripe, Shopify, PagerDuty, Datadog, Langfuse, Qdrant, GCS, Azure Blob, Exa, and 44 more. Add tools: [slack] to any step.

🤝

A2A Protocol

Google's Agent-to-Agent protocol. Agent card discovery at /.well-known/agent.json, JSON-RPC 2.0 task management.

📺

AG-UI Streaming

CopilotKit's Agent-User Interaction protocol. Real-time SSE streaming of agent state, tool calls, and text deltas to any frontend.

🔌

MCP Server

Built-in Model Context Protocol server. Run workflows, check status, and manage schedules from Claude Desktop, Cursor, or Windsurf.

📡

Webhooks + SSE

Real-time event streaming via SSE. Webhook dispatcher for external integrations. Live updates, no polling.

🔗

Named Connections

Multiple credential instances per tool. slack:engineering, postgresql:analytics - named connections resolve to the right credentials automatically.

🌐

Browser RPA

Five modes: Playwright (selector-based), Computer Use (vision AI), DOM Extract (accessibility tree), LightPanda (10x faster headless via CDP), Browserbase (cloud-hosted, zero cold-start). Action caching, CAPTCHA escalation, execution replay.

🧠

AI Connectors

OpenAI, Anthropic, ElevenLabs, Tavily, Firecrawl, Pinecone. Chain multiple AI providers, vector search, web scraping, and text-to-speech in your workflows.

🚀

DevOps Connectors

Vercel, Cloudflare Workers, Datadog, PagerDuty, AWS S3, Redis. Deploy, monitor, alert, and manage infrastructure from workflow steps.

⌨️

Full CLI Suite

templates, runs, replay, fork, approve, reject, generate, doctor. Global --json flag for scripting.

📊

Real-time Dashboard

Runs, costs, schedules, dead letters, approvals, experiments, policy violations - all in one place. Visual workflow builder included.

📋

236 Templates

118 built-in + 118 community templates. Marketing, sales, engineering, support, HR, legal, and general AI. Full Community Hub with one-click install and uninstall.

🧪

Evaluation Framework

A/B test models and prompts per step. Automatic quality evaluation and best-variant deployment with AutoPilot.

🚀

REST API + Docs

Full OpenAPI spec. Interactive docs at /api/docs. Zero-config local mode - sandcastle init + sandcastle serve.

🏪

Community Hub

Browse, install, and share workflow templates. One-click install from the dashboard or sandcastle hub install author/name from CLI.

📡

OpenTelemetry

Optional OTLP instrumentation with workflow and step-level spans. Includes cost, duration, and token counts as span attributes. Install with pip install sandcastle-ai[otel].

🤖

AI Providers - mix per step

Claude

Opus, Sonnet, Haiku

OpenAI

Codex, Codex Mini

MiniMax

M2.5

Gemini

via OpenRouter

Ollama

Local models

Mistral

Large, Small, Codestral

oMLX

Apple Silicon local

📦

Sandbox Backends - switch via env var

E2B

Cloud sandboxes (default)

Docker

Local containers

Local

Subprocess (dev only)

Cloudflare

Edge Workers

Define

Workflows as YAML

No SDKs, no boilerplate. Declare your pipeline, Sandcastle handles the rest.

lead-enrichment.yaml hybrid

name: "Lead Enrichment"
default_model: sonnet

steps:
  - id: "fetch"
    type: http              # $0 - no LLM needed
    http_config:
      url: "https://api.example.com/company/{input.id}"
      method: GET

  - id: "enrich"
    type: llm               # single API call, no sandbox
    depends_on: ["fetch"]
    prompt: "Research: {steps.fetch.output}"

  - id: "route"
    type: classify          # LLM-based routing
    depends_on: ["enrich"]
    classify_config:
      categories: [hot, warm, cold]
      input: "{steps.enrich.output}"
      branches:
        hot: [priority-outreach]
        warm: [nurture-sequence]
        cold: [archive]

Mix agent + lightweight steps. One file.

Not every step needs a full AI agent. Use http for API calls ($0), code for Python snippets, condition and classify for branching, loop for iteration, race for parallel competition, sensor for polling, transform for templates, notify for alerts - and llm or full agent steps where you actually need AI.

✓21 step types - standard, llm, http, code, condition, classify, loop, race, sensor, gate, transform, notify, delegate, approval, sub_workflow, parse, openclaw, map, retry, switch, agent
✓Smart branching - condition, classify, and race route to different paths
✓$0 steps - http, code, condition, transform, notify cost nothing - no LLM call
✓Data passing - reference prior outputs with {steps.id.output}
✓Per-step models - Claude, OpenAI, MiniMax, Gemini - mix per step
✓Automatic retries - exponential backoff on failure
✓Persistent storage - results saved to disk (local) or S3 (production)

Monitor

See everything. Control everything.

Runs, costs, schedules, dead letters, approvals, experiments, policy violations - all in one place. Includes a visual workflow builder for drag-and-drop pipeline design.

Try Live Demo - no backend needed

Integrate

Talk to Sandcastle from your AI editor

Built-in MCP (Model Context Protocol) server. Claude Desktop, Cursor, and Windsurf can run workflows, check status, and manage schedules - all from the chat interface.

claude_desktop_config.json

{
  "mcpServers": {
    "sandcastle": {
      "command": "sandcastle",
      "args": ["mcp"]
    }
  }
}

8 tools. 3 resources. Zero config.

Add one JSON block to your client config. The MCP server connects to a running sandcastle serve instance and exposes the full workflow API.

✓run_workflow - run a saved workflow by name with optional input
✓run_workflow_yaml - run a workflow from inline YAML definition
✓get_run_status - detailed status with all step results
✓list_runs - browse runs with status and workflow filters
✓cancel_run - stop a queued or running workflow
✓save_workflow - save YAML workflow definitions to the server
✓create_schedule / delete_schedule - manage cron schedules
✓Resources - read-only access to workflows, schedules, and health

Claude Desktop Cursor Windsurf

Templates

236 workflow templates, ready to deploy

Skip the blank page. Pick a template, tweak the prompts, and ship. Browse the Community Hub or install via CLI.

20 Marketing Blog to Social, SEO Content, Ad Copy, Email Campaign, Content Calendar, Competitor Analysis...

17 Engineering API Docs, Data Extractor, Jira Triage, Release Notes, Slack Standup, Incident Responder...

17 General AI Research Agent, PDF Summary, Chain of Thought, Invoice Processor, Course Creator...

15 Sales & CRM Lead Enrichment, Lead Scoring, CRM Sync, Meeting Recap, Churn Predictor, Pipeline Autopilot...

11 HR & Legal Resume Screener, Contract Review, Compliance Checker, Onboarding, Job Description, Recruiting Pipeline...

8 Support Ticket Triage, FAQ Generator, SLA Watchdog, Customer Health Check, Multi-Channel Router, Voice Agent...

Browse Community Hub

Workflow Evolution

Your workflows improve themselves

Set a goal. Walk away. Come back to a faster, cheaper, more accurate workflow.

Baseline

→

evolving...

Best

+75%

🧰

Autonomous Optimization

Sandcastle mutates prompts, swaps models, removes unnecessary steps - and keeps only what improves your score. Inspired by Karpathy's autoresearch.

📈

Composite Scoring

Quality, cost, and speed measured together. Every mutation is evaluated against your eval suite. No subjective "looks good" - pure metrics.

✨

The Great Simplification

The best optimization is often removal. Sandcastle discovers that simpler workflows outperform complex ones - just like the AI that taught itself to be a quant.

Start evolving

Enterprise Trust

Enterprise-Ready Compliance

Built-in EU AI Act compliance, tamper-evident audit trail, and PII redaction - so you can deploy AI in regulated industries.

🛡️

EU AI Act Compliance

✓Risk classification: minimal / limited / high / unacceptable
✓Compliance mode enforcement for high-risk workflows
✓Transparency reports (Article 13)
✓Annex IV technical documentation generator
✓Emergency stop for high-risk workflows

🔒

Tamper-Evident Audit Trail

✓SHA-256 hash chain on all audit events
✓Full lifecycle tracking across 7 executor events
✓Chain integrity verification endpoint
✓Per-run and global audit trail APIs
✓Admin route actions fully logged

🔏

Privacy Router (PII Redaction)

✓7 PII patterns: email, phone, SSN, credit card, IP, IBAN, DOB
✓Per-workflow or per-server configuration
✓Redact or audit-only modes
✓Zero-dependency regex engine
✓Applied to all LLM inputs and outputs

Ready to ship agent workflows?

Install Sandcastle in under 30 seconds. No infrastructure, no setup - just works.

$ pip install sandcastle-ai

Get Started Live Demo

Built With

Boring tech, reliable results

No exotic dependencies. Battle-tested tools you already know. Local mode needs zero infrastructure.

🐍

Python 3.12

API Server

⚡

FastAPI

REST + SSE

🗄️

SQLite / PostgreSQL

Local / Production DB

🔴

In-process / Redis

Local / Production Queue

📦

Filesystem / S3

Local / Production Storage

⚛️

React + TS

Dashboard

🎨

Tailwind CSS

Styling

🏖️

Sandshore

Agent Runtime

Open Source

Free forever. Seriously.

Sandcastle is fully open source under the Business Source License 1.1. Free for non-production use, converts to Apache 2.0 in 2030. No feature gates, no "contact sales" buttons. Every feature you see on this page ships in the free version - because there is only one version.

☕

Buy me a coffee

I built Sandcastle in my free time and I plan to keep improving it - new features, better docs, bug fixes, community support. If it saves you time or you just think it's cool, a coffee goes a long way.

It keeps me caffeinated and motivated to ship the next update.

☕ Buy Me a Coffee

☕ Fuels new features 🐛 Faster bug fixes 📖 Better documentation

Check out my other projects

Apps I built and shipped. Give them a try.

💳