Sandbox Agent

An autonomous coding agent designed to safely generate, execute, and verify code within isolated environments. Built with Pure Python and Pydantic, avoiding the bloat and abstraction layers of complex agent frameworks.

Core Capability: Sandboxed Execution

The primary function of this agent is Safe Code Verification. Instead of running generated code directly on your host machine, the agent:

Generates Implementation: Writes Python scripts to solve a specific task.
Containerizes: Automatically writes a custom Dockerfile tailored to the script's dependencies.
Builds & Isolates: Builds a Docker image, ensuring the environment is clean and reproducible.
Verifies: Runs the container to capture stdout and stderr.
Self-Corrects: If the container execution fails (e.g., missing dependencies, syntax errors), the agent reads the logs, patches the code, and rebuilds the container until it passes.

This ensures that no generated code executes on your local OS, properly isolating potential side effects or malicious logic.

Architecture: Pure Python + Pydantic

This project deliberately avoids heavy agent frameworks (like LangChain, CrewAI, or AutoGen) in favor of a lightweight, transparent, and strictly typed foundation:

No Black Box Logic: Control flow is explicit in main.py and agent/core.py, not hidden behind framework magic.
Type Safety: Uses Pydantic for rigorous schema definition of tools (see agent/tools.py) and message history.
Direct API Control: Manages LLM context windows and tool calling parameters directly, ensuring predictable behavior with both Gemini and local Ollama models.

Features

Autonomous Debug Loop: The agent detects crash logs (like Exit Code 127 or Python tracebacks) and iterates on fixes without human intervention.
Dynamic Environments: Creates throwaway containers for every task, preventing dependency conflicts.
Flexible Intelligence: Powered by an LLM backend (configurable for local or cloud inference) to drive the logic.

Setup

uv sync
ollama pull qwen2.5-coder:7b

Usage

$env:LLM_PROVIDER="ollama" 
uv run main.py

Configuration

Variable	Default	Description
`LLM_PROVIDER`	`gemini`	Backend for reasoning (`ollama` or `gemini`).
`OLLAMA_MODEL`	`qwen2.5-coder:7b`	Local model tag.

Note on Tool Parsing

We use explicit JSON regex parsing for tool calls to support models like qwen2.5-coder, which can be inconsistent with native tool outputs. More advanced models should support direct tool invocation without these workarounds.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent		agent
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sandbox Agent

Core Capability: Sandboxed Execution

Architecture: Pure Python + Pydantic

Features

Setup

Usage

Configuration

Note on Tool Parsing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sandbox Agent

Core Capability: Sandboxed Execution

Architecture: Pure Python + Pydantic

Features

Setup

Usage

Configuration

Note on Tool Parsing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages