Skip to content

TejasS1233/sandboxing_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sandbox Agent

An autonomous coding agent designed to safely generate, execute, and verify code within isolated environments. Built with Pure Python and Pydantic, avoiding the bloat and abstraction layers of complex agent frameworks.

Core Capability: Sandboxed Execution

The primary function of this agent is Safe Code Verification. Instead of running generated code directly on your host machine, the agent:

  1. Generates Implementation: Writes Python scripts to solve a specific task.
  2. Containerizes: Automatically writes a custom Dockerfile tailored to the script's dependencies.
  3. Builds & Isolates: Builds a Docker image, ensuring the environment is clean and reproducible.
  4. Verifies: Runs the container to capture stdout and stderr.
  5. Self-Corrects: If the container execution fails (e.g., missing dependencies, syntax errors), the agent reads the logs, patches the code, and rebuilds the container until it passes.

This ensures that no generated code executes on your local OS, properly isolating potential side effects or malicious logic.

Architecture: Pure Python + Pydantic

This project deliberately avoids heavy agent frameworks (like LangChain, CrewAI, or AutoGen) in favor of a lightweight, transparent, and strictly typed foundation:

  • No Black Box Logic: Control flow is explicit in main.py and agent/core.py, not hidden behind framework magic.
  • Type Safety: Uses Pydantic for rigorous schema definition of tools (see agent/tools.py) and message history.
  • Direct API Control: Manages LLM context windows and tool calling parameters directly, ensuring predictable behavior with both Gemini and local Ollama models.

Features

  • Autonomous Debug Loop: The agent detects crash logs (like Exit Code 127 or Python tracebacks) and iterates on fixes without human intervention.
  • Dynamic Environments: Creates throwaway containers for every task, preventing dependency conflicts.
  • Flexible Intelligence: Powered by an LLM backend (configurable for local or cloud inference) to drive the logic.

Setup

uv sync
ollama pull qwen2.5-coder:7b

Usage

$env:LLM_PROVIDER="ollama" 
uv run main.py

Configuration

Variable Default Description
LLM_PROVIDER gemini Backend for reasoning (ollama or gemini).
OLLAMA_MODEL qwen2.5-coder:7b Local model tag.

Note on Tool Parsing

We use explicit JSON regex parsing for tool calls to support models like qwen2.5-coder, which can be inconsistent with native tool outputs. More advanced models should support direct tool invocation without these workarounds.

About

A lightweight, autonomous sandboxed coding agent using Pure Python, Pydantic, and Docker.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages