An autonomous coding agent designed to safely generate, execute, and verify code within isolated environments. Built with Pure Python and Pydantic, avoiding the bloat and abstraction layers of complex agent frameworks.
The primary function of this agent is Safe Code Verification. Instead of running generated code directly on your host machine, the agent:
- Generates Implementation: Writes Python scripts to solve a specific task.
- Containerizes: Automatically writes a custom
Dockerfiletailored to the script's dependencies. - Builds & Isolates: Builds a Docker image, ensuring the environment is clean and reproducible.
- Verifies: Runs the container to capture
stdoutandstderr. - Self-Corrects: If the container execution fails (e.g., missing dependencies, syntax errors), the agent reads the logs, patches the code, and rebuilds the container until it passes.
This ensures that no generated code executes on your local OS, properly isolating potential side effects or malicious logic.
This project deliberately avoids heavy agent frameworks (like LangChain, CrewAI, or AutoGen) in favor of a lightweight, transparent, and strictly typed foundation:
- No Black Box Logic: Control flow is explicit in
main.pyandagent/core.py, not hidden behind framework magic. - Type Safety: Uses Pydantic for rigorous schema definition of tools (see
agent/tools.py) and message history. - Direct API Control: Manages LLM context windows and tool calling parameters directly, ensuring predictable behavior with both Gemini and local Ollama models.
- Autonomous Debug Loop: The agent detects crash logs (like
Exit Code 127or Python tracebacks) and iterates on fixes without human intervention. - Dynamic Environments: Creates throwaway containers for every task, preventing dependency conflicts.
- Flexible Intelligence: Powered by an LLM backend (configurable for local or cloud inference) to drive the logic.
uv sync
ollama pull qwen2.5-coder:7b$env:LLM_PROVIDER="ollama"
uv run main.py| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER |
gemini |
Backend for reasoning (ollama or gemini). |
OLLAMA_MODEL |
qwen2.5-coder:7b |
Local model tag. |
We use explicit JSON regex parsing for tool calls to support models like qwen2.5-coder, which can be inconsistent with native tool outputs. More advanced models should support direct tool invocation without these workarounds.