An intelligent problem-solving framework that leverages large language models (LLMs) to autonomously solve tasks by generating and executing Python scripts in an iterative loop.
Solver is a framework that enables AI agents to tackle complex problems by:
- Analyzing a task description
- Reasoning about the solution approach
- Generating Python code to execute
- Learning from execution results
- Iterating until the problem is solved
The system uses local LLMs via Ollama to maintain privacy and control over the reasoning process.
┌─────────────────────────────────────────┐
│ Problem Description │
└────────────────┬────────────────────────┘
│
▼
┌────────────────┐
│ LLM Reasoning │ (via Ollama)
└────────┬───────┘
│
┌────────▼────────────────┐
│ Generate Python Script │
│ + Explanation │
└────────┬────────────────┘
│
┌────────▼────────────────┐
│ Execute Script │
│ Capture Output │
└────────┬────────────────┘
│
┌────────▼────────────────┐
│ Problem Solved? │
└────┬──────────────┬─────┘
│ Yes │ No
▼ ▼
DONE Repeat Loop
- Autonomous Execution: LLM generates and executes Python code iteratively
- Local Processing: Uses Ollama for local LLM inference (supports remote hosts)
- Flexible Models: Works with any LLM available through Ollama
- Result Tracking: Logs LLM reasoning and execution results
- Error Handling: Gracefully handles execution errors and returns them to the LLM
solver/
├── run.py # Example usage and entry point
├── experimenter_lib.py # Main Experimenter class
├── exec_lib.py # Script execution utilities
└── README.md # This file
run.py: Demonstrates how to use the Solver framework with an example taskexperimenter_lib.py: CoreExperimenterclass that orchestrates the LLM-driven problem solving loopexec_lib.py: Helper functions for executing generated Python scripts safely
- Python 3.8+
- Ollama: Download from ollama.ai
- LLM Model: Pull a model using
ollama pull <model-name>
-
Install Ollama (if not already installed)
# macOS/Linux curl -fsSL https://ollama.ai/install.sh | sh # Then start Ollama ollama serve
-
Pull an LLM model (in a new terminal)
# Example: Pull a coding-capable model ollama pull qwen3-coder:30b -
Install Python dependencies
pip install langchain langchain-ollama
from experimenter_lib import Experimenter
# Define your task
task = "Scrap top 10 movie titles from IMDB"
# Specify the model (use exact name from 'ollama list')
model_name = "qwen3-coder:30b"
# Create experimenter (ollama_url=None uses local Ollama)
experimenter = Experimenter(task, model_name)
# Run the experiment
result = experimenter.run_experiment()
print(result)TEMPERATURE: Controls LLM randomness (0.0 = deterministic, higher = more creative)CONTEXT_LEN: Maximum context window for the model (tokens)RESPONSE_DELIMITER: Separator between reasoning and code
# Local Ollama (default)
experimenter = Experimenter(task, model_name, ollama_url=None)
# Remote Ollama host
experimenter = Experimenter(task, model_name, ollama_url="http://192.168.1.100:11434")The LLM must respond with exactly two parts, separated by a delimiter line (=========================):
First part: Reasoning and explanation
of what the agent plans to do
=========================
Second part: Python code to execute
(or empty if task is complete)
Example:
I need to fetch web content. I'll use requests library.
=========================
import requests
response = requests.get("https://example.com")
print(response.text[:100])
Constructor:
Experimenter(task_description, model_name, ollama_url=None)Main Method:
run_experiment(): Starts the problem-solving loop, returns final system response
Internal Methods:
_run_query(): Sends prompt to LLM via Ollama_parse_answer(): Splits LLM response into reasoning and code_create_system_prompt(): Builds the system prompt for the LLM
cb_proc(command)- Writes Python code to
script.py - Executes it using
python3 - Returns stdout/stderr output
_experimenter.log: Last LLM response (reasoning + code)script.py: Most recently generated and executed Python script
task = "Scrape the latest news headlines from BBC News"
experimenter = Experimenter(task, "qwen3-coder:30b")
result = experimenter.run_experiment()task = "Calculate statistics (mean, median, std) for the CSV file data.csv and save results to results.csv"
experimenter = Experimenter(task, "qwen3-coder:30b")
result = experimenter.run_experiment()- Clear Task Descriptions: Write specific, unambiguous task descriptions
- Model Selection: Use models with strong coding capabilities (e.g., DeepSeek-Coder, Qwen-Coder)
- Error Handling: The framework returns errors to the LLM for recovery
- Testing: Test with simple tasks first before complex ones
- Resource Management: Monitor system resources during execution
- LLM Dependent: Quality depends on chosen model's capabilities
- External Dependencies: Generated code must have access to required libraries
- Determinism: With
TEMPERATURE=0.0, same task produces consistent results, but still depends on model - Execution Risk: Runs arbitrary Python code—only use with trusted models/tasks
- Ensure Ollama is running:
ollama serve - Check if Ollama is accessible at the specified URL
- List available models:
ollama list - Pull the model:
ollama pull model-name - Use exact model name in code
- Check
script.pyfor syntax errors - Ensure required Python packages are installed
- Review
_experimenter.logfor LLM reasoning
- Use Quantized Models: Faster execution with reduced memory (e.g., Q5_K_M)
- Reduce Context: Lower
CONTEXT_LENfor faster inference (if task allows) - Model Selection: Smaller models run faster but may have reduced reasoning ability
This project is provided as-is for research and educational purposes.
Improvements, bug reports, and feature requests are welcome.