Skip to content

Experimenter: proof-of-concept AI-Powered Problem Solving Agent

License

Notifications You must be signed in to change notification settings

itfx3035/experimenter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Solver - AI-Powered Problem Solving Agent

An intelligent problem-solving framework that leverages large language models (LLMs) to autonomously solve tasks by generating and executing Python scripts in an iterative loop.

Overview

Solver is a framework that enables AI agents to tackle complex problems by:

  1. Analyzing a task description
  2. Reasoning about the solution approach
  3. Generating Python code to execute
  4. Learning from execution results
  5. Iterating until the problem is solved

The system uses local LLMs via Ollama to maintain privacy and control over the reasoning process.

How It Works

┌─────────────────────────────────────────┐
│  Problem Description                    │
└────────────────┬────────────────────────┘
                 │
                 ▼
        ┌────────────────┐
        │  LLM Reasoning │ (via Ollama)
        └────────┬───────┘
                 │
        ┌────────▼────────────────┐
        │ Generate Python Script  │
        │ + Explanation           │
        └────────┬────────────────┘
                 │
        ┌────────▼────────────────┐
        │  Execute Script         │
        │  Capture Output         │
        └────────┬────────────────┘
                 │
        ┌────────▼────────────────┐
        │   Problem Solved?       │
        └────┬──────────────┬─────┘
             │ Yes          │ No
             ▼              ▼
          DONE         Repeat Loop

Features

  • Autonomous Execution: LLM generates and executes Python code iteratively
  • Local Processing: Uses Ollama for local LLM inference (supports remote hosts)
  • Flexible Models: Works with any LLM available through Ollama
  • Result Tracking: Logs LLM reasoning and execution results
  • Error Handling: Gracefully handles execution errors and returns them to the LLM

Project Structure

solver/
├── run.py                   # Example usage and entry point
├── experimenter_lib.py      # Main Experimenter class
├── exec_lib.py             # Script execution utilities
└── README.md              # This file

File Descriptions

  • run.py: Demonstrates how to use the Solver framework with an example task
  • experimenter_lib.py: Core Experimenter class that orchestrates the LLM-driven problem solving loop
  • exec_lib.py: Helper functions for executing generated Python scripts safely

Installation

Prerequisites

  • Python 3.8+
  • Ollama: Download from ollama.ai
  • LLM Model: Pull a model using ollama pull <model-name>

Setup

  1. Install Ollama (if not already installed)

    # macOS/Linux
    curl -fsSL https://ollama.ai/install.sh | sh
    
    # Then start Ollama
    ollama serve
  2. Pull an LLM model (in a new terminal)

    # Example: Pull a coding-capable model
    ollama pull qwen3-coder:30b
  3. Install Python dependencies

    pip install langchain langchain-ollama

Usage

Basic Example

from experimenter_lib import Experimenter

# Define your task
task = "Scrap top 10 movie titles from IMDB"

# Specify the model (use exact name from 'ollama list')
model_name = "qwen3-coder:30b"

# Create experimenter (ollama_url=None uses local Ollama)
experimenter = Experimenter(task, model_name)

# Run the experiment
result = experimenter.run_experiment()
print(result)

Configuration

In experimenter_lib.py:

  • TEMPERATURE: Controls LLM randomness (0.0 = deterministic, higher = more creative)
  • CONTEXT_LEN: Maximum context window for the model (tokens)
  • RESPONSE_DELIMITER: Separator between reasoning and code

In your code:

# Local Ollama (default)
experimenter = Experimenter(task, model_name, ollama_url=None)

# Remote Ollama host
experimenter = Experimenter(task, model_name, ollama_url="http://192.168.1.100:11434")

Response Format

The LLM must respond with exactly two parts, separated by a delimiter line (=========================):

First part: Reasoning and explanation
of what the agent plans to do

=========================
Second part: Python code to execute
(or empty if task is complete)

Example:

I need to fetch web content. I'll use requests library.

=========================
import requests
response = requests.get("https://example.com")
print(response.text[:100])

Key Components

Experimenter Class

Constructor:

Experimenter(task_description, model_name, ollama_url=None)

Main Method:

  • run_experiment(): Starts the problem-solving loop, returns final system response

Internal Methods:

  • _run_query(): Sends prompt to LLM via Ollama
  • _parse_answer(): Splits LLM response into reasoning and code
  • _create_system_prompt(): Builds the system prompt for the LLM

Execution Function

cb_proc(command)
  • Writes Python code to script.py
  • Executes it using python3
  • Returns stdout/stderr output

Output Files

  • _experimenter.log: Last LLM response (reasoning + code)
  • script.py: Most recently generated and executed Python script

Example Workflows

Example 1: Web Scraping

task = "Scrape the latest news headlines from BBC News"
experimenter = Experimenter(task, "qwen3-coder:30b")
result = experimenter.run_experiment()

Example 2: Data Processing

task = "Calculate statistics (mean, median, std) for the CSV file data.csv and save results to results.csv"
experimenter = Experimenter(task, "qwen3-coder:30b")
result = experimenter.run_experiment()

Best Practices

  1. Clear Task Descriptions: Write specific, unambiguous task descriptions
  2. Model Selection: Use models with strong coding capabilities (e.g., DeepSeek-Coder, Qwen-Coder)
  3. Error Handling: The framework returns errors to the LLM for recovery
  4. Testing: Test with simple tasks first before complex ones
  5. Resource Management: Monitor system resources during execution

Limitations

  • LLM Dependent: Quality depends on chosen model's capabilities
  • External Dependencies: Generated code must have access to required libraries
  • Determinism: With TEMPERATURE=0.0, same task produces consistent results, but still depends on model
  • Execution Risk: Runs arbitrary Python code—only use with trusted models/tasks

Troubleshooting

Connection refused Error

  • Ensure Ollama is running: ollama serve
  • Check if Ollama is accessible at the specified URL

Model Not Found

  • List available models: ollama list
  • Pull the model: ollama pull model-name
  • Use exact model name in code

Script Execution Failures

  • Check script.py for syntax errors
  • Ensure required Python packages are installed
  • Review _experimenter.log for LLM reasoning

Performance Tips

  • Use Quantized Models: Faster execution with reduced memory (e.g., Q5_K_M)
  • Reduce Context: Lower CONTEXT_LEN for faster inference (if task allows)
  • Model Selection: Smaller models run faster but may have reduced reasoning ability

License

This project is provided as-is for research and educational purposes.

Contributing

Improvements, bug reports, and feature requests are welcome.

About

Experimenter: proof-of-concept AI-Powered Problem Solving Agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages