Skip to main content

Nano SWE Agent - A simple AI software engineering agent

Project description

mini-swe-agent banner

The 100 line AI agent that solves GitHub issues & more

📣 Gemini 3 Pro reaches 74% on SWE-bench verified with mini-swe-agent!
📣 New blogpost: Randomly switching between GPT-5 and Sonnet 4 boosts performance

Docs Slack PyPI - Version

In 2024, SWE-bench & SWE-agent helped kickstart the coding agent revolution.

We now ask: What if SWE-agent was 100x smaller, and still worked nearly as well?

The mini agent is for

  • Researchers who want to benchmark, fine-tune or RL without assumptions, bloat, or surprises
  • Developers who like to own, understand, and modify their tools
  • Engineers who want something trivial to sandbox & to deploy anywhere

Here's some details:

  • Minimal: Just 100 lines of python (+100 total for env, model, script) — no fancy dependencies!
  • Performant: Scores >74% on the SWE-bench verified benchmark benchmark; starts faster than Claude Code
  • Deployable: In addition to local envs, you can use docker, podman, singularity, apptainer, and more
  • Cutting edge: Built by the Princeton & Stanford team behind SWE-bench and SWE-agent.
  • Widely adopted: In use by Meta, NVIDIA, Essential AI, Anyscale, and others
  • Tested: Codecov
More motivation (for research)

SWE-agent jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent. However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent! In fact, the mini agent

  • Does not have any tools other than bash — it doesn't even use the tool-calling interface of the LMs. This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care of installing a single package — all it needs is bash.
  • Has a completely linear history — every step of the agent just appends to the messages and that's it. So there's no difference between the trajectory and the messages that you pass on to the LM. Great for debugging & fine-tuning.
  • Executes actions with subprocess.run — every action is completely independent (as opposed to keeping a stateful shell session running). This makes it trivial to execute the actions in sandboxes (literally just switch out subprocess.run with docker exec) and to scale up effortlessly. Seriously, this is a big deal, trust me.

This makes it perfect as a baseline system and for a system that puts the language model (rather than the agent scaffold) in the middle of our attention. You can see the result on the SWE-bench (bash only) leaderboard, that evaluates the performance of different LMs with mini.

More motivation (as a tool)

Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.

The mini agent wants to be a hackable tool, not a black box.

  • Simple enough to understand at a glance
  • Convenient enough to use in daily workflows
  • Flexible to extend

Unlike other agents (including our own swe-agent), it is radically simpler, because it:

  • Does not have any tools other than bash — it doesn't even use the tool-calling interface of the LMs. Instead of implementing custom tools for every specific thing the agent might want to do, the focus is fully on the LM utilizing the shell to its full potential. Want it to do something specific like opening a PR? Just tell the LM to figure it out rather than spending time to implement it in the agent.
  • Executes actions with subprocess.run — every action is completely independent (as opposed to keeping a stateful shell session running). This is a big deal for the stability of the agent, trust me.
  • Has a completely linear history — every step of the agent just appends to the messages that are passed to the LM in the next step and that's it. This is great for debugging and understanding what the LM is prompted with.
Should I use SWE-agent or mini-SWE-agent?

You should use mini-swe-agent if

  • You want a quick command line tool that works locally
  • You want an agent with a very simple control flow
  • You want even faster, simpler & more stable sandboxing & benchmark evaluations
  • You are doing FT or RL and don't want to overfit to a specific agent scaffold

You should use swe-agent if

  • You need specific tools or want to experiment with different tools
  • You want to experiment with different history processors
  • You want very powerful yaml configuration without touching code

What you get with both

  • Excellent performance on SWE-Bench
  • A trajectory browser
Simple UI (mini) Visual UI (mini -v)

mini

miniv

Batch inference Trajectory browser

swebench

inspector

Python bindings More in the docs
agent = DefaultAgent(
    LitellmModel(model_name=...),
    LocalEnvironment(),
)
agent.run("Write a sudoku game")

Let's get started!

Option 1: If you just want to try out the CLI (package installed in anonymous virtual environment)

pip install uv && uvx mini-swe-agent [-v]
# or
pip install pipx && pipx ensurepath && pipx run mini-swe-agent [-v]

Option 2: Install CLI & python bindings in current environment

pip install mini-swe-agent
mini -v  # run the CLI

Option 3: Install from source (developer setup)

git clone https://github.com/SWE-agent/mini-swe-agent.git
cd mini-swe-agent && pip install -e .
mini [-v]  # run the CLI

Read more in our documentation:

Attribution

If you found this work helpful, please consider citing the SWE-agent paper in your work:

@inproceedings{yang2024sweagent,
  title={{SWE}-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
  author={John Yang and Carlos E Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik R Narasimhan and Ofir Press},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
  url={https://arxiv.org/abs/2405.15793}
}

Our other projects:

SWE-agent    SWE-ReX    SWE-bench    SWE-smith    CodeClash    sb-cli

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mini_swe_agent-1.17.4.tar.gz (50.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mini_swe_agent-1.17.4-py3-none-any.whl (81.2 kB view details)

Uploaded Python 3

File details

Details for the file mini_swe_agent-1.17.4.tar.gz.

File metadata

  • Download URL: mini_swe_agent-1.17.4.tar.gz
  • Upload date:
  • Size: 50.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mini_swe_agent-1.17.4.tar.gz
Algorithm Hash digest
SHA256 52d8b84d69eb6fca12ec10dc37387c9e0d70826f80f0fef91ecb30f5ba2b4bcc
MD5 4c1d7516ec79c6bb81603ed5ef2b565e
BLAKE2b-256 54737cb2e5afdda99aafb74134afd180e12ad203871530de42740b6039c4793f

See more details on using hashes here.

File details

Details for the file mini_swe_agent-1.17.4-py3-none-any.whl.

File metadata

File hashes

Hashes for mini_swe_agent-1.17.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a8df18447d51210007420588512355013aaaecb95001727f3cb86d9847ff85c8
MD5 60b3ec70e5e92b8a52ba0af1883d5e29
BLAKE2b-256 de1134b8223f13e7b5344945f04ed97f327b89a1cfa56cc7e805393b5087d113

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page