ml-ralph

An autonomous ML engineering agent with a terminal user interface. ml-ralph automates the experiment loop — planning, execution, analysis, and learning extraction — so you can iterate on ML projects faster.

You define your goals through a PRD. The agent works through stories autonomously, runs experiments, tracks metrics, and accumulates structured learnings across iterations.

Getting started

bunx @pentoai/ml-ralph

That's it. Run it inside any ML project directory and the TUI will launch in tmux.

Requirements

Bun v1.0+
tmux (brew install tmux)
Claude Code CLI, installed and authenticated

The cognitive framework

ml-ralph operates as a paranoid scientist. Its core assumption: results are probably misleading, data is probably corrupted, and conclusions should be broken before they're trusted. It allocates roughly 70% of effort to understanding and verification, 20% to strategy, and 10% to execution.

The agent works through a 4-phase cognitive cycle:

UNDERSTAND → STRATEGIZE → EXECUTE → REFLECT
     ↑                                  │
     └──────────────────────────────────┘

Understand — Verify data integrity (row counts, label distributions, sample inspection). Run exploratory analysis. Research prior art. Build a mental model and explicitly list all assumptions. Nothing happens until this is done.

Strategize — Generate 3–5 competing hypotheses. For each: what's expected, why, and what will be learned. Think 5–6 steps ahead. Pick the path with the best learning-to-effort ratio. Run the smallest experiment that tests the hypothesis.

Execute — Run the experiment. Log metrics and observations as work happens, not after. Surprises are more valuable than confirmations.

Reflect — Verify results are real, not artifacts of bugs, leakage, or evaluation errors. Try to break your own result before trusting it. Then decide:

Too good? → Verify harder
Verified and promising? → Strategize next step
Surprised or confused? → Go back to Understand
Stuck after 2–3 experiments? → Strategic retreat to Understand
All success criteria met and verified? → Complete

Strategic retreat — going back to understand when stuck — is a first-class concept, not a failure. Understanding is progress.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.claude		.claude
.codex/skills/ml-ralph		.codex/skills/ml-ralph
.github/workflows		.github/workflows
docs		docs
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
RALPH_CONTEXT.md		RALPH_CONTEXT.md
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ml-ralph

Getting started

Requirements

The cognitive framework

License

About

Uh oh!

Releases 2

Packages

Contributors 2

Uh oh!

Languages

pentoai/ml-ralph

Folders and files

Latest commit

History

Repository files navigation

ml-ralph

Getting started

Requirements

The cognitive framework

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Uh oh!

Languages

Packages