This guide shows how to run Shinka with coding agents using the project skills:
shinka-setup: scaffold task files (evaluate.py,initial.<ext>, optional run config)shinka-convert: snapshot an existing repo into a Shinka task directoryshinka-run: launch and iterate evolution batches viashinka_runshinka-inspect: load top-performing programs into a compact context bundle
It covers:
- installing Shinka
- installing Claude Code and/or Codex CLI
- installing the skills from this GitHub repo with
npx skills add - running a practical setup -> run -> inspect loop
From a clean machine:
pip install shinka-evolve
# or
uv pip install shinka-evolveSet API keys (example):
cp .env.example .env 2>/dev/null || true
# Edit .env and add OPENAI_API_KEY / ANTHROPIC_API_KEY as neededInstall one or both.
npm install -g @anthropic-ai/claude-code
claude --versionnpm install -g @openai/codex
codex --versionThe Shinka skills live directly in this repo under skills/. You do not need to copy files by hand or publish a separate npm package.
Install all current Shinka skills globally for Claude Code and Codex:
npx skills add SakanaAI/ShinkaEvolve --skill '*' -g -a claude-code -a codex -yThis installs from the GitHub repo source. The explicit --skill '*' makes "install all skills" unambiguous and avoids interactive prompts.
Installed skills currently include:
shinka-setupshinka-convertshinka-runshinka-inspect
Use this if you want the skills installed only for the current repo:
npx skills add SakanaAI/ShinkaEvolve --skill '*' -a claude-code -a codex -yTypical project paths:
- Claude Code:
.claude/skills/ - Codex:
.agents/skills/
For the global install command above, the relevant skill roots are:
- Claude Code:
~/.claude/skills/ - Codex:
~/.codex/skills/
For a narrower install:
npx skills add SakanaAI/ShinkaEvolve --skill shinka-setup -g -a claude-code -a codex -yAsk the agent to scaffold a new task directory and evaluator contract.
Example prompt:
Use shinka-setup to scaffold a new task in examples/my_task.
Language: python.
Goal: maximize <metric>.
Illustration (setup flow):
Expected output:
initial.<ext>with evolve blockevaluate.pyproducingmetrics.json+correct.json- optional
run_evo.py/shinka.yamlscaffolds when requested
Use shinka_run for agent-driven evolution loops.
Minimal batch:
shinka_run \
--task-dir examples/my_task \
--results_dir results/my_task_agent \
--num_generations 10With core knobs via --set:
shinka_run \
--task-dir examples/my_task \
--results_dir results/my_task_agent \
--num_generations 20 \
--set evo.max_api_costs=0.5 \
--set evo.llm_models='["gpt-5-mini","gemini-3-flash-preview"]' \
--set db.num_islands=2 \
--set db.parent_selection_strategy=weightedIllustration (run flow):
Use shinka-inspect after one or more batches to generate an agent-ready context file.
Minimal:
python skills/shinka-inspect/scripts/inspect_best_programs.py \
--results-dir results/my_task_agent \
--k 5With filters and explicit output:
python skills/shinka-inspect/scripts/inspect_best_programs.py \
--results-dir results/my_task_agent \
--k 8 \
--min-generation 10 \
--max-code-chars 5000 \
--out results/my_task_agent/inspect/top_programs.mdOutput:
- default file:
results/my_task_agent/shinka_inspect_context.md - contains ranking + code snippets for top programs
- designed to be loaded directly into coding-agent context
When using shinka-run skill:
- unless user explicitly requests fully autonomous execution, ask for config confirmation between batches
- keep
--results_dirthe same across continuation batches so prior state can reload - change
--results_dironly when intentionally forking a new run
Before first run:
shinka_run --helpworks- task dir has
evaluate.py+initial.<ext> - API keys are available in environment
npx skills listshows the installed Shinka skills- for global installs, skills appear under
~/.claude/skills/and/or~/.codex/skills/ - for project installs, skills appear under
.claude/skills/and/or.agents/skills/
After each batch:
- check run artifacts/logs under the chosen
results_dir - review score and correctness trend
- run
shinka-inspectand review the generated context markdown - choose next batch config (budget, models, islands, attempts, generations)



