GitHub - robomonkey-vla/RoboMonkey

Scaling Test-Time Sampling and Verification for Vision-Language-Action Models

🛠️ Setup

Clone this repository:

git clone --recurse-submodules https://github.com/robomonkey-vla/RoboMonkey.git

Use the provided script to set up all dependencies:

bash scripts/setup.sh

This setup has been tested on 2×RTX 4090 GPUs using this Docker image.

✅ Action Verifier

Spin up the action verifier server:

conda activate monkey-verifier
cd monkey-verifier/src
python infer_server.py

⚡ VLA Serving Engine

Launch OpenVLA using our optimized SGLang-based engine:

conda activate sglang-vla
cd sglang-vla
CUDA_VISIBLE_DEVICES=1 python openvla_server.py --seed 1

🤖 SIMPLER Environment

Running RoboMonkey

Activate the environment and run the evaluation script as follows:

conda activate simpler_env
export PRISMATIC_DATA_ROOT=. && export PYTHONPATH=.
cd openvla-mini

xvfb-run --auto-servernum -s "-screen 0 640x480x24" \
python experiments/robot/simpler/run_simpler_eval.py \
  --task_suite_name simpler_put_eggplant_in_basket \
  --initial_samples 9 \
  --augmented_samples 32

initial_samples: Number of actions generated by the base policy.
augmented_samples: Number of actions generated via Gaussian perturbation.
task_suite: simpler_put_eggplant_in_basket, simpler_stack_cube, simpler_spoon_on_towel, simpler_carrot_on_plate

Baseline without Verifier

To disable the verifier and use the base policy:

--initial_samples 1 --augmented_samples 1

📊 Evaluation Results

Task	Initial Samples	Augmented Samples	Seed 1	Seed 2	Seed 3	Average	Baseline	Success Rate ↑
Eggplant in Basket	9	32	76%	66%	78%	73%	54%	+19%
Carrot on Plate	5	16	24%	24%	26%	25%	20%	+5%
Spoon on Towel	5	32	46%	46%	50%	47%	45%	+2%
Stack Cube	9	32	46%	40%	48%	45%	35%	+10%

Logs are saved under: openvla-mini/experiments/log/

📚 Acknowledgements

We thank the authors of OpenVLA, SGLang, SimplerEnv, LLaVA-RLHF, and OpenVLA-mini for their contributions to the open-source community. Our implementation builds upon these projects. For comprehensive details and the latest updates, please consult the official documentation and repositories of the respective projects.

If you find this project helpful, please consider citing:

@article{kwok25robomonkey,
  title={RoboMonkey: Scaling Test-Time Sampling and Verification for Vision-Language-Action Models},
  author={Jacky Kwok and Christopher Agia and Rohan Sinha and Matt Foutter and Shulu Li and Ion Stoica and Azalia Mirhoseini and Marco Pavone},
  journal={arXiv preprint arXiv:2506.17811},
  year={2025},
}

🔎 Troubleshooting

If you encounter the following error: No Vulkan extensions found for window surface creation (hint: set VK_ICD_FILENAMES to `locate icd.json`)

You can resolve this by running the script that installs Vulkan dependencies and sets up the correct ICD configuration:

bash scripts/vulkan.sh

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
SimplerEnv @ 4ab7178		SimplerEnv @ 4ab7178
assets		assets
monkey-verifier		monkey-verifier
openvla-mini @ 8038e8e		openvla-mini @ 8038e8e
scripts		scripts
sglang-vla @ 66160f1		sglang-vla @ 66160f1
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of contents

🛠️ Setup

✅ Action Verifier

⚡ VLA Serving Engine

🤖 SIMPLER Environment

Running RoboMonkey

Baseline without Verifier

📊 Evaluation Results

📚 Acknowledgements

🔎 Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Table of contents

🛠️ Setup

✅ Action Verifier

⚡ VLA Serving Engine

🤖 SIMPLER Environment

Running RoboMonkey

Baseline without Verifier

📊 Evaluation Results

📚 Acknowledgements

🔎 Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages