Robotics: Science and Systems (RSS) 2025
IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation
Krishan Rana†,1, Robert Lee†, David Pershouse1, Niko Suenderhauf1,
†Equal Contribution, 1Queensland University of Technology,
Download our source code:
git clone https://github.com/krishanrana/imle_policy.git
cd imle_policyCreate a virtual environment with Python 3.10 and activate it, e.g. with miniconda:
conda create -y -n imle_policy -c conda-forge python=3.10 evdev=1.9.0 xorg-x11-proto-devel-cos6-x86_64 glew mesa-libgl-devel-cos6-x86_64 libglib
conda activate imle_policyInstall all requirements:
pip install -e .Download Mujoco for the Kitchen and UR3 Block Push environments:
./get_mujoco.shDownload all the required datasets and extract (~25GB):
cd imle_policy
wget https://huggingface.co/datasets/krishanrana/imle_policy/resolve/main/datasets.zip && unzip datasets.zip && rm datasets.zipTo download and extract only the PushT sim dataset:
cd imle_policy
wget https://huggingface.co/datasets/krishanrana/imle_policy/resolve/main/pusht_dataset/datasets.zip && unzip datasets.zip && rm datasets.zipTo train IMLE Policy on the PushT task with all the default parameters, run:
python train.py --task pusht --method rs_imle Note: you will be prompted to login to your wandb account the first time you run this.
Available options:
task: pusht, Lift, NutAssemblySquare, PickPlaceCan, ToolHang, TwoArmTransport, kitchen, ur3_blockpush
method: rs_imle, diffusion, flow_matching
dataset_percentage: Fixed subsample of the full dataset ranging from 0.1 to 1.0
epsilon: IMLE Policy-specific hyperparameter that controls the rejection sampling threshold
n_samples_per_condition: IMLE Policy-specific hyperparameter that controls the number of samples per condition
use_traj_consistency: IMLE Policy-specific hyperparameter that controls whether to use trajectory consistency or not
If you found our code helpful please consider citing:
@inproceedings{rana2025imle,
title = {IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation},
author = {Rana, Krishan and Lee, Robert and Pershouse, David and Suenderhauf, Niko},
booktitle = {Proceedings of Robotics: Science and Systems (RSS)}, year = {2025}}
The authors would like to thank the open source code upon which this project was built upon:
- The policy architectures, diffusion policy implementation and Push-T env are built off the Diffusion Policy repository.
- The RS-IMLE implentation was adapted from the RS-IMLE repository.
- The Lift, NutAssemblySquare, PickPlaceCan, ToolHang, and TwoArmTransport environments are provided by Robomimic.
- The Kitchen environment is provided by D4RL.
- The UR3 Block Push environment is adapted from the VQ-BeT repository.
