Skip to content

policyconsensus/policyconsensus

Repository files navigation

Multi-Modal Policy Consensus

Multi-Modal Manipulation via Multi-Modal Policy Consensus

📄 Paper | 🌐 Project Page | 📝 Blog | 📜 Deepwiki

Haonan Chen1,4, Jiaming Xu1*, Hongyu Chen1*, Kaiwen Hong1, Binghao Huang2, Chaoqi Liu1,
Jiayuan Mao3, Yunzhu Li2, Yilun Du4†, Katherine Driggs-Campbell1†

¹ University of Illinois Urbana-Champaign ² Columbia University ³ Massachusetts Institute of Technology ⁴ Harvard University

*Equal contribution †Equal advising

Teaser

Citation

If you use this work in your research, please cite:

@misc{chen2025multimodalmanipulationmultimodalpolicy,
      title={Multi-Modal Manipulation via Multi-Modal Policy Consensus},
      author={Haonan Chen and Jiaming Xu and Hongyu Chen and Kaiwen Hong and Binghao Huang and Chaoqi Liu and Jiayuan Mao and Yunzhu Li and Yilun Du and Katherine Driggs-Campbell},
      year={2025},
      eprint={2509.23468},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2509.23468},
}

Project Structure

This project is organized into several key directories:

  • modular_policy/: Contains the core policy implementation, including models, environments, and training/evaluation scripts.
  • scripts/: Utility scripts for data generation, processing, analysis, and policy evaluation.
  • third_party/: External dependencies and libraries.
  • data/: Dataset storage directory.
  • output/: Default directory for experiment outputs, checkpoints, and evaluation results.

Quick Setup

Automated Setup (Recommended)

For local Linux systems:

./setup.sh

For cluster environments:

./setup.sh --cluster

For real robot environments:

./setup.sh --real

The script automatically:

  • ✅ Configures environment variables in ~/.bashrc
  • ✅ Downloads and installs MuJoCo 210
  • ✅ Downloads and installs CoppeliaSim (local only)
  • ✅ Creates conda environment with all dependencies
  • ✅ Installs the package in development mode

Manual Setup

If you prefer manual installation or encounter issues:

Click to expand manual installation steps

1. Configure environment variables - Add to ~/.bashrc:

# Common (both local and cluster)
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HOME}/.mujoco/mujoco210/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export MUJOCO_GL=egl

export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

# Local Linux only (skip for cluster)
export COPPELIASIM_ROOT=${HOME}/.coppeliasim
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$COPPELIASIM_ROOT
export QT_QPA_PLATFORM_PLUGIN_PATH=$COPPELIASIM_ROOT

2. Install MuJoCo:

mkdir -p ~/.mujoco && cd ~/.mujoco
wget https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz -O mujoco210.tar.gz --no-check-certificate
tar -xvzf mujoco210.tar.gz

3. Install CoppeliaSim (local Linux only):

wget https://downloads.coppeliarobotics.com/V4_1_0/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04.tar.xz
mkdir -p $COPPELIASIM_ROOT && tar -xf CoppeliaSim_Edu_V4_1_0_Ubuntu20_04.tar.xz -C $COPPELIASIM_ROOT --strip-components 1
rm -rf CoppeliaSim_Edu_V4_1_0_Ubuntu20_04.tar.xz

4. Create conda environment:

mamba install mamba==1.5.9 -n base -c conda-forge

# Local (simulation)
mamba env create -f conda_environment.yaml

# Cluster
mamba env create -f conda_environment_cluster.yaml

# Real robot
mamba env create -f conda_environment_real.yaml

# Activate and install
conda activate policy-consensus
pip install -e .

Getting Started

After setup:

source ~/.bashrc  # If first time setup
conda activate policy-consensus

Pre-Collected Datasets

Download pre-collected demonstration datasets from Hugging Face:

Simulation (RLBench MT4):

Real-World (UR5e Robot):

Usage:

# Download and extract simulation dataset
wget https://huggingface.co/datasets/haonan-chen/policy-consensus/resolve/main/mt4_expert_200.zarr.tar.gz -O mt4_expert_200.zarr.tar.gz
mkdir -p data/rlbench
tar -xzf mt4_expert_200.zarr.tar.gz -C data/rlbench/

# Download and extract real-world dataset
wget https://huggingface.co/datasets/haonan-chen/policy-consensus/resolve/main/puzzle_expert_50.zarr.tar.gz -O puzzle_expert_50.zarr.tar.gz
mkdir -p data
tar -xzf puzzle_expert_50.zarr.tar.gz -C data/

# Verify datasets
python scripts/inspect_and_replay_dataset.py data/rlbench/mt4_expert_200.zarr --inspect-only
python scripts/inspect_and_replay_dataset.py data/puzzle_expert_50.zarr --inspect-only

For real-world data collection and complete workflow, see Real-World Guide.

Quick Start

End-to-end workflow for simulation tasks:

# 1. Download pre-collected dataset (see above) OR generate your own:
python scripts/manip_data.py gen rlbench -t mt4 -c 200

# 2. Train compositional policy (RGB + PCD + DINO)
python scripts/manip_policy.py \
  --config-name="train_dp_unets_spec_rgb_pcd_dino" \
  task=rlbench/mt4 \
  task.dataset.zarr_path=data/rlbench/mt4_expert_200.zarr \
  training.num_epochs=1001

# 3. Evaluate trained policy
python scripts/eval_policy.py sim \
  -c output/checkpoints/latest.ckpt \
  -o output/eval_results/ \
  -n 200

For real-world tasks: See the Real-World Guide for hardware setup, dataset downloads, and complete workflows.


Generate Data (Simulation)

This section details how to generate simulation demonstration data within simulation environments (RLBench). For real-world data collection, please refer to the Real-World Robotic Manipulation Guide.

Run scripted policy associated with the simulation benchmark for data generation.

python scripts/manip_data.py gen rlbench -t [task name] -c [num episodes]

# e.g. generate 50 demo for each task in MT4 set of RLBench benchmark
python scripts/manip_data.py gen rlbench -t mt4 -c 200

Train Policy

Simulation (RLBench Multi-Task)

Ours - Compositional Policy (RGB + PCD + DINO):

python scripts/manip_policy.py \
  --config-name="train_dp_unets_spec_rgb_pcd_dino" \
  task=rlbench/mt4 \
  task.dataset.zarr_path=data/rlbench/mt4_expert_200.zarr \
  training.num_epochs=1001

Baselines:

# Feature Concatenation (RGB + PCD + DINO)
python scripts/manip_policy.py \
  --config-name="train_dp_unet_rgb_pcd_dino" \
  task=rlbench/mt4 \
  task.dataset.zarr_path=data/rlbench/mt4_expert_200.zarr \
  training.num_epochs=1001

# RGB Only
python scripts/manip_policy.py \
  --config-name="train_dp_unet_rgb" \
  task=rlbench/mt4 \
  task.dataset.zarr_path=data/rlbench/mt4_expert_200.zarr \
  training.num_epochs=1001

Real-World Tasks

See the Real-World Guide for complete workflow including:

  • Data collection procedures
  • Pre-collected dataset downloads
  • Training configurations
  • Evaluation protocols and safety guidelines

Ours - Specialized RGB + Tactile (Policy Composition):

python scripts/manip_policy.py \
  --config-name="train_dp_unets_spec_rgb_tactile" \
  task=real/puzzle_expert \
  task.dataset.zarr_path=data/puzzle_expert_50.zarr \
  training.num_epochs=1001 \
  policy.obs_encoders.0.model_library.RobomimicRgbEncoder.crop_shape='[91, 121]' \
  policy.obs_encoders.1.model_library.RobomimicTactileEncoder.crop_shape='[15, 30]'

Baselines:

# Feature Concatenation (RGB + Tactile)
python scripts/manip_policy.py \
  --config-name="train_dp_unet_rgb_tactile" \
  task=real/puzzle_expert \
  task.dataset.zarr_path=data/puzzle_expert_50.zarr \
  training.num_epochs=1001

# RGB Only
python scripts/manip_policy.py \
  --config-name="train_dp_unet_rgb" \
  task=real/puzzle_expert \
  task.dataset.zarr_path=data/puzzle_expert_50.zarr \
  task.dataset.obs_keys="['camera_0_color','camera_1_color','robot_joint']"

See modular_policy/config/ for all available configurations.

Dataset Visualization and Inspection

The inspection tool works with both simulation and real-world datasets, though we primarily use it for real-world data analysis. For more examples and details, refer to the Real-World Robotic Manipulation Guide.

Visualize episodes as videos:

python scripts/inspect_and_replay_dataset.py data/puzzle_expert_50.zarr --video-only --episodes 0 1 2 3 4

Inspect dataset structure and statistics:

python scripts/inspect_and_replay_dataset.py data/puzzle_expert_50.zarr --inspect-only
Advanced options

Video generation:

  • --output-dir <dir>: Custom output directory
  • --fps <num>: Frame rate (default: 10)
  • --no-tactile: Disable tactile visualization

Dataset inspection:

  • --no-samples: Skip sample data (faster for large datasets)
  • --max-episodes <num>: Inspect more episodes in detail

The script automatically detects camera views, visualizes tactile data with heatmaps, and provides dataset statistics including episode counts, data keys, and quality checks.

Evaluate Policy

Simulation

Basic Evaluation:

python scripts/eval_policy.py sim \
  -c output/checkpoints/policy.ckpt \
  -o output/eval_results/ \
  -n 200

Examples:

# Evaluate our compositional policy (RLBench MT4)
python scripts/eval_policy.py sim \
  -c output/rlbench_compositional/checkpoints/ep-1000_sr-0.999.ckpt \
  -o output/eval/rlbench/compositional \
  -n 200

# Evaluate baseline - Feature Concatenation (RGB + PCD + DINO)
python scripts/eval_policy.py sim \
  -c output/rlbench_concat_baseline/checkpoints/ep-1000_sr-0.999.ckpt \
  -o output/eval/rlbench/concat_baseline \
  -n 200

# Evaluate baseline - RGB only
python scripts/eval_policy.py sim \
  -c output/rlbench_rgb_baseline/checkpoints/ep-1000_sr-0.999.ckpt \
  -o output/eval/rlbench/rgb_baseline \
  -n 200

# Evaluate multiple checkpoints in a directory
python scripts/eval_policy.py sim \
  -c output/my_experiment/checkpoints/ \
  -o output/eval/my_experiment \
  -n 200

Key Parameters:

  • -c, --checkpoint: Path to checkpoint file (.ckpt) or directory of checkpoints
  • -o, --output_dir: Output directory for evaluation results
  • -n, --num_episodes: Number of episodes to evaluate (default: 200)

Real-World

See the Real-World Guide for complete real-world evaluation workflow including robot setup, safety protocols, and interactive controls.

python modular_policy/workspace/eval_policy_real.py \
  -i output/checkpoints/policy.ckpt \
  -o output/eval_results/

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages