Skip to content

repo of paper "ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory"

Notifications You must be signed in to change notification settings

myendless1/ManipDreamer3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🦾 ManipDreamer3D

Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
Official implementation of the paper ManipDreamer3D (arXiv:2509.05314)

arXiv GitHub HuggingFace


🌟 Overview

ManipDreamer3D is a 3D-aware video generation framework designed for plausible robotic manipulation synthesis.
It integrates occupancy-aware 3D trajectory planning with visual generative modeling, achieving both visual realism and physical feasibility.

🔹 Key Highlights

  • 🧠 Occupancy-based 3D path planning ensures physical plausibility
  • 🎥 Generates consistent multi-frame robotic motion videos
  • 📈 Achieves 67.9% success rate, comparable to CogACT (67.5%) in SimplerEnv

🚀 Quick Installation

Tested on Ubuntu 22.04 / CUDA 11.8 / Python 3.10
⚠️ Important: Install sam2 first, then other dependencies.

  1. Setup Environment
conda create -n md3d python=3.10 -y
conda activate md3d

pip install uv -i https://pypi.tuna.tsinghua.edu.cn/simple
export UV_INDEX_URL="https://pypi.tuna.tsinghua.edu.cn/simple/"

uv pip install git+https://github.com/facebookresearch/sam2.git
pip install torch==2.1.2+cu118 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
CUDA_HOME=/usr/local/cuda-11.8 uv pip install -e . --no-build-isolation
CUDA_HOME=/usr/local/cuda-11.8 uv pip install -r requirements.txt
  1. Install NKSR (for occupancy reconstruction)
git clone https://github.com/nv-tlabs/NKSR.git && cd NKSR
CUDA_HOME=/usr/local/cuda-11.8 uv pip install --no-build-isolation package/
cd ..
  1. Install SimplerEnv (for policy evaluation)
git clone https://github.com/simpler-env/SimplerEnv --recurse-submodules
cd SimplerEnv
uv pip install -r SimplerEnv/requirements_full_install.txt
cd ManiSkill2_real2sim && uv pip install .
cd .. && uv pip install .
pip install --upgrade "jax[cuda11_pip]==0.4.20" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
cd ..
  1. Fix Vulkan dependency if needed
conda install conda-forge::libvulkan-loader -y

⚙️ Environment Variables

Create an env.sh file and source env.sh for environment variables.:

## env.sh
export USER_ROOT="/path/to/your/workspace"
conda activate md3d

# (optional) set proxy
export PATH=/usr/local/cuda-11.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH

# API key for OpenAI or DeepSeek
export api_key="your_openai_api_key"

# SimplerEnv data path
export MS2_REAL2SIM_ASSET_DIR=../SimplerEnv/ManiSkill2_real2sim/data

# PyTorch memory tuning
export PYTORCH_CUDA_ALLOC_CONF="max_split_size_mb:64"

📦 Download Pretrained Models

mkdir -p manipdreamer3d/weights
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth \
     -O manipdreamer3d/weights/groundingdino_swint_ogc.pth

🧩 Training

bash scripts/train/train_md3d.sh

🎯 Citation

If you find this work useful, please consider citing:

@article{li2025manipdreamer3d,
  title={ManipDreamer3D: Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory},
  author={Li, Ying and Wei, Xiaobao and Chi, Xiaowei and Li, Yuming and Zhao, Zhongyu and Wang, Hao and Ma, Ningning and Lu, Ming and Zhang, Shanghang},
  journal={arXiv preprint arXiv:2509.05314},
  year={2025}
}

About

repo of paper "ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published