🎇MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation
We present MoMa-Kitchen, a benchmark dataset with over 100k auto-generated samples featuring affordance-grounded manipulation positions and egocentric RGB-D data, and propose NavAff, a lightweight model that learns optimal navigation termination for seamless manipulation transitions. Our approach generalizes across diverse robotic platforms and arm configurations, addressing the critical gap between navigation proximity and manipulation readiness in mobile manipulation.
- Paper uploaded to arXiv
- MoMa-Kitchen Dataset release
- NavAff Model Training Code release
- Data Collection Code and Assets release
- Create a conda environment with Python 3.8
conda create --name MoMaKitchen python=3.8
conda activate MoMaKitchen- Install Pytorch
Please change to your CUDA version
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124- Install Main Dependencies
pip install -r requirements.txtDownload the robot info data from the Google Drive: 24.7MB
Then unzip this file and change the info_root path in config.yaml.
Download the RGBD and processed point cloud data at this URL: https://huggingface.co/datasets/IPEC-COMMUNITY/MoMa-Kitchen-Data
(Option) After downloading(by git lfs), you can delete the .git folder in the dataset directory to save space.
rm -rf .git Then remember to change the rgbd_root to Path/To/Your/Datafolder in config.yaml.
Start training
bash train_on_ali.shThe training process costs nearly 12h on a single A100 GPU.