For acquiring and processing public datasets, please follow NoMaD. For Habitat datasets, install Habitat with conda install habitat-sim==0.2.4 withbullet headless -c conda-forge -c aihabitat, then use collect_datasets.py to collect trajectories in the format required by this project. We use the Matterport3D (MP3D) dataset for data collection.
Your dataset must follow the directory structure below. If you are collecting a custom dataset, please organize it accordingly:
├── <dataset_name>
│ ├── <name_of_traj1>
│ │ ├── 0.jpg
│ │ ├── 1.jpg
│ │ ├── ...
│ │ ├── T_1.jpg
│ │ └── traj_data.pkl
│ ├── <name_of_traj2>
│ │ ├── 0.jpg
│ │ ├── ...
│ │ └── traj_data.pkl
│ ...
└── └── <name_of_trajN>
├── 0.jpg
├── ...
└── traj_data.pkl
Use data_split.py to split your data into training and testing sets.
Split Training Data:
python data_split.py -i <path_to_train_data> -d <train_dataset_name>Split Test Data:
python data_split.py -i <path_to_test_data> -d <test_dataset_name> -s 0Train the model using the provided configuration file:
python train.py --config config/config_shortcut_w_pretrain.yamlEvaluation involves three steps: preparing ground truth frames, generating future frames, and calculating metrics (LPIPS, DreamSim, FID).
Step 1: Prepare Ground Truth Frames
python isolated_infer.py --exp logs/<run_name> --ckp latest --datasets <test_dataset_name> --gt 1Step 2: Generate Future Frames
python isolated_infer.py --exp logs/<run_name> --ckp latest --datasets <test_dataset_name> --gt 0Step 3: Calculate Metrics
python isolated_eval.py --gt_dir output/gt --exp_dir output/<run_name>_latest --datasets <test_dataset_name>Perform waypoints prediction using the trained World Model.
Download the Distance Model Weights before running inference into models_dist/weights.
We provide an intuitive inference script for testing:
python inference.pyIf you find this work useful in your research, please consider citing:
@misc{shen2026efficientmultimodalnavigationonestep,
title={An Efficient and Multi-Modal Navigation System with One-Step World Model},
author={Wangtian Shen and Ziyang Meng and Jinming Ma and Mingliang Zhou and Diyun Xiang},
year={2026},
eprint={2601.12277},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2601.12277},
}