This repository is the implementation of Latent BKI, aiming to reproduce the experiment results shown in the paper.
conda env create -f environment.yml # this may takes some time
conda activate latentbki_env
There are some package needs manual installation:
- Clip: follow the repo.
- torchsparse (sudo apt-get install libsparsehash-dev, pip install --upgrade git+https://github.com/mit-han-lab/[email protected])
- pytorch3d
(Optional) Install ROS Noetic to visualize map in Rviz
Note: Also checkout all dependent codebase if facing any issues.
Follow VLMap's Generate dataset section to get MP3D sequences. The ground truth semantic generated is a bit incorrect, so we provided a modifcation to obtain correct ground truth data here.
Follow 2DPASS' Data Preparation section to obtain Semantic KIITI dataset under Dataset folder
Download SPVCNN model checkpoint from here and put it under ./TwoDPASS/pretrained/SPVCNN.
- Download Record3D app on iPhone/iPad, record a video and export it as
.r3dfile. - Extract the files in
.r3dsame as extracting from zip files, you will get a folder namedrgbdand ametadatafile. - Run
Data/select_r3d_frames.pywith customized parameters to create following real world dataset folder structure:
/[your dataset path]
├──
├── ...
└── real_world/
├──[sequence name]
├── conf/
| ├── 000000.npy
| ├── 000001.npy
| └── ...
└── depth/
| ├── 000000.npy
| ├── 000001.npy
| └── ...
└── rgb/
| ├── 000000.jpg
| ├── 000001.jpg
| └── ...
└── intrinsics.txt
└── poses.txt
You can download an already processed Record3D data example, my_house_long.zip, here.
Download mp3d_pca_64.pkl here to ./PCAonGPU/PCA_instance
Required to provide path in ./config/mp3d.yamlto following parameters:
- data_dir: "/path/to/realworld/dataset/folder"
- pca_path: "/path/to/trained/pca/.pkl/file"
Other parameters are optional if only want to reproduce the result.
Modify realworld.yaml under ./config
Required parameters:
- num_classes: [number of class desired]
- data_dir: "/path/to/realworld/dataset/folder"
- pca_path: "/path/to/trained/pca/.pkl/file"
- intrinsic: [matrix from the intrinsic.txt]
- sequences: [
[your_sequences_name]
]
- category: [
[List of words you want decode]
]
Optional parameters:
- feature_size: [PCA downsampled size, default 64]
- grid_mask: [ignore points outside local grid, default True]
- down_sample_feature: [default True]
- raw_data: [Set to True only if features are saved to disk]
- subsample_points: [How much pixel feature to use, default 1, use all feature]
- feature_dir: [set it only if you save latent feature to disk]
NOTE: semantic_kitti.yaml is used to provide additional parameters, such as feature size. We are using the dataloader in 2DPASS. Change the following parameters in TwoDPASS/config/SPVCNN-semantickitti.yaml:
train_data_loader:
data_path: "/path/to/kitti/dataset/sequences"
val_data_loader:
data_path: "/path/to/kitti/dataset/sequences"
In ./generate_results.py, set MODEL_NAME to one of the following:
- "LatentBKI_default": latent mapping using MP3D
- "LatentBKI_kitti": latent mapping using semantic KITTI
- "LatentBKI_vlmap": including vlmap heuristic for comparison experiment
- "LatentBKI_realworld": map real-world environment captured by Record3D
Generated latent map and evaluation result for each sequence will be under Results folder.
In ./inference.py, provide the following parameters:
- RESULT_SAVE: the folder that contain the map you want to evaluate
- MODEL_NAME: The model you used to create the above map
- scenes: the sequences you want to evaluate
The evalution result will be under the folder you provided to RESULT_SAVE as a results.txt file.
- Run
./publish_map.pywithlatent_mapandcategory_mapset to the map you want to visualize. - Open Rviz and subscribe to topic
visualization_marker_array
- Run
./publish_map.pywith customizedMODEL_NAMEandlatent_map_pathparameter. - Open Rviz and subscribe to topic
Open_Query/HeatmapandOpen_Query/Uncertainty - In terminal follow the prompt to query arbitrary word.
