Skip to content

Latest commit

 

History

History
105 lines (93 loc) · 4.28 KB

File metadata and controls

105 lines (93 loc) · 4.28 KB

Data

ShapeLLM

ShapeLLM utilizes text-3D paired data provided by Cap3D for the first-stage alignment training and employs GPT-4V to construct multi-view-based general data for supervised finetuning. To equip the model with the capability of 3D Visual Grounding, we also leverage GPT4 based on GAPartNet to build SFT data for Embodied Understanding. All data except Cap3D_pts can be directly downloaded from Hugging Face. Cap3D_pcs data needs to be obtained in pt form from the Cap3D repository.

Data file name Size
cap3d_objaverse_785k.json 242 MB
cap3d_objaverse_sft_45k.json 16.9 MB
gapartnet_sft_27k_openai.json 12.5 MB
gapartnet_pcs.zip 4.59 GB
cap3d_pcs 173.8 GB
Organize the data as follows in ./playground/data/shapellm/
│playground/data/shapellm/
├── cap3d_objaverse_785k.json
├── cap3d_objaverse_sft_45k.json
├── gapartnet_sft_27k_openai.json
├── gapartnet_pcs
│   ├── Box_100129_0_0.npy
│   └── ...
└── cap3d_pcs
    ├── 00000054c36d44a2a483bdbff31d8edf.pt
    └── ...

ReCon++

The overall directory structure should be:

│llava/
│ReConV2/
│  └──data/
│    ├──OpenShape/
│    ├──ModelNet/
│    ├──ModelNetFewshot/
│    └──ScanObjectNN/

OpenShape Dataset:

│OpenShape/
├──objaverse-processed/
│  └── merged_for_training_final/
│      ├── 3D-FUTURE/
│      ├── ABO/
│      ├── Objaverse/
│      └── ShapeNet/
├──meta_data/
│      ├── modelnet40/
│      ├── scanobjectnn/
│      ├── split/
│      ├── gpt4_filtering.json
│      ├── lvis_cat_name_pt_feat.npy
│      └── point_feat_knn.npy

Download: You can download the processed data from OpenShape Hugging Face. Note that the rendered image data is not necessary.

ModelNet40 Dataset:

│ModelNet/
├──modelnet40_normal_resampled/
│  ├── modelnet40_shape_names.txt
│  ├── modelnet40_train.txt
│  ├── modelnet40_test.txt
│  ├── modelnet40_train_8192pts_fps.dat
│  └── modelnet40_test_8192pts_fps.dat

Download: You can download the processed data from Point-BERT repo, or download from the official website and process it by yourself.

ModelNet Few-shot Dataset:

│ModelNetFewshot/
├──5way10shot/
│  ├── 0.pkl
│  ├── ...
│  ├── 9.pkl
├──5way20shot/
│  ├── ...
├──10way10shot/
│  ├── ...
├──10way20shot/
│  ├── ...

Download: Please download the data from Point-BERT repo. We use the same data split as theirs.

ScanObjectNN Dataset:

│ScanObjectNN/
├──main_split/
│  ├── training_objectdataset_augmentedrot_scale75.h5
│  ├── test_objectdataset_augmentedrot_scale75.h5
│  ├── training_objectdataset.h5
│  ├── test_objectdataset.h5
├──main_split_nobg/
│  ├── training_objectdataset.h5
│  └── test_objectdataset.h5

Download: Please download the data from the official website.