Our datasets can be accessed using Huggingface or the Downloadable links below.
CADCODER/GenCAD-Code: A dataset of 163k images of CAD models pair with CadQuery Python code. This dataset is dervied from the DeepCAD dataset.
CADCODER/real_photo_test: A dataset of 400 images of 3D printed CAD objects from the test subset of the DeepCAD dataset.
We provide the scripts we used to create our datasets below, in case you'd like to adapt them to your own needs.
-
Download the DeepCAD vectors from this link in the
deepcad_deriveddirectory and unzip. -
Create the necessary environment using:
conda env create -f environment.yml
- With the environment activated, run the following to convert the .h5 vector files into Python CadQuery files. It should take ~2 minutes for all files to generate.
python scripts/h5tocadquery.py
-
Download the rendered CAD images from the GenCAD dataset using this link. Download the zip in
deepcad_derived/datadirectory and unzip. TODO: provide actual script to generate these images. -
Merge all the components of the dataset and upload it to huggingface using the following:
python scripts/gencadcode_to_hf.py
First, load HEICs into the real_photo_test_set/heics directory. Then, run the following to convert HEICs into PNGs:
python scripts/process_heic.py --heic_dir real_photo_test_set/heics --save_dir real_photo_test_set/pngs
Next, run the script to push the dataset to huggingface:
python scripts/upload_realphoto_to_hf.py