TreeSBA

This repository is the official implementation of TreeSBA.

TreeSBA: Tree-Transformer for Self-Supervised Sequential Brick Assembly
Mengqi Guo, Chen Li, Yuyang Zhao, Gim Hee Lee

Abstract

Inferring step-wise actions to assemble 3D objects with primitive bricks from images is a challenging task due to complex constraints and the vast number of possible combinations. Recent studies have demonstrated promising results on sequential LEGO brick assembly through the utilization of LEGO-Graph modeling to predict sequential actions. However, existing approaches are class-specific and require significant computational and 3D annotation resources. In this work, we first propose a computationally efficient breadth-first search (BFS) LEGO-Tree structure to model the sequential assembly actions by considering connections between consecutive layers. Based on the LEGO-Tree structure, we then design a class-agnostic tree-transformer framework to predict the sequential assembly actions from the input multi-view images. A major challenge of the sequential brick assembly task is that the step-wise action labels are costly and tedious to obtain in practice. We mitigate this problem by leveraging synthetic-to-real transfer learning. Specifically, our model is first pre-trained on synthetic data with full supervision from the available action labels. We then circumvent the requirement for action labels in the real data by proposing an action-to-silhouette projection that replaces action labels with input image silhouettes for self-supervision. Without any annotation on the real data, our model outperforms existing methods with 3D supervision by 7.8% and 11.3% in mIoU on the MNIST and ModelNet Construction datasets, respectively.

1. Installation

Pull repo.

git clone [email protected]:dreamguo/TreeSBA.git

Create conda environment.

conda env create -f environment.yml
conda activate TreeSBA

2. Usage

Download RAD, MNIST-C, ModelNet-C data used in the paper here and put them under ./dataset folder.

Please follow the data folder as:

├── dataset
|   ├── graph_dat
|   |   └── random13to18.dat   # RAD-S dataset
|   |   └── random15to50.dat   # RAD dataset
|   |   └── random.dat         # RAD-1k dataset
|   |   └── ...
|   ├── tree_actions
|   |   └── random13to18       # RAD-S dataset
|   |   └── ...
|   ├── tree_dep_actions
│   │   └── random13to18       # RAD-S dataset
|   |   └── ...
|   ├── voxel
│   │   └── mnist_all          # MNIST-C dataset
│   │   └── modelnet_all       # ModelNet-C40 dataset
|   |   └── ...
│   ├── voxel_img
│   │   └── random13to18       # RAD-S dataset
|   |   └── ...
├── ...

Training.

Pre-train on RAD dataset.

CUDA_VISIBLE_DEVICES=0 python run.py --config configs/RAD.txt

Pre-train on RAD-S dataset.

CUDA_VISIBLE_DEVICES=0 python run.py --config configs/RAD-S.txt

Fine tune on MNIST-C dataset.

CUDA_VISIBLE_DEVICES=0 python run.py --config configs/MNIST-C.txt

Fine tune on ModelNet-C3 dataset.

CUDA_VISIBLE_DEVICES=0 python run.py --config configs/ModelNet-C.txt

We also upload pre-trained checkpoints here, please put them under ./pretrained_model folder.

Test.

Test on MNIST-C dataset.

CUDA_VISIBLE_DEVICES=0 python run.py --config configs/MNIST-C.txt --inference 1 --save_obj 1 --load_model_path pretrained_model/mnist_all.pt

Test on ModelNet-C3 dataset.

CUDA_VISIBLE_DEVICES=0 python run.py --config configs/ModelNet-C.txt --inference 1 --save_obj 1 --load_model_path pretrained_model/modelnet_all3.pt

Test on ModelNet-C40 dataset.

CUDA_VISIBLE_DEVICES=0 python run.py --config configs/ModelNet-C.txt --inference 1 --save_obj 1 --load_model_path pretrained_model/modelnet_all40.pt

3. Dataset build from scratch

python prepare_dataset/prepareRAD.py

Our code for generting LEGO dataset is based on Combinatorial-3D-Shape-Generation.

4. Citation

If you make use of our work, please cite our paper:

@article{Guo2024TreeSBA,
  author    = {Guo, Mengqi and Li, Chen and Zhao, Yuyang and Lee, Gim Hee},
  title     = {TreeSBA: Tree-Transformer for Self-Supervised Sequential Brick Assembly},
  journal   = {European Conference on Computer Vision (ECCV)},
  year      = {2024},
}

5. Ackowledgements

This work is based on GenerativeLEGO and Combinatorial-3D-Shape-Generation. If you use this code in your research, please also acknowledge their work.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
GenerativeLEGO		GenerativeLEGO
configs		configs
prepare_dataset		prepare_dataset
unit_primitives		unit_primitives
utils		utils
LICENSE.txt		LICENSE.txt
README.md		README.md
config.py		config.py
environment.yml		environment.yml
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TreeSBA

Abstract

1. Installation

2. Usage

Training.

Test.

3. Dataset build from scratch

4. Citation

5. Ackowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

dreamguo/TreeSBA

Folders and files

Latest commit

History

Repository files navigation

TreeSBA

Abstract

1. Installation

2. Usage

Training.

Test.

3. Dataset build from scratch

4. Citation

5. Ackowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages