Skip to content

Trans-Diff/TransDiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image

ICRA 2025

Haoxiao Wang¹, Kaichen Zhou¹*, Binrui Gu¹, Zhiyuan Feng², Weijie Wang³, Peilin Sun³, Yicheng Xiao⁴, Jianhua Zhang¹, Hao Dong¹*

¹Peking University, ²Tsinghua University, ³Zhejiang University, ⁴Southeast University

*equal contributions, *corresponding author


InstallationDatasetTrainingTestingResultsCitation

arXiv Project Page python license



TransDiff Overview
TransDiff presents a diffusion-based method for depth estimation of transparent objects. By leveraging RGB cues like edges and normals, our model gradually refines depth through a denoising process. Despite the challenges of reflection and refraction, TransDiff achieves accurate, material-agnostic depth maps and outperforms existing methods on both synthetic and real-world datasets.

Dataset

Download the ClearGrasp dataset from ClearGrasp. This dataset contains RGB-D images of transparent objects for depth completion and manipulation tasks.

Dataset Structure

data/
├── cleargrasp/
│   ├── train/
│   │   ├── rgb/
│   │   ├── depth/
│   │   ├── mask/
│   │   └── init_depth/
│   └── test/
│       ├── rgb/
│       ├── depth/
│       ├── mask/
│       └── init_depth/
└── data_json/
    └── cleargrasp_train_0_1.json

Installation

Prerequisites

Our released implementation is tested on:

  • Ubuntu 20.04 / Ubuntu 22.04
  • Python 3.10.x
  • NVIDIA CUDA 12.4
  • 8x NVIDIA GTX 4090 / 8x NVIDIA A100 RTX GPUs

Environment

conda create -n transdiff python=3.10
conda activate transdiff
pip install -r requirements.txt

Training

Quick Start Script

Use the provided script for ClearGrasp dataset:

cd src
chmod +x run_cleargrasp.sh
./run_cleargrasp.sh

Script contents:

#!/bin/bash
python main.py \
    --dir_data DATA_PATH \
    --data_name CLEARGRASP \
    --split_json DATA_JSON_PATH \
    --patch_height 144 --patch_width 256 \
    --gpus 0,1,2,3,4,5,6,7 \
    --loss "1.0*L1+1.0*L2+1.0*DDIM" \
    --epochs 30 \
    --batch_size 64 \
    --max_depth 1.5 \
    --save CLEARGRASP_results \
    --model_name Transdiff_Diffusion \
    --backbone_module swin \
    --backbone_name swin_large_naive_l4w722422k \
    --head_specify DDIMDepthEstimate_Swin_ADDHAHI \

Testing

Inference on Test Set

python main.py \
    --dir_data DATA_PATH \
    --data_name CLEARGRASP \
    --split_json DATA_JSON_PATH \
    --patch_height 144 --patch_width 256 \
    --gpus 0 \
    --max_depth 1.5 \
    --batch_size 1 \
    --test_only \
    --pretrain path/to/trained/model.pt \
    --save test_results \
    --save_image \
    --model_name Transdiff_Diffusion \
    --backbone_module swin \
    --backbone_name swin_large_naive_l4w722422k \
    --head_specify DDIMDepthEstimate_Swin_ADDHAHI

Common Installation Issues

1. MMDetection Installation:

# If mmdet installation fails, try:
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
mim install mmdet

2. OpenEXR Installation:

# On Ubuntu/Debian:
sudo apt-get install libopenexr-dev

# On CentOS/RHEL:
sudo yum install OpenEXR-devel

# Using conda:
conda install -c conda-forge openexr-python

Citation

If you find this work useful in your research, please cite:

@article{wang2025transdiff,
  title={Transdiff: Diffusion-based method for manipulating transparent objects using a single rgb-d image},
  author={Wang, Haoxiao and Zhou, Kaichen and Gu, Binrui and Feng, Zhiyuan and Wang, Weijie and Sun, Peilin and Xiao, Yicheng and Zhang, Jianhua and Dong, Hao},
  journal={arXiv preprint arXiv:2503.12779},
  year={2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published