Skip to content
/ DKT Public

official implement of "Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation"

License

Notifications You must be signed in to change notification settings

Daniellli/DKT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Shaocong Xu, Songlin Wei, Qizhe Wei, Zheng Geng, Hong Li, Licheng Shen, Qianpu Sun, Shu Han, Bin Ma, Bohan Li, Chongjie Ye, Yuhang Zheng, Nan Wang, Saining Zhang, and Hao Zhao

🌟 Takeaways

DKT is a foundation model for transparent-object 🫙, in-the-wild 🌎, arbitrary-length ⏳ video depth and normal estimation, facilitating downstream applications such as robot manipulation tasks, policy learning, and so forth.

teaser

✨ News

  • [25-12-04] 🔥🔥🔥 DKT is released now, have fun!

🤗 Pretrained Models

Our pretrained models are available on the huggingface hub:

Version Hugging Face Model
DKT-Depth-1-3B DKT-Depth-1-3B-v1.1
DKT-Depth-14B DKT-Depth-14B
DKT-Normal-14B DKT-Normal-14B

📦 Installation

Please run following commands to build package:

git clone https://github.com/Daniellli/DKT.git
cd DKT
pip install -r requirements.txt

🤖 Gradio Demo

  • Online demo: DKT
  • Local demo:
python app.py

💡 Usage

from dkt.pipelines.pipelines import DKTPipeline
import os
from tools.common_utils import save_video


pipe = DKTPipeline()

demo_path = 'examples/1.mp4'
prediction = pipe(demo_path,vis_pc = False)  #* Set vis_pc to `True` to obtain the estimated point cloud.


save_dir = 'logs'
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, 'demo.mp4')
save_video(prediction['colored_depth_map'], output_path, fps=25)



📜 Citation

@article{dkt2025,
  title   = {Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation},
  author  = {Shaocong Xu and Songlin Wei and Qizhe Wei and Zheng Geng and Hong Li and Licheng Shen and Qianpu Sun and Shu Han and Bin Ma and Bohan Li and Chongjie Ye and Yuhang Zheng and Nan Wang and Saining Zhang and Hao Zhao},
  journal = {https://arxiv.org/abs/2512.23705},
  year    = {2025}
}

💗 Ackownledge

Our code is based on recent fantastic works including MoGe, WAN, and DiffSynth-Studio. We sincerely thank the authors for their excellent contributions. Huge thanks!

📧 Contact

If you have any questions, please feel free to contact Shaocong Xu (daniellesry at gmail.com).

About

official implement of "Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages