DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space

Junyu Chen†, Dongyun Zou, Wenkun He, Junsong Chen, Enze Xie, Song Han, Han Cai†
MIT, NVIDIA
(* indicates equal contribution)

News

Awards

No items found.

Competition Awards

No items found.

Abstract

We present DC-AE 1.5, a new family of deep compression autoencoders for high-resolution diffusion models. Increasing the autoencoder's latent channel number is a highly effective approach for improving its reconstruction quality. However, it results in slow convergence for diffusion models, leading to poorer generation quality despite better reconstruction quality. This issue limits the quality upper bound of latent diffusion models and hinders the employment of autoencoders with higher spatial compression ratios. We introduce two key innovations to address this challenge: i) Structured Latent Space, a training-based approach to impose a desired channel-wise structure on the latent space with front latent channels capturing object structures and the latter latent channels capturing image details; ii) Augmented Diffusion Training, an augmented diffusion training strategy with additional diffusion training objectives on object latent channels to accelerate convergence. With these techniques, DC-AE 1.5 delivers faster convergence and better diffusion scaling results than DC-AE. On ImageNet 512x512, DC-AE-1.5-f64c128 delivers better image generation quality than DC-AE-f32c32 while being 4x faster.

DC-AE 1.5 can reconstruct images from partial latent channels. The front channels focus on object structures and the latter channels add details, forming a structured latent space.

图片1 图片2 图片3 图片4 图片5 图片6 图片7 图片8

DC-AE Using 16 / 128 Channels

图片3 图片4 图片1 图片2 图片7 图片8 图片5 图片6

DC-AE 1.5 Using 16 / 128 Channels

Video

Citation

@article{chen2025dc,
 title={DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space},
 author={Chen, Junyu and Zou, Dongyun and He, Wenkun and Chen, Junsong and Xie, Enze and Han, Song and Cai, Han},
 journal={arXiv preprint arXiv:2508.00413},
 year={2025}
}

Media

No media articles found.

Acknowledgment

Team Members