- [2025-10] We have released official Codebase!
- [2025-10] Technical Report is released!
- [2025-09] This work is accepted by NeurIPS2025!
We firstly visit the decoding problem in autoregressive (AR) image generation, and address sampling inefficiency by leveraging spatial entropy of token distributions.
Built upon existing AR models (LlamaGen, Lumina-mGPT, Meissonic & STAR), we improves image fidelity and alignment across multiple benchmarks without increasing computational cost.
Furthermore, we achieves ~15% faster generation based on current speculative decoding method (SJD).
Its effectiveness and generalization are further validated on diverse AR architectures, demonstrating significant gains in both quality and efficiency for mask-based and scale-wise models.
CLICK for Detailed Introduction
In this work, we first revisit the sampling issues in current autoregressive (AR) image generation models and identify that image tokens, unlike text tokens, exhibit lower information density and non-uniform spatial distribution. Accordingly, we present an entropy-informed decoding strategy that facilitates higher autoregressive generation quality with faster synthesis speed. Specifically, the proposed method introduces two main innovations:-
Dynamic temperature control guided by spatial entropy of token distributions, enhancing the balance between content diversity, alignment accuracy, and structural coherence in both mask-based and scale-wise models without extra computational overhead.
-
Entropy-aware acceptance rules in speculative decoding, achieving near-lossless generation at about 85% of the inference cost of conventional acceleration methods.
Extensive experiments across multiple benchmarks using diverse AR image generation models demonstrate the effectiveness and generalizability of our approach in enhancing both generation quality and sampling speed.
Clone the repository (now we provide only inference code for LlamaGen, the other codes will be released soon!):
git clone https://github.com/krennic999/ARsample.git
cd LlamaGendownload checkpoints refer to LlamaGen, and cd into ./autoregressive/sample, and run
torchrun --nnodes=1 --nproc_per_node=8 --node_rank=0 --master_port=29500 sample_t2i_coco.py --save_root your-save-rootSet --enable_entropy_filtering=True during inference for entropy-aware sampling.
We thank LlamaGen, Lumina-mGPT, Meissonic & STAR for their great work and trained model, and SJD for the algorithm.
@misc{ma2025betterfasterautoregressive,
title={Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy},
author={Xiaoxiao Ma and Feng Zhao and Pengyang Ling and Haibo Qiu and Zhixiang Wei and Hu Yu and Jie Huang and Zhixiong Zeng and Lin Ma},
year={2025},
eprint={2510.09012},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.09012},
}


