DAWN: Dependency-Aware Fast Inference for Diffusion LLMs (Paper Coming Soon)

DAWN is a training-free, dependency-aware decoding method for fast dLLM inference.

DAWN leverages a dependency graph to select more reliable unmasking positions at each iteration, achieving high parallelism with negligible loss in generation quality.

🚀 Features

Mitigate nonindependent prediction via inter-poistion dependency.
A training-free, plug-and-play method, improving quality-speed trade-off.
Fast inference support for Dream and LLaDA model.
Multiple baseline method realization.
Full evaluation provided.

🔍 Key Details

DAWN is composed of three main modules: Dependency Graph Construction, Anchor-Guided Decoding and Conflict-Based Scheduling.

Dependency Graph Construction extracts a lightweight proxy of token dependencies from the model’s attention maps and builds a sparse directed dependency graph. It mitigates attention-sink bias by filtering positions with abnormal incoming attention mass, then retains only salient high-score attention links to capture meaningful couplings between positions for downstream scheduling.
Anchor-Guided Decoding first selects high-confidence masked positions that are likely safe to unmask in parallel, then uses previously committed high-confidence positions as anchors to relax the confidence requirement for their dependent (induced) positions. This expands safe parallelism beyond conservative thresholding by leveraging reliable context provided by anchors.
Conflict-Based Scheduling prevents error-prone joint updates by explicitly avoiding strongly coupled positions for remaining candidates under a lower confidence threshold. Using the dependency graph to define conflicts, it greedily constructs a large non-conflicting update set (an independent set), enabling additional parallel unmasking while reducing inconsistencies caused by non-independent position predictions.

🔧 Installation

Option A: Quick start (recommended)

pip install -r requirements.txt

Option B: Reproducible install

pip install -r requirements-lock.txt

✨ Eval

We provide the eval scripts for the main experiment, you can reproduce it directly. For example:

cd llada
bash eval_instruct.sh

The main experiment is conducted on an Nvidia H100 80 GPU, DAWN exhibits efficiency across multiple models and benchmarks:

🎓 Citation

Thank you for citing this work if it helps your research!

@misc{dawn,
      title={DAWN: Dependency-Aware Fast Inference for Diffusion LLMs}, 
      author={Lizhuo Luo and Zhuoran Shi and Jiajun Luo and Zhi Wang and Shen Ren and Wenya Wang and Tianwei Zhang},
      year={2026},
      eprint={2602.06953},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.06953}, 
}

🙏 Acknowledgements

We would like to thank the authors of LLaDA, Dream and Fast-dLLM for their excellent work and open-source contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
asset		asset
dream		dream
llada		llada
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements-lock.txt		requirements-lock.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAWN: Dependency-Aware Fast Inference for Diffusion LLMs (Paper Coming Soon)

🚀 Features

🔍 Key Details

🔧 Installation

Option A: Quick start (recommended)

Option B: Reproducible install

✨ Eval

🎓 Citation

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DAWN: Dependency-Aware Fast Inference for Diffusion LLMs (Paper Coming Soon)

🚀 Features

🔍 Key Details

🔧 Installation

Option A: Quick start (recommended)

Option B: Reproducible install

✨ Eval

🎓 Citation

🙏 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages