dLLM-Var: Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

Yicun Yang^{1 *}, Cong Wang¹, Shaobo Wang¹, Zichen Wen¹, Biqing Qi², Hanlin Xu³, Linfeng Zhang^{1 †}

¹Shanghai Jiao Tong University, ²Shanghai AI Lab, ³Huawei

Key Features

Native Variable-Length Generation: Guided by the [EOS] token, it supports arbitrary length output without fixed hyperparameters.
High Parallelism: Inherits the bidirectional attention of dLLM, supporting blockwise diffusion inference.
KV Cache Compatible: Seamlessly reuses the KV cache, avoiding complex designs and improving efficiency.

Figure 1: The evolution of probabilistic modeling paradigms for text generation. From autoregressive (AR) to diffusion-based methods. dLLM-Var achieves variable-length generation while maintaining high parallelism.

Installation

Requirements

Python 3.12
PyTorch 2.5+ (H-series GPUs support FP8 mixed precision)

Quick Installation

# Clone the repository
git clone https://github.com/maomaocun/dLLM-Var.git
cd dLLM-Var
bash install.sh

Quick Start

Demo

demo_dLLM-var.py

Evaluation

cd ./evaluation
bash run_batch.sh

You need to change the environment variables

Prepare Dataset

cd datset
python transfer_text2token.py --input_dir "path/to/your/input/jsonl/folder" --output_file "path/to/your/output/tokenized.jsonl" --tokenizer_model "path/to/your/LLaDA-8B-Base"

For detailed dataset format, see ./sft_training/data/dataset.py.

Training Script

Training uses DeepSpeed ZeRO-2 and supports multi-GPU. Example command:

cd ./sft_training
bash run_gpus_fp8.sh

For detailed training configuration, see ./sft_training/config/sft/default_config.yaml.

Citation

If you find this work useful, please cite:

@misc{yang2025diffusionllmnativevariable,
      title={Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way}, 
      author={Yicun Yang and Cong Wang and Shaobo Wang and Zichen Wen and Biqing Qi and Hanlin Xu and Linfeng Zhang},
      year={2025},
      eprint={2510.24605},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.24605}, 
}

License

MIT License.

Contact

Project Lead: [email protected]
Corresponding Author: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
dataset		dataset
evaluation		evaluation
sft_training		sft_training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo_dLLM-var.py		demo_dLLM-var.py
install.sh		install.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dLLM-Var: Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

Key Features

Installation

Requirements

Quick Installation

Quick Start

Demo

Evaluation

Prepare Dataset

Training Script

Citation

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dLLM-Var: Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

Key Features

Installation

Requirements

Quick Installation

Quick Start

Demo

Evaluation

Prepare Dataset

Training Script

Citation

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages