PAPER_TITLE

FIRST_AUTHOR_LAST, FIRST_AUTHOR_FIRST; SECOND_AUTHOR_LAST, SECOND_AUTHOR_FIRST

SOAR: Confidence-Switched Position Beam Search for Diffusion Language Models

Mingyu Cao, Alvaro H.C. Correia, Christos Louizos, Shiwei Liu^✉, Lu Yin^✉,

University of Surrey, Qualcomm AI Research, ELLIS Institute Tübingen, Max Planck Institute for Intelligent Systems, Tübingen AI Center
Arxiv
^✉Indicates Corresponding author

Paper Code arXiv

Overview

SOAR is a confidence-switched position beam search decoding strategy for diffusion language models. The core idea is:

When there are high-confidence tokens in the sequence, SOAR directly parallel decode these tokens.
Otherwise, it employs position beam search (PBS) to expand the search space for higher-confidence sequence.

Motivation

We computed the average confidence of all samples during decoding on DREAM-7B and divided the samples into 6 groups, revealing a positive correlation between confidence and accuracy (p = 0.000). This leads us to ask: Can we obtain sequences with higher decoding confidence—and thus better decoding results—by expanding the search space?

Method Illustration

(1) We propose Position Beam Search (PBS), which performs search in the position dimension to obtain sequences with higher confidence. We find that while PBS improves generation quality, it slows down inference speed.

(2) We further propose SOAR (illustrated above), which performs search only when the model cannot generate high-confidence tokens, and directly employs parallel decoding to accelerate inference when the model is sufficiently confident.

Main Results

PBS (single token) achieves quality improvement but slows down decoding by 50%.
When combining PBS with parallel decoding (decoding two tokens per step, corresponding to PBS (parallel) in the figure), the quality gains from PBS are offset by the quality degradation caused by parallel decoding.
In contrast, SOAR (Ours) dynamically switches between the two modes based on confidence, achieving improved decoding quality without sacrificing decoding speed.

Detail Results

Detailed Results on LLaDA-8B-Base and Dream-7B-Base

Upper: Results on LLaDA-8B-Base; Lower: Results on Dream-7B-Base

Further Analysis

We again plot the confidence and accuracy of different methods (left figure). It can be seen that SOAR improves accuracy by searching for sequences with higher average confidence.
Interestingly, we also plot the relationship between AR-ness and accuracy (right figure) and find that negative AR-ness also correlates with accuracy. This suggests that SOAR, by expanding the search space, prevents the model from always decoding tokens closer to the prompt, making the generation process more flexible.

AR-ness: proposed by DiffuCoder, measures whether the generation process follows a left-to-right order; the larger the value, the closer it is to autoregressive generation.

Decoding Path Visualization

Left: Greedy Decoding, Right: SOAR (using the same GSM8K sample). SOAR explores a more confident and flexible decoding path under less decoding steps.

BibTeX

@misc{cao2026searchaccelerateconfidenceswitchedposition,
      title={Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models}, 
      author={Mingyu Cao and Alvaro Correia and Christos Louizos and Shiwei Liu and Lu Yin},
      year={2026},
      eprint={2602.10953},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.10953}, 
}
}