feat: add XTC sampler support by blightbow · Pull Request #337 · jundot/omlx

blightbow · 2026-03-21T19:18:13Z

Thread mlx-lm's XTC (eXclude Top Choices) sampling parameters through the full request pipeline. XTC was the only mlx-lm sampler missing from the omlx API surface.

Add xtc_probability and xtc_threshold fields to SamplingParams dataclass (default 0.0 = disabled)
Add optional xtc_probability and xtc_threshold to both ChatCompletionRequest and CompletionRequest API models
Extend get_sampling_params() to resolve XTC values with the same request > default priority as other sampling params
Thread XTC params through chat_kwargs dicts and direct engine calls across all API endpoints (chat, completion, anthropic messages, responses)
Extract XTC params from kwargs in BatchedEngine and VLMBatchedEngine SamplingParams construction
Pass xtc_probability, xtc_threshold, and xtc_special_tokens to both make_sampler() call sites in the scheduler
Add _get_xtc_special_tokens() helper to Scheduler, delegating to _get_stop_tokens() for EOS coverage and caching the result at init time
Add 10 new tests covering defaults, passthrough, API model acceptance, and special token derivation

blightbow · 2026-03-21T19:37:34Z

I aligned to the upstream mlx-lm defaults on this PR. This means both parameters must be specified in order for the feature to work. It should be noted that the original author's implementation (oobabooga/text-generation-webui#6335) uses a threshold of 0.1:

https://github.com/p-e-w/text-generation-webui/blob/29d38a74ba8d47206f2371b44a746111fac80f4c/modules/presets.py#L47-L48

@jundot I personally think that principle of least astonishment is more in favor of the 0.1 default (don't silently fail when the user wants the feature enabled but doesn't set a threshold), but I'll defer to your preference on this one. Feel free to tweak that value yourself if you accept the PR.

blightbow · 2026-03-21T20:46:01Z

On second thought, it doesn't make sense to model after upstream on this. A xtc_threshold of 0.0 when xtc_probability >0 doesn't actually disable it! I had Claude Code verify my analysis:

❯ Wait, let's double check that. xtc_threshold of 0.0 when xtc_probability > 0.0 is not treated as disabled?

(snip)

⏺ Good catch to verify. With xtc_threshold=0.0:

mask = probs > mx.where(probs > 0.0, probs, mx.inf).min()

Since all probabilities are > 0.0 (after softmax), mx.where(probs > 0.0, probs, mx.inf) returns all probs unchanged, and .min() gives the minimum probability. The mask then selects all tokens with probability greater than the minimum — so all tokens except the least probable one get masked to -inf.

That's not "disabled" — it's maximally aggressive. It would exclude every token except the lowest-probability one (plus special tokens). With xtc_probability=0.5, that would happen on 50% of token samples.

So upstream's 0.0/0.0 is safe only because both default to 0 and the if xtc_probability > 0.0 gate prevents apply_xtc from ever being called. But if a user sets xtc_probability without setting xtc_threshold, they'd get destructive behavior silently.

As such, I would consider upstream's default to be a bug. I'll change our own default and log that bug separately.

Thread mlx-lm's XTC (eXclude Top Choices) sampling parameters through the full request pipeline. XTC was the only mlx-lm sampler missing from the omlx API surface. - Add xtc_probability and xtc_threshold fields to SamplingParams dataclass (default 0.0 and 0.1 respectively) - Default xtc_threshold to 0.1 instead of upstream's 0.0 to prevent destructive sampling when only probability is set (upstream threshold=0.0 excludes all tokens except the least probable one) - Add optional xtc_probability and xtc_threshold to both ChatCompletionRequest and CompletionRequest API models - Extend get_sampling_params() to resolve XTC values with the same request > default priority as other sampling params - Thread XTC params through chat_kwargs dicts and direct engine calls across all API endpoints (chat, completion, anthropic messages, responses) - Extract XTC params from kwargs in BatchedEngine and VLMBatchedEngine SamplingParams construction - Pass xtc_probability, xtc_threshold, and xtc_special_tokens to both make_sampler() call sites in the scheduler - Add _get_xtc_special_tokens() helper to Scheduler, delegating to _get_stop_tokens() for EOS coverage and caching the result at init time - Add 10 new tests covering defaults, passthrough, API model acceptance, and special token derivation Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> Signed-off-by: Blightbow <[email protected]>

jundot · 2026-03-29T09:29:15Z

Reviewed the full diff. Looking good.

The safe default for xtc_threshold (0.1 instead of upstream's 0.0) is a nice touch. Upstream's 0.0 would nuke sampling if someone only sets probability without thinking about threshold.

_get_xtc_special_tokens() reusing _get_stop_tokens() keeps things clean. All 5 get_sampling_params() call sites are updated, tests cover the important paths.

The growing tuple return from get_sampling_params() (now 10 elements) is getting unwieldy but that's pre-existing tech debt, not something to hold this up for. I'll clean that up separately.

No regression concerns. XTC defaults to disabled (probability=0.0) so existing behavior is untouched.

jundot · 2026-03-29T11:12:16Z

@blightbow v0.3.0rc1 is out with your XTC sampler included. https://github.com/jundot/omlx/releases/tag/v0.3.0rc1 — if you get a chance, please give it a test and let me know if anything looks off. thanks!

blightbow force-pushed the feat/xtc-sampler branch from 3a99fbd to 5e71737 Compare March 21, 2026 20:51

jundot force-pushed the main branch 2 times, most recently from 475d3bf to a47ef55 Compare March 22, 2026 15:37

Merge branch 'jundot:main' into feat/xtc-sampler

04262fa

jundot force-pushed the main branch 7 times, most recently from bef2aeb to 86720d8 Compare March 23, 2026 19:49

blightbow added 3 commits March 25, 2026 19:04

Merge branch 'jundot:main' into feat/xtc-sampler

0af0682

Merge branch 'jundot:main' into feat/xtc-sampler

19d2e1e

Merge branch 'jundot:main' into feat/xtc-sampler

a0516e0

jundot force-pushed the main branch from 65b4ef1 to 2e39d71 Compare March 28, 2026 01:20

jundot merged commit f18c6a4 into jundot:main Mar 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add XTC sampler support#337

feat: add XTC sampler support#337
jundot merged 5 commits intojundot:mainfrom
blightbow:feat/xtc-sampler

blightbow commented Mar 21, 2026

Uh oh!

blightbow commented Mar 21, 2026 •

edited

Loading

Uh oh!

blightbow commented Mar 21, 2026 •

edited

Loading

Uh oh!

jundot commented Mar 29, 2026

Uh oh!

jundot commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

blightbow commented Mar 21, 2026

Uh oh!

blightbow commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blightbow commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jundot commented Mar 29, 2026

Uh oh!

jundot commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

blightbow commented Mar 21, 2026 •

edited

Loading

blightbow commented Mar 21, 2026 •

edited

Loading