feat(stt): WhisperModel logit bias for custom vocabulary by vyomshah05 · Pull Request #436 · cactus-compute/cactus

vyomshah05 · 2026-02-24T23:05:11Z

What

Adds vocab_bias_ infrastructure to WhisperModel in model.h and
implements the bias application inside decode_with_audio.

Why

Part of the custom vocabulary / hotword biasing feature. Closes #396.

When a caller sets a token → boost map via set_vocab_bias(), the decode
loop executes the graph first, converts logits to FP32, applies the
clamped boost values, then samples on CPU. When no bias is set, the
original gb->sample() fast path is completely unchanged.

Changes

cactus/models/model.h: added vocab_bias_ field and set_vocab_bias()
public method to WhisperModel
decode_with_audio: split sample+execute into fast path (no bias) and
manual sampling path (with bias), reusing existing FP32 conversion pattern
tests/test_stt.cpp: added three new tests:
- vocab_bias_transcription — verifies no crash and valid output when
  custom_vocabulary is passed through options_json
- vocab_bias_affects_output — runs same audio with and without extreme
  bias and logs both outputs for comparison
- vocab_bias_direct — bypasses FFI entirely, calls set_vocab_bias()
  directly on the model to prove the engine path works in isolation

Not in scope for this PR

MoonshineModel
FFI/options_json parsing (needed to connect custom_vocabulary JSON → set_vocab_bias())
SDK docs

Testing

cli/cactus test --only stt passes. No regression on existing transcription.

Note: vocab_bias_affects_output currently logs identical outputs because
the FFI JSON parsing layer is not merged yet — vocab_bias_ is only
populated when called directly via set_vocab_bias(). The test is
structured to catch regressions once the FFI wiring lands.

HenryNdubuaku · 2026-02-24T23:11:15Z

@vyomshah05 thanks so much, we wanna make sure the solution generalises to all speech models, not just whisper. @ammesatyajit will dive into this as he's leading the efforts at Cactus. The modifications should be here

cactus/cactus/engine/engine_model.cpp

Line 215 in 38632d3

    
           uint32_t Model::decode(const std::vector<uint32_t>& tokens, float temperature, float top_p,

- Add vocab_bias_ field and set_vocab_bias() to base Model class in engine.h - Apply vocab bias in Model::decode() by merging with tool_constrainer bias - Parse custom_vocabulary and vocabulary_boost from options_json in cactus_transcribe.cpp and cactus_stream.cpp - Tokenize vocabulary words and build token->boost map at decode time - Add test_vocab_bias_base_class to test_stt.cpp verifying full chain Signed-off-by: vyomshah05 <[email protected]>

vyomshah05 force-pushed the feat/whisper-vocab-bias branch from 2581f46 to daa4f90 Compare February 25, 2026 22:15

vyomshah05 closed this Feb 25, 2026

vyomshah05 deleted the feat/whisper-vocab-bias branch February 25, 2026 22:15

vyomshah05 mentioned this pull request Feb 25, 2026

feat(stt): custom vocabulary biasing for all speech models #451

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(stt): WhisperModel logit bias for custom vocabulary#436

feat(stt): WhisperModel logit bias for custom vocabulary#436
vyomshah05 wants to merge 1 commit intocactus-compute:mainfrom
vyomshah05:feat/whisper-vocab-bias

vyomshah05 commented Feb 24, 2026

Uh oh!

HenryNdubuaku commented Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vyomshah05 commented Feb 24, 2026

What

Why

Changes

Not in scope for this PR

Testing

Uh oh!

HenryNdubuaku commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HenryNdubuaku commented Feb 24, 2026 •

edited

Loading