Skip to content

feat: add Canary 1B v2 ONNX engine#55

Merged
cjpais merged 3 commits intocjpais:mainfrom
intech:feat/canary-engine
Mar 13, 2026
Merged

feat: add Canary 1B v2 ONNX engine#55
cjpais merged 3 commits intocjpais:mainfrom
intech:feat/canary-engine

Conversation

@intech
Copy link
Copy Markdown
Contributor

@intech intech commented Mar 12, 2026

Summary

Closes #54

  • Adds NVIDIA Canary 1B v2 as a new ONNX speech model in src/onnx/canary/
  • Supports 27 languages with transcription and translation to English
  • Three ONNX sessions: preprocessor (mel), encoder, decoder (autoregressive with KV-cache)
  • Follows all v0.3.0 patterns: SpeechModel trait, Quantization, shared session.rs utilities

What's included

File Description
src/onnx/canary/mod.rs CanaryModel, CanaryParams, SpeechModel impl, CAPABILITIES
src/onnx/canary/decoder.rs Autoregressive KV-cache decode loop, greedy argmax
src/onnx/canary/vocab.rs Vocabulary loading, 9-token prompt building, SentencePiece decoding
src/onnx/mod.rs Added pub mod canary
tests/canary.rs Integration tests (SpeechModel trait + transcribe_with)
examples/canary.rs CLI example with timing, quantization, translation flags
Cargo.toml Added test/example entries, no new dependencies

Key design decisions

  • Preprocessor always FP32: nemo128.onnx doesn't have quantized variants
  • Encoder/decoder respect Quantization: via resolve_model_path()
  • Translation mapping: TranscribeOptions.translate: truetarget_language = "en"
  • No new feature flags: Canary compiles under existing onnx feature
  • Ported from standalone canary-engine crate used in Handy, re-architected for v0.3.0 API

Test plan

  • cargo check --features onnx — compiles
  • cargo test --features onnx --lib — 3 unit tests pass (argmax, vocab, decode_tokens)
  • cargo clippy --features onnx — no warnings in canary module
  • cargo fmt --check — formatted
  • Tested with real Canary 1B v2 model in local Handy build (daily use)

🤖 Generated with Claude Code

Port NVIDIA Canary 1B v2 speech model as a new ONNX engine, supporting
27 languages with transcription and translation capabilities.

New files:
- src/onnx/canary/mod.rs — CanaryModel, CanaryParams, SpeechModel impl
- src/onnx/canary/decoder.rs — autoregressive KV-cache decode loop
- src/onnx/canary/vocab.rs — vocabulary loading and prompt building
- tests/canary.rs — integration tests
- examples/canary.rs — CLI usage example

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 13, 2026

I tested this and it works. Going to review it a bit further and pull in. Thank you.

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 13, 2026

Thank you. I tested this and added some code and cleaned some things up a bit, but I think it's good to go.

@cjpais cjpais merged commit 4704c0e into cjpais:main Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add NVIDIA Canary 1B v2 as ONNX engine

2 participants