Skip to content

feat: add GigaAM v3 for Russian speech recognition#913

Merged
cjpais merged 3 commits intocjpais:mainfrom
pantafive:feat/gigaam
Mar 1, 2026
Merged

feat: add GigaAM v3 for Russian speech recognition#913
cjpais merged 3 commits intocjpais:mainfrom
pantafive:feat/gigaam

Conversation

@pantafive
Copy link
Copy Markdown
Contributor

Adds GigaAM v3 e2e_ctc engine — Russian speech recognition with punctuation, Latin characters and digits. Uses int8 quantized ONNX model (225 MB), BPE tokenizer with 257 subword tokens.

The model is currently downloaded from HuggingFace (istupakov/gigaam-v3-onnx). It needs to be mirrored to blob.handy.computer to be consistent with other models.

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Feb 28, 2026

Can you add gigaam to transcribe rs first and then I will pull that in

@pantafive
Copy link
Copy Markdown
Contributor Author

Added GigaAM as a proper engine in transcribe-rs: cjpais/transcribe-rs#45

Once that's merged and published, I'll update this PR to use the crate feature instead of the standalone module.

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 1, 2026

Thank you so much @pantafive, amazing to see how much support we have for this in just one day! I've released transcribe-rs 0.2.7 with support :) I also have uploaded to handy blob site at https://blob.handy.computer/giga-am-v3.int8.onnx

Only comment I have is maybe in the description section just making sure it's very clear it's for Russian. Also if you have opinion on testing, would be helpful to have your opinion there too. Is it the best for Russian speech you've tested?

@pantafive
Copy link
Copy Markdown
Contributor Author

Thanks for the release and the CDN upload!

Updated the PR — now uses transcribe-rs 0.2.7 with the gigaam feature, the standalone module is removed. Model URL points to blob.handy.computer. Tested locally, everything works.

Regarding the description — could you clarify which description you'd like updated? The model description in the app already says "Russian speech recognition", but happy to adjust wherever you think it needs to be clearer.

As for testing — GigaAM v3 is the best Russian speech model I've tested. It outperforms Whisper-large-v3 on Russian benchmarks (9.2% vs 25.1% avg WER) and handles punctuation natively.

Add GigaAM v3 e2e_ctc as a new transcription engine using
transcribe-rs 0.2.7 gigaam feature. Russian speech recognition
with punctuation, Latin characters and digit support.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 1, 2026

Thank you! Mainly I was thinking something a bit stronger for the description, like "Best model for Russian speakers or similar"

@pantafive
Copy link
Copy Markdown
Contributor Author

I think "best" might be risky in a description — it's subjective, and things move fast in this space, so it could become misleading quickly. "Russian speech recognition. Fast and accurate." states what it does without overpromising. But it's your project — happy to go with whatever you think works best!

@eboyko
Copy link
Copy Markdown

eboyko commented Mar 1, 2026

"Best model for Russian speakers. Great for bilingual Russian/English use — especially developers mixing both languages."

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 1, 2026

I'm good with whatever and will defer to you since I don't speak Russian haha. Things do sure move fast

@cjpais cjpais merged commit ff86122 into cjpais:main Mar 1, 2026
@gtubolcev
Copy link
Copy Markdown

Not a streaming model or am I wrong?
Great work anyway, thank you!
It would be great to have some real streaming model. Like the last vosk 0.54 https://huggingface.co/alphacep/vosk-model-ru

MaxITService pushed a commit to MaxITService/AIVORelay that referenced this pull request Mar 5, 2026
* feat: add GigaAM v3 model for Russian speech recognition

Add GigaAM v3 e2e_ctc as a new transcription engine using
transcribe-rs 0.2.7 gigaam feature. Russian speech recognition
with punctuation, Latin characters and digit support.

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix: cargo fmt formatting

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* Keep the file name of the model download the same as the file on the
blob website.

---------

Co-authored-by: Claude Opus 4.6 <[email protected]>
Co-authored-by: CJ Pais <[email protected]>
(cherry picked from commit ff86122)

# Conflicts:
#	src-tauri/Cargo.lock
#	src-tauri/Cargo.toml
#	src/bindings.ts
#	src/i18n/locales/ar/translation.json
#	src/i18n/locales/cs/translation.json
#	src/i18n/locales/de/translation.json
#	src/i18n/locales/es/translation.json
#	src/i18n/locales/fr/translation.json
#	src/i18n/locales/it/translation.json
#	src/i18n/locales/ja/translation.json
#	src/i18n/locales/ko/translation.json
#	src/i18n/locales/pl/translation.json
#	src/i18n/locales/pt/translation.json
#	src/i18n/locales/ru/translation.json
#	src/i18n/locales/tr/translation.json
#	src/i18n/locales/uk/translation.json
#	src/i18n/locales/vi/translation.json
#	src/i18n/locales/zh-TW/translation.json
#	src/i18n/locales/zh/translation.json
MaxITService pushed a commit to MaxITService/AIVORelay that referenced this pull request Mar 5, 2026
* feat: add GigaAM v3 model for Russian speech recognition

Add GigaAM v3 e2e_ctc as a new transcription engine using
transcribe-rs 0.2.7 gigaam feature. Russian speech recognition
with punctuation, Latin characters and digit support.

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix: cargo fmt formatting

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* Keep the file name of the model download the same as the file on the
blob website.

---------

Co-authored-by: Claude Opus 4.6 <[email protected]>
Co-authored-by: CJ Pais <[email protected]>
(cherry picked from commit ff86122)

# Conflicts:
#	src-tauri/Cargo.lock
#	src-tauri/Cargo.toml
#	src/bindings.ts
#	src/i18n/locales/ar/translation.json
#	src/i18n/locales/cs/translation.json
#	src/i18n/locales/de/translation.json
#	src/i18n/locales/es/translation.json
#	src/i18n/locales/fr/translation.json
#	src/i18n/locales/it/translation.json
#	src/i18n/locales/ja/translation.json
#	src/i18n/locales/ko/translation.json
#	src/i18n/locales/pl/translation.json
#	src/i18n/locales/pt/translation.json
#	src/i18n/locales/ru/translation.json
#	src/i18n/locales/tr/translation.json
#	src/i18n/locales/uk/translation.json
#	src/i18n/locales/vi/translation.json
#	src/i18n/locales/zh-TW/translation.json
#	src/i18n/locales/zh/translation.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants