Skip to content

whisper-rs 0.16.0 and ort.rc12#1041

Merged
cjpais merged 31 commits intomainfrom
whisper-rs-0.16.0
Mar 19, 2026
Merged

whisper-rs 0.16.0 and ort.rc12#1041
cjpais merged 31 commits intomainfrom
whisper-rs-0.16.0

Conversation

@cjpais
Copy link
Copy Markdown
Owner

@cjpais cjpais commented Mar 14, 2026

No description provided.

@github-actions
Copy link
Copy Markdown

🧪 Test Build Ready

Build artifacts for PR #1041 are available for testing.

Download artifacts from workflow run

Artifacts expire after 30 days.

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 14, 2026

It's hard to imagine, but this actually compiles

0.8.0 will have this change.

@cjpais cjpais changed the title test build for whisper-rs 0.16.0 whisper-rs 0.16.0 and ort.rc12 Mar 14, 2026
@VirenMohindra
Copy link
Copy Markdown
Contributor

tested on macos looks good

ps: after a brief stint, i am BACK

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 16, 2026

@VirenMohindra <3

@github-actions
Copy link
Copy Markdown

🧪 Test Build Ready

Build artifacts for PR #1041 are available for testing.

Download artifacts from workflow run

Artifacts expire after 30 days.

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 16, 2026

pinging nix folks to help me fix the build. this PR will come in soonish. would also love your help testing the pre-release version here.

@y0usaf @xilec @CaptainSpof @pomarec @kakapt

@xilec
Copy link
Copy Markdown
Contributor

xilec commented Mar 16, 2026

@cjpais I created PR #1057 that fixes the Nix build for this branch. Should I target main instead?

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 17, 2026

Thank you @xilec i forgot I made this breaking change. I will make try to make a patch for this case.

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 17, 2026

Also a special note: really need testers of the 22.04 based .deb, and Intel Mac.

If you are a user of either of these, don't just thumbs up but please comment. We need confirmation these builds work before shipping this

@mawnir
Copy link
Copy Markdown

mawnir commented Mar 17, 2026

i have intel Mac
Screenshot 2026-03-17 at 6 38 21 AM

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000

Termination Reason:    Namespace DYLD, Code 1 Library missing
Library not loaded: @rpath/libonnxruntime.1.24.2.dylib
Referenced from: <2331CF1A-E001-3D97-98A8-49C7A14E1F5E> /Applications/Handy.app/Contents/MacOS/handy
Reason: tried: '/usr/lib/swift/libonnxruntime.1.24.2.dylib' (no such file, not in dyld cache), '/System/Volumes/Preboot/Cryptexes/OS/usr/lib/swift/libonnxruntime.1.24.2.dylib' (no such file), '/Applications/Handy.app/Contents/Frameworks/libonnxruntime.1.24.2.dylib' (no such file), '/usr/lib/swift/libonnxruntime.1.24.2.dylib' (no such file, not in dyld cache), '/System/Volumes/Preboot/Cryptexes/OS/usr/lib/swift/libonnxruntime.1.24.2.dylib' (no such file), '/Applications/Handy.app/Contents/Frameworks/libonnxruntime.1.24.2.dylib' (no such file)
(terminated at launch; ignore backtrace)

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 17, 2026

Beautiful thank you @mawnir i was wondering if this would happen. Let me see if I can resolve. Thank you

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 17, 2026

pinging some folks who might be running ubuntu 22.04 or generally ubuntu, need confirmation the .deb is working properly. can you try the test build above

@kernelwhisperer, @mpgon, @joshribakoff, @camlafit, @nikolayhg, @luleyleo, @pumello, @hanpham32

@SoumyaRanjanPatnaik
Copy link
Copy Markdown
Contributor

I'm on ubuntu. Will give the test build a try and report back.

@NourEldin-Osama
Copy link
Copy Markdown
Contributor

I tested handy-pr-1041-x86_64-pc-windows-msvc on my windows 11 device and it worked

@oddrationale
Copy link
Copy Markdown
Contributor

Tested on Windows 11 ARM64. Parakeet V3 works flawlessly but Whisper Small crashes Handy when trying to transcribe audio. This appears to be a known issue noted on the README and not a regression.

I haven't had much motivation to figure out why Whisper crashes on Windows ARM64, since the Parakeet models work so well.

@tanshkoul
Copy link
Copy Markdown

tanshkoul commented Mar 18, 2026

Tested .deb on Ubuntu 25.04 Wayland. Parakeet works flawlessly with direct paste via ydotool. Whisper Small almost kills my PC while trying to transcribe but does paste successfully after a bit.

@github-actions
Copy link
Copy Markdown

🧪 Test Build Ready

Build artifacts for PR #1041 are available for testing.

Download artifacts from workflow run

Artifacts expire after 30 days.

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 18, 2026

Thanks all for the info so far! We are getting close

@mawnir are you able to test the latest build on your x86 Mac and see if it helps?

@mawnir
Copy link
Copy Markdown

mawnir commented Mar 18, 2026

the info so far! We are

Yes, I’m able to use it on an Intel Mac, it works great with Parakeet and Moonshine models. But, with Whisper Small, it’s very slow. When I switched back to Parakeet, it stayed slow, almost as if it didn’t switch. But after switching to Moonshine, it started working normally again.

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 18, 2026

@mawnir if possible can you share logs, mostly happy its working. Would expect whisper to be slow. Were you using whisper before, is it slower than you had then?

@mawnir
Copy link
Copy Markdown

mawnir commented Mar 18, 2026

I think it's the same speed. (I didn't use Whisper for long.)

[2026-03-18][11:36:16][handy_app_lib::managers::transcription][DEBUG] Starting to load model: small
[2026-03-18][11:36:17][handy_app_lib::managers::transcription][DEBUG] Successfully loaded transcription model: small (took 1176ms)
[2026-03-18][11:36:28][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Pressed
[2026-03-18][11:36:28][handy_app_lib::actions][DEBUG] TranscribeAction::start called for binding: transcribe
[2026-03-18][11:36:28][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:36:28][handy_app_lib::actions][DEBUG] Microphone mode - always_on: false
[2026-03-18][11:36:28][handy_app_lib::actions][DEBUG] On-demand mode: Starting recording first, then audio feedback
[2026-03-18][11:36:28][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("Built-in Microphone")
Sample rate: 44100
Channels: 2
Format: F32
[2026-03-18][11:36:29][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 1.137899805s
[2026-03-18][11:36:29][handy_app_lib::managers::audio][DEBUG] Recording started for binding transcribe
[2026-03-18][11:36:29][handy_app_lib::actions][DEBUG] Recording started in 1.138070274s
[2026-03-18][11:36:29][handy_app_lib::actions][DEBUG] TranscribeAction::start completed in 1.145436952s
[2026-03-18][11:36:29][handy_app_lib::shortcut::handy_keys][DEBUG] Registered handy-keys shortcut: cancel -> Hotkey { modifiers: Modifiers(0x0), key: Some(Escape) }
[2026-03-18][11:36:29][handy_app_lib::actions][DEBUG] Handling delayed audio feedback/mute sequence
[2026-03-18][11:36:29][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:36:29][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, 56, a5, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:36:29][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:36:32][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Released
[2026-03-18][11:36:32][handy_app_lib::actions][DEBUG] TranscribeAction::stop called for binding: transcribe
[2026-03-18][11:36:32][handy_app_lib::shortcut::handy_keys][DEBUG] Unregistered handy-keys shortcut: cancel
[2026-03-18][11:36:32][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:36:32][handy_app_lib::actions][DEBUG] TranscribeAction::stop completed in 9.27891ms
[2026-03-18][11:36:32][handy_app_lib::actions][DEBUG] Starting async transcription task for binding: transcribe
[2026-03-18][11:36:32][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:36:32][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, ca, 8c, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:36:32][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:36:32][handy_app_lib::managers::audio][DEBUG] Microphone stream stopped
[2026-03-18][11:36:32][handy_app_lib::actions][DEBUG] Recording stopped and samples retrieved in 70.819121ms, sample count: 32160
[2026-03-18][11:36:32][handy_app_lib::managers::transcription][DEBUG] Audio vector length: 32160
[2026-03-18][11:37:27][handy_app_lib::managers::transcription][INFO] Transcription completed in 54863ms
[2026-03-18][11:37:27][handy_app_lib::managers::transcription][INFO] Transcription result: Hello everyone
[2026-03-18][11:37:27][handy_app_lib::actions][DEBUG] Transcription completed in 54.864225232s: 'Hello everyone'
[2026-03-18][11:37:27][handy_app_lib::actions][DEBUG] selected_language is not Simplified or Traditional Chinese; skipping translation
[2026-03-18][11:37:27][handy_app_lib::clipboard][INFO] Using paste method: CtrlV, delay: 60ms
[2026-03-18][11:37:27][handy_app_lib::audio_toolkit::audio::utils][DEBUG] Saved WAV file: "/Users/mawnirfas/Library/Application Support/com.pais.handy/recordings/handy-1773833847.wav"
[2026-03-18][11:37:27][handy_app_lib::managers::history][DEBUG] Saved transcription to database
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Meta, direction: Press)�[0m
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 55, direction: Press)�[0m
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] added the keycode 55 to the held keys
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] added the key Meta to the held keys
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Other(9), direction: Click)�[0m
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 9, direction: Click)�[0m
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Meta, direction: Release)�[0m
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 55, direction: Release)�[0m
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] removed the keycode 55 from the held keys
[2026-03-18][11:37:27][enigo::platform::macos_impl][DEBUG] removed the key Meta from the held keys
[2026-03-18][11:37:27][handy_app_lib::actions][DEBUG] Text pasted successfully in 215.343916ms
[2026-03-18][11:37:33][handy_app_lib::managers::transcription][DEBUG] Starting to load model: parakeet-tdt-0.6b-v3
[2026-03-18][11:37:33][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/mawnirfas/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v3-int8/encoder-model.int8.onnx
[2026-03-18][11:37:33][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/mawnirfas/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v3-int8/decoder_joint-model.int8.onnx
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=audio_signal, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["Transposeoutputs_dim_0", "", "Transposeoutputs_dim_2"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=encoded_lengths, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, -1, -1, 8198], dimension_symbols: SymbolicDimensions(["Addoutputs_dim_0", "Addoutputs_dim_1", "Addoutputs_dim_2", ""]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=prednet_lengths, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=output_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_1_dim_1", ""]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=output_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_2_dim_1", ""]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=features, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["batch_size", "", "T"]) }
[2026-03-18][11:37:36][transcribe_rs::onnx::session][INFO] Model output: name=features_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-03-18][11:37:36][transcribe_rs::decode::tokens][INFO] Loaded 8193 vocab tokens from "/Users/mawnirfas/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v3-int8/vocab.txt"
[2026-03-18][11:37:36][transcribe_rs::onnx::parakeet][INFO] Loaded vocabulary with 8193 tokens, blank_idx=8192
[2026-03-18][11:37:36][handy_app_lib::managers::transcription][DEBUG] Successfully loaded transcription model: parakeet-tdt-0.6b-v3 (took 2428ms)
[2026-03-18][11:37:42][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Pressed
[2026-03-18][11:37:42][handy_app_lib::actions][DEBUG] TranscribeAction::start called for binding: transcribe
[2026-03-18][11:37:42][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:37:42][handy_app_lib::actions][DEBUG] Microphone mode - always_on: false
[2026-03-18][11:37:42][handy_app_lib::actions][DEBUG] On-demand mode: Starting recording first, then audio feedback
[2026-03-18][11:37:42][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("Built-in Microphone")
Sample rate: 44100
Channels: 2
Format: F32
[2026-03-18][11:37:43][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 1.13672598s
[2026-03-18][11:37:43][handy_app_lib::managers::audio][DEBUG] Recording started for binding transcribe
[2026-03-18][11:37:43][handy_app_lib::actions][DEBUG] Recording started in 1.136844406s
[2026-03-18][11:37:43][handy_app_lib::actions][DEBUG] TranscribeAction::start completed in 1.14676615s
[2026-03-18][11:37:43][handy_app_lib::shortcut::handy_keys][DEBUG] Registered handy-keys shortcut: cancel -> Hotkey { modifiers: Modifiers(0x0), key: Some(Escape) }
[2026-03-18][11:37:43][handy_app_lib::actions][DEBUG] Handling delayed audio feedback/mute sequence
[2026-03-18][11:37:43][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:37:43][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, 56, a5, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:37:43][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:37:45][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Released
[2026-03-18][11:37:45][handy_app_lib::actions][DEBUG] TranscribeAction::stop called for binding: transcribe
[2026-03-18][11:37:45][handy_app_lib::shortcut::handy_keys][DEBUG] Unregistered handy-keys shortcut: cancel
[2026-03-18][11:37:45][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:37:45][handy_app_lib::actions][DEBUG] TranscribeAction::stop completed in 9.158151ms
[2026-03-18][11:37:45][handy_app_lib::actions][DEBUG] Starting async transcription task for binding: transcribe
[2026-03-18][11:37:45][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:37:45][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, ca, 8c, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:37:45][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:37:45][handy_app_lib::managers::audio][DEBUG] Microphone stream stopped
[2026-03-18][11:37:45][handy_app_lib::actions][DEBUG] Recording stopped and samples retrieved in 61.283832ms, sample count: 42240
[2026-03-18][11:37:45][handy_app_lib::managers::transcription][DEBUG] Audio vector length: 42240
[2026-03-18][11:37:46][handy_app_lib::managers::transcription][INFO] Transcription completed in 424ms
[2026-03-18][11:37:46][handy_app_lib::managers::transcription][INFO] Transcription result: This is Barack.
[2026-03-18][11:37:46][handy_app_lib::actions][DEBUG] Transcription completed in 425.075353ms: 'This is Barack.'
[2026-03-18][11:37:46][handy_app_lib::actions][DEBUG] selected_language is not Simplified or Traditional Chinese; skipping translation
[2026-03-18][11:37:46][handy_app_lib::clipboard][INFO] Using paste method: CtrlV, delay: 60ms
[2026-03-18][11:37:46][handy_app_lib::audio_toolkit::audio::utils][DEBUG] Saved WAV file: "/Users/mawnirfas/Library/Application Support/com.pais.handy/recordings/handy-1773833866.wav"
[2026-03-18][11:37:46][handy_app_lib::managers::history][DEBUG] Saved transcription to database
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Meta, direction: Press)�[0m
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 55, direction: Press)�[0m
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] added the keycode 55 to the held keys
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] added the key Meta to the held keys
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Other(9), direction: Click)�[0m
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 9, direction: Click)�[0m
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Meta, direction: Release)�[0m
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 55, direction: Release)�[0m
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] removed the keycode 55 from the held keys
[2026-03-18][11:37:46][enigo::platform::macos_impl][DEBUG] removed the key Meta from the held keys
[2026-03-18][11:37:46][handy_app_lib::actions][DEBUG] Text pasted successfully in 214.188455ms
[2026-03-18][11:37:49][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Pressed
[2026-03-18][11:37:49][handy_app_lib::actions][DEBUG] TranscribeAction::start called for binding: transcribe
[2026-03-18][11:37:49][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:37:49][handy_app_lib::actions][DEBUG] Microphone mode - always_on: false
[2026-03-18][11:37:49][handy_app_lib::actions][DEBUG] On-demand mode: Starting recording first, then audio feedback
[2026-03-18][11:37:49][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("Built-in Microphone")
Sample rate: 44100
Channels: 2
Format: F32
[2026-03-18][11:37:49][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 504.840404ms
[2026-03-18][11:37:49][handy_app_lib::managers::audio][DEBUG] Recording started for binding transcribe
[2026-03-18][11:37:49][handy_app_lib::actions][DEBUG] Recording started in 505.059208ms
[2026-03-18][11:37:49][handy_app_lib::actions][DEBUG] TranscribeAction::start completed in 513.406977ms
[2026-03-18][11:37:49][handy_app_lib::shortcut::handy_keys][DEBUG] Registered handy-keys shortcut: cancel -> Hotkey { modifiers: Modifiers(0x0), key: Some(Escape) }
[2026-03-18][11:37:49][handy_app_lib::actions][DEBUG] Handling delayed audio feedback/mute sequence
[2026-03-18][11:37:49][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:37:49][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, 56, a5, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:37:49][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:37:51][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Released
[2026-03-18][11:37:51][handy_app_lib::actions][DEBUG] TranscribeAction::stop called for binding: transcribe
[2026-03-18][11:37:51][handy_app_lib::shortcut::handy_keys][DEBUG] Unregistered handy-keys shortcut: cancel
[2026-03-18][11:37:51][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:37:51][handy_app_lib::actions][DEBUG] TranscribeAction::stop completed in 8.384502ms
[2026-03-18][11:37:51][handy_app_lib::actions][DEBUG] Starting async transcription task for binding: transcribe
[2026-03-18][11:37:51][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:37:51][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, ca, 8c, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:37:51][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:37:51][handy_app_lib::managers::audio][DEBUG] Microphone stream stopped
[2026-03-18][11:37:51][handy_app_lib::actions][DEBUG] Recording stopped and samples retrieved in 67.080682ms, sample count: 31680
[2026-03-18][11:37:51][handy_app_lib::managers::transcription][DEBUG] Audio vector length: 31680
[2026-03-18][11:37:52][handy_app_lib::managers::transcription][INFO] Transcription completed in 327ms
[2026-03-18][11:37:52][handy_app_lib::managers::transcription][INFO] Transcription result is empty
[2026-03-18][11:37:52][handy_app_lib::actions][DEBUG] Transcription completed in 327.559821ms: ''
[2026-03-18][11:37:55][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Pressed
[2026-03-18][11:37:55][handy_app_lib::actions][DEBUG] TranscribeAction::start called for binding: transcribe
[2026-03-18][11:37:55][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:37:55][handy_app_lib::actions][DEBUG] Microphone mode - always_on: false
[2026-03-18][11:37:55][handy_app_lib::actions][DEBUG] On-demand mode: Starting recording first, then audio feedback
[2026-03-18][11:37:56][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("Built-in Microphone")
Sample rate: 44100
Channels: 2
Format: F32
[2026-03-18][11:37:56][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 514.607513ms
[2026-03-18][11:37:56][handy_app_lib::managers::audio][DEBUG] Recording started for binding transcribe
[2026-03-18][11:37:56][handy_app_lib::actions][DEBUG] Recording started in 514.74856ms
[2026-03-18][11:37:56][handy_app_lib::actions][DEBUG] TranscribeAction::start completed in 524.020301ms
[2026-03-18][11:37:56][handy_app_lib::shortcut::handy_keys][DEBUG] Registered handy-keys shortcut: cancel -> Hotkey { modifiers: Modifiers(0x0), key: Some(Escape) }
[2026-03-18][11:37:56][handy_app_lib::actions][DEBUG] Handling delayed audio feedback/mute sequence
[2026-03-18][11:37:56][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:37:56][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, 56, a5, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:37:56][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:37:58][handy_app_lib::shortcut::handy_keys][DEBUG] handy-keys event: binding=transcribe, hotkey=fn, state=Released
[2026-03-18][11:37:58][handy_app_lib::actions][DEBUG] TranscribeAction::stop called for binding: transcribe
[2026-03-18][11:37:58][handy_app_lib::shortcut::handy_keys][DEBUG] Unregistered handy-keys shortcut: cancel
[2026-03-18][11:37:58][enigo::platform::macos_impl][DEBUG] �[93mlocation()�[0m
[2026-03-18][11:37:58][handy_app_lib::actions][DEBUG] TranscribeAction::stop completed in 7.05365ms
[2026-03-18][11:37:58][handy_app_lib::actions][DEBUG] Starting async transcription task for binding: transcribe
[2026-03-18][11:37:58][handy_app_lib::audio_feedback][DEBUG] Using default device
[2026-03-18][11:37:58][symphonia_core::probe][DEBUG] found a possible format marker within [52, 49, 46, 46, ca, 8c, 1, 0, 57, 41, 56, 45, 66, 6d, 74, 20] @ 0+2 bytes.
[2026-03-18][11:37:58][symphonia_core::probe][DEBUG] found the format marker [52, 49, 46, 46] @ 0+2 bytes.
[2026-03-18][11:37:58][handy_app_lib::managers::audio][DEBUG] Microphone stream stopped
[2026-03-18][11:37:58][handy_app_lib::actions][DEBUG] Recording stopped and samples retrieved in 54.650244ms, sample count: 37920
[2026-03-18][11:37:58][handy_app_lib::managers::transcription][DEBUG] Audio vector length: 37920
[2026-03-18][11:37:59][handy_app_lib::managers::transcription][INFO] Transcription completed in 388ms
[2026-03-18][11:37:59][handy_app_lib::managers::transcription][INFO] Transcription result: Now it's working.
[2026-03-18][11:37:59][handy_app_lib::actions][DEBUG] Transcription completed in 388.515239ms: 'Now it's working.'
[2026-03-18][11:37:59][handy_app_lib::actions][DEBUG] selected_language is not Simplified or Traditional Chinese; skipping translation
[2026-03-18][11:37:59][handy_app_lib::clipboard][INFO] Using paste method: CtrlV, delay: 60ms
[2026-03-18][11:37:59][handy_app_lib::audio_toolkit::audio::utils][DEBUG] Saved WAV file: "/Users/mawnirfas/Library/Application Support/com.pais.handy/recordings/handy-1773833879.wav"
[2026-03-18][11:37:59][handy_app_lib::managers::history][DEBUG] Saved transcription to database
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Meta, direction: Press)�[0m
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 55, direction: Press)�[0m
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] added the keycode 55 to the held keys
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] added the key Meta to the held keys
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Other(9), direction: Click)�[0m
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 9, direction: Click)�[0m
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] �[93mkey(key: Meta, direction: Release)�[0m
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] �[93mraw(keycode: 55, direction: Release)�[0m
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] removed the keycode 55 from the held keys
[2026-03-18][11:37:59][enigo::platform::macos_impl][DEBUG] removed the key Meta from the held keys
[2026-03-18][11:37:59][handy_app_lib::actions][DEBUG] Text pasted successfully in 215.356312ms

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 18, 2026

As far as I can tell, I think all of this looks fairly good to me. At least nothing that we should block on. I think maybe the transcription that you had, maybe one of them failed.

So far from the feedback on this thread and everything we have now, I think we can say that we can ship this. Likely tomorrow will be the day it gets pushed out.

Thank you everyone so much for testing. It was an absolute massive help, and I'm pretty excited for this release, and performance should improve across the board and fix some outstanding issues, which I'm quite excited about, so thank you.

@jacksongoode
Copy link
Copy Markdown
Contributor

jacksongoode commented Mar 19, 2026

I'm not sure if this is a bug with this version but I am seeing the model unload and the transcript abandoned if "unload immediately" is set. But I haven't checked, it seems it this is the culprit.

[handy_app_lib::managers::transcription][INFO] Model idle for 1s (limit: 0s), unloading
[2026-03-19][01:54:54][handy_app_lib::managers::transcription][INFO] Model unloaded due to inactivity (took 66ms)
Logs

``` [2026-03-19][01:54:24][handy_app_lib::managers::history][INFO] Initializing database at "/Users/jackson/Library/Application Support/com.pais.handy/history.db" [2026-03-19][01:54:24][handy_app_lib::managers::transcription][INFO] Whisper accelerator set to: auto [2026-03-19][01:54:24][handy_app_lib::managers::transcription][INFO] ORT accelerator set to: auto [2026-03-19][01:54:24][enigo::platform::macos_impl][INFO] The application has the permission to simulate input [2026-03-19][01:54:24][handy_app_lib::commands][INFO] Enigo initialized successfully after permission grant [2026-03-19][01:54:24][handy_app_lib::shortcut::handy_keys][INFO] handy-keys manager thread started [2026-03-19][01:54:24][handy_app_lib::shortcut::handy_keys][INFO] handy-keys shortcuts initialized [2026-03-19][01:54:24][handy_app_lib::commands][INFO] Shortcuts initialized successfully [2026-03-19][01:54:36][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/encoder-model.int8.onnx [2026-03-19][01:54:36][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/decoder_joint-model.int8.onnx [2026-03-19][01:54:36][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("MacBook Pro Microphone") Sample rate: 48000 Channels: 1 Format: F32 [2026-03-19][01:54:36][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 226.746792ms [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=audio_signal, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["Transposeoutputs_dim_0", "", "Transposeoutputs_dim_2"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=encoded_lengths, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, -1, -1, 1030], dimension_symbols: SymbolicDimensions(["Addoutputs_dim_0", "Addoutputs_dim_1", "Addoutputs_dim_2", ""]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=prednet_lengths, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=output_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_1_dim_1", ""]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=output_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_2_dim_1", ""]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=features, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["batch_size", "", "T"]) } [2026-03-19][01:54:37][transcribe_rs::onnx::session][INFO] Model output: name=features_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:37][transcribe_rs::decode::tokens][INFO] Loaded 1025 vocab tokens from "/Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/vocab.txt" [2026-03-19][01:54:37][transcribe_rs::onnx::parakeet][INFO] Loaded vocabulary with 1025 tokens, blank_idx=1024 [2026-03-19][01:54:40][handy_app_lib::managers::transcription][INFO] Transcription completed in 170ms [2026-03-19][01:54:40][handy_app_lib::managers::transcription][INFO] Transcription result: Testing, testing, testing. Hello, hello, hello. [2026-03-19][01:54:40][handy_app_lib::managers::transcription][INFO] Immediately unloading model after transcription [2026-03-19][01:54:40][handy_app_lib::clipboard][INFO] Using paste method: CtrlV, delay: 60ms [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/encoder-model.int8.onnx [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/decoder_joint-model.int8.onnx [2026-03-19][01:54:42][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("MacBook Pro Microphone") Sample rate: 48000 Channels: 1 Format: F32 [2026-03-19][01:54:42][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 133.645709ms [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=audio_signal, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["Transposeoutputs_dim_0", "", "Transposeoutputs_dim_2"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=encoded_lengths, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, -1, -1, 1030], dimension_symbols: SymbolicDimensions(["Addoutputs_dim_0", "Addoutputs_dim_1", "Addoutputs_dim_2", ""]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=prednet_lengths, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=output_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_1_dim_1", ""]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=output_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_2_dim_1", ""]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=features, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["batch_size", "", "T"]) } [2026-03-19][01:54:42][transcribe_rs::onnx::session][INFO] Model output: name=features_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:42][transcribe_rs::decode::tokens][INFO] Loaded 1025 vocab tokens from "/Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/vocab.txt" [2026-03-19][01:54:42][transcribe_rs::onnx::parakeet][INFO] Loaded vocabulary with 1025 tokens, blank_idx=1024 [2026-03-19][01:54:44][handy_app_lib::managers::transcription][INFO] Model idle for 1s (limit: 0s), unloading [2026-03-19][01:54:44][handy_app_lib::managers::transcription][INFO] Model unloaded due to inactivity (took 55ms) [2026-03-19][01:54:46][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/encoder-model.int8.onnx [2026-03-19][01:54:46][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/decoder_joint-model.int8.onnx [2026-03-19][01:54:46][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("MacBook Pro Microphone") Sample rate: 48000 Channels: 1 Format: F32 [2026-03-19][01:54:46][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 127.720709ms [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=audio_signal, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["Transposeoutputs_dim_0", "", "Transposeoutputs_dim_2"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=encoded_lengths, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, -1, -1, 1030], dimension_symbols: SymbolicDimensions(["Addoutputs_dim_0", "Addoutputs_dim_1", "Addoutputs_dim_2", ""]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=prednet_lengths, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=output_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_1_dim_1", ""]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=output_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_2_dim_1", ""]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=features, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["batch_size", "", "T"]) } [2026-03-19][01:54:47][transcribe_rs::onnx::session][INFO] Model output: name=features_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:47][transcribe_rs::decode::tokens][INFO] Loaded 1025 vocab tokens from "/Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/vocab.txt" [2026-03-19][01:54:47][transcribe_rs::onnx::parakeet][INFO] Loaded vocabulary with 1025 tokens, blank_idx=1024 [2026-03-19][01:54:51][handy_app_lib::managers::transcription][INFO] Transcription completed in 218ms [2026-03-19][01:54:51][handy_app_lib::managers::transcription][INFO] Transcription result: Testing, testing, testing, hello, hello, hello, hello, testing, hello. [2026-03-19][01:54:51][handy_app_lib::managers::transcription][INFO] Immediately unloading model after transcription [2026-03-19][01:54:51][handy_app_lib::clipboard][INFO] Using paste method: CtrlV, delay: 60ms [2026-03-19][01:54:52][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/encoder-model.int8.onnx [2026-03-19][01:54:52][transcribe_rs::onnx::session][INFO] Loading int8 model: /Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/decoder_joint-model.int8.onnx [2026-03-19][01:54:52][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("MacBook Pro Microphone") Sample rate: 48000 Channels: 1 Format: F32 [2026-03-19][01:54:52][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 132.621667ms [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=audio_signal, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["Transposeoutputs_dim_0", "", "Transposeoutputs_dim_2"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=encoded_lengths, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=outputs, type=Tensor { ty: Float32, shape: [-1, -1, -1, 1030], dimension_symbols: SymbolicDimensions(["Addoutputs_dim_0", "Addoutputs_dim_1", "Addoutputs_dim_2", ""]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=prednet_lengths, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=output_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_1_dim_1", ""]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=output_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "Concatoutput_states_2_dim_1", ""]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=features, type=Tensor { ty: Float32, shape: [-1, 128, -1], dimension_symbols: SymbolicDimensions(["batch_size", "", "T"]) } [2026-03-19][01:54:53][transcribe_rs::onnx::session][INFO] Model output: name=features_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) } [2026-03-19][01:54:53][transcribe_rs::decode::tokens][INFO] Loaded 1025 vocab tokens from "/Users/jackson/Library/Application Support/com.pais.handy/models/parakeet-tdt-0.6b-v2-int8/vocab.txt" [2026-03-19][01:54:53][transcribe_rs::onnx::parakeet][INFO] Loaded vocabulary with 1025 tokens, blank_idx=1024 [2026-03-19][01:54:54][handy_app_lib::managers::transcription][INFO] Model idle for 1s (limit: 0s), unloading [2026-03-19][01:54:54][handy_app_lib::managers::transcription][INFO] Model unloaded due to inactivity (took 66ms) ```

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 19, 2026

There's another PR which should fix this #1085, would love review/testing there @jacksongoode

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 19, 2026

I think we can say this is good to go! Merging, thanks all for testing

@cjpais cjpais merged commit a301502 into main Mar 19, 2026
3 checks passed
@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 20, 2026

@mawnir
Copy link
Copy Markdown

mawnir commented Mar 20, 2026 via email

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 20, 2026

no worries, happy Eid!

@camlafit
Copy link
Copy Markdown

Hello,

I’m on Debian Trixie with Wayland and GNOME.

I tried the .deb from the releases page and the latest build from this PR, but without success.
I can’t start any audio capture. The shortcut doesn’t seem to work. I tried with the default shortcut and also configured another one, but in all cases the audio capture does not start.

With the CLI command handy --toggle-transcription, I get the sound notification and the capture starts. From the tray icon I can paste the transcription into my notepad.
If I’ve understood correctly, it should paste directly into the active window, since the paste mode is set to “Direct”.

laptop ~/ handy 

[2026-03-20][10:36:02][arboard::platform::linux][WARN] Tried to initialize the wayland data control protocol clipboard, but failed. Falling back to the X11 clipboard protocol. The error was: Unknown error while interacting
 with the clipboard: A required Wayland protocol (ext-data-control, or wlr-data-control version 1) is not supported by the compositor
[2026-03-20][10:36:02][handy_app_lib::managers::history][INFO] Initializing database at "~/.local/share/com.pais.handy/history.db"
[2026-03-20][10:36:02][handy_app_lib::managers::transcription][INFO] Whisper accelerator set to: auto
[2026-03-20][10:36:02][handy_app_lib::managers::transcription][INFO] ORT accelerator set to: auto

(handy:2095059): libayatana-appindicator-WARNING **: 11:36:02.440: libayatana-appindicator is deprecated. Please use libayatana-appindicator-glib in newly written code.

** (handy:2095059): WARNING **: 11:36:02.448: It appears your Wayland compositor does not support the Layer Shell protocol
[2026-03-20][10:36:02][handy_app_lib::commands][INFO] Enigo initialized successfully after permission grant
[2026-03-20][10:36:02][handy_app_lib::commands][INFO] Shortcuts initialized successfully
ALSA lib pcm_oss.c:404:(_snd_pcm_oss_open) Cannot open device /dev/dsp
[...] Alsa error in loop 
[2026-03-20][10:36:22][handy_app_lib::tray][WARN] No transcription history entries available for tray copy.

With during --toggle-transcription

[...]
** (handy:2095582): CRITICAL **: 11:47:33.659: GtkWindow is not a layer surface. Make sure you called gtk_layer_init_for_window ()

** (handy:2095582): CRITICAL **: 11:47:33.659: GtkWindow is not a layer surface. Make sure you called gtk_layer_init_for_window ()
[2026-03-20][10:47:33][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("default")
Sample rate: 44100
Channels: 1
Format: F32
[2026-03-20][10:47:33][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 20.166933ms
[2026-03-20][10:47:50][handy_app_lib::tray][INFO] Copied last transcript to clipboard via tray.

** (handy:2095582): CRITICAL **: 11:47:58.491: GtkWindow is not a layer surface. Make sure you called gtk_layer_init_for_window ()

** (handy:2095582): CRITICAL **: 11:47:58.491: GtkWindow is not a layer surface. Make sure you called gtk_layer_init_for_window ()
[2026-03-20][10:47:59][handy_app_lib::managers::transcription][INFO] Transcription completed in 814ms
[2026-03-20][10:47:59][handy_app_lib::managers::transcription][INFO] Transcription result: Je parle et je continue à parler pour voir si cela fonctionne. Et forcer in Italiano, permet de dire, c'est capiché tout au Babenego.
[2026-03-20][10:47:59][handy_app_lib::clipboard][INFO] Using paste method: Direct, delay: 60ms
[2026-03-20][10:47:59][handy_app_lib::clipboard][INFO] Falling back to enigo for direct text input
[2026-03-20][10:47:59][enigo::platform::x11][WARN] fast text entry is not possible on X11
[2026-03-20][10:48:06][handy_app_lib::tray][INFO] Copied last transcript to clipboard via tray.

@cjpais
Copy link
Copy Markdown
Owner Author

cjpais commented Mar 20, 2026

Hi @camlafit

Is this happening only with the new version or also in the older versions?

Can you also let me know the model being used? Does it happen with whisper and parakeet?

@mawnir
Copy link
Copy Markdown

mawnir commented Mar 21, 2026

test this build as well and let me

Yes it works well on Intel Mac

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.