Add general stdin/stdout command filter for transcription post-processing#739
Add general stdin/stdout command filter for transcription post-processing#739NightMachinery wants to merge 2 commits intocjpais:mainfrom
Conversation
|
hey @NightMachinery, appreciate the detailed writeup. the summary is thorough and it's clear a lot of work went into this. however, we'd need this to follow the PR template in specifically what's missing~
the technical summary you have is great and can stay, just needs the template sections wrapped around it |
|
@VirenMohindra Hi! Thanks for the heads up. I added the template before the previous text. |
|
Can you provide screenshots? This is a large change, I would generally prefer community support for something like this before a PR is submitted. |
|
Uhhhhhh okay, this is not going to be merged anytime soon. Mainly it's far too advanced and specific to be generally distributed I feel. You need to collect community support if you want this Maybe in v2 we will have something like this but it will likely be a more agentic flow |
|
FWIW, I would be in favor of a feature like this. One use case I can think of is making the output more casual in an instant messaging context. This is trivial and instantaneous for a scripting language but slower and more brittle for LLM post processing. |
Add support for transcription hook - an executable script in app's data directory. If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#739 * cjpais#638 * cjpais#455
Add support for transcription hook - an executable script in app's data directory. If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#739 * cjpais#638 * cjpais#455
Add support for transcription hook - an executable script in app's data directory. If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#162 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157
Add support for transcription hook - an executable script in app's data directory. If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and uses script stdout as a transcription result. This approach is a flexible extension point for advanced users (which nowadays means with access to coding LLM) akin to git hooks. Here are some possible scenarios: * simple transcription modifications * a pipeline involving LLM processing, language detection and translation * custom paste method (as Handy does nothing if transcription is empty) * conditional processing based on the active application waiting for the input See related: * cjpais#168 * cjpais#162 * cjpais#916 * cjpais#911 * cjpais#834 * cjpais#847 * cjpais#833 * cjpais#662 * cjpais#601 * cjpais#335 * cjpais#739 * cjpais#638 * cjpais#455 * cjpais#157


Before Submitting This PR
Please confirm you have done the following:
Human Written Description
This PR allows setting any general program as the post-processor. It feeds the transcribed text into the given program's stdin, and uses what the program outputs in its stdout as the final output.
The advantage is that this allows doing any kind of post-processing. The disadvantage is that the user must program the post-processing completely by themselves.
I use it to wrap the transcribed text when interacting with LLM agents such as Codex:
My post-processor even detects which app is in focus and adapts its post-processing logic accordingly.
The code was completely written by the Codex App 5.3 Extra-High, but I drived the design choices of the UX. I haven't looked at the code diff yet. I am not familiar with Rust and the rest of the stack. I did manually test the feature, and it works.
Related Issues/Discussions
Fixes #
Discussion:
Community Feedback
Testing
I manually tested it.
Screenshots/Videos (if applicable)
AI Assistance
If AI was used:
The rest of the PR is written by Codex:
Summary
This PR adds a general local command filter pipeline to Handy so users can run any executable that:
The filter is configurable from the Post Process page and can be applied to normal hotkey, post-process hotkey, or both.
Why this is valuable
This makes Handy a composable STT tool, not just a fixed pipeline:
In practice, this unlocks lightweight automation that sits between raw dictation and paste.
Real example (from my usage)
I use a wrapper filter script that:
When I’m focused in ChatGPT/Gemini/Codex contexts, it outputs:
Otherwise, it passes text through unchanged.
That lets downstream instructions treat STT input differently (e.g. typo-aware cleanup) while keeping normal app usage unaffected.
What changed
Backend
command_filter_enabledcommand_filter_scope(transcribe|post_process|both)command_filter_order(before_llm|after_llm)command_filter_executablecommand_filter_argscommand_filter_timeout_mssrc-tauri/src/command_filter.rs:executable + argsdirectly (no shell string execution)~/~/...in executable and args to home dir before executionactions.rswith configurable order relative to LLM.Shortcut registration
transcribe_with_post_processnow registers if either:post_process.Commands + bindings
Added Tauri commands:
change_command_filter_enabled_settingchange_command_filter_scope_settingchange_command_filter_order_settingchange_command_filter_executable_settingchange_command_filter_args_settingchange_command_filter_timeout_settingWired in
lib.rsand updatedsrc/bindings.ts.Frontend/UI
i18n
settings.postProcessing.modes.*settings.postProcessing.commandFilter.*Documentation
docs/PRs/general_postprocess.mdBehavior details
Validation
All passed:
cargo check -qcargo test -qbun run lintbun run check:translationsbun run buildbun run format:check