Skip to content

Add general stdin/stdout command filter for transcription post-processing#739

Closed
NightMachinery wants to merge 2 commits intocjpais:mainfrom
NightMachinery:codex/command-filter-stdin-stdout
Closed

Add general stdin/stdout command filter for transcription post-processing#739
NightMachinery wants to merge 2 commits intocjpais:mainfrom
NightMachinery:codex/command-filter-stdin-stdout

Conversation

@NightMachinery
Copy link
Copy Markdown

@NightMachinery NightMachinery commented Feb 8, 2026

Before Submitting This PR

Please confirm you have done the following:

Human Written Description

This PR allows setting any general program as the post-processor. It feeds the transcribed text into the given program's stdin, and uses what the program outputs in its stdout as the final output.

The advantage is that this allows doing any kind of post-processing. The disadvantage is that the user must program the post-processing completely by themselves.

I use it to wrap the transcribed text when interacting with LLM agents such as Codex:

```speech-to-text
...dictated text...
```

My post-processor even detects which app is in focus and adapts its post-processing logic accordingly.

The code was completely written by the Codex App 5.3 Extra-High, but I drived the design choices of the UX. I haven't looked at the code diff yet. I am not familiar with Rust and the rest of the stack. I did manually test the feature, and it works.

Related Issues/Discussions

Fixes #
Discussion:

Community Feedback

Testing

I manually tested it.

Screenshots/Videos (if applicable)

AI Assistance

  • No AI was used in this PR
  • AI was used (please describe below)

If AI was used:

  • Tools used: Codex App 5.3 Extra-High
  • How extensively: All code changes

The rest of the PR is written by Codex:

Summary

This PR adds a general local command filter pipeline to Handy so users can run any executable that:

  • reads transcription text from stdin
  • writes transformed text to stdout

The filter is configurable from the Post Process page and can be applied to normal hotkey, post-process hotkey, or both.

Why this is valuable

This makes Handy a composable STT tool, not just a fixed pipeline:

  • integrate with existing scripts/tools you already trust
  • keep transformations local and fast
  • support highly custom workflows without shipping app-specific logic for each one

In practice, this unlocks lightweight automation that sits between raw dictation and paste.

Real example (from my usage)

I use a wrapper filter script that:

  1. reads stdin from Handy
  2. checks current app/window focus
  3. conditionally wraps dictated text in a fenced block for LLM tools

When I’m focused in ChatGPT/Gemini/Codex contexts, it outputs:

```speech-to-text
...dictated text...
```

Otherwise, it passes text through unchanged.

That lets downstream instructions treat STT input differently (e.g. typo-aware cleanup) while keeping normal app usage unaffected.

What changed

Backend

  • Added new settings fields:
    • command_filter_enabled
    • command_filter_scope (transcribe | post_process | both)
    • command_filter_order (before_llm | after_llm)
    • command_filter_executable
    • command_filter_args
    • command_filter_timeout_ms
  • Added helper logic for when secondary shortcut registration is needed.
  • Added src-tauri/src/command_filter.rs:
    • executes executable + args directly (no shell string execution)
    • writes exact transcription bytes to stdin
    • captures stdout/stderr
    • enforces timeout
    • returns applied / failed / empty-cancel states
    • expands ~ / ~/... in executable and args to home dir before execution
  • Integrated filter in actions.rs with configurable order relative to LLM.
  • Trimmed-empty stdout cancels paste while preserving original transcription in history.

Shortcut registration

transcribe_with_post_process now registers if either:

  • AI post-processing is enabled, or
  • command filter is enabled and scope includes post_process.

Commands + bindings

Added Tauri commands:

  • change_command_filter_enabled_setting
  • change_command_filter_scope_setting
  • change_command_filter_order_setting
  • change_command_filter_executable_setting
  • change_command_filter_args_setting
  • change_command_filter_timeout_setting

Wired in lib.rs and updated src/bindings.ts.

Frontend/UI

  • Post Process sidebar section is always visible.
  • Removed old Post Processing toggle from Advanced > Experimental.
  • Added Post Process > Modes controls:
    • AI Post-Processing toggle
    • Command Filter toggle
  • Added Post Process > Command Filter controls:
    • scope
    • order
    • executable
    • args (one per line)
    • timeout (ms)
  • Updated settings store mapping for new settings.

i18n

  • Added settings.postProcessing.modes.*
  • Added settings.postProcessing.commandFilter.*
  • Added keys across all locale files for consistency.
  • Updated secondary hotkey description text.

Documentation

  • Added: docs/PRs/general_postprocess.md

Behavior details

  • Filter failures: fallback to previous text.
  • Trimmed-empty stdout: cancel paste.
  • History on empty-cancel: still stores original transcription.
  • Chinese conversion remains before filter stage(s).
  • LLM stage still only runs when AI post-processing is enabled and secondary hotkey is used.

Validation

All passed:

  • cargo check -q
  • cargo test -q
  • bun run lint
  • bun run check:translations
  • bun run build
  • bun run format:check

@VirenMohindra
Copy link
Copy Markdown
Contributor

hey @NightMachinery, appreciate the detailed writeup. the summary is thorough and it's clear a lot of work went into this. however, we'd need this to follow the PR template in .github/PULL_REQUEST_TEMPLATE.md before we can review. the project has a lot of open PRs and inflight work right now so following the template helps us triage consistently

specifically what's missing~

  • the checklist confirmations (searched existing issues/PRs, read CONTRIBUTING.md)
  • a human written description section — a few sentences in your own words about what problem you noticed and why this matters (separate from the technical summary)
  • related issues / discussions
  • the AI assistance disclosure checkbox
  • screenshots (if needed)

the technical summary you have is great and can stay, just needs the template sections wrapped around it

@NightMachinery
Copy link
Copy Markdown
Author

NightMachinery commented Feb 8, 2026

@VirenMohindra Hi! Thanks for the heads up. I added the template before the previous text.

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Feb 9, 2026

Can you provide screenshots?

This is a large change, I would generally prefer community support for something like this before a PR is submitted.

@NightMachinery
Copy link
Copy Markdown
Author

Can you provide screenshots?

This is a large change, I would generally prefer community support for something like this before a PR is submitted.

The only visible changes are these additions to the post-processing panel. This panel is now always shown.

image image

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Feb 9, 2026

Uhhhhhh okay, this is not going to be merged anytime soon. Mainly it's far too advanced and specific to be generally distributed I feel. You need to collect community support if you want this

Maybe in v2 we will have something like this but it will likely be a more agentic flow

@cjpais cjpais closed this Feb 9, 2026
@danlamanna
Copy link
Copy Markdown

FWIW, I would be in favor of a feature like this. One use case I can think of is making the output more casual in an instant messaging context. This is trivial and instantaneous for a scripting language but slower and more brittle for LLM post processing.

AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 1, 2026
Add support for transcription hook - an executable script in app's data directory.

If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#739
* cjpais#638
* cjpais#455
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 1, 2026
Add support for transcription hook - an executable script in app's data directory.

If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#739
* cjpais#638
* cjpais#455
@AlexanderYastrebov AlexanderYastrebov mentioned this pull request Mar 1, 2026
6 tasks
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 1, 2026
Add support for transcription hook - an executable script in app's data directory.

If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#162
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 1, 2026
Add support for transcription hook - an executable script in app's data directory.

If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 3, 2026
Add support for transcription hook - an executable script in app's data directory.

If `transcription_hook` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 5, 2026
Add support for transcription hook - an executable script in app's data directory.

If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 5, 2026
Add support for transcription hook - an executable script in app's data directory.

If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 5, 2026
Add support for transcription hook - an executable script in app's data directory.

If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 5, 2026
Add support for transcription hook - an executable script in app's data directory.

If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 5, 2026
Add support for transcription hook - an executable script in app's data directory.

If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 5, 2026
Add support for transcription hook - an executable script in app's data directory.

If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
AlexanderYastrebov added a commit to AlexanderYastrebov/Handy that referenced this pull request Mar 26, 2026
Add support for transcription hook - an executable script in app's data directory.

If `hooks/transcription` file exists, Handy runs it passing transcription text via stdin and
uses script stdout as a transcription result.

This approach is a flexible extension point for advanced users
(which nowadays means with access to coding LLM) akin to git hooks.

Here are some possible scenarios:
* simple transcription modifications
* a pipeline involving LLM processing, language detection and translation
* custom paste method (as Handy does nothing if transcription is empty)
* conditional processing based on the active application waiting for the input

See related:
* cjpais#168
* cjpais#162
* cjpais#916
* cjpais#911
* cjpais#834
* cjpais#847
* cjpais#833
* cjpais#662
* cjpais#601
* cjpais#335
* cjpais#739
* cjpais#638
* cjpais#455
* cjpais#157
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants