voice

Voice Plugin

Audio feedback for Claude Code using pocket-tts.

When the Claude Code agent completes a task, it provides a spoken summary of what was accomplished.

Recommended: Speech-to-Text Companion

For a complete voice workflow, pair this TTS plugin with Handy (open-source) using the Parakeet V3 model for speech-to-text. It's stunningly fast with near-instant transcription.

The slight accuracy drop compared to larger models is immaterial when talking to an AI. Pro tip: Ask the agent to restate what it understood - this confirms understanding and helps keep the CLI agent on track.

Requirements

uv (for running pocket-tts via uvx)
macOS (with afplay) or Linux (with aplay or paplay)
Recommended: FFmpeg (provides ffplay for lower-latency streaming audio)

Installation

Install from the cctools-plugins marketplace:

claude plugin add voice

How It Works

Architecture Overview

The plugin uses a multi-hook strategy to get fast, reliable voice summaries:

UserPromptSubmit hook     →  Injects full voice instructions each turn
         ↓
PostToolUse hook          →  Short reminder after each tool call
         ↓
Agent generates 📢 marker →  "📢 Done, fixed the auth bug!"
         ↓
Stop hook extracts it     →  Instant playback (no API call!)
         ↓
[Fallback: headless Claude if agent forgets the marker]

The Hooks

UserPromptSubmit hook — Silently injects voice instructions at the start of each turn, telling Claude to end longer responses with a 📢 spoken summary. Uses additionalContext for silent injection (no terminal noise).

PostToolUse hook — Injects a brief reminder after each tool call to keep the voice instructions fresh during long tool chains where Claude might forget.

Stop hook — When the agent stops, this hook:

Checks if voice is enabled (via ~/.claude/voice.local.md)
Looks for a 📢 marker in the last assistant message (instant extraction)
If no marker but response is short (≤25 words), speaks it directly
Falls back to headless Claude summarization only if needed
Plays the audio via pocket-tts

Word Limits

Short responses (≤25 words): Spoken directly, no summary needed
Explicit summaries (📢 marker or headless Claude): Flexible 1.5× limit (37 words)
Last resort truncation: Strict limit (25 words)

The limit is configurable via MAX_SPOKEN_WORDS in hooks/voice_common.py.

The `/voice:speak` Command

Control voice feedback with the slash command:

/voice:speak - Enable voice feedback
/voice:speak <voice> - Set voice (e.g., azure, alba) and enable
/voice:speak stop - Disable voice feedback
/voice:speak prompt <text> - Set custom instruction for summaries
/voice:speak prompt - Clear custom prompt

Config is stored in ~/.claude/voice.local.md.

Custom Prompts

Use custom prompts to personalize how summaries are delivered:

# Be more enthusiastic
/voice:speak prompt "be upbeat and encouraging"

# Keep it ultra-brief
/voice:speak prompt "use 5 words or less"

# Add a sign-off
/voice:speak prompt "always end with 'back to you, boss'"

The custom prompt is appended as an additional instruction to the summarizer.

The `say` Script

The scripts/say script is a standalone TTS utility that:

Checks if the pocket-tts server is running
Starts the server if needed (first run may take ~30-60 seconds)
Sends text to the TTS endpoint
Plays the generated audio

Standalone Usage

You can use the say script directly from the command line:

# Basic usage
./scripts/say "Hello, world!"

# With a specific voice
./scripts/say --voice azure "Hello, world!"

# Show help
./scripts/say --help

Environment Variables

TTS_HOST: TTS server host (default: localhost)
TTS_PORT: TTS server port (default: 8000)

Disabling

Disable voice feedback temporarily:

/voice:speak stop

Or uninstall the plugin entirely:

claude plugin remove voice

Troubleshooting

Server won't start

Check the server log:

cat /tmp/pocket-tts-server.log

No audio playing

macOS: Ensure afplay is available (built-in)
Linux: Ensure aplay or paplay is installed

Slow audio playback

If there's a noticeable delay before audio starts, install FFmpeg to enable streaming mode:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

With FFmpeg installed, audio streams directly to ffplay as it's generated, reducing latency. Without it, the script waits for the full audio file before playing.

Name		Name	Last commit message	Last commit date
parent directory ..
.claude-plugin		.claude-plugin
commands		commands
hooks		hooks
scripts		scripts
skills/voice-update		skills/voice-update
CHANGELOG.md		CHANGELOG.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Voice Plugin

Recommended: Speech-to-Text Companion

Requirements

Installation

How It Works

Architecture Overview

The Hooks

Word Limits

The `/voice:speak` Command

Custom Prompts

The `say` Script

Standalone Usage

Environment Variables

Disabling

Troubleshooting

Server won't start

No audio playing

Slow audio playback

FilesExpand file tree

voice

Directory actions

More options

Directory actions

More options

Latest commit

History

voice

Folders and files

parent directory

README.md

Voice Plugin

Recommended: Speech-to-Text Companion

Requirements

Installation

How It Works

Architecture Overview

The Hooks

Word Limits

The /voice:speak Command

Custom Prompts

The say Script

Standalone Usage

Environment Variables

Disabling

Troubleshooting

Server won't start

No audio playing

Slow audio playback

The `/voice:speak` Command

The `say` Script