A foundational AI system prompt designed for cleaning and improving transcribed voice notes while preserving essential content and detail.
This repository contains a carefully crafted system prompt for processing speech-to-text (STT) transcriptions. The agent serves as a foundational text cleanup tool that can be used standalone or as the first step in chained AI workflows.
The Voice Note Redaction Agent addresses a critical challenge in voice-to-text processing: creating cleaner, more coherent text while preserving all important details. This balance is essential for workflows where content loss would be a dealbreaker.
- Text Cleanup: Removes speech artifacts ("uhm", "ehm", etc.) and transcription errors
- Structure Enhancement: Adds proper formatting, headers, and markdown structure
- Content Preservation: Maintains all important details and context
- Metadata Generation: Creates titles and summaries alongside cleaned text
- Chain-Ready Output: Produces structured output suitable for downstream AI agents
Processing voice notes presents a unique AI challenge:
- Too Aggressive: Risk losing important details or context
- Too Conservative: Leave in distracting artifacts and poor structure
- Balance Required: Clean up imperfections while preserving meaning
This system prompt is specifically tuned to err on the side of preservation while still providing meaningful improvements.
- Voice Note Processing: Initial cleanup of STT output from tools like Whisper
- Workflow Foundation: First step in multi-agent processing chains
- Content Preparation: Preparing voice notes for further specialized processing
The agent is designed to work well in sequences:
- Voice Note Redaction Agent (this) - Initial cleanup
- Specialized Agents - Format conversion, analysis, etc.
- Output Agents - Final formatting and delivery
system-prompt/
├── iterations/ # Version history of the prompt
│ ├── v1.md # Initial version
│ └── v2.md # Current version with summary/title generation
├── personalised/ # Customized versions
│ └── prompt.md # Personalized variant
└── scehma/ # JSON schema definitions
└── scehma.json # Output format specification
The agent produces structured JSON output with three components:
{
"note_output": "Cleaned markdown-formatted content",
"summary": "40-word plain text summary",
"title": "Descriptive title in plain text"
}- No Detail Loss: Explicitly instructed to preserve all information
- Context Maintenance: Keeps important nuances and specifics
- Conservative Editing: Focuses on structure over content changes
- Artifact Removal: Cleans up "uhm", "ehm", false starts
- Structure Addition: Creates logical headers and formatting
- Markdown Output: Well-formatted, readable final text
- Smart Titles: Generates appropriate titles from content
- Concise Summaries: 40-word summaries for quick reference
- Plain Text: Metadata in plain text for maximum compatibility
The agent expects raw STT transcription text and handles:
- Mistranscribed words
- Speech artifacts and filler words
- Embedded voice commands ("scratch that!", etc.)
- Poor punctuation and structure
Produces:
- Clean, readable markdown text
- Proper document structure with headers
- Preserved lists and emphasis
- Consistent formatting
- Load the system prompt from
system-prompt/iterations/v2.md - Configure your AI model with the prompt
- Send raw transcription as user input
- Receive structured JSON with cleaned content
- v1: Basic text cleanup and markdown formatting
- v2: Added title and summary generation capabilities
This agent is part of a larger text transformation ecosystem and works well with:
- Specialized formatting agents
- Content analysis tools
- Multi-step processing pipelines
When modifying the system prompt:
- Test thoroughly with various voice note types
- Ensure content preservation remains paramount
- Update schema files to match prompt changes
- Document changes in the iterations folder
This foundational agent prioritizes content preservation above all else, making it suitable for critical workflows where information loss is unacceptable.
