Voice Training Data Creator

A desktop GUI application for creating voice training datasets with AI-powered synthetic text generation. Supports voice cloning, TTS model training, and STT fine-tuning workflows.

Screenshots

Dataset System

Each sample is saved as a numbered folder (starting from 001) with:

Recording (as .wav)
Text as .txt, incorporating user edits to scrub AI messages from source of truth
JSON for metadata

Features

Audio Recording

High-quality WAV recording (44.1kHz/48kHz, 16-bit)
Preferred microphone selection (auto-selected on recording)
Microphone selection and testing
Real-time duration tracking
Pause/resume functionality during recording
Retake button: One-click to discard and restart recording
Delete recording option

Text Generation

Synthetic text generation using OpenAI GPT models
Multiple style options:
- General Purpose
- Colloquial
- Voice Note
- Technical
- Prose
Custom vocabulary dictionary support
Configurable duration and speaking rate (WPM)
Post-generation text editing
Regenerate option (keep audio, new text)
New Sample button: Clear state and start completely fresh
Narration view with adjustable font size

Sample Management

Organized directory structure (numbered folders)
Automatic sample numbering
Metadata tracking (generation parameters, timestamps)
Session statistics
Total duration estimation
Personalized goal tracking: Set target duration and track progress
Quick access to dataset folder

Configuration and Settings

Dedicated Settings Tab for easy access
Secure API key storage (system keyring)
API connection testing
Preferred microphone configuration
Configurable audio quality settings (sample rate)
Multiple OpenAI model support (gpt-4o-mini, gpt-4o, gpt-4.1-mini)
Persistent base path configuration
Training goal duration (personalizes progress tracking)
Auto-generate next sample preference

User Experience

Three-tab interface (Record & Generate, Dataset, Settings)
Light/dark theme toggle
Helpful tooltips and descriptions
Real-time statistics updates
Clear workflow with New Sample and Retake buttons

Installation

Option 1: Install .deb Package (Recommended)

Standard installation method for Ubuntu/Debian systems:

Download or build the package:

# Option A: Build from source
git clone https://github.com/danielrosehill/Voice-Training-Data-Creator.git
cd Voice-Training-Data-Creator
./build-deb.sh

# Option B: Download release (when available)
# Download voice-training-data-creator_1.0.0_all.deb from releases

Install the package:

sudo dpkg -i voice-training-data-creator_1.0.0_all.deb
sudo apt-get install -f  # Install dependencies

Validate installation (optional):
```
./validate-package.sh
```
Launch the application:
- From application menu: Search for "Voice Training Data Creator"
- From terminal: voice-training-data-creator

See INSTALL.md for detailed installation instructions and troubleshooting.

Option 2: Run from Source

For development or if you prefer not to install system-wide:

Prerequisites

Python 3.10 or higher
Ubuntu Desktop (tested on Ubuntu 25.04 with KDE)
System packages: libportaudio2
OpenAI API key (for text generation)

Setup

Clone the repository:

git clone https://github.com/yourusername/Voice-Training-Data-Creator.git
cd Voice-Training-Data-Creator

The application uses uv for virtual environment management. If you don't have it installed:

pip install uv

Create and activate virtual environment, then install dependencies:

uv venv .venv
source .venv/bin/activate
uv pip install -r requirements.txt

Usage

Running the Application

Use the provided launcher script:

./run.sh

Or manually:

source .venv/bin/activate
python src/main.py

First-Time Setup

Open Settings Tab: Click on the "Settings" tab
Configure Base Path: Click "Browse" to set a base directory for storing samples
Add API Key: Enter your OpenAI API key in the API Configuration section
Test Connection: Click "Test Connection" to verify your API key works
Set Preferred Microphone (optional): Choose your preferred microphone device
Set Training Goal (optional): Enter target duration in minutes (e.g., 60 for 1 hour)
Click "Save Settings"

Creating Samples

Generate Text (or use existing):
- Set target duration (minutes)
- Choose words per minute (WPM)
- Select a style
- Optionally add custom vocabulary
- Click "Generate Text"
- Edit if needed, or click "Regenerate" for a new version
Record Audio:
- Your preferred microphone is automatically selected
- Click "Record" to start
- Use "Pause/Resume" if you need a break
- Click "Retake" to discard and restart immediately
- Click "Stop" when finished
Review and Save:
- Edit the text if needed
- Click "View Text" for a large narration view
- Click "Save Sample" to store your recording
Start Next Sample:
- Click "New Sample" to clear everything and start fresh
- Or enable "Auto-generate next sample" in Settings for automatic workflow

Workflow Tips

New Sample Button: Clears text and audio, optionally auto-generates new text
Retake Button: While recording, instantly discard and restart (no stop → delete needed)
Regenerate Button: Keep your audio but generate new text with same parameters
Auto-generate: Enable in Settings to automatically generate text after saving
Narration View: Use "View Text" for a large, readable display while recording

Directory Structure

Samples are organized as follows:

{base_path}/
└── samples/
    ├── 001/
    │   ├── 1.wav              # Audio recording
    │   ├── 1.txt              # Source text
    │   └── 1_metadata.json    # Generation parameters
    ├── 002/
    │   ├── 2.wav
    │   ├── 2.txt
    │   └── 2_metadata.json
    └── ...

Configuration

Configuration is stored in ~/.config/VoiceTrainingDataCreator/config.json

API keys are stored securely in the system keyring.

Troubleshooting

No Audio Devices Found

Ensure your microphone is connected
Check system audio settings
Verify permissions for microphone access

API Connection Fails

Verify your API key is correct
Check internet connectivity
Ensure you have sufficient OpenAI credits

Audio Clipping Warning

Reduce microphone input volume in system settings
Move further from the microphone
Adjust microphone gain if available

Long Silence Detected

Re-record the sample
Check microphone is not muted
Verify microphone is selected correctly

Development

Project Structure

Voice-Training-Data-Creator/
├── src/
│   ├── main.py              # Application entry point
│   ├── audio/               # Audio recording modules
│   │   ├── recorder.py
│   │   └── device_manager.py
│   ├── llm/                 # LLM integration
│   │   ├── generator.py
│   │   └── prompt_builder.py
│   ├── storage/             # Configuration and data storage
│   │   ├── config.py
│   │   └── sample_manager.py
│   ├── main.py              # Flet UI (single-file architecture)
│   └── utils/               # Validation utilities
│       └── validators.py
├── requirements.txt
├── run.sh                   # Launcher script
└── README.md

Dependencies

Flet: Modern GUI framework (Flutter-based)
sounddevice: Audio recording
soundfile: WAV file handling
numpy: Audio processing
openai: LLM text generation
keyring: Secure credential storage

UI Framework

This application uses Flet (Flutter-based) for the GUI, providing:

Flutter-based rendering with predictable layouts
Declarative UI architecture
Better cross-platform support
Material Design by default
Three-tab interface for organized workflow

The app was migrated from PyQt6 to Flet to resolve persistent layout issues and provide a more modern user experience.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.claude/commands		.claude/commands
.github		.github
2		2
debian		debian
screenshots/v1_0_14		screenshots/v1_0_14
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
INSTALL.md		INSTALL.md
PACKAGING.md		PACKAGING.md
README.md		README.md
build-deb.sh		build-deb.sh
icon.svg		icon.svg
requirements.txt		requirements.txt
run.sh		run.sh
validate-package.sh		validate-package.sh

danielrosehill/Voice-Training-Data-Creator

Folders and files

Latest commit

History

Repository files navigation