Skip to content

voice-ai/voice-ai-pipecat-tts

Repository files navigation

Voice.ai TTS Service for Pipecat

An official integration for Voice.ai text-to-speech (TTS) with Pipecat.

Overview

This integration provides streaming text-to-speech capabilities using Voice.ai's WebSocket API.

Features

  • Streaming text-to-speech via WebSocket API
  • Support for 11 languages (English, Catalan, Swedish, Spanish, French, German, Italian, Portuguese, Polish, Russian, Dutch)
  • Custom voice cloning support

Installation

Prerequisites

  • Python 3.10 or higher
  • Voice.ai API key (get one at voice.ai)

Install

Clone the repository and install in editable mode:

git clone https://github.com/voice-ai/voice-ai-pipecat-tts.git
cd pipecat-voice-ai
pip install -e .
pip install -e ".[examples]"  # Optional: for running examples

Quick Start

from pipecat_voice_ai import VoiceAiTTSService
from pipecat.transcriptions.language import Language

# Initialize the service
tts = VoiceAiTTSService(
    api_key="vk_your-api-key",
    voice_id="your-voice-id",  # Optional: uses default if not provided
    params=VoiceAiTTSService.InputParams(
        language=Language.EN,
        temperature=1.0,
        top_p=0.8
    )
)

# Use in a Pipecat pipeline
from pipecat.pipeline.pipeline import Pipeline

pipeline = Pipeline([
    # ... other processors ...
    tts,
    transport.output(),
])

Configuration

Supported Languages

Use Pipecat's Language enum:

Language.EN  # English
Language.CA  # Catalan
Language.SV  # Swedish
Language.ES  # Spanish
Language.FR  # French
Language.DE  # German
Language.IT  # Italian
Language.PT  # Portuguese
Language.PL  # Polish
Language.RU  # Russian
Language.NL  # Dutch

Parameters

params = VoiceAiTTSService.InputParams(
    language=Language.EN,           # Target language
    model="voiceai-tts-v1-latest",  # TTS model (auto-selected if not specified)
    audio_format="pcm",             # Audio format (raw PCM)
    temperature=1.0,                # Creativity (0.0-2.0, default: 1.0)
    top_p=0.8,                      # Diversity (0.0-1.0, default: 0.8)
)

Running the Examples

Setup

  1. Copy .env.example to .env and add your API key:
VOICEAI_API_KEY=vk_your-api-key-here
VOICEAI_VOICE_ID=your-voice-id-here  # Optional
  1. Run the examples:

Basic Example

Tests the service by generating audio to a file:

python examples/simple_tts.py

Interactive Example

Full conversational bot with microphone input:

pip install -e ".[examples]"  # Install example dependencies
python examples/microphone_example.py

Requires OPENAI_API_KEY in .env for speech-to-text and LLM services.

Troubleshooting

Authentication Errors: Verify your API key starts with vk_ and is properly set in .env

No Audio Generated: Try without specifying a voice_id to use the default voice

Import Errors: Ensure the package is installed with pip install -e .

For more help, open an issue or join the Pipecat Discord #community-integrations channel.

Technical Details

  • Base Class: AudioContextTTSService from Pipecat
  • Connection: Persistent WebSocket with automatic reconnection
  • Audio Format: Raw PCM at 32kHz mono (no decoding libraries needed)
  • Tested with: Pipecat v0.0.100+

Links

Contributing

Contributions are welcome! Please ensure:

  • Code follows Pipecat's conventions
  • Changes are tested with the examples
  • README is updated for new features

Maintainer

This integration is officially maintained by Voice.ai.

License

BSD 2-Clause License (same as Pipecat)

See CHANGELOG.md for version history.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages