Batteries-included voice pipeline framework for Go. This package provides a unified interface for speech-to-text (STT) and text-to-speech (TTS) with all providers included.
For a minimal dependency footprint, use omnivoice-core instead.
- ๐ฏ Unified Interface: Single API for all STT and TTS providers
- ๐๏ธ Provider Registry: Get providers by name - no need to import individual provider packages
- ๐ Multiple Providers: OpenAI, Deepgram, ElevenLabs, Twilio, Telnyx
- โก Streaming Support: Real-time transcription and synthesis
- ๐ Easy Integration: Import and use with minimal configuration
go get github.com/plexusone/omnivoiceOmniVoice includes a command-line tool for transcription.
go install github.com/plexusone/omnivoice/cmd/omnivoice@latest# Set your API key
export DEEPGRAM_API_KEY="your-api-key"
# Basic transcription (stdout)
omnivoice transcribe podcast.mp3
# Save to file
omnivoice transcribe -p deepgram -o transcript.txt podcast.mp3
# JSON output with full metadata (OmniVoice Transcript format)
omnivoice transcribe -p deepgram --diarize --timestamps -f json -o transcript.json podcast.mp3
# Generate SRT subtitles
omnivoice transcribe -p deepgram -f srt -o subtitles.srt podcast.mp3
# Generate WebVTT subtitles
omnivoice transcribe -p deepgram -f vtt -o subtitles.vtt podcast.mp3
# List available providers
omnivoice providers list| Format | Description |
|---|---|
text |
Plain transcript text (default) |
json |
OmniVoice Transcript format with full metadata |
srt |
SubRip subtitles |
vtt |
WebVTT subtitles |
| Variable | Provider |
|---|---|
DEEPGRAM_API_KEY |
Deepgram |
OPENAI_API_KEY |
OpenAI |
ELEVENLABS_API_KEY |
ElevenLabs |
import (
"github.com/plexusone/omnivoice"
_ "github.com/plexusone/omnivoice/providers/all" // Register all providers
)package main
import (
"context"
"log"
"os"
"github.com/plexusone/omnivoice"
_ "github.com/plexusone/omnivoice/providers/all"
)
func main() {
ctx := context.Background()
// Get providers by name using the registry
sttProvider, err := omnivoice.GetSTTProvider("deepgram",
omnivoice.WithAPIKey(os.Getenv("DEEPGRAM_API_KEY")))
if err != nil {
log.Fatal(err)
}
ttsProvider, err := omnivoice.GetTTSProvider("elevenlabs",
omnivoice.WithAPIKey(os.Getenv("ELEVENLABS_API_KEY")))
if err != nil {
log.Fatal(err)
}
// Transcribe audio
result, err := sttProvider.TranscribeFile(ctx, "audio.mp3", omnivoice.TranscriptionConfig{
Language: "en",
EnableWordTimestamps: true,
})
if err != nil {
log.Fatal(err)
}
log.Printf("Transcription: %s", result.Text)
// Synthesize speech
audio, err := ttsProvider.Synthesize(ctx, "Hello, world!", omnivoice.SynthesisConfig{
VoiceID: "pNInz6obpgDQGcFmaJgB", // Adam
})
if err != nil {
log.Fatal(err)
}
// audio.Audio contains the audio bytes
}Get providers by name at runtime - no need to import individual provider packages:
// Available providers: "openai", "elevenlabs", "deepgram", "twilio"
ttsProvider, _ := omnivoice.GetTTSProvider("elevenlabs", omnivoice.WithAPIKey(key))
sttProvider, _ := omnivoice.GetSTTProvider("deepgram", omnivoice.WithAPIKey(key))
// List registered providers
fmt.Println(omnivoice.ListTTSProviders()) // [openai elevenlabs deepgram twilio]
fmt.Println(omnivoice.ListSTTProviders()) // [openai elevenlabs deepgram twilio]OmniVoice accepts language codes in BCP-47 format, which includes ISO 639-1 two-letter codes and regional variants.
Common codes:
| Code | Language |
|---|---|
en |
English |
en-US |
English (US) |
en-GB |
English (UK) |
es |
Spanish |
es-MX |
Spanish (Mexico) |
fr |
French |
de |
German |
it |
Italian |
pt |
Portuguese |
pt-BR |
Portuguese (Brazil) |
ja |
Japanese |
ko |
Korean |
zh |
Chinese |
zh-CN |
Chinese (Simplified) |
zh-TW |
Chinese (Traditional) |
ar |
Arabic |
hi |
Hindi |
ru |
Russian |
Notes:
- Use simple codes (
en) for broad compatibility across providers - Use regional variants (
en-US) when accent/dialect matters for TTS - Provider support varies; see provider documentation for full language lists
- STT providers generally support automatic language detection when no code is specified
| Provider | STT | TTS | Registry Name |
|---|---|---|---|
| OpenAI | Whisper | TTS-1/TTS-1-HD | "openai" |
| ElevenLabs | Scribe | Multilingual v2 | "elevenlabs" |
| Deepgram | Nova-2 | Aura | "deepgram" |
| Twilio | Media Streams | Media Streams | "twilio" |
- omnivoice-core - Core interfaces (minimal dependencies)
- omni-openai - OpenAI provider
- omni-deepgram - Deepgram provider
- omni-telnyx - Telnyx provider
- omni-twilio - Twilio provider
- elevenlabs-go - ElevenLabs SDK
MIT License - see LICENSE for details.