omnivoice

package module
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 2, 2026 License: MIT Imports: 6 Imported by: 4

README

OmniVoice

Go CI Go Lint Go SAST Go Report Card Docs Docs Visualization License

Batteries-included voice pipeline framework for Go. This package provides a unified interface for speech-to-text (STT) and text-to-speech (TTS) with all providers included.

For a minimal dependency footprint, use omnivoice-core instead.

Features

  • 🎯 Unified Interface: Single API for all STT and TTS providers
  • 🗂️ Provider Registry: Get providers by name - no need to import individual provider packages
  • 🔌 Multiple Providers: OpenAI, Deepgram, ElevenLabs, Twilio, Telnyx
  • Streaming Support: Real-time transcription and synthesis
  • 🚀 Easy Integration: Import and use with minimal configuration

Installation

go get github.com/plexusone/omnivoice

CLI

OmniVoice includes a command-line tool for transcription.

Install CLI
go install github.com/plexusone/omnivoice/cmd/omnivoice@latest
Usage
# Set your API key
export DEEPGRAM_API_KEY="your-api-key"

# Basic transcription (stdout)
omnivoice transcribe podcast.mp3

# Save to file
omnivoice transcribe -p deepgram -o transcript.txt podcast.mp3

# JSON output with full metadata (OmniVoice Transcript format)
omnivoice transcribe -p deepgram --diarize --timestamps -f json -o transcript.json podcast.mp3

# Generate SRT subtitles
omnivoice transcribe -p deepgram -f srt -o subtitles.srt podcast.mp3

# Generate WebVTT subtitles
omnivoice transcribe -p deepgram -f vtt -o subtitles.vtt podcast.mp3

# List available providers
omnivoice providers list
Output Formats
Format Description
text Plain transcript text (default)
json OmniVoice Transcript format with full metadata
srt SubRip subtitles
vtt WebVTT subtitles
Environment Variables
Variable Provider
DEEPGRAM_API_KEY Deepgram
OPENAI_API_KEY OpenAI
ELEVENLABS_API_KEY ElevenLabs

Quick Start (Library)

import (
    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all" // Register all providers
)

Usage

package main

import (
    "context"
    "log"
    "os"

    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all"
)

func main() {
    ctx := context.Background()

    // Get providers by name using the registry
    sttProvider, err := omnivoice.GetSTTProvider("deepgram",
        omnivoice.WithAPIKey(os.Getenv("DEEPGRAM_API_KEY")))
    if err != nil {
        log.Fatal(err)
    }

    ttsProvider, err := omnivoice.GetTTSProvider("elevenlabs",
        omnivoice.WithAPIKey(os.Getenv("ELEVENLABS_API_KEY")))
    if err != nil {
        log.Fatal(err)
    }

    // Transcribe audio
    result, err := sttProvider.TranscribeFile(ctx, "audio.mp3", omnivoice.TranscriptionConfig{
        Language:             "en",
        EnableWordTimestamps: true,
    })
    if err != nil {
        log.Fatal(err)
    }
    log.Printf("Transcription: %s", result.Text)

    // Synthesize speech
    audio, err := ttsProvider.Synthesize(ctx, "Hello, world!", omnivoice.SynthesisConfig{
        VoiceID: "pNInz6obpgDQGcFmaJgB", // Adam
    })
    if err != nil {
        log.Fatal(err)
    }
    // audio.Audio contains the audio bytes
}

Provider Registry

Get providers by name at runtime - no need to import individual provider packages:

// Available providers: "openai", "elevenlabs", "deepgram", "twilio"
ttsProvider, _ := omnivoice.GetTTSProvider("elevenlabs", omnivoice.WithAPIKey(key))
sttProvider, _ := omnivoice.GetSTTProvider("deepgram", omnivoice.WithAPIKey(key))

// List registered providers
fmt.Println(omnivoice.ListTTSProviders()) // [openai elevenlabs deepgram twilio]
fmt.Println(omnivoice.ListSTTProviders()) // [openai elevenlabs deepgram twilio]

Language Codes

OmniVoice accepts language codes in BCP-47 format, which includes ISO 639-1 two-letter codes and regional variants.

Common codes:

Code Language
en English
en-US English (US)
en-GB English (UK)
es Spanish
es-MX Spanish (Mexico)
fr French
de German
it Italian
pt Portuguese
pt-BR Portuguese (Brazil)
ja Japanese
ko Korean
zh Chinese
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
ar Arabic
hi Hindi
ru Russian

Notes:

  • Use simple codes (en) for broad compatibility across providers
  • Use regional variants (en-US) when accent/dialect matters for TTS
  • Provider support varies; see provider documentation for full language lists
  • STT providers generally support automatic language detection when no code is specified

Included Providers

Provider STT TTS Registry Name
OpenAI Whisper TTS-1/TTS-1-HD "openai"
ElevenLabs Scribe Multilingual v2 "elevenlabs"
Deepgram Nova-2 Aura "deepgram"
Twilio Media Streams Media Streams "twilio"

License

MIT License - see LICENSE for details.

Documentation

Overview

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech.

This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Quick Start

Import the package with all providers:

import (
    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all"
)

Or import specific providers:

import (
    "github.com/plexusone/omnivoice"
    openai "github.com/plexusone/omni-openai/omnivoice"
)

Creating Providers

// OpenAI provider
sttProvider := openai.NewSTTProvider(apiKey)
ttsProvider := openai.NewTTSProvider(apiKey)

// Create multi-provider client
sttClient := omnivoice.NewSTTClient(sttProvider)
ttsClient := omnivoice.NewTTSClient(ttsProvider)

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Index

Constants

View Source
const (
	StatusRinging  = callsystem.StatusRinging
	StatusAnswered = callsystem.StatusAnswered
	StatusEnded    = callsystem.StatusEnded
	StatusFailed   = callsystem.StatusFailed
	StatusBusy     = callsystem.StatusBusy
	StatusNoAnswer = callsystem.StatusNoAnswer
	CallInbound    = callsystem.Inbound
	CallOutbound   = callsystem.Outbound
)

Re-export CallStatus constants

View Source
const (
	SubtitleFormatSRT = subtitle.FormatSRT
	SubtitleFormatVTT = subtitle.FormatVTT
)

Subtitle format constants.

View Source
const TranscriptFormatVersion = stt.TranscriptFormatVersion

TranscriptFormatVersion is the current version of the OmniVoice transcript format.

View Source
const TranscriptSchemaURL = stt.TranscriptSchemaURL

TranscriptSchemaURL is the JSON Schema URL for the transcript format.

View Source
const Version = "0.6.0"

Version is the current version of the omnivoice package.

Variables

View Source
var (
	WithFrom             = callsystem.WithFrom
	WithTimeout          = callsystem.WithTimeout
	WithMachineDetection = callsystem.WithMachineDetection
	WithRecording        = callsystem.WithRecording
	WithWhisper          = callsystem.WithWhisper
	WithAgent            = callsystem.WithAgent
	WithStatusCallback   = callsystem.WithStatusCallback
)

Re-export CallOption functions

View Source
var (
	ErrNoAvailableProvider   = stt.ErrNoAvailableProvider
	ErrStreamingNotSupported = stt.ErrStreamingNotSupported
	ErrInvalidAudio          = stt.ErrInvalidAudio
	ErrInvalidConfig         = stt.ErrInvalidConfig
	ErrAudioTooLong          = stt.ErrAudioTooLong
	ErrAudioTooShort         = stt.ErrAudioTooShort
	ErrRateLimited           = stt.ErrRateLimited
	ErrQuotaExceeded         = stt.ErrQuotaExceeded
	ErrUnsupportedLanguage   = stt.ErrUnsupportedLanguage
	ErrUnsupportedFormat     = stt.ErrUnsupportedFormat
	ErrStreamClosed          = stt.ErrStreamClosed
)

Re-export STT errors

View Source
var (
	// DefaultSubtitleOptions returns sensible defaults for subtitle generation.
	DefaultSubtitleOptions = subtitle.DefaultOptions

	// GenerateSRT generates SRT subtitles from a transcription result.
	GenerateSRT = subtitle.GenerateSRT

	// GenerateVTT generates WebVTT subtitles from a transcription result.
	GenerateVTT = subtitle.GenerateVTT

	// SaveSRT generates and saves SRT to a file.
	SaveSRT = subtitle.SaveSRT

	// SaveVTT generates and saves WebVTT to a file.
	SaveVTT = subtitle.SaveVTT
)

Re-export subtitle functions.

View Source
var (
	ErrTTSNoAvailableProvider = tts.ErrNoAvailableProvider
	ErrVoiceNotFound          = tts.ErrVoiceNotFound
	ErrTTSInvalidConfig       = tts.ErrInvalidConfig
	ErrTTSRateLimited         = tts.ErrRateLimited
	ErrTTSQuotaExceeded       = tts.ErrQuotaExceeded
	ErrTTSStreamClosed        = tts.ErrStreamClosed
)

Re-export TTS errors

View Source
var NewCallSystemClient = callsystem.NewClient

NewCallSystemClient creates a new CallSystem client with the given providers. The first provider becomes the primary by default.

View Source
var NewSTTClient = stt.NewClient

Re-export STT functions

View Source
var NewTTSClient = tts.NewClient

Re-export TTS functions

Functions

func GetCallSystemProvider added in v0.7.0

func GetCallSystemProvider(name string, opts ...ProviderOption) (callsystem.CallSystem, error)

GetCallSystemProvider creates a CallSystem provider by name with the given options.

Example:

cs, err := omnivoice.GetCallSystemProvider("twilio",
    omnivoice.WithAccountSID(accountSID),
    omnivoice.WithAuthToken(authToken),
    omnivoice.WithPhoneNumber(phoneNumber),
    omnivoice.WithWebhookURL(webhookURL),
)

func GetSTTProvider

func GetSTTProvider(name string, opts ...ProviderOption) (stt.Provider, error)

GetSTTProvider creates an STT provider by name with the given options.

Example:

provider, err := omnivoice.GetSTTProvider("deepgram", omnivoice.WithAPIKey(apiKey))

func GetTTSProvider

func GetTTSProvider(name string, opts ...ProviderOption) (tts.Provider, error)

GetTTSProvider creates a TTS provider by name with the given options.

Example:

provider, err := omnivoice.GetTTSProvider("elevenlabs", omnivoice.WithAPIKey(apiKey))

func HasCallSystemProvider added in v0.7.0

func HasCallSystemProvider(name string) bool

HasCallSystemProvider checks if a CallSystem provider is registered.

func HasSTTProvider

func HasSTTProvider(name string) bool

HasSTTProvider checks if an STT provider is registered.

func HasTTSProvider

func HasTTSProvider(name string) bool

HasTTSProvider checks if a TTS provider is registered.

func ListCallSystemProviders added in v0.7.0

func ListCallSystemProviders() []string

ListCallSystemProviders returns the names of all registered CallSystem providers.

func ListSTTProviders

func ListSTTProviders() []string

ListSTTProviders returns the names of all registered STT providers.

func ListTTSProviders

func ListTTSProviders() []string

ListTTSProviders returns the names of all registered TTS providers.

func RegisterCallSystemProvider added in v0.7.0

func RegisterCallSystemProvider(name string, factory CallSystemProviderFactory)

RegisterCallSystemProvider registers a CallSystem provider factory by name. This is typically called from provider init() functions.

func RegisterSTTProvider

func RegisterSTTProvider(name string, factory STTProviderFactory)

RegisterSTTProvider registers an STT provider factory by name. This is typically called from provider init() functions.

func RegisterTTSProvider

func RegisterTTSProvider(name string, factory TTSProviderFactory)

RegisterTTSProvider registers a TTS provider factory by name. This is typically called from provider init() functions.

Types

type Call added in v0.6.0

type Call = callsystem.Call

Call represents a phone or video call.

type CallDirection added in v0.6.0

type CallDirection = callsystem.CallDirection

CallDirection indicates inbound or outbound call.

type CallHandler added in v0.6.0

type CallHandler = callsystem.CallHandler

CallHandler is called when a new call arrives.

type CallOption added in v0.6.0

type CallOption = callsystem.CallOption

CallOption configures an outbound call.

type CallOptions added in v0.6.0

type CallOptions = callsystem.CallOptions

CallOptions holds parsed options for MakeCall.

type CallStatus added in v0.6.0

type CallStatus = callsystem.CallStatus

CallStatus represents the call state.

type CallSystem added in v0.6.0

type CallSystem = callsystem.CallSystem

CallSystem defines the interface for telephony/meeting integrations.

type CallSystemClient added in v0.7.0

type CallSystemClient = callsystem.Client

CallSystemClient manages multiple CallSystem providers with fallback support.

type CallSystemConfig added in v0.6.0

type CallSystemConfig = callsystem.CallSystemConfig

CallSystemConfig configures a call system integration.

type CallSystemProviderFactory added in v0.7.0

type CallSystemProviderFactory func(config ProviderConfig) (callsystem.CallSystem, error)

CallSystemProviderFactory creates a CallSystem provider with the given configuration.

type ProviderConfig

type ProviderConfig struct {
	// APIKey is the authentication key for the provider.
	APIKey string //nolint:gosec // G117: This is a config struct, not storing secrets

	// BaseURL is an optional custom API endpoint.
	BaseURL string

	// Extensions holds provider-specific configuration.
	Extensions map[string]any
}

ProviderConfig holds common configuration options for creating providers.

type ProviderOption

type ProviderOption func(*ProviderConfig)

ProviderOption configures a ProviderConfig.

func WithAPIKey

func WithAPIKey(apiKey string) ProviderOption

WithAPIKey sets the API key for the provider.

func WithAccountSID added in v0.7.0

func WithAccountSID(sid string) ProviderOption

WithAccountSID sets the account SID (Twilio).

func WithAuthToken added in v0.7.0

func WithAuthToken(token string) ProviderOption

WithAuthToken sets the auth token (Twilio).

func WithBaseURL

func WithBaseURL(baseURL string) ProviderOption

WithBaseURL sets a custom base URL for the provider.

func WithExtension

func WithExtension(key string, value any) ProviderOption

WithExtension sets a provider-specific configuration value.

func WithPhoneNumber added in v0.7.0

func WithPhoneNumber(number string) ProviderOption

WithPhoneNumber sets the default outbound phone number.

func WithRegion added in v0.7.0

func WithRegion(region string) ProviderOption

WithRegion sets the service region.

func WithWebhookURL added in v0.7.0

func WithWebhookURL(url string) ProviderOption

WithWebhookURL sets the webhook URL for incoming calls.

type STTClient

type STTClient = stt.Client

STTClient is the multi-provider STT client.

type STTProvider

type STTProvider = stt.Provider

STTProvider defines the interface for STT providers.

type STTProviderFactory

type STTProviderFactory func(config ProviderConfig) (stt.Provider, error)

STTProviderFactory creates an STT provider with the given configuration.

type STTStreamingProvider

type STTStreamingProvider = stt.StreamingProvider

STTStreamingProvider extends Provider with streaming support.

type Segment

type Segment = stt.Segment

Segment represents a transcription segment.

type StreamEvent

type StreamEvent = stt.StreamEvent

StreamEvent represents a streaming transcription event.

type SubtitleFormat

type SubtitleFormat = subtitle.Format

SubtitleFormat represents the output format for subtitles.

type SubtitleOptions

type SubtitleOptions = subtitle.Options

SubtitleOptions configures subtitle generation.

type SynthesisConfig

type SynthesisConfig = tts.SynthesisConfig

SynthesisConfig configures a TTS synthesis request.

type SynthesisResult

type SynthesisResult = tts.SynthesisResult

SynthesisResult contains the result of a TTS synthesis.

type TTSClient

type TTSClient = tts.Client

TTSClient is the multi-provider TTS client.

type TTSProvider

type TTSProvider = tts.Provider

TTSProvider defines the interface for TTS providers.

type TTSProviderFactory

type TTSProviderFactory func(config ProviderConfig) (tts.Provider, error)

TTSProviderFactory creates a TTS provider with the given configuration.

type TTSStreamChunk

type TTSStreamChunk = tts.StreamChunk

StreamChunk represents a chunk of streaming audio.

type TTSStreamingProvider

type TTSStreamingProvider = tts.StreamingProvider

TTSStreamingProvider extends Provider with input streaming support.

type Transcript added in v0.8.0

type Transcript = stt.Transcript

Type aliases for backwards compatibility.

func LoadTranscript added in v0.8.0

func LoadTranscript(filePath string) (*Transcript, error)

LoadTranscript reads a transcript from a JSON file.

func NewTranscript added in v0.8.0

func NewTranscript(result *TranscriptionResult, provider, model, audioFile string, config *TranscriptionConfig) *Transcript

NewTranscript creates a Transcript from a TranscriptionResult. This is a convenience wrapper around stt.NewTranscript.

type TranscriptMetadata added in v0.8.0

type TranscriptMetadata = stt.TranscriptMetadata

Type aliases for backwards compatibility.

type TranscriptOptions added in v0.8.0

type TranscriptOptions = stt.TranscriptOptions

Type aliases for backwards compatibility.

type TranscriptSegment added in v0.8.0

type TranscriptSegment = stt.TranscriptSegment

Type aliases for backwards compatibility.

type TranscriptWord added in v0.8.0

type TranscriptWord = stt.TranscriptWord

Type aliases for backwards compatibility.

type TranscriptionConfig

type TranscriptionConfig = stt.TranscriptionConfig

TranscriptionConfig configures a STT transcription request.

type TranscriptionResult

type TranscriptionResult = stt.TranscriptionResult

TranscriptionResult contains the result of a STT transcription.

type Voice

type Voice = tts.Voice

Voice represents a voice configuration for TTS.

type Word

type Word = stt.Word

Word represents a word with timing information.

Directories

Path Synopsis
cmd
omnivoice command
Package main provides the entry point for the omnivoice CLI.
Package main provides the entry point for the omnivoice CLI.
internal
cli
Package cli provides the command-line interface for omnivoice.
Package cli provides the command-line interface for omnivoice.
providers
all
Package all imports and registers all omnivoice providers.
Package all imports and registers all omnivoice providers.
Package schema re-exports JSON Schema definitions from omnivoice-core.
Package schema re-exports JSON Schema definitions from omnivoice-core.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL