Picovoice

Build private, offline voice features with wake words, STT, local LLM, and TTS
Rating
Your vote:
No screenshots
Visit Website
picovoice.ai
Loading

Start by installing the SDK on your target platform and wiring the microphone stream into the runtime. Pick a wake phrase with the builder, export the tiny model, and tune sensitivity so the keyword triggers reliably without false alarms. Next, define a small set of intents and slots for core actions—think “create task,” “open note,” or “start timer.” Connect command callbacks to UI updates, and add quick tests with the console tools to validate accuracy in your actual environment. Because everything runs locally, there are no API keys to manage, no data leaves the device, and latency stays consistent even without a network.

For productivity and content creation, combine three flows: command-and-control for navigation, streaming dictation for text entry, and lightweight local reasoning to clean up results. Use Cheetah for live transcription with punctuation and timestamps, then pass segments to the compact LLM to summarize, extract to-dos, or convert instructions into structured JSON. In a notes app, for example, you can say the wake phrase, issue a command to start a note, dictate freely, and finish with “summarize and email.” The system detects the intent, formats the transcript, generates a brief, and drafts the message—no server round-trips. Creators can also generate scratch voice-overs with TTS to storyboard timing before recording final takes. more

Review Summary

Features

  • Offline wake phrase detection with tunable sensitivity
  • Streaming speech recognition with punctuation and timestamps
  • On-device LLM for summarization, extraction, and intent parsing
  • Natural-sounding text-to-speech and speech-to-speech
  • Noise robustness tools and VAD controls
  • Speaker verification and diarization options
  • Custom intents, slots, and grammars
  • Cross-platform SDKs for embedded, mobile, web, and server
  • Tiny models optimized for low CPU and memory
  • Over-the-air model updates without app rebuilds
  • Local analytics with privacy-preserving aggregation
  • Regulatory alignment (HIPAA/GDPR) by keeping data on device

How It’s Used

  • Hands-free controls in mobile productivity apps
  • Field service note-taking with instant summaries
  • Warehouse or factory terminals with robust voice commands
  • Point-of-care dictation for healthcare staff
  • In-vehicle assistants for navigation and media
  • Private voice input for web applications via WASM
  • On-prem kiosks with multilingual prompts and replies
  • Live meeting captions and action-item extraction
  • Creators drafting scripts and temp voice-overs
  • Robotics voice teleoperation with local safety gating

Plans & Pricing

Free

Free

PicoLLM (Inference) - 1M Tokens/Month Orca (Streaming Text-to-Speech) - 100K Characters/Month Cheetah (Streaming Speech-to-Text) - 250 Minutes/Month Leopard (Speech-to-Text) - 250 Minutes/Month Falcon (Speaker Diarization) - 250 Minutes/Month Eagle (Speaker Recognition) - 100 Minutes/Month Koala (Noise Suppression) - 100 Minutes/Month Porcupine (Wake Word) - 1 Monthly Active User Rhino (Speech-to-Intent) - 1 Monthly Active User Cobra (Voice Activity Detection) - 1 Monthly Active User Model Training <ul> <li>Compressed picoLLM Models - 10 GB/Month* Speech-to-Text Models - 1 Models/Month Voice Command Contexts - 1 Models/Month Wake Word Models - 1 Models/Month Support <ul> <li>GitHub Issues for Bugs Terms <ul> <li>Usage Terms - Standard Terms of Use(with Non- Commercial Usage Rights)

Foundation

$6,000.00 per year

PicoLLM (Inference) - 100M Tokens/Month Orca (Streaming Text-to-Speech) - 10M Characters/Month Cheetah (Streaming Speech-to-Text) - 25K Minutes/Month Leopard (Speech-to-Text) - 25K Minutes/Month Falcon (Speaker Diarization) - 25K Minutes/Month Eagle (Speaker Recognition) - 10K Minutes/Month Koala (Noise Suppression) - 10K Minutes/Month Porcupine (Wake Word) - 100 Monthly Active Users Rhino (Speech-to-Intent) - 100 Monthly Active Users Cobra (Voice Activity Detection) - 100 Monthly Active Users Model Training <ul> <li>Compressed picoLLM Models - 1 TB/Month* Speech-to-Text Models - 10 Models/Month Voice Command Contexts - 10 Models/Month Wake Word Models - 10 Models/Month Support <ul> <li>GitHub Issues for Bugs Dedicated Support - 6 Hours (Email) SLA - 3 Business Days Terms <ul> <li>Usage Terms - Standard Terms of Use(with Commercial Usage Rights) Payment Terms - Credit Card (Upon the Receipt of Invoice)

Enterprise

$30,000.00 per year

PicoLLM (Inference) - 100M Tokens/Month Orca (Streaming Text-to-Speech) - 10M Characters/Month Cheetah (Streaming Speech-to-Text) - 25K Minutes/Month Leopard (Speech-to-Text) - 25K Minutes/Month Falcon (Speaker Diarization) - 25K Minutes/Month Eagle (Speaker Recognition) - 10K Minutes/Month Koala (Noise Suppression) - 10K Minutes/Month Porcupine (Wake Word) - 100 Monthly Active Users Rhino (Speech-to-Intent) - 100 Monthly Active Users Cobra (Voice Activity Detection) - 100 Monthly Active Users Model Training <ul> <li>Compressed picoLLM Models - 1 TB/Month Speech-to-Text Models - 10 Models/Month Voice Command Contexts - 10 Models/Month Wake Word Models - 10 Models/Month Support <ul> <li>GitHub Issues for Bugs Dedicated Support - Custom SLA - Custom Custom Development Terms <ul> <li>Usage Terms - Custom Terms of Use(with Commercial Usage Rights) Payment Terms - Custom

Comments

User

Your vote: