Skip to content

Feature Request: Add Google Cloud Speech-to-Text and Text-to-Speech support #9738

@TeigenZhang

Description

@TeigenZhang

Summary

Add support for Google Cloud Speech-to-Text and Text-to-Speech APIs as providers for audio transcription and TTS.

Motivation

  • Users with GCP accounts already have these APIs enabled and want to use their $300 free trial credits
  • GCP offers generous free tiers:
    • Speech-to-Text: 60 minutes/month free
    • Text-to-Speech: 4 million characters/month free
  • GCP Speech-to-Text supports 125+ languages with excellent quality
  • GCP Text-to-Speech offers 220+ voices in 40+ languages
  • Authentication can reuse existing ADC (Application Default Credentials) from gcloud auth application-default login

Proposed Implementation

Speech-to-Text (Audio Transcription)

Add google-cloud as a provider option in tools.media.audio:

{
  "tools": {
    "media": {
      "audio": {
        "models": [
          {
            "provider": "google-cloud",
            "model": "latest_long",
            "capabilities": ["audio"],
            "language": "zh-CN"
          }
        ]
      }
    }
  }
}

Text-to-Speech

Add google-cloud as a provider option in messages.tts:

{
  "messages": {
    "tts": {
      "provider": "google-cloud",
      "googleCloud": {
        "voice": "cmn-CN-Wavenet-A",
        "languageCode": "cmn-CN"
      }
    }
  }
}

Authentication

Should support:

  1. ADC (Application Default Credentials) - already used by many GCP users
  2. Service account JSON key file (optional)

Related

This complements #9729 (Vertex AI Gemini support) - together they would allow full utilization of GCP free credits within OpenClaw.

Additional Context

Thanks for considering!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions