Skip to content

Latest commit

 

History

History
773 lines (580 loc) · 17.6 KB

File metadata and controls

773 lines (580 loc) · 17.6 KB

Configuration Guide

This guide covers all configuration options for Esperanto, including environment variables, parameters, and best practices.

Environment Variables

Esperanto uses environment variables for API keys and provider configuration. The complete reference is in .env.example at the project root.

Quick Setup

# Copy example file
cp .env.example .env

# Edit with your credentials
nano .env

Using Environment Variables

Option 1: Export in shell

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

Option 2: .env file with python-dotenv

from dotenv import load_dotenv
load_dotenv()

# Now Esperanto can access the variables
from esperanto.factory import AIFactory
model = AIFactory.create_language("openai", "gpt-4")

Option 3: Direct configuration (not recommended for production)

model = AIFactory.create_language(
    "openai", "gpt-4",
    config={"api_key": "sk-..."}
)

Provider Configuration

Cloud API Providers

OpenAI

OPENAI_API_KEY=sk-...
model = AIFactory.create_language("openai", "gpt-4", config={
    "api_key": "sk-...",  # Or from env var
    "organization": "org-...",  # Optional
    "base_url": "https://api.openai.com/v1",  # Optional custom endpoint
    "temperature": 0.7,
    "max_tokens": 1000,
    "timeout": 60.0
})

Full OpenAI Setup Guide

Anthropic

ANTHROPIC_API_KEY=sk-ant-...
model = AIFactory.create_language("anthropic", "claude-3-5-sonnet-20241022", config={
    "api_key": "sk-ant-...",  # Or from env var
    "temperature": 0.7,
    "max_tokens": 1000,
    "timeout": 60.0
})

Full Anthropic Setup Guide

Google (GenAI)

GOOGLE_API_KEY=...
# Optional: Override base URL
GEMINI_API_BASE_URL=https://generativelanguage.googleapis.com
model = AIFactory.create_language("google", "gemini-pro", config={
    "api_key": "...",  # Or from env var
    "timeout": 60.0
})

Full Google Setup Guide

Groq

GROQ_API_KEY=...

Full Groq Setup Guide

Mistral

MISTRAL_API_KEY=...

Full Mistral Setup Guide

DeepSeek

DEEPSEEK_API_KEY=...

Full DeepSeek Setup Guide

Perplexity

PERPLEXITY_API_KEY=...

Full Perplexity Setup Guide

xAI

XAI_API_KEY=...

Full xAI Setup Guide

DashScope (Qwen)

DASHSCOPE_API_KEY=...

Full DashScope Setup Guide

MiniMax

MINIMAX_API_KEY=...

Full MiniMax Setup Guide

OpenRouter

OPENROUTER_API_KEY=...

Full OpenRouter Setup Guide

Jina

JINA_API_KEY=...

Full Jina Setup Guide

Voyage

VOYAGE_API_KEY=...

Full Voyage Setup Guide

ElevenLabs

ELEVENLABS_API_KEY=...

Full ElevenLabs Setup Guide

Enterprise Providers

Azure OpenAI

Generic configuration (works for all modalities):

AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_API_VERSION=2024-02-01

Modality-specific configuration (takes precedence):

# For LLM
AZURE_OPENAI_API_KEY_LLM=...
AZURE_OPENAI_ENDPOINT_LLM=...
AZURE_OPENAI_API_VERSION_LLM=...

# For Embeddings
AZURE_OPENAI_API_KEY_EMBEDDING=...
AZURE_OPENAI_ENDPOINT_EMBEDDING=...
AZURE_OPENAI_API_VERSION_EMBEDDING=...

# For Speech-to-Text
AZURE_OPENAI_API_KEY_STT=...
AZURE_OPENAI_ENDPOINT_STT=...
AZURE_OPENAI_API_VERSION_STT=...

# For Text-to-Speech
AZURE_OPENAI_API_KEY_TTS=...
AZURE_OPENAI_ENDPOINT_TTS=...
AZURE_OPENAI_API_VERSION_TTS=...

Priority order:

  1. Modality-specific variables (highest)
  2. Generic variables
  3. Legacy variables (backward compatibility)

Full Azure Setup Guide

Vertex AI

VERTEX_PROJECT=your-gcp-project-id
VERTEX_LOCATION=us-east5
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

Full Vertex AI Setup Guide

Local/Self-Hosted Providers

Ollama

OLLAMA_BASE_URL=http://localhost:11434

No API key needed. Requires Ollama installed locally.

Full Ollama Setup Guide

Transformers

# Optional: HuggingFace token for private/gated models
HF_TOKEN=...

No API key needed. Models downloaded from HuggingFace.

Full Transformers Setup Guide

OpenAI-Compatible

Generic configuration (works for all modalities):

OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
OPENAI_COMPATIBLE_API_KEY=...  # Optional, depends on endpoint

Modality-specific configuration:

# For LLM
OPENAI_COMPATIBLE_BASE_URL_LLM=http://localhost:1234/v1
OPENAI_COMPATIBLE_API_KEY_LLM=...

# For Embeddings
OPENAI_COMPATIBLE_BASE_URL_EMBEDDING=http://localhost:8080/v1
OPENAI_COMPATIBLE_API_KEY_EMBEDDING=...

# For Speech-to-Text
OPENAI_COMPATIBLE_BASE_URL_STT=http://localhost:9000/v1
OPENAI_COMPATIBLE_API_KEY_STT=...

# For Text-to-Speech
OPENAI_COMPATIBLE_BASE_URL_TTS=http://localhost:7000/v1
OPENAI_COMPATIBLE_API_KEY_TTS=...

Use cases:

  • LM Studio (local LLM server)
  • vLLM (high-performance inference)
  • Local Ollama with OpenAI compatibility
  • Custom OpenAI-compatible endpoints

Full OpenAI-Compatible Setup Guide

Timeout Configuration

Control request timeouts globally or per-provider-type.

Default Timeouts

  • LLM, Embedding, Reranking: 60 seconds
  • Speech-to-Text, Text-to-Speech: 300 seconds (5 minutes)

Global Timeout Configuration

Set via environment variables:

# Override defaults for all providers
ESPERANTO_LLM_TIMEOUT=90           # 90 seconds for LLMs
ESPERANTO_EMBEDDING_TIMEOUT=120    # 2 minutes for embeddings
ESPERANTO_RERANKER_TIMEOUT=75      # 75 seconds for rerankers
ESPERANTO_STT_TIMEOUT=600          # 10 minutes for STT
ESPERANTO_TTS_TIMEOUT=400          # 6.5 minutes for TTS

Per-Instance Configuration

Override via config parameter:

# Via config dictionary (highest priority)
model = AIFactory.create_language(
    "openai", "gpt-4",
    config={"timeout": 120.0}  # 2 minutes
)

# For STT/TTS, also via direct parameter
transcriber = AIFactory.create_speech_to_text(
    "openai",
    timeout=600.0  # 10 minutes
)

Priority Order

  1. Config parameter (highest priority)
  2. Environment variable (ESPERANTO_*_TIMEOUT)
  3. Provider type default (60s or 300s)

Full Timeout Configuration Guide

SSL Verification Configuration

Configure SSL certificate verification for providers using HTTPS connections. This is useful when connecting to local services with self-signed certificates.

Default Behavior

SSL verification is enabled by default for security. All HTTPS connections verify SSL certificates using the system's certificate store.

Disabling SSL Verification

Security Warning: Disabling SSL verification exposes you to man-in-the-middle attacks. Only disable for development/testing with local services.

Via environment variable:

ESPERANTO_SSL_VERIFY=false

Via config parameter:

model = AIFactory.create_language(
    "ollama", "llama3",
    config={"verify_ssl": False}
)

Using Custom CA Certificates

For self-signed certificates, the recommended approach is to specify a custom CA bundle instead of disabling verification:

Via environment variable:

ESPERANTO_SSL_CA_BUNDLE=/path/to/ca-bundle.pem

Via config parameter:

model = AIFactory.create_language(
    "ollama", "llama3",
    config={"ssl_ca_bundle": "/path/to/ca-bundle.pem"}
)

Priority Order

  1. Config parameter ssl_ca_bundle (highest priority)
  2. Config parameter verify_ssl
  3. Environment variable ESPERANTO_SSL_CA_BUNDLE
  4. Environment variable ESPERANTO_SSL_VERIFY
  5. Default True (SSL verification enabled)

Common Use Cases

Local Ollama with reverse proxy (self-signed cert):

# Option 1: Disable verification (development only)
model = AIFactory.create_language(
    "ollama", "llama3",
    config={
        "base_url": "https://localhost:8443",
        "verify_ssl": False
    }
)

# Option 2: Use custom CA bundle (recommended)
model = AIFactory.create_language(
    "ollama", "llama3",
    config={
        "base_url": "https://localhost:8443",
        "ssl_ca_bundle": "/etc/ssl/certs/my-ca.pem"
    }
)

LM Studio behind Caddy proxy:

# In .env
OPENAI_COMPATIBLE_BASE_URL=https://lmstudio.local
ESPERANTO_SSL_CA_BUNDLE=/path/to/caddy-ca.pem
model = AIFactory.create_language("openai-compatible", "my-model")

SSL Configuration Applies To

All provider types that use HTTP clients:

  • Language Models (LLM)
  • Embedding Models
  • Speech-to-Text (STT)
  • Text-to-Speech (TTS)
  • Rerankers

Proxy Configuration

Esperanto uses the standard HTTP proxy environment variables supported by most tools and libraries. Proxy configuration is handled automatically by the underlying httpx library.

Environment Variables

# HTTP proxy (for http:// requests)
HTTP_PROXY=http://proxy.example.com:8080
http_proxy=http://proxy.example.com:8080

# HTTPS proxy (for https:// requests)
HTTPS_PROXY=http://proxy.example.com:8080
https_proxy=http://proxy.example.com:8080

# Hosts to bypass proxy (comma-separated)
NO_PROXY=localhost,127.0.0.1,.internal.com
no_proxy=localhost,127.0.0.1,.internal.com

Both uppercase and lowercase versions are supported.

Proxy URL Formats

# HTTP proxy
HTTP_PROXY=http://proxy.example.com:8080

# HTTPS proxy (note: proxy URL is usually http://, not https://)
HTTPS_PROXY=http://proxy.example.com:8080

# Proxy with authentication
HTTP_PROXY=http://username:[email protected]:8080

Common Use Cases

Corporate network with proxy:

# In .env
HTTP_PROXY=http://corporate-proxy.internal:3128
HTTPS_PROXY=http://corporate-proxy.internal:3128
NO_PROXY=localhost,127.0.0.1,.internal.com
# All providers automatically use the proxy
model = AIFactory.create_language("openai", "gpt-4")
embedder = AIFactory.create_embedding("openai", "text-embedding-3-small")

Bypass proxy for local services:

# In .env
HTTP_PROXY=http://proxy.example.com:8080
HTTPS_PROXY=http://proxy.example.com:8080
NO_PROXY=localhost,127.0.0.1,ollama.local
# External APIs go through proxy
model = AIFactory.create_language("openai", "gpt-4")

# Local Ollama bypasses proxy (if in NO_PROXY)
local_model = AIFactory.create_language("ollama", "llama3")

Proxy Configuration Applies To

All provider types that use HTTP clients:

  • Language Models (LLM)
  • Embedding Models
  • Speech-to-Text (STT)
  • Text-to-Speech (TTS)
  • Rerankers

Common Parameters

Language Models (LLM)

model = AIFactory.create_language(
    provider="openai",
    model_name="gpt-4",
    config={
        # Sampling parameters
        "temperature": 0.7,      # 0.0-2.0, creativity
        "top_p": 0.9,           # Nucleus sampling
        "max_tokens": 1000,     # Response length limit

        # Output format
        "streaming": False,      # Enable token-by-token streaming
        "structured": {"type": "json"},  # JSON output mode

        # Performance
        "timeout": 60.0,        # Request timeout in seconds

        # Authentication (if not using env vars)
        "api_key": "...",

        # Provider-specific
        "organization": "...",  # OpenAI only
        "base_url": "...",      # Custom endpoints
    }
)

Embeddings

embedder = AIFactory.create_embedding(
    provider="openai",
    model_name="text-embedding-3-small",
    config={
        # Performance
        "timeout": 60.0,
        "batch_size": 32,       # Texts per request

        # Advanced (provider-specific)
        "task_type": EmbeddingTaskType.RETRIEVAL_QUERY,  # Jina, Google
        "late_chunking": True,  # Jina only
        "output_dimensions": 512,  # Jina, OpenAI (some models)

        # Authentication
        "api_key": "...",
    }
)

Reranking

reranker = AIFactory.create_reranker(
    provider="jina",
    model_name="jina-reranker-v2-base-multilingual",
    config={
        "timeout": 60.0,
        "api_key": "...",
    }
)

Speech-to-Text

transcriber = AIFactory.create_speech_to_text(
    provider="openai",
    model_name="whisper-1",
    config={
        "timeout": 300.0,       # Longer for audio processing
        "language": "en",       # ISO-639-1 code
        "response_format": "json",  # "json", "text", "srt", "vtt", "verbose_json"
        "temperature": 0.0,     # Sampling (0.0 = deterministic)
        "api_key": "...",
    }
)

Text-to-Speech

speaker = AIFactory.create_text_to_speech(
    provider="openai",
    model_name="tts-1",
    config={
        "timeout": 300.0,       # Longer for audio generation
        "voice": "nova",        # Default voice
        "speed": 1.0,           # 0.25-4.0, speech rate
        "response_format": "mp3",  # "mp3", "opus", "aac", "flac", "wav", "pcm"
        "api_key": "...",
    }
)

Configuration Patterns

Development vs Production

Development:

# .env.development
OPENAI_API_KEY=...
ESPERANTO_LLM_TIMEOUT=30  # Faster timeouts for testing

Production:

# .env.production
OPENAI_API_KEY=...
ESPERANTO_LLM_TIMEOUT=120  # Longer timeouts for reliability

Load environment-specific config:

import os
from dotenv import load_dotenv

env = os.getenv("ENV", "development")
load_dotenv(f".env.{env}")

Multi-Environment Setup

import os

def get_model():
    """Get model based on environment."""
    if os.getenv("ENV") == "production":
        # Production: OpenAI for quality
        return AIFactory.create_language("openai", "gpt-4")
    else:
        # Development: Ollama for cost savings
        return AIFactory.create_language("ollama", "llama3.2")

model = get_model()

Provider Fallback

def create_llm_with_fallback():
    """Try primary provider, fall back to secondary."""
    try:
        return AIFactory.create_language("openai", "gpt-4", config={"timeout": 30.0})
    except Exception as e:
        print(f"Primary failed: {e}, falling back to Groq")
        return AIFactory.create_language("groq", "mixtral-8x7b-32768")

model = create_llm_with_fallback()

Multi-Provider Configuration

# .env
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
JINA_API_KEY=...
ELEVENLABS_API_KEY=...

# Use best provider for each task
llm = AIFactory.create_language("anthropic", "claude-3-5-sonnet-20241022")
embedder = AIFactory.create_embedding("jina", "jina-embeddings-v3")
speaker = AIFactory.create_text_to_speech("elevenlabs", "eleven_multilingual_v2")

Best Practices

Security

DO:

  • ✅ Use environment variables for API keys
  • ✅ Use .env file in development (add to .gitignore)
  • ✅ Use secret management in production (AWS Secrets Manager, etc.)
  • ✅ Rotate API keys regularly
  • ✅ Use least-privilege keys when possible

DON'T:

  • ❌ Hard-code API keys in source code
  • ❌ Commit .env files to version control
  • ❌ Share API keys in logs or error messages
  • ❌ Use production keys in development

Performance

  • Timeouts: Set appropriate timeouts based on expected operation duration
  • Batch Size: Increase for embeddings when processing many texts
  • Async: Use async methods for concurrent requests
  • Caching: Cache model instances when possible (AIFactory does this automatically)

Error Handling

from esperanto.factory import AIFactory

try:
    model = AIFactory.create_language("openai", "gpt-4")
    response = model.chat_complete(messages)
except ValueError as e:
    # Configuration errors (invalid parameters, missing API key)
    print(f"Configuration error: {e}")
except TimeoutError as e:
    # Request timeout
    print(f"Request timed out: {e}")
except Exception as e:
    # Other errors (network, API errors, etc.)
    print(f"Error: {e}")

Testing

Use different providers for test vs production:

# conftest.py
import pytest
import os

@pytest.fixture
def llm():
    if os.getenv("CI"):
        # CI: Use mock or free provider
        return MockLLM()
    else:
        # Local: Use real provider
        return AIFactory.create_language("openai", "gpt-4")

Validation

Esperanto validates configuration parameters:

  • Timeout: Must be 1-3600 seconds
  • Temperature: Must be 0.0-2.0 (LLM)
  • API Keys: Must be provided (via env var or config)
  • Model Names: Validated against available models (where possible)

Invalid configuration raises ValueError with descriptive message.

Complete .env Example

See .env.example in project root for the complete, up-to-date reference with all providers and options.

# Copy to get started
cp .env.example .env

See Also


Need help with a specific provider? → Check Provider Setup Guides