Skip to content

Add Openrouter cache_control support for provider-side prompt caching. #9600

@edutj-t

Description

@edutj-t

OpenRouter supports server‑side prompt caching via the cache_control parameter (docs: https://openrouter.ai/docs/guides/best-practices/prompt-caching).

This can significantly reduce token costs by caching the static prefix (system prompt + injected workspace files) across requests.

Currently, OpenClaw already implements this for Anthropic direct via cacheRetention, but OpenRouter - for some providers' - requests don't include cache_control, missing potential savings of ~10‑12k tokens per turn. I recognise some of them cache automatically, but others don't.

Proposed solution:

  1. Add cache_control parameter to OpenRouter provider configuration
  2. Compute hash of static prompt prefix (system prompt + injected files)
  3. Insert cache_control breakpoint after static prefix
  4. Track and reuse cache IDs across requests with same prefix

Configuration example:
{ agents: { defaults: { models: { "openrouter/openai/chatgpt-4o": { params: { cache_control: { type: "ephemeral", ttl: "1h" } } } } } } }

Benefits:

  • Reduces token burn for identical prefix across turns

  • Works across sessions if prefix unchanged

  • Compatible with all OpenRouter‑supported providers (Anthropic, OpenAI, Gemini, DeepSeek, etc.)

  • Follows existing pattern from Anthropic cacheRetention implementation

Implementation complexity:
Low‑medium (~100‑200 LoC)
2. Code Changes Sketch
Let me create a minimal patch for packages/gateway/src/providers/openrouter.ts (based on inferred structure):

// Hypothetical implementation - needs actual code inspection
import { hash } from 'node:crypto';

interface OpenRouterCacheControl {
  type: 'ephemeral';
  ttl?: '1h';
}

interface OpenRouterRequest {
  messages: Array<{
    role: string;
    content: Array<{
      type: string;
      text: string;
      cache_control?: OpenRouterCacheControl;
    }>;
  }>;
  cache_control?: OpenRouterCacheControl; // Top-level for some providers
}

class OpenRouterProvider {
  private prefixCache = new Map<string, string>(); // hash -> cache_id
  
  async createChatCompletion(request, config) {
    const { cache_control } = config.params || {};
    
    if (cache_control) {
      // 1. Compute hash of static prefix (system prompt + injected files)
      const prefixHash = this.computePrefixHash(request.messages);
      
      // 2. Check for existing cache ID
      const cacheId = this.prefixCache.get(prefixHash);
      if (cacheId) {
        request.cache_id = cacheId;
      } else {
        // 3. Insert cache_control breakpoint after static prefix
        this.insertCacheControl(request.messages, cache_control);
      }
    }
    
    const response = await this.sendToOpenRouter(request);
    
    // 4. Store new cache ID from response
    if (cache_control && response.cache_id && !request.cache_id) {
      const prefixHash = this.computePrefixHash(request.messages);
      this.prefixCache.set(prefixHash, response.cache_id);
    }
    
    return response;
  }
  
  private computePrefixHash(messages) {
    // Identify static parts: system messages + injected file content
    // Hash them for change detection
    const staticText = messages
      .filter(msg => msg.role === 'system')
      .map(msg => msg.content.map(c => c.text).join(''))
      .join('');
    return hash('sha256').update(staticText).digest('hex');
  }
  
  private insertCacheControl(messages, cache_control) {
    // Find the last system message or first user message
    // Insert cache_control in the appropriate text part
    for (const msg of messages) {
      if (msg.role === 'system' && msg.content?.length) {
        const lastContent = msg.content[msg.content.length - 1];
        if (lastContent.type === 'text') {
          lastContent.cache_control = cache_control;
          break;
        }
      }
    }
  }
}

Additional considerations:

  • Need to check minimum token requirements per provider
  • Handle multiple cache_control breakpoints for Anthropic (max 4)
  • Clear cache when workspace files change
  • Add metrics to track cache hits/savings

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions