Skip to content

V3 Spec Proposal: Token Usage Normalization for Vercel AI SDK #9921

@jonaslalin

Description

@jonaslalin

Description

Token Usage Normalization Proposal for Vercel AI SDK

Problem Statement

The Vercel AI SDK currently maps token usage inconsistently across providers, leading to incorrect totals and missing information. Since the SDK normalizes language model messages across providers, it should also normalize usage statistics.

Current Provider Implementations

Anthropic's Token Reporting

Anthropic reports tokens as three separate, non-overlapping values:

// First request (cache WRITE)
{
  "input_tokens": 18,                      // New tokens (not cached)
  "cache_creation_input_tokens": 2284,     // Tokens written to cache
  "cache_read_input_tokens": 0,            // Tokens read from cache
  "output_tokens": 158
}

// Second request (cache READ)
{
  "input_tokens": 18,                      // New tokens (not cached)
  "cache_creation_input_tokens": 0,
  "cache_read_input_tokens": 2284,         // Tokens read from cache
  "output_tokens": 141
}

Anthropic's calculation:

Total Input = input_tokens + cache_creation_input_tokens + cache_read_input_tokens
            = 18 + 2284 + 0 = 2302  (first request)
            = 18 + 0 + 2284 = 2302  (second request)

OpenAI's Token Reporting

OpenAI reports tokens as total with cached subset:

// First request (no cache)
{
  "input_tokens": 1543,                    // Total input tokens
  "input_tokens_details": {
    "cached_tokens": 0                     // Subset that was cached
  },
  "output_tokens": 600,                    // Total output tokens
  "output_tokens_details": {
    "reasoning_tokens": 512                // Subset for reasoning
  }
}

// Second request (with cache)
{
  "input_tokens": 1543,                    // Total input tokens (same)
  "input_tokens_details": {
    "cached_tokens": 1408                  // Subset that was cached
  },
  "output_tokens": 624,
  "output_tokens_details": {
    "reasoning_tokens": 512
  }
}

OpenAI's calculation:

Total Input = input_tokens (already includes cached)
            = 1543 (both requests report same total)

Non-cached Input = input_tokens - cached_tokens
                 = 1543 - 0 = 1543     (first request)
                 = 1543 - 1408 = 135   (second request)

The Mismatch

Provider Total Input Non-cached Cache Write Cache Read
Anthropic ❌ Must calculate input_tokens cache_creation_input_tokens cache_read_input_tokens
OpenAI input_tokens ❌ Must calculate ❌ Missing ⚠️ cached_tokens (doesn't distinguish read vs write)
Vercel AI SDK (current) ❌ Wrong for Anthropic ❌ Missing ⚠️ Only has cachedInputTokens

Problems with Current Vercel AI SDK Mapping

  1. Incorrect Anthropic totals: If Vercel maps Anthropic's input_tokens directly to a "total", it's wrong (18 ≠ 2302)
  2. Missing cache write tracking: No way to represent the cost of writing to cache
  3. Ambiguous cached tokens: cachedInputTokens doesn't distinguish between reading from cache vs writing to cache

Proposed Normalized Schema

I propose something similar to:

interface ModelUsage {
  // Always present: total counts
  inputTokens: number;          // Total input tokens
  outputTokens: number;         // Total output tokens

  // Breakdown of input tokens (all optional, should sum to inputTokens)
  inputTokensDetails?: {
    regularTokens?: number;     // Tokens not involved in caching
    cacheWriteTokens?: number;  // Tokens written to cache (creation cost)
    cacheReadTokens?: number;   // Tokens read from cache (reduced cost)
  };

  // Breakdown of output tokens (all optional, should sum to outputTokens)
  outputTokensDetails?: {
    regularTokens?: number;     // Normal output tokens
    reasoningTokens?: number;   // Reasoning/thinking tokens (OpenAI)
  };
}

Mapping Examples

Anthropic → Normalized

// Anthropic response (cache write)
{
  "input_tokens": 18,
  "cache_creation_input_tokens": 2284,
  "cache_read_input_tokens": 0,
  "output_tokens": 158
}

// Normalized
{
  "inputTokens": 2302,              // 18 + 2284 + 0
  "outputTokens": 158,
  "inputTokensDetails": {
    "regularTokens": 18,
    "cacheWriteTokens": 2284,
    "cacheReadTokens": 0
  }
}

// Anthropic response (cache read)
{
  "input_tokens": 18,
  "cache_creation_input_tokens": 0,
  "cache_read_input_tokens": 2284,
  "output_tokens": 141
}

// Normalized
{
  "inputTokens": 2302,              // 18 + 0 + 2284
  "outputTokens": 141,
  "inputTokensDetails": {
    "regularTokens": 18,
    "cacheWriteTokens": 0,
    "cacheReadTokens": 2284
  }
}

OpenAI → Normalized

// OpenAI response (no cache)
{
  "input_tokens": 1543,
  "input_tokens_details": { "cached_tokens": 0 },
  "output_tokens": 600,
  "output_tokens_details": { "reasoning_tokens": 512 }
}

// Normalized
{
  "inputTokens": 1543,
  "outputTokens": 600,
  "inputTokensDetails": {
    "regularTokens": 1543,          // 1543 - 0
    "cacheWriteTokens": 0,          // OpenAI doesn't report this separately
    "cacheReadTokens": 0
  },
  "outputTokensDetails": {
    "regularTokens": 88,            // 600 - 512
    "reasoningTokens": 512
  }
}

// OpenAI response (with cache)
{
  "input_tokens": 1543,
  "input_tokens_details": { "cached_tokens": 1408 },
  "output_tokens": 624,
  "output_tokens_details": { "reasoning_tokens": 512 }
}

// Normalized
{
  "inputTokens": 1543,
  "outputTokens": 624,
  "inputTokensDetails": {
    "regularTokens": 135,           // 1543 - 1408
    "cacheReadTokens": 1408,        // Assuming read (OpenAI doesn't distinguish)
    "cacheWriteTokens": 0
  },
  "outputTokensDetails": {
    "regularTokens": 112,           // 624 - 512
    "reasoningTokens": 512
  }
}

Benefits

  1. Consistent totals: inputTokens and outputTokens always represent the actual total
  2. Complete breakdown: All constituent parts are represented
  3. Cost calculation: Can accurately calculate costs with different cache write/read rates
  4. Provider agnostic: Works regardless of how the provider reports tokens
  5. Backward compatible: Can deprecate cachedInputTokens in favor of inputTokensDetails.cacheReadTokens

Implementation Notes

  • The sum of all values in inputTokensDetails should equal inputTokens
  • The sum of all values in outputTokensDetails should equal outputTokens
  • Providers that don't support certain breakdowns can omit those fields
  • OpenAI doesn't distinguish cache reads from writes, so we may need to assume cached tokens are reads unless it's the first request with that cache key, but in general we have no way of knowing that...

Output:

{
  "model": "claude-sonnet-4-5-20250929",
  "id": "msg_01FUwdy8MzL3Nq6buDyhXADu",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "The text you've provided is **Lorem Ipsum** - placeholder text commonly used in design and publishing to demonstrate visual layout without meaningful content.\n\n**Main characteristics:**\n\n- **No actual themes or meaning** - Lorem Ipsum is pseudo-Latin filler text derived from Cicero's writings but scrambled to be nonsensical\n- **Purpose**: Used as a placeholder to show how text will look in a design before final content is available\n- **Structure**: Formatted in paragraphs to mimic real document layout\n\nSince this is dummy text, there are no actual themes, arguments, or ideas to summarize. If you have a real text you'd like me to analyze and summarize, I'd be happy to help with that instead!"
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 18,
    "cache_creation_input_tokens": 2284,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 2284,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 158,
    "service_tier": "standard"
  }
}

{
  "model": "claude-sonnet-4-5-20250929",
  "id": "msg_018dXs99JsACXSzqmQum2FDA",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "The text you've provided is **Lorem Ipsum** - placeholder text commonly used in design and publishing. It's essentially meaningless Latin-like filler text that has no actual themes or content to summarize.\n\nLorem Ipsum is used by designers and developers to:\n- Fill space in mockups and templates\n- Show how text will look in a layout\n- Avoid distracting readers with readable content during the design phase\n\nSince this is dummy text without real meaning, there are no themes, arguments, or ideas to extract. If you intended to share a different text for analysis, please feel free to paste the actual content you'd like me to summarize!"
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 18,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 2284,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 0,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 141,
    "service_tier": "standard"
  }
}

{
  "id": "resp_0af01b0fdf503cae01690371a2909481a3adac438e5aee8d88",
  "object": "response",
  "created_at": 1761833378,
  "status": "completed",
  "background": false,
  "billing": {
    "payer": "developer"
  },
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "max_output_tokens": null,
  "max_tool_calls": null,
  "model": "gpt-5-2025-08-07",
  "output": [
    {
      "id": "rs_0af01b0fdf503cae01690371a2ee1081a39448467a2af02c0b",
      "type": "reasoning",
      "summary": []
    },
    {
      "id": "msg_0af01b0fdf503cae01690371b3018c81a3a7017183c9d9c01f",
      "type": "message",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "annotations": [],
          "logprobs": [],
          "text": "- The passage is standard “Lorem ipsum” placeholder text. It’s pseudo-Latin used to fill space and does not convey specific meaning or arguments.\n- There are no real themes or narrative. The recurring terms relate generically to structure, flow, spacing, and formatting—typical of boilerplate used to visualize layout.\n- If you intended a content summary, please provide the actual text you want summarized."
        }
      ],
      "role": "assistant"
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": null,
  "prompt_cache_key": "42",
  "reasoning": {
    "effort": "medium",
    "summary": null
  },
  "safety_identifier": null,
  "service_tier": "default",
  "store": false,
  "temperature": 1.0,
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "tool_choice": "auto",
  "tools": [],
  "top_logprobs": 0,
  "top_p": 1.0,
  "truncation": "disabled",
  "usage": {
    "input_tokens": 1543,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 600,
    "output_tokens_details": {
      "reasoning_tokens": 512
    },
    "total_tokens": 2143
  },
  "user": null,
  "metadata": {}
}

{
  "id": "resp_003aae3a9336707201690371b52c2481978f0bebcac709994e",
  "object": "response",
  "created_at": 1761833397,
  "status": "completed",
  "background": false,
  "billing": {
    "payer": "developer"
  },
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "max_output_tokens": null,
  "max_tool_calls": null,
  "model": "gpt-5-2025-08-07",
  "output": [
    {
      "id": "rs_003aae3a9336707201690371b59fb08197bda808956c7495a8",
      "type": "reasoning",
      "summary": []
    },
    {
      "id": "msg_003aae3a9336707201690371c6c61881979d248dd6b3000ecf",
      "type": "message",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "annotations": [],
          "logprobs": [],
          "text": "- The passage is largely “Lorem ipsum”–style placeholder text, intended to show typography and layout rather than convey concrete information.\n- It mimics formal exposition with recurring generic ideas about structure, balance, motion/change, and organization (e.g., references to vestibulum, elementum, varius, congue).\n- Repeated motifs include transitions and flexibility, order and efficiency, and generalized mentions of positioning and spacing.\n- Overall, it has no coherent subject or argument; its primary “theme” is demonstrating paragraph flow and rhythm."
        }
      ],
      "role": "assistant"
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": null,
  "prompt_cache_key": "42",
  "reasoning": {
    "effort": "medium",
    "summary": null
  },
  "safety_identifier": null,
  "service_tier": "default",
  "store": false,
  "temperature": 1.0,
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "tool_choice": "auto",
  "tools": [],
  "top_logprobs": 0,
  "top_p": 1.0,
  "truncation": "disabled",
  "usage": {
    "input_tokens": 1543,
    "input_tokens_details": {
      "cached_tokens": 1408
    },
    "output_tokens": 624,
    "output_tokens_details": {
      "reasoning_tokens": 512
    },
    "total_tokens": 2167
  },
  "user": null,
  "metadata": {}
}

Test script:

CONTENT=$(cat <<EOF
Proin metus nibh, luctus eleifend tempor id, lobortis et ex. Mauris nisl mi, euismod at ligula id, tristique efficitur velit. Suspendisse potenti. Mauris imperdiet erat sit amet mattis elementum. Nullam id dui sed mi convallis mattis. Sed pellentesque tincidunt ante nec vestibulum. Sed tristique risus nunc. Morbi ante quam, molestie a nibh nec, auctor laoreet nisl. Proin cursus erat eget sollicitudin hendrerit. Duis vestibulum neque leo, et pulvinar tellus placerat quis. Morbi sagittis lacus in justo porttitor, sit amet facilisis odio consequat. Sed vitae erat a urna pretium dictum et eget enim. Morbi porttitor ultrices dignissim. In in mollis metus. Etiam scelerisque est at risus ultrices, vel maximus dui pulvinar. Phasellus quis nisi vestibulum, tincidunt massa vel, egestas justo.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. In faucibus ipsum erat. Nunc massa nulla, venenatis id tellus ac, tempus malesuada odio. Integer imperdiet sagittis ipsum, nec iaculis tortor condimentum vitae. Aliquam consectetur urna et enim porta, vitae tempor leo tempor. Nam porttitor blandit ante, lacinia commodo felis. Aliquam erat volutpat. Ut mattis cursus diam et interdum. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Nullam fermentum, nibh id imperdiet scelerisque, massa elit convallis tortor, sit amet scelerisque lectus ligula non mauris. Vivamus id porta ante. Praesent fringilla ex nec risus vulputate, vitae tempor tortor pharetra. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris facilisis arcu risus, non blandit ex imperdiet nec. Quisque vitae nunc et elit egestas ornare. Quisque lobortis, mauris vitae dapibus euismod, odio turpis tempor dolor, a sagittis mauris leo eget mi.

Praesent orci augue, sagittis et est egestas, mattis posuere quam. Donec semper pulvinar vulputate. Nam ac rhoncus ligula, id molestie lacus. Curabitur quis purus sit amet massa efficitur pellentesque. Quisque ornare lectus molestie, tincidunt nisi non, blandit risus. Integer congue elementum vestibulum. Donec vulputate auctor risus, a rhoncus dui aliquam ut. Praesent ultrices gravida rutrum.

Mauris id pharetra diam, in finibus eros. Nullam dapibus urna accumsan leo suscipit euismod. Duis elementum nulla vitae risus accumsan, quis facilisis urna vehicula. Maecenas vel massa eu dui accumsan interdum. In iaculis et nibh vitae malesuada. Ut massa nibh, aliquam eu nunc at, pretium venenatis quam. Vivamus et facilisis nulla. Vestibulum sit amet orci ac justo mattis venenatis in sed sapien. Donec rutrum mi vitae diam accumsan, ut facilisis leo consectetur. Etiam dapibus, nulla quis egestas accumsan, sapien libero sodales sem, eget tempus lectus felis sed est. Maecenas at volutpat dolor, rutrum semper lorem. Nam vulputate nibh id orci aliquam, et eleifend ipsum pulvinar. Aliquam laoreet neque ultricies nisi rutrum bibendum. Vivamus aliquam ipsum at justo maximus, nec dictum purus vestibulum. Proin blandit efficitur dolor eget consequat.

Ut tristique, ipsum vel pulvinar sollicitudin, ligula tortor tincidunt ligula, vitae luctus velit eros semper sem. Ut aliquet varius orci vitae porta. Aenean feugiat elementum dui ut pharetra. Vivamus vehicula nec tellus vel tristique. Proin vitae pretium lectus, et rhoncus lacus. Pellentesque a mattis enim. Aenean lectus elit, suscipit id ultrices eu, pretium nec tortor. Etiam aliquet neque vitae massa sodales maximus. Mauris ut imperdiet velit, sed bibendum arcu. Etiam dui lectus, maximus sed mauris ac, lobortis blandit erat. Fusce condimentum, nibh ac eleifend interdum, est ipsum elementum nulla, nec cursus orci tortor ut enim. Maecenas pretium hendrerit interdum. Etiam mollis, eros quis hendrerit volutpat, ipsum lectus interdum nibh, vel ullamcorper nibh elit sed nisi.

Ut et odio magna. Nunc dictum libero erat, non placerat elit porta ut. Nulla convallis orci at aliquet tincidunt. In convallis massa interdum purus tristique tempor. Etiam malesuada fermentum ex condimentum porttitor. Nam ac aliquam sapien, sed elementum nisl. Pellentesque et lorem rhoncus, porttitor augue ac, condimentum nisl. Sed id porta risus. Suspendisse eu faucibus justo. In et risus ut arcu sollicitudin venenatis et sit amet quam. Sed mollis turpis a dapibus blandit. Curabitur lacinia leo vitae molestie efficitur. Vivamus non dui lectus. Aliquam in neque dapibus, ullamcorper nisi et, tempor enim. In interdum arcu nec tellus bibendum, eu maximus nisl viverra. Duis egestas tellus at quam vehicula pharetra.

Suspendisse fermentum orci in arcu elementum, eu finibus quam varius. Ut purus libero, vehicula quis neque egestas, porta ultricies ex. Integer cursus lacinia mauris, sed dictum magna consectetur sit amet. Proin in dui neque. Nulla facilisi. Nulla finibus, enim porta ornare porta, ligula nulla tristique elit, vel ullamcorper dui lacus vitae elit. Proin vitae ante posuere, posuere libero et, dignissim nisi. Mauris at orci eu libero commodo tempor. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean quis erat ut lectus dictum imperdiet. Donec nisi dui, venenatis commodo sapien sodales, vulputate consectetur dui. Proin commodo cursus tortor, at bibendum neque feugiat faucibus. Sed condimentum consequat ligula sit amet ornare. Sed in ex vel orci porttitor luctus. Etiam at consectetur velit, posuere blandit dolor.

Etiam non nibh nisi. Quisque tincidunt risus in est sagittis, quis cursus arcu ullamcorper. Nulla at mi auctor, rhoncus neque vel, accumsan felis. Proin tristique vehicula fringilla. Aliquam malesuada accumsan justo. Integer convallis, ligula non varius dictum, enim diam ornare justo, vitae placerat nisl est pretium est. Nulla facilisi. Mauris efficitur vel eros ac mattis. Suspendisse potenti. Nullam vel neque malesuada, sagittis ante ac, finibus quam. Proin sollicitudin eget libero vitae fringilla.

Integer eu maximus orci. Ut in lectus vel purus congue mattis. Fusce fringilla elit quis purus congue porttitor. Duis tempor, sapien in maximus mollis, purus quam fringilla libero, ultricies scelerisque orci augue nec neque. Vivamus ultrices eget velit sed mattis. Nullam lobortis, ex at convallis facilisis, metus urna tempus ex, sed pellentesque lacus ipsum non nulla. Cras erat dolor, lacinia vitae lorem id, pulvinar malesuada quam. Proin convallis tempor leo id varius. Sed pharetra, odio eu viverra tincidunt, lorem eros sodales tellus, consectetur tempus leo magna sit amet tellus. Morbi eu mauris lobortis, malesuada risus non, euismod nisi. Integer vel sollicitudin augue, sed hendrerit mauris. Quisque convallis purus eget odio auctor efficitur. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Nam ut sagittis justo, ac tempor enim.
EOF
)

for i in {1..2}; do
curl https://api.anthropic.com/v1/messages \
     --header "x-api-key: $ANTHROPIC_API_KEY" \
     --header "anthropic-version: 2023-06-01" \
     --header "content-type: application/json" \
     --data "{
    \"model\": \"claude-sonnet-4-5-20250929\",
    \"max_tokens\": 4096,
    \"messages\": [
        {
            \"role\": \"user\", 
            \"content\": [
                {
                    \"type\": \"text\",
                    \"text\": $(printf '%s' "$CONTENT" | jq -Rs .),
                    \"cache_control\": {\"type\": \"ephemeral\"}
                }
            ]
        },
        {
            \"role\": \"user\",
            \"content\": [
                {
                    \"type\": \"text\",
                    \"text\": \"$((RANDOM % 100000)) Please summarize the main themes in the text above.\"
                }
            ]
        }
    ]
}" | jq
done

for i in {1..2}; do
curl https://api.openai.com/v1/responses \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer $OPENAI_API_KEY" \
  --data "{
  \"model\": \"gpt-5-2025-08-07\",
  \"prompt_cache_key\": \"42\",
  \"input\": [
    {
      \"role\": \"user\",
      \"content\": $(printf '%s' "$CONTENT" | jq -Rs .)
    },
    {
      \"role\": \"user\",
      \"content\": \"$((RANDOM % 100000)) Please summarize the main themes in the text above.\"
    }
  ]
}" | jq
done

AI SDK Version

Latest everything

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

Labels

ai/corecore functions like generateText, streamText, etc. Provider utils, and provider spec.ai/providerrelated to a provider package. Must be assigned together with at least one `provider/*` labelprovider/anthropicIssues related to the @ai-sdk/anthropic providerprovider/openaiIssues related to the @ai-sdk/openai providersupport

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions