Skip to content

Conversation

@tkattkat
Copy link
Collaborator

@tkattkat tkattkat commented Nov 14, 2025

why

currently we do not expose reasoning, and cache tokens in our usage metrics

what changed

Reasoning, and cache tokens, are now exposed through stagehand metrics

test plan

tested locally
tested on api

@changeset-bot
Copy link

changeset-bot bot commented Nov 14, 2025

🦋 Changeset detected

Latest commit: e91d7c6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@browserbasehq/stagehand Patch
@browserbasehq/stagehand-evals Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 14, 2025

Greptile Overview

Greptile Summary

This PR adds comprehensive support for tracking reasoning tokens across all Stagehand operations (act, extract, observe, agent). Reasoning tokens are now exposed through the StagehandMetrics interface and propagated from LLM responses through handlers to the metrics tracking system.

Key Changes:

  • Added reasoning_tokens field (optional) to LLMUsage interface and all operation metrics
  • Updated all handlers to extract and report reasoning tokens through the metrics callback chain
  • Modified AI SDK client implementations to map reasoningTokens from AI SDK to internal reasoning_tokens format
  • Updated documentation with new field definitions and example outputs
  • Initialized reasoning token counters to 0 in cache and metrics objects

Implementation approach:

  • Made reasoning_tokens optional throughout to maintain backward compatibility with models that don't support reasoning tokens
  • Used fallback to 0 where reasoning tokens might be undefined
  • Maintained consistent naming: reasoning_tokens in internal APIs, reasoningTokens when mapping from AI SDK

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are purely additive, introducing new optional fields without modifying existing behavior. All reasoning token fields default to 0 when undefined, ensuring backward compatibility. The implementation is consistent across all operation types and properly propagated through the entire metrics pipeline. Documentation is comprehensive and accurately reflects the code changes.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
packages/core/lib/v3/types/public/metrics.ts 5/5 Added reasoningTokens fields to all operation categories (act, extract, observe, agent) and totals
packages/core/lib/inference.ts 5/5 Added optional reasoning_tokens to LLMUsage interface and propagated through extract, observe, and act functions
packages/core/lib/v3/v3.ts 5/5 Updated stagehandMetrics initialization and updateMetrics/updateTotalMetrics methods to include reasoning tokens
packages/core/lib/v3/handlers/actHandler.ts 5/5 Updated callback signature and all metric reporting calls to include reasoning tokens
packages/core/lib/v3/handlers/extractHandler.ts 5/5 Added reasoning_tokens with default value 0 to extraction response destructuring and metric callbacks
packages/core/lib/v3/handlers/observeHandler.ts 5/5 Added reasoning_tokens extraction and propagation through metric callbacks
packages/core/lib/v3/handlers/v3AgentHandler.ts 5/5 Added reasoningTokens from AI SDK result usage to both metrics tracking and result object
packages/core/lib/v3/llm/aisdk.ts 5/5 Added reasoning_tokens mapping from AI SDK's reasoningTokens in both object and text generation responses
packages/core/lib/v3/api.ts 5/5 Added reasoningTokens to replay metrics response interface and metrics aggregation logic

Sequence Diagram

sequenceDiagram
    participant User
    participant V3
    participant Handler as Act/Extract/Observe Handler
    participant LLM as LLMClient (AISdkClient)
    participant Model as AI Model
    
    User->>V3: call act/extract/observe/agent
    V3->>Handler: execute operation
    Handler->>LLM: createChatCompletion()
    LLM->>Model: API request
    Model-->>LLM: response with usage (inputTokens, outputTokens, reasoningTokens)
    LLM-->>Handler: {data, usage: {prompt_tokens, completion_tokens, reasoning_tokens}}
    Handler->>Handler: extract reasoning_tokens from usage
    Handler->>V3: onMetrics(functionName, promptTokens, completionTokens, reasoningTokens, inferenceTimeMs)
    V3->>V3: updateMetrics() - add to operation-specific counters
    V3->>V3: updateTotalMetrics() - add to total counters
    V3-->>User: return result
    User->>V3: stagehand.stagehandMetrics
    V3-->>User: metrics object with reasoning tokens for all operations
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

19 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@tkattkat tkattkat merged commit c76ade0 into main Nov 14, 2025
15 checks passed
seanmcguire12 pushed a commit that referenced this pull request Nov 16, 2025
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @browserbasehq/[email protected]

### Patch Changes

- [#1273](#1273)
[`ab51232`](ab51232)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix:
trigger shadow root rerender in OOPIFs by cloning & replacing instead of
reloading

- [#1268](#1268)
[`c76ade0`](c76ade0)
Thanks [@tkattkat](https://github.com/tkattkat)! - Expose reasoning, and
cached input tokens in stagehand metrics

- [#1267](#1267)
[`ffb5e5d`](ffb5e5d)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: file
uploads failing on Browserbase

- [#1269](#1269)
[`772e735`](772e735)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add example using
playwright screen recording

## @browserbasehq/[email protected]

### Patch Changes

- Updated dependencies
\[[`ab51232`](ab51232),
[`c76ade0`](c76ade0),
[`ffb5e5d`](ffb5e5d),
[`772e735`](772e735)]:
    -   @browserbasehq/[email protected]

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
michaelfp930-WB added a commit to michaelfp930-WB/stagehand that referenced this pull request Jan 12, 2026
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @browserbasehq/[email protected]

### Patch Changes

- [#1273](browserbase/stagehand#1273)
[`ab51232`](browserbase/stagehand@ab51232)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix:
trigger shadow root rerender in OOPIFs by cloning & replacing instead of
reloading

- [#1268](browserbase/stagehand#1268)
[`c76ade0`](browserbase/stagehand@c76ade0)
Thanks [@tkattkat](https://github.com/tkattkat)! - Expose reasoning, and
cached input tokens in stagehand metrics

- [#1267](browserbase/stagehand#1267)
[`ffb5e5d`](browserbase/stagehand@ffb5e5d)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: file
uploads failing on Browserbase

- [#1269](browserbase/stagehand#1269)
[`772e735`](browserbase/stagehand@772e735)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add example using
playwright screen recording

## @browserbasehq/[email protected]

### Patch Changes

- Updated dependencies
\[[`ab51232`](browserbase/stagehand@ab51232),
[`c76ade0`](browserbase/stagehand@c76ade0),
[`ffb5e5d`](browserbase/stagehand@ffb5e5d),
[`772e735`](browserbase/stagehand@772e735)]:
    -   @browserbasehq/[email protected]

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants