Skip to content

Conversation

@miguelg719
Copy link
Collaborator

@miguelg719 miguelg719 commented Nov 24, 2025

why

Adding support for Microsoft's new Fara 7B model

what changed

test plan

@changeset-bot
Copy link

changeset-bot bot commented Nov 24, 2025

🦋 Changeset detected

Latest commit: de165e1

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@browserbasehq/stagehand Patch
@browserbasehq/stagehand-evals Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@miguelg719 miguelg719 marked this pull request as ready for review November 24, 2025 17:51
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 24, 2025

Greptile Overview

Greptile Summary

Added support for Microsoft's Fara-7B model with a new MicrosoftCUAClient that uses OpenAI-compatible API with XML-based tool calling. The implementation includes viewport resizing logic, dual conversation history tracking, and comprehensive action mapping.

Key Changes:

  • Added MicrosoftCUAClient implementing FARA's XML-based tool calling format
  • Extended modelUtils and AgentProvider to support explicit provider override
  • Added microsoft/fara-7b to available CUA models
  • Included example demonstrating usage with Azure/Fireworks deployments

Issues Found:

  • wait action uses duration field but handler expects timeMs - will cause wait actions to fail
  • Missing handler for pause_and_memorize_fact action - will fall through to default case and fail

Confidence Score: 3/5

  • This PR has critical bugs that will cause runtime failures for wait and pause_and_memorize_fact actions
  • Two logic errors will cause the model's actions to fail: (1) wait action field name mismatch between client and handler, (2) missing handler case for pause_and_memorize_fact. These are straightforward fixes but will break functionality until resolved.
  • Pay close attention to packages/core/lib/v3/agent/MicrosoftCUAClient.ts:483-490 and packages/core/lib/v3/handlers/v3CuaAgentHandler.ts:414-423 - both contain critical bugs

Important Files Changed

File Analysis

Filename Score Overview
packages/core/lib/modelUtils.ts 5/5 Enhanced to support explicit provider override for custom deployments
packages/core/lib/v3/agent/AgentProvider.ts 5/5 Added Microsoft provider routing and fara-7b model mapping
packages/core/lib/v3/agent/MicrosoftCUAClient.ts 3/5 New client implementation with XML-based tool calling; wait action field name issue found
packages/core/lib/v3/handlers/v3CuaAgentHandler.ts 3/5 Missing handler for pause_and_memorize_fact action will cause errors

Sequence Diagram

sequenceDiagram
    participant User
    participant Stagehand
    participant AgentProvider
    participant MicrosoftCUAClient
    participant OpenAI API
    participant V3CuaAgentHandler
    participant Browser

    User->>Stagehand: agent({cua: true, model: "microsoft/fara-7b"})
    Stagehand->>AgentProvider: getClient(modelName, clientOptions)
    AgentProvider->>AgentProvider: Check explicit provider or map "fara-7b" to "microsoft"
    AgentProvider->>MicrosoftCUAClient: new MicrosoftCUAClient(type, modelName, instructions, options)
    MicrosoftCUAClient->>MicrosoftCUAClient: Initialize OpenAI client with baseURL/apiKey
    MicrosoftCUAClient->>MicrosoftCUAClient: Calculate resized viewport using smart_resize
    MicrosoftCUAClient-->>Stagehand: Return agent instance

    User->>Stagehand: agent.execute({instruction, maxSteps})
    Stagehand->>V3CuaAgentHandler: execute(options)
    
    loop For each step (up to maxSteps)
        V3CuaAgentHandler->>MicrosoftCUAClient: executeStep(logger, isFirstRound)
        MicrosoftCUAClient->>Browser: captureScreenshot()
        Browser-->>MicrosoftCUAClient: base64 screenshot
        MicrosoftCUAClient->>MicrosoftCUAClient: Add screenshot to conversation history
        MicrosoftCUAClient->>MicrosoftCUAClient: reconstructHistory() + generateSystemPrompt()
        MicrosoftCUAClient->>OpenAI API: chat.completions.create(messages)
        OpenAI API-->>MicrosoftCUAClient: XML response with <tool_call>
        MicrosoftCUAClient->>MicrosoftCUAClient: parseThoughtsAndAction(response)
        MicrosoftCUAClient->>MicrosoftCUAClient: convertFunctionCallToAction(functionCall)
        MicrosoftCUAClient->>MicrosoftCUAClient: Transform coordinates (resized→original viewport)
        
        alt action is "type" with coordinates
            MicrosoftCUAClient->>MicrosoftCUAClient: Expand to [click, type, press_enter]
        else action is "terminate"
            MicrosoftCUAClient->>MicrosoftCUAClient: Mark as completed
        else other actions
            MicrosoftCUAClient->>MicrosoftCUAClient: Pass through as-is
        end
        
        MicrosoftCUAClient->>V3CuaAgentHandler: Return actions & completion status
        
        loop For each action
            V3CuaAgentHandler->>V3CuaAgentHandler: executeAction(action)
            V3CuaAgentHandler->>Browser: Perform action (click, type, scroll, etc.)
            Browser-->>V3CuaAgentHandler: Action completed
        end
        
        alt completed or maxSteps reached
            V3CuaAgentHandler-->>User: Return AgentResult
        end
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. packages/core/lib/v3/handlers/v3CuaAgentHandler.ts, line 414-423 (link)

    logic: missing handler for pause_and_memorize_fact action - will fall through to default and fail

8 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

seanmcguire12 and others added 2 commits November 24, 2025 10:01
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@seanmcguire12 seanmcguire12 merged commit d5e119b into main Nov 24, 2025
14 of 15 checks passed
miguelg719 pushed a commit that referenced this pull request Dec 13, 2025
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @browserbasehq/[email protected]

### Patch Changes

- [#1388](#1388)
[`605ed6b`](605ed6b)
Thanks [@miguelg719](https://github.com/miguelg719)! - Fix multiple
click event dispatches on CDP and Anthropic CUA handling (double clicks)

- [#1400](#1400)
[`34e7e5b`](34e7e5b)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - don't write
base64 encoded screenshots to disk when caching agent actions

- [#1345](#1345)
[`943d2d7`](943d2d7)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for
aborting / stopping an agent run & continuing an agent run using
messages from prior runs

- [#1334](#1334)
[`0e95cd2`](0e95cd2)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for
google vertex provider

- [#1410](#1410)
[`d4237e4`](d4237e4)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix:
include extract in stagehand.history()

- [#1315](#1315)
[`86975e7`](86975e7)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add streaming support
to agent through stream:true in the agent config

- [#1304](#1304)
[`d5e119b`](d5e119b)
Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for
Microsoft's Fara-7B

- [#1346](#1346)
[`4e051b2`](4e051b2)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: don't
attach to targets twice

- [#1327](#1327)
[`6b5a3c9`](6b5a3c9)
Thanks [@miguelg719](https://github.com/miguelg719)! - Informed error
parsing from api

- [#1335](#1335)
[`bb85ad9`](bb85ad9)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add support
for page.addInitScript()

- [#1331](#1331)
[`88d28cc`](88d28cc)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix:
page.evaluate() now works with scripts injected via
context.addInitScript()

- [#1316](#1316)
[`45bcef0`](45bcef0)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for
callbacks in stagehand agent

- [#1374](#1374)
[`6aa9d45`](6aa9d45)
Thanks [@miguelg719](https://github.com/miguelg719)! - Fix key action
mapping in Anthropic CUA

- [#1330](#1330)
[`d382084`](d382084)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: make
act, extract, and observe respect user defined timeout param

- [#1336](#1336)
[`1df08cc`](1df08cc)
Thanks [@tkattkat](https://github.com/tkattkat)! - Patch agent on api

- [#1358](#1358)
[`2b56600`](2b56600)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for 4.5
opus in cua agent

## @browserbasehq/[email protected]

### Patch Changes

- [#1364](#1364)
[`ca0630e`](ca0630e)
Thanks [@tkattkat](https://github.com/tkattkat)! - Update model handling
in agent evals cli

- Updated dependencies
\[[`605ed6b`](605ed6b),
[`34e7e5b`](34e7e5b),
[`943d2d7`](943d2d7),
[`0e95cd2`](0e95cd2),
[`d4237e4`](d4237e4),
[`86975e7`](86975e7),
[`d5e119b`](d5e119b),
[`4e051b2`](4e051b2),
[`6b5a3c9`](6b5a3c9),
[`bb85ad9`](bb85ad9),
[`88d28cc`](88d28cc),
[`45bcef0`](45bcef0),
[`6aa9d45`](6aa9d45),
[`d382084`](d382084),
[`1df08cc`](1df08cc),
[`2b56600`](2b56600)]:
    -   @browserbasehq/[email protected]

## @browserbasehq/[email protected]

### Patch Changes

- Updated dependencies
\[[`605ed6b`](605ed6b),
[`34e7e5b`](34e7e5b),
[`943d2d7`](943d2d7),
[`0e95cd2`](0e95cd2),
[`d4237e4`](d4237e4),
[`86975e7`](86975e7),
[`d5e119b`](d5e119b),
[`4e051b2`](4e051b2),
[`6b5a3c9`](6b5a3c9),
[`bb85ad9`](bb85ad9),
[`88d28cc`](88d28cc),
[`45bcef0`](45bcef0),
[`6aa9d45`](6aa9d45),
[`d382084`](d382084),
[`1df08cc`](1df08cc),
[`2b56600`](2b56600)]:
    -   @browserbasehq/[email protected]

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
michaelfp930-WB added a commit to michaelfp930-WB/stagehand that referenced this pull request Jan 12, 2026
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @browserbasehq/[email protected]

### Patch Changes

- [#1388](browserbase/stagehand#1388)
[`605ed6b`](browserbase/stagehand@605ed6b)
Thanks [@miguelg719](https://github.com/miguelg719)! - Fix multiple
click event dispatches on CDP and Anthropic CUA handling (double clicks)

- [#1400](browserbase/stagehand#1400)
[`34e7e5b`](browserbase/stagehand@34e7e5b)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - don't write
base64 encoded screenshots to disk when caching agent actions

- [#1345](browserbase/stagehand#1345)
[`943d2d7`](browserbase/stagehand@943d2d7)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for
aborting / stopping an agent run & continuing an agent run using
messages from prior runs

- [#1334](browserbase/stagehand#1334)
[`0e95cd2`](browserbase/stagehand@0e95cd2)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for
google vertex provider

- [#1410](browserbase/stagehand#1410)
[`d4237e4`](browserbase/stagehand@d4237e4)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix:
include extract in stagehand.history()

- [#1315](browserbase/stagehand#1315)
[`86975e7`](browserbase/stagehand@86975e7)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add streaming support
to agent through stream:true in the agent config

- [#1304](browserbase/stagehand#1304)
[`d5e119b`](browserbase/stagehand@d5e119b)
Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for
Microsoft's Fara-7B

- [#1346](browserbase/stagehand#1346)
[`4e051b2`](browserbase/stagehand@4e051b2)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: don't
attach to targets twice

- [#1327](browserbase/stagehand#1327)
[`6b5a3c9`](browserbase/stagehand@6b5a3c9)
Thanks [@miguelg719](https://github.com/miguelg719)! - Informed error
parsing from api

- [#1335](browserbase/stagehand#1335)
[`bb85ad9`](browserbase/stagehand@bb85ad9)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add support
for page.addInitScript()

- [#1331](browserbase/stagehand#1331)
[`88d28cc`](browserbase/stagehand@88d28cc)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix:
page.evaluate() now works with scripts injected via
context.addInitScript()

- [#1316](browserbase/stagehand#1316)
[`45bcef0`](browserbase/stagehand@45bcef0)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for
callbacks in stagehand agent

- [#1374](browserbase/stagehand#1374)
[`6aa9d45`](browserbase/stagehand@6aa9d45)
Thanks [@miguelg719](https://github.com/miguelg719)! - Fix key action
mapping in Anthropic CUA

- [#1330](browserbase/stagehand#1330)
[`d382084`](browserbase/stagehand@d382084)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: make
act, extract, and observe respect user defined timeout param

- [#1336](browserbase/stagehand#1336)
[`1df08cc`](browserbase/stagehand@1df08cc)
Thanks [@tkattkat](https://github.com/tkattkat)! - Patch agent on api

- [#1358](browserbase/stagehand#1358)
[`2b56600`](browserbase/stagehand@2b56600)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for 4.5
opus in cua agent

## @browserbasehq/[email protected]

### Patch Changes

- [#1364](browserbase/stagehand#1364)
[`ca0630e`](browserbase/stagehand@ca0630e)
Thanks [@tkattkat](https://github.com/tkattkat)! - Update model handling
in agent evals cli

- Updated dependencies
\[[`605ed6b`](browserbase/stagehand@605ed6b),
[`34e7e5b`](browserbase/stagehand@34e7e5b),
[`943d2d7`](browserbase/stagehand@943d2d7),
[`0e95cd2`](browserbase/stagehand@0e95cd2),
[`d4237e4`](browserbase/stagehand@d4237e4),
[`86975e7`](browserbase/stagehand@86975e7),
[`d5e119b`](browserbase/stagehand@d5e119b),
[`4e051b2`](browserbase/stagehand@4e051b2),
[`6b5a3c9`](browserbase/stagehand@6b5a3c9),
[`bb85ad9`](browserbase/stagehand@bb85ad9),
[`88d28cc`](browserbase/stagehand@88d28cc),
[`45bcef0`](browserbase/stagehand@45bcef0),
[`6aa9d45`](browserbase/stagehand@6aa9d45),
[`d382084`](browserbase/stagehand@d382084),
[`1df08cc`](browserbase/stagehand@1df08cc),
[`2b56600`](browserbase/stagehand@2b56600)]:
    -   @browserbasehq/[email protected]

## @browserbasehq/[email protected]

### Patch Changes

- Updated dependencies
\[[`605ed6b`](browserbase/stagehand@605ed6b),
[`34e7e5b`](browserbase/stagehand@34e7e5b),
[`943d2d7`](browserbase/stagehand@943d2d7),
[`0e95cd2`](browserbase/stagehand@0e95cd2),
[`d4237e4`](browserbase/stagehand@d4237e4),
[`86975e7`](browserbase/stagehand@86975e7),
[`d5e119b`](browserbase/stagehand@d5e119b),
[`4e051b2`](browserbase/stagehand@4e051b2),
[`6b5a3c9`](browserbase/stagehand@6b5a3c9),
[`bb85ad9`](browserbase/stagehand@bb85ad9),
[`88d28cc`](browserbase/stagehand@88d28cc),
[`45bcef0`](browserbase/stagehand@45bcef0),
[`6aa9d45`](browserbase/stagehand@6aa9d45),
[`d382084`](browserbase/stagehand@d382084),
[`1df08cc`](browserbase/stagehand@1df08cc),
[`2b56600`](browserbase/stagehand@2b56600)]:
    -   @browserbasehq/[email protected]

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants