Skip to content

Releases: browserwing/browserwing

v1.0.1-beta.2

06 Mar 02:21

Choose a tag to compare

v1.0.1-beta.2 Pre-release
Pre-release

v1.0.1-beta.2

New Features

  • LLM BaseURL Support: LLM client now supports custom BaseURL for integrating with OpenAI-compatible services
  • Version Endpoint: Added /version endpoint to check current version info
  • NoSandbox Configuration: Browser supports NoSandbox mode for containerized deployment
  • Immediate Task Execution: Scheduler supports immediate task execution with result saving

Improvements

  • Panic Recovery: Added panic recovery for iframe script injection and navigation operations

Installation

npm install -g browserwing@beta

v1.0.1-beta.1

03 Mar 01:16

Choose a tag to compare

v1.0.1-beta.1 Pre-release
Pre-release

新增功能

  • AI 控制模式: 新增 AI 驱动的浏览器控制,支持临时会话和适配器接口
  • 多浏览器实例管理: 支持创建、管理多个独立的浏览器实例
  • 定时任务系统: 支持脚本的定时执行和任务管理
  • XHR/Fetch 捕获: 录制和回放时捕获网络请求
  • Cookie 管理: 新增 Cookie 的查看、单个删除和批量删除功能
  • 国际化 (i18n): 支持中英文界面切换
  • MCP 服务管理: 支持外部 MCP 服务的 CRUD 和工具发现
  • 脚本变量系统: 支持脚本参数化和外部变量覆盖
  • 条件执行: 基于变量的条件判断执行
  • 键盘动作: 支持键盘输入录制和回放
  • 滚动动作: 支持页面滚动录制和回放
  • 截图动作: 支持视口/全页/区域截图
  • AI Explorer: AI 驱动的浏览器探索和脚本生成
  • 自定义 AI 提示词: 可自定义 AI 操作的提示词系统
  • RefID 系统: 语义化元素选择,提升自动化稳定性
  • 代理支持: 浏览器配置支持 HTTP/SOCKS 代理
  • 认证系统: 支持 JWT 和 API Key 认证

详见 CHANGELOG.md

Release v1.0.0

25 Jan 15:07

Choose a tag to compare

BrowserWing 1.0.0 Release Notes 🎉

BrowserWing

Release Date: 2026-01-25
Version: 1.0.0
License: MIT

🌟 Overview

BrowserWing 1.0.0 is the first official release, delivering a complete browser automation platform with deep AI integration. A powerful tool for developers, QA engineers, data analysts, and AI application developers, making browser automation simple, intelligent, and efficient.

✨ Core Features

1. 🤖 Built-in AI Agent

Conversational Browser Control

  • Multi-LLM Support: Compatible with OpenAI, Claude, DeepSeek, Gemini, and more
  • Natural Language Control: Describe tasks in plain language, AI automates browser operations
  • Intelligent Task Planning: Automatically evaluates task complexity and selects optimal execution strategy
  • Real-time Streaming: See execution progress as it happens
  • Session Management: Support multiple parallel sessions, each with independent model configuration
  • Performance Optimized:
    • Startup time reduced by 89% (4.5s → 0.5s)
    • Memory usage reduced by 97% (800MB → 24MB)
    • Simple query response time improved by 56% (4.5s → 2s)

Example Tasks:

"Open GitHub, search for 'browser automation', extract names and star counts of top 10 projects"
"Log in to Twitter, post a tweet: 'Hello from BrowserWing!'"
"Monitor product price on Amazon, notify me when price drops below $50"

2. 🔌 Universal AI Tool Integration

Three Integration Methods for All AI Tools

Method 1: MCP Server (Recommended)

  • Standard Protocol: Full Model Context Protocol (MCP) implementation
  • Zero Configuration: One-line JSON config to integrate with any MCP-compatible AI tool
  • Rich Tool Set: 26+ browser control tools including navigation, interaction, extraction, screenshots
{
  "mcpServers": {
    "browserwing": {
      "url": "http://localhost:8080/api/v1/mcp/message"
    }
  }
}

Method 2: Skills File

  • Plug and Play: Download SKILL.md, import into Cursor, Windsurf, and other Skills-compatible tools
  • Auto Discovery: AI tools automatically recognize available browser control capabilities
  • Custom Export: Export recorded scripts as custom Skills files

Method 3: HTTP API

  • 26+ RESTful Endpoints: Complete programmatic browser control
  • OpenAPI Documentation: Standardized API specs for easy integration
  • Batch Operations: Multi-step atomic execution support

3. 🎬 Visual Script Recording & Playback

WYSIWYG Automation Workflows

Recording Features

  • One-Click Recording: Automatically captures clicks, inputs, selections, navigation
  • Semantic Recording: Stable element location based on ARIA roles and accessibility tree
  • Smart Waits: Auto-detects page loading, element appearance wait conditions
  • Cross-Tab Support: Records operations across multiple tabs

Editing Features

  • Visual Editor: Drag to reorder, delete, modify recorded steps
  • Parameter Adjustment: Modify URLs, selectors, input text
  • Logic Addition: Insert waits, conditionals, loops
  • Variable System: Support script variables and extraction variables

Playback Features

  • Precise Reproduction: High-fidelity replay of recorded sequences
  • Step Debugging: Execute step-by-step, observe each operation's effect
  • Error Recovery: Auto-retry or skip on failures
  • Batch Execution: Sequential or parallel execution of multiple scripts

Export Features

  • Export as MCP Commands: Convert to MCP tool call sequences for AI tools
  • Export as Skills: Generate custom SKILL.md files
  • Export as Code: Generate Python, JavaScript code (planned)

4. 🎯 Intelligent Data Extraction

LLM-Driven Semantic Extraction

  • Natural Language Description: Describe data to extract in plain language, no selectors needed
  • Structured Output: Auto-convert unstructured pages to JSON data
  • Multiple Extraction Methods:
    • CSS/XPath selector extraction
    • LLM semantic understanding extraction
    • Batch multi-field extraction
    • Auto-pagination extraction
  • Screenshot Annotation: Pair extraction with screenshots for visual reference

Example:

# Traditional: Precise CSS selectors required
curl -X POST 'http://localhost:8080/api/v1/executor/extract' \
  -d '{"selector": "div.product > h2.title", "multiple": true}'

# LLM: Natural language description
curl -X POST 'http://localhost:8080/api/v1/executor/extract-semantic' \
  -d '{"instructions": "Extract all product names, prices, and ratings"}'

5. 🔐 Session Management & Authentication

Stable and Reliable Browser Sessions

  • Multiple Browser Instances: Manage multiple independent browser instances simultaneously
  • User Data Persistence: Cookies, LocalStorage, SessionStorage auto-saved
  • Login State Retention: Login state automatically restored after reopening
  • Proxy Support: Each instance can configure independent proxy
  • Custom Configuration: User-Agent, window size, language, etc.
  • Headless Mode: Support headless/headed mode switching for different scenarios

6. 📸 Screenshots & Debugging

Powerful Debugging and Monitoring

  • Full Page Screenshots: Capture complete page including scroll areas
  • Element Screenshots: Precisely capture specific elements
  • Multiple Formats: PNG, JPEG, etc.
  • Auto Save: Screenshots auto-saved to specified directory, returns file path
  • Accessibility Snapshot: Get page accessibility tree, quickly locate elements
  • RefID System: Assigns stable reference IDs to each interactive element (@e1, @e2...)

7. 🚀 High-Performance Architecture

Optimized for Production

  • Go Backend: High performance, low latency, concurrency-friendly
  • React + TypeScript Frontend: Modern, responsive UI
  • Lazy Loading: Create Agent instances on-demand, save resources
  • Connection Pooling: Efficient browser connection reuse
  • Error Recovery: Auto-handle Chrome crashes, connection drops
  • Graceful Shutdown: Auto-cleanup resources on Ctrl+C, close browsers

📋 Complete Feature List

Browser Control

  • ✅ Navigation: Forward, back, refresh, jump to URL
  • ✅ Tab Management: New, close, switch, list
  • ✅ Window Management: Maximize, minimize, resize
  • ✅ Element Interaction: Click, type, select, hover, drag
  • ✅ Keyboard Operations: Keys, shortcuts, combinations
  • ✅ File Operations: Upload files, monitor downloads
  • ✅ Scroll Control: Scroll to element, scroll page
  • ✅ JavaScript Execution: Run custom JS code
  • ✅ Cookie Management: Get, set, delete cookies
  • ✅ Storage Management: LocalStorage, SessionStorage operations

Data Extraction

  • ✅ Text Extraction: innerText, textContent
  • ✅ HTML Extraction: innerHTML, outerHTML
  • ✅ Attribute Extraction: href, src, data-*, etc.
  • ✅ Multi-Element Extraction: Batch extract list data
  • ✅ LLM Semantic Extraction: Natural language extraction requirements
  • ✅ Table Extraction: Auto-recognize table structure
  • ✅ Pagination Extraction: Auto-paginate and aggregate data

AI Capabilities

  • ✅ Multi-LLM Support: OpenAI, Claude, DeepSeek, etc.
  • ✅ Streaming Response: Real-time view of AI execution
  • ✅ Task Evaluation: Intelligently determine if tool calls needed
  • ✅ Direct Response: Fast response for simple questions (56% faster)
  • ✅ Tool Calling: Auto-select appropriate browser tools
  • ✅ Error Handling: Auto-retry or report failures
  • ✅ Session Management: Parallel sessions, saved history

Scripts & Automation

  • ✅ One-Click Recording: Auto-capture all operations
  • ✅ Visual Editing: Drag to adjust step order
  • ✅ Variable Support: Script variables and extraction variables
  • ✅ Conditional Execution: Conditions based on extracted data
  • ✅ Batch Playback: Sequential execution of multiple scripts
  • ✅ Export as Skills: Generate SKILL.md files
  • ✅ Export as MCP: Generate MCP tool call sequences

Integration & Extension

  • ✅ MCP Server: Standard MCP protocol implementation
  • ✅ Skills File: Cursor, Windsurf compatible
  • ✅ HTTP API: 26+ RESTful endpoints
  • ✅ OpenAPI Spec: Standardized API documentation
  • ✅ Webhooks: Event notifications (planned)
  • ✅ Plugin System: Extend with custom features (planned)

🎯 Use Cases

1. RPA (Robotic Process Automation)

  • Data Entry: Auto-fill forms, submit orders
  • Report Generation: Periodically scrape data, generate Excel reports
  • Email Processing: Auto-login to email, categorize and process
  • Invoice Processing: Batch download invoices, extract information

2. Data Collection & Monitoring

  • Price Monitoring: Monitor e-commerce price changes, price alerts
  • Competitor Analysis: Collect competitor information, comparative analysis
  • Sentiment Monitoring: Monitor social media, analyze user feedback
  • Property Monitoring: Real-time property listings, immediate notifications

3. Automated Testing

  • E2E Testing: End-to-end functional testing
  • Regression Testing: Batch replay test cases
  • UI Testing: Screenshot comparison, visual regression testing
  • Performance Testing: Page load time monitoring

4. AI Agent Development

  • Intelligent Assistant: Provide browser operation capabilities for AI assistants
  • Information Retrieval: AI auto-search and extract web information
  • Task Execution: AI-driven complex task automation
  • Knowledge Base Building: Auto-collect and organize knowledge content

5. Content Creation & Management

  • Auto Publishing: Batch publish blog, social media content
  • Content Collection: Collect quality content, assist creati...
Read more

v0.0.1 — Initial Public Release

16 Dec 14:53

Choose a tag to compare

Overview

BrowserWing enables AI agents to control browsers via MCP commands instead of slow, token-heavy step-by-step interactions.

By moving browser execution behind a command boundary, agents can operate faster with significantly fewer LLM calls.

Features

  • Define browser automation as reusable MCP commands
  • Execute browser actions with minimal LLM interaction
  • Local browser control for agent-driven workflows
  • Simple demo showcasing command-based automation

Motivation

Current browser agents rely on frequent LLM interactions for DOM inspection and reasoning, resulting in high latency and token usage.

BrowserWing reduces this overhead by letting LLMs focus on intent, while execution is handled by predefined commands.

Notes

  • Early-stage release; APIs may change
  • Limited command set in this version
  • Focused on local experimentation

Next Steps

  • Command composition and chaining
  • Improved authoring tools
  • More real-world automation examples

Feedback

Issues, discussions, and early feedback are highly appreciated.