Overview

Relevant source files

Purpose and Scope

This document provides a high-level introduction to MaaFramework, explaining its design goals, architectural principles, and core capabilities. It serves as an entry point for understanding the framework's purpose, component structure, and ecosystem.

For specific integration instructions, see Quick Start. For detailed technical specifications, see Core Architecture and Pipeline Protocol.

What is MaaFramework

MaaFramework is a cross-platform automation black-box testing framework built on image recognition technology. It represents a complete rewrite and refinement of lessons learned from the MAA (MaaAssistantArknights) project, designed to provide developers with powerful automation capabilities through a low-code approach that maintains high extensibility.

The framework's core objective is to enable developers to create sophisticated automation workflows using primarily declarative JSON configurations (the Pipeline Protocol), while seamlessly integrating custom logic in any programming language when needed through the Agent System.

Primary Goals:

Low-Code Accessibility: Enable rapid development through JSON-based pipeline definitions with zero coding prerequisites
High Extensibility: Support custom recognition algorithms and actions without framework recompilation
Cross-Platform: Unified API across Windows, Linux, macOS, and Android platforms
Multi-Language: Native bindings for Python, C#, NodeJS, Go, Rust, and Java
Production-Ready: Industrial-grade architecture suitable for long-running, complex automation tasks

Sources: README.md42-46 README_en.md44-48

Architecture Philosophy

MaaFramework's design is built on three foundational principles:

1. Declarative Low-Code with Imperative Escape Hatches

The framework prioritizes Pipeline JSON definitions for workflow logic, allowing users to describe tasks declaratively without writing code. Complex recognition logic (template matching, OCR, neural networks) and common actions (clicks, swipes, text input) are exposed as JSON properties. When declarative patterns prove insufficient, the framework provides Custom Recognition and Custom Action extension points that execute in separate processes via the Agent System.

2. Agent-Based Process Separation

The Agent Architecture decouples the core C++ engine from user-defined logic. Custom recognition and action implementations run in independent processes connected via ZeroMQ IPC. This separation breaks the language barrier—a C# GUI can orchestrate tasks while Python code handles custom vision algorithms, all without shared memory or FFI complexity.

3. Platform Abstraction with Optimized Implementations

The Controller abstraction provides a unified device interaction API (screencap, click, swipe, etc.) while each platform implementation (AdbController, Win32Controller, etc.) selects optimal methods at runtime. For example, AdbController tests seven screencap methods on initialization and caches the fastest, achieving 60+ FPS on high-end devices.

Sources: README.md44-46 docs/en_us/1.1-QuickStarted.md4-95 docs/en_us/4.2-StandardizedInterfaceDesign.md1-31

Core Component Architecture

MaaFramework Core Component Architecture

MaaTasker

The central orchestration component. MaaTasker loads pipeline definitions from MaaResource, executes recognition-action loops via MaaController, and manages execution context through MaaContext. It exposes the post_task API for asynchronous job submission and provides query interfaces for task status and results.

MaaResource

Asset management system responsible for loading and validating pipeline JSON files, caching image templates, managing OCR/ONNX models, and providing runtime access to these resources. Supports multi-bundle loading with priority-based override semantics.

MaaController

Platform-agnostic device interface. Implementations (AdbController, Win32Controller, etc.) provide screencap and input capabilities optimized for each platform. The controller handles connection management, performance testing, and device feature reporting.

MaaContext

Execution environment for pipeline tasks. Maintains the current image, recognition results, and runtime state. Provides context objects to custom recognition/action implementations and manages IPC communication with Agent processes.

MaaToolkit

Utility collection providing device/window discovery (find_adb_devices, find_desktop_windows), configuration management (init_option), and ProjectInterface protocol support.

Sources: README.md42-58 docs/en_us/1.2-ExplanationOfTerms.md1-43

Multi-Platform Support

Multi-Platform Controller Architecture

MaaFramework achieves cross-platform support through the Controller abstraction layer. Each platform implementation provides optimized screencap and input methods selected via runtime testing or priority fallback:

Platform	Controller Class	Target Devices	Key Features
Android	`AdbController`	Physical devices, emulators	7 screencap methods, 4 input methods, MaaAgentBinary deployment
Windows	`Win32Controller`	Desktop applications, games	DXGI Desktop Duplication, message-based input, pseudo-minimize
macOS	`PlayCoverController`	iOS apps via PlayCover	iOS app automation on macOS
Linux	`WlRootsController`	Wayland compositors	Native Wayland support
Virtual	`GamepadController`	ViGEm virtual gamepads	Xbox 360/DualShock 4 emulation
Custom	`CustomController`	User-defined platforms	Extensibility interface

The DbgController enables testing with pre-recorded screenshots, eliminating device requirements during development.

Sources: README.md19-20 README_en.md19-20

Language Bindings Ecosystem

Language Bindings Architecture

MaaFramework employs a two-layer binding strategy to ensure API consistency across languages while maintaining idiomatic patterns:

Layer 1: Standardized OOP Abstraction

All bindings implement a common object model defined in the Standardized Interface Design:

Object Wrappers: MaaTasker, MaaController, MaaResource become Tasker, Controller, Resource classes with automatic handle management
Job Abstractions: Asynchronous operation IDs (MaaTaskId, MaaCtrlId) are wrapped in Job/Future objects with wait(), status(), get() methods
Detail Structures: JSON query results parsed into typed structures (RecoDetail, NodeDetail, TaskDetail)
Callback Agents: C function pointers wrapped in interfaces/base classes for custom implementations

Layer 2: Language-Specific Idioms

Each binding adapts the standardized layer to language conventions:

Language	Package Name	Distribution	Idiomatic Patterns
Python	`maafw`	PyPI	Type hints, context managers (`with`), property decorators
C#	`Maa.Framework.Runtimes`	NuGet	async/await, LINQ, IDisposable
NodeJS	`@maaxyz/maa-node`	npm	Promises, async/await, EventEmitter
Go	`maa-framework-go`	pkg.go.dev	Interfaces, defer, error handling
Rust	`maa-framework`	crates.io	Result types, RAII, trait objects

Sources: README.md25-29 README_en.md25-28 docs/en_us/4.2-StandardizedInterfaceDesign.md10-25

Development Paradigms

MaaFramework supports three integration approaches, allowing developers to choose based on complexity requirements:

Approach 1: Pure JSON Low-Code

Applicable Scenarios: Simple automation tasks, quick prototyping, non-programmers

Zero code required. Define entire workflows in JSON using the Pipeline Protocol:

Supported by visual editors like MaaPipelineEditor for drag-and-drop development.

Approach 2: JSON + Custom Extensions (Recommended)

Applicable Scenarios: Most production use cases requiring advanced logic

Retains JSON's visual pipeline structure while delegating complex logic to custom implementations running in Agent processes:

This approach preserves ecosystem tool compatibility (visual debugger, General UI) while enabling arbitrary code complexity.

Approach 3: Full-Code Development

Applicable Scenarios: Maximum flexibility, deep customization, non-standard workflows

Direct API usage without Pipeline JSON, trading ecosystem tools for complete control:

Sources: docs/en_us/1.1-QuickStarted.md4-113

Key Features Summary

Low-Code Pipeline Protocol

Declarative Workflows: Define tasks using JSON with 10 recognition types (OCR, TemplateMatch, FeatureMatch, ColorMatch, Neural Networks, etc.) and 22 action types
Visual Tooling: MaaPipelineEditor for no-code editing, VSCode extension for IDE integration
Schema Validation: JSON Schema-based validation with PipelineChecker

Agent-Based Extensibility

Language Barrier Breaking: Custom recognition/action logic runs in separate processes via ZeroMQ IPC
Multi-Language Support: Python, JavaScript, or any language with ZeroMQ bindings
Process Isolation: Framework crashes don't affect custom logic and vice versa

Cross-Platform Controllers

Unified API: Single codebase deploys to Android, Windows, macOS, Linux
Optimized Implementations: Platform-specific screencap/input methods selected via runtime testing
Custom Controller Extension: Implement the Controller interface for new platforms

Resource Management

Multi-Bundle Loading: Combine multiple resource bundles with priority-based override semantics
Asset Caching: Images, OCR models, ONNX models cached for performance
Hot Reloading: Runtime resource updates without process restart (via reload API)

Hardware Acceleration

Multi-Backend Inference: CPU, CUDA (NVIDIA), DirectML (Windows), CoreML (macOS)
OCR Optimization: FastDeploy-based PaddleOCR with batch processing
Neural Network Support: ONNX Runtime for custom classification/detection models

General UI Integration

ProjectInterface Protocol: Standardized interface.json for resource discovery and configuration
Zero UI Development: Existing General UIs (MFAAvalonia, MXU) run any MaaFramework project
Configuration Persistence: Automatic save/load of user preferences

Production-Grade Architecture

Asynchronous Operations: Non-blocking APIs with job/future abstractions
Event Notification System: Real-time callbacks for task progress, recognition results, errors
Robust Error Handling: Timeout mechanisms, retry logic, fallback strategies in Pipeline Protocol
Comprehensive Logging: Configurable log levels, visual debug output (save_draw), error screenshots

Multi-Language API Consistency

Standardized Interface: All bindings follow the Standardized Interface Design
Idiomatic Bindings: Language-specific patterns (async/await, context managers, etc.)
Feature Parity: Equivalent functionality across all supported languages

Sources: README.md42-58 docs/en_us/1.1-QuickStarted.md1-227 docs/en_us/1.2-ExplanationOfTerms.md1-43

Ecosystem and Community

MaaFramework has spawned a rich ecosystem of tools and applications:

General UI Applications: MFAAvalonia MXU MWU

Development Tools: MaaDebugger MaaPipelineEditor, VSCode Extension, MFAToolsPlus

Production Applications: 30+ game assistants and automation tools (see README.md104-187)

The framework is actively maintained with monthly releases, comprehensive CI/CD (8 platform-architecture combinations), and multi-channel distribution (PyPI, NuGet, npm).

Sources: README.md59-187 README_en.md61-189

Overview

Relevant source files

Purpose and Scope

For specific integration instructions, see Quick Start. For detailed technical specifications, see Core Architecture and Pipeline Protocol.

What is MaaFramework

Primary Goals:

Low-Code Accessibility: Enable rapid development through JSON-based pipeline definitions with zero coding prerequisites
High Extensibility: Support custom recognition algorithms and actions without framework recompilation
Cross-Platform: Unified API across Windows, Linux, macOS, and Android platforms
Multi-Language: Native bindings for Python, C#, NodeJS, Go, Rust, and Java
Production-Ready: Industrial-grade architecture suitable for long-running, complex automation tasks

Sources: README.md42-46 README_en.md44-48

Architecture Philosophy

MaaFramework's design is built on three foundational principles:

1. Declarative Low-Code with Imperative Escape Hatches

2. Agent-Based Process Separation

3. Platform Abstraction with Optimized Implementations

Sources: README.md44-46 docs/en_us/1.1-QuickStarted.md4-95 docs/en_us/4.2-StandardizedInterfaceDesign.md1-31

Core Component Architecture

MaaFramework Core Component Architecture

MaaTasker

MaaResource

MaaController

MaaContext

MaaToolkit

Utility collection providing device/window discovery (find_adb_devices, find_desktop_windows), configuration management (init_option), and ProjectInterface protocol support.

Sources: README.md42-58 docs/en_us/1.2-ExplanationOfTerms.md1-43

Multi-Platform Support

Multi-Platform Controller Architecture

Platform	Controller Class	Target Devices	Key Features
Android	`AdbController`	Physical devices, emulators	7 screencap methods, 4 input methods, MaaAgentBinary deployment
Windows	`Win32Controller`	Desktop applications, games	DXGI Desktop Duplication, message-based input, pseudo-minimize
macOS	`PlayCoverController`	iOS apps via PlayCover	iOS app automation on macOS
Linux	`WlRootsController`	Wayland compositors	Native Wayland support
Virtual	`GamepadController`	ViGEm virtual gamepads	Xbox 360/DualShock 4 emulation
Custom	`CustomController`	User-defined platforms	Extensibility interface

The DbgController enables testing with pre-recorded screenshots, eliminating device requirements during development.

Sources: README.md19-20 README_en.md19-20

Language Bindings Ecosystem

Language Bindings Architecture

MaaFramework employs a two-layer binding strategy to ensure API consistency across languages while maintaining idiomatic patterns:

Layer 1: Standardized OOP Abstraction

All bindings implement a common object model defined in the Standardized Interface Design:

Object Wrappers: MaaTasker, MaaController, MaaResource become Tasker, Controller, Resource classes with automatic handle management
Job Abstractions: Asynchronous operation IDs (MaaTaskId, MaaCtrlId) are wrapped in Job/Future objects with wait(), status(), get() methods
Detail Structures: JSON query results parsed into typed structures (RecoDetail, NodeDetail, TaskDetail)
Callback Agents: C function pointers wrapped in interfaces/base classes for custom implementations

Layer 2: Language-Specific Idioms

Each binding adapts the standardized layer to language conventions:

Language	Package Name	Distribution	Idiomatic Patterns
Python	`maafw`	PyPI	Type hints, context managers (`with`), property decorators
C#	`Maa.Framework.Runtimes`	NuGet	async/await, LINQ, IDisposable
NodeJS	`@maaxyz/maa-node`	npm	Promises, async/await, EventEmitter
Go	`maa-framework-go`	pkg.go.dev	Interfaces, defer, error handling
Rust	`maa-framework`	crates.io	Result types, RAII, trait objects

Sources: README.md25-29 README_en.md25-28 docs/en_us/4.2-StandardizedInterfaceDesign.md10-25

Development Paradigms

MaaFramework supports three integration approaches, allowing developers to choose based on complexity requirements:

Approach 1: Pure JSON Low-Code

Applicable Scenarios: Simple automation tasks, quick prototyping, non-programmers

Zero code required. Define entire workflows in JSON using the Pipeline Protocol:

Supported by visual editors like MaaPipelineEditor for drag-and-drop development.

Approach 2: JSON + Custom Extensions (Recommended)

Applicable Scenarios: Most production use cases requiring advanced logic

Retains JSON's visual pipeline structure while delegating complex logic to custom implementations running in Agent processes:

This approach preserves ecosystem tool compatibility (visual debugger, General UI) while enabling arbitrary code complexity.

Approach 3: Full-Code Development

Applicable Scenarios: Maximum flexibility, deep customization, non-standard workflows

Direct API usage without Pipeline JSON, trading ecosystem tools for complete control:

Sources: docs/en_us/1.1-QuickStarted.md4-113

Key Features Summary

Low-Code Pipeline Protocol

Declarative Workflows: Define tasks using JSON with 10 recognition types (OCR, TemplateMatch, FeatureMatch, ColorMatch, Neural Networks, etc.) and 22 action types
Visual Tooling: MaaPipelineEditor for no-code editing, VSCode extension for IDE integration
Schema Validation: JSON Schema-based validation with PipelineChecker

Agent-Based Extensibility

Language Barrier Breaking: Custom recognition/action logic runs in separate processes via ZeroMQ IPC
Multi-Language Support: Python, JavaScript, or any language with ZeroMQ bindings
Process Isolation: Framework crashes don't affect custom logic and vice versa

Cross-Platform Controllers

Unified API: Single codebase deploys to Android, Windows, macOS, Linux
Optimized Implementations: Platform-specific screencap/input methods selected via runtime testing
Custom Controller Extension: Implement the Controller interface for new platforms

Resource Management

Multi-Bundle Loading: Combine multiple resource bundles with priority-based override semantics
Asset Caching: Images, OCR models, ONNX models cached for performance
Hot Reloading: Runtime resource updates without process restart (via reload API)

Hardware Acceleration

Multi-Backend Inference: CPU, CUDA (NVIDIA), DirectML (Windows), CoreML (macOS)
OCR Optimization: FastDeploy-based PaddleOCR with batch processing
Neural Network Support: ONNX Runtime for custom classification/detection models

General UI Integration

ProjectInterface Protocol: Standardized interface.json for resource discovery and configuration
Zero UI Development: Existing General UIs (MFAAvalonia, MXU) run any MaaFramework project
Configuration Persistence: Automatic save/load of user preferences

Production-Grade Architecture

Asynchronous Operations: Non-blocking APIs with job/future abstractions
Event Notification System: Real-time callbacks for task progress, recognition results, errors
Robust Error Handling: Timeout mechanisms, retry logic, fallback strategies in Pipeline Protocol
Comprehensive Logging: Configurable log levels, visual debug output (save_draw), error screenshots

Multi-Language API Consistency

Standardized Interface: All bindings follow the Standardized Interface Design
Idiomatic Bindings: Language-specific patterns (async/await, context managers, etc.)
Feature Parity: Equivalent functionality across all supported languages

Sources: README.md42-58 docs/en_us/1.1-QuickStarted.md1-227 docs/en_us/1.2-ExplanationOfTerms.md1-43

Ecosystem and Community

MaaFramework has spawned a rich ecosystem of tools and applications:

General UI Applications: MFAAvalonia MXU MWU

Development Tools: MaaDebugger MaaPipelineEditor, VSCode Extension, MFAToolsPlus

Production Applications: 30+ game assistants and automation tools (see README.md104-187)

The framework is actively maintained with monthly releases, comprehensive CI/CD (8 platform-architecture combinations), and multi-channel distribution (PyPI, NuGet, npm).

Sources: README.md59-187 README_en.md61-189

Overview

Purpose and Scope

What is MaaFramework

Architecture Philosophy

1. Declarative Low-Code with Imperative Escape Hatches

2. Agent-Based Process Separation

3. Platform Abstraction with Optimized Implementations

Core Component Architecture

MaaTasker

MaaResource

MaaController

MaaContext

MaaToolkit

Multi-Platform Support

Language Bindings Ecosystem

Development Paradigms

Approach 1: Pure JSON Low-Code

Approach 2: JSON + Custom Extensions (Recommended)

Approach 3: Full-Code Development

Key Features Summary

Low-Code Pipeline Protocol

Agent-Based Extensibility

Cross-Platform Controllers

Resource Management

Hardware Acceleration

General UI Integration

Production-Grade Architecture

Multi-Language API Consistency

Ecosystem and Community

On this page

Overview

Purpose and Scope

What is MaaFramework

Architecture Philosophy

1. Declarative Low-Code with Imperative Escape Hatches

2. Agent-Based Process Separation

3. Platform Abstraction with Optimized Implementations

Core Component Architecture

MaaTasker

MaaResource

MaaController

MaaContext

MaaToolkit

Multi-Platform Support

Language Bindings Ecosystem

Development Paradigms

Approach 1: Pure JSON Low-Code

Approach 2: JSON + Custom Extensions (Recommended)

Approach 3: Full-Code Development

Key Features Summary

Low-Code Pipeline Protocol

Agent-Based Extensibility

Cross-Platform Controllers

Resource Management

Hardware Acceleration

General UI Integration

Production-Grade Architecture

Multi-Language API Consistency

Ecosystem and Community

On this page