This document provides a high-level introduction to MaaFramework, explaining its design goals, architectural principles, and core capabilities. It serves as an entry point for understanding the framework's purpose, component structure, and ecosystem.
For specific integration instructions, see Quick Start. For detailed technical specifications, see Core Architecture and Pipeline Protocol.
MaaFramework is a cross-platform automation black-box testing framework built on image recognition technology. It represents a complete rewrite and refinement of lessons learned from the MAA (MaaAssistantArknights) project, designed to provide developers with powerful automation capabilities through a low-code approach that maintains high extensibility.
The framework's core objective is to enable developers to create sophisticated automation workflows using primarily declarative JSON configurations (the Pipeline Protocol), while seamlessly integrating custom logic in any programming language when needed through the Agent System.
Primary Goals:
Sources: README.md42-46 README_en.md44-48
MaaFramework's design is built on three foundational principles:
The framework prioritizes Pipeline JSON definitions for workflow logic, allowing users to describe tasks declaratively without writing code. Complex recognition logic (template matching, OCR, neural networks) and common actions (clicks, swipes, text input) are exposed as JSON properties. When declarative patterns prove insufficient, the framework provides Custom Recognition and Custom Action extension points that execute in separate processes via the Agent System.
The Agent Architecture decouples the core C++ engine from user-defined logic. Custom recognition and action implementations run in independent processes connected via ZeroMQ IPC. This separation breaks the language barrier—a C# GUI can orchestrate tasks while Python code handles custom vision algorithms, all without shared memory or FFI complexity.
The Controller abstraction provides a unified device interaction API (screencap, click, swipe, etc.) while each platform implementation (AdbController, Win32Controller, etc.) selects optimal methods at runtime. For example, AdbController tests seven screencap methods on initialization and caches the fastest, achieving 60+ FPS on high-end devices.
Sources: README.md44-46 docs/en_us/1.1-QuickStarted.md4-95 docs/en_us/4.2-StandardizedInterfaceDesign.md1-31
MaaFramework Core Component Architecture
The central orchestration component. MaaTasker loads pipeline definitions from MaaResource, executes recognition-action loops via MaaController, and manages execution context through MaaContext. It exposes the post_task API for asynchronous job submission and provides query interfaces for task status and results.
Asset management system responsible for loading and validating pipeline JSON files, caching image templates, managing OCR/ONNX models, and providing runtime access to these resources. Supports multi-bundle loading with priority-based override semantics.
Platform-agnostic device interface. Implementations (AdbController, Win32Controller, etc.) provide screencap and input capabilities optimized for each platform. The controller handles connection management, performance testing, and device feature reporting.
Execution environment for pipeline tasks. Maintains the current image, recognition results, and runtime state. Provides context objects to custom recognition/action implementations and manages IPC communication with Agent processes.
Utility collection providing device/window discovery (find_adb_devices, find_desktop_windows), configuration management (init_option), and ProjectInterface protocol support.
Sources: README.md42-58 docs/en_us/1.2-ExplanationOfTerms.md1-43
Multi-Platform Controller Architecture
MaaFramework achieves cross-platform support through the Controller abstraction layer. Each platform implementation provides optimized screencap and input methods selected via runtime testing or priority fallback:
| Platform | Controller Class | Target Devices | Key Features |
|---|---|---|---|
| Android | AdbController | Physical devices, emulators | 7 screencap methods, 4 input methods, MaaAgentBinary deployment |
| Windows | Win32Controller | Desktop applications, games | DXGI Desktop Duplication, message-based input, pseudo-minimize |
| macOS | PlayCoverController | iOS apps via PlayCover | iOS app automation on macOS |
| Linux | WlRootsController | Wayland compositors | Native Wayland support |
| Virtual | GamepadController | ViGEm virtual gamepads | Xbox 360/DualShock 4 emulation |
| Custom | CustomController | User-defined platforms | Extensibility interface |
The DbgController enables testing with pre-recorded screenshots, eliminating device requirements during development.
Sources: README.md19-20 README_en.md19-20
Language Bindings Architecture
MaaFramework employs a two-layer binding strategy to ensure API consistency across languages while maintaining idiomatic patterns:
Layer 1: Standardized OOP Abstraction
All bindings implement a common object model defined in the Standardized Interface Design:
MaaTasker, MaaController, MaaResource become Tasker, Controller, Resource classes with automatic handle managementMaaTaskId, MaaCtrlId) are wrapped in Job/Future objects with wait(), status(), get() methodsRecoDetail, NodeDetail, TaskDetail)Layer 2: Language-Specific Idioms
Each binding adapts the standardized layer to language conventions:
| Language | Package Name | Distribution | Idiomatic Patterns |
|---|---|---|---|
| Python | maafw | PyPI | Type hints, context managers (with), property decorators |
| C# | Maa.Framework.Runtimes | NuGet | async/await, LINQ, IDisposable |
| NodeJS | @maaxyz/maa-node | npm | Promises, async/await, EventEmitter |
| Go | maa-framework-go | pkg.go.dev | Interfaces, defer, error handling |
| Rust | maa-framework | crates.io | Result types, RAII, trait objects |
Sources: README.md25-29 README_en.md25-28 docs/en_us/4.2-StandardizedInterfaceDesign.md10-25
MaaFramework supports three integration approaches, allowing developers to choose based on complexity requirements:
Applicable Scenarios: Simple automation tasks, quick prototyping, non-programmers
Zero code required. Define entire workflows in JSON using the Pipeline Protocol:
Supported by visual editors like MaaPipelineEditor for drag-and-drop development.
Applicable Scenarios: Most production use cases requiring advanced logic
Retains JSON's visual pipeline structure while delegating complex logic to custom implementations running in Agent processes:
This approach preserves ecosystem tool compatibility (visual debugger, General UI) while enabling arbitrary code complexity.
Applicable Scenarios: Maximum flexibility, deep customization, non-standard workflows
Direct API usage without Pipeline JSON, trading ecosystem tools for complete control:
Sources: docs/en_us/1.1-QuickStarted.md4-113
reload API)interface.json for resource discovery and configurationsave_draw), error screenshotsSources: README.md42-58 docs/en_us/1.1-QuickStarted.md1-227 docs/en_us/1.2-ExplanationOfTerms.md1-43
MaaFramework has spawned a rich ecosystem of tools and applications:
General UI Applications: MFAAvalonia MXU MWU
Development Tools: MaaDebugger MaaPipelineEditor, VSCode Extension, MFAToolsPlus
Production Applications: 30+ game assistants and automation tools (see README.md104-187)
The framework is actively maintained with monthly releases, comprehensive CI/CD (8 platform-architecture combinations), and multi-channel distribution (PyPI, NuGet, npm).
Sources: README.md59-187 README_en.md61-189
Refresh this wiki