Add Actions: event-driven timed agent behavior by EwoutH · Pull Request #3308 · mesa/mesa

EwoutH · 2026-02-15T08:19:28Z

Summary

Adds a minimal Action system to mesa.experimental.actions, enabling agents to perform actions that take time, can be interrupted, and give proportional reward based on a reward curve.

Motive

Mesa 3.5 introduced event scheduling (schedule_event, run_for, etc.), but agents still lack a built-in concept of doing something over time. There's no way to know if an agent is busy, interrupt what it's doing, or get partial credit for incomplete work. This has been discussed extensively in #2526 (Tasks), #2529 (Continuous States), #2538 (Behavioral Framework), and #2858 (ActionSpace).

This PR introduces the minimal foundation that those proposals build on: timed actions with interruption and partial completion.

Implementation

Two classes in mesa/experimental/actions/__init__.py:

Action: Defines an action with name, duration, priority, reward curve, and on_effect callback. Tracks its own runtime state (progress, scheduled event).
ActionAgent: Agent subclass with current_action, start_action(), interrupt_for(), and cancel_action(). Integrates with model.schedule_event for completion timing.

Reward curves map progress [0,1] to effective completion [0,1]. Two built-ins: linear (default, proportional) and step (all-or-nothing). Any float → float callable works.

Key design decisions (mainly for simplicity, now):

Single on_effect(agent, completion) callback fires on both completion and interruption, scaled by the reward curve. No separate complete/interrupt reward logic.
interrupt_for() silently ignores if current action is non-interruptible.
start_action() raises if agent is already busy (explicit over implicit).
Agent.remove() cancels any scheduled action event.

Usage Examples

from mesa import Model
from mesa.experimental.actions import Action, ActionAgent, step


class Sheep(ActionAgent):
    def __init__(self, model):
        super().__init__(model)
        self.energy = 50.0
        self.alive = True

        # Start the first action
        self.start_action(self.forage())

    def forage(self):
        """Forage for food. Linear reward: partial grazing still gives some energy."""
        return Action(
            "forage", duration=5.0, priority=1,
            on_effect=self.gain_energy,
        )

    def flee(self):
        """Flee from predator. All-or-nothing: must complete to survive."""
        return Action(
            "flee", duration=2.0, priority=10,
            interruptible=False, reward_curve=step,
            on_effect=self.resolve_flee,
        )

    def gain_energy(self, agent, completion):
        agent.energy = min(100, agent.energy + 30 * completion)
        # Done foraging (or interrupted), start again
        if not agent.is_busy:
            agent.start_action(agent.forage())

    def resolve_flee(self, agent, completion):
        if completion < 1.0:
            agent.alive = False  # Didn't escape
        elif not agent.is_busy:
            agent.start_action(agent.forage())  # Safe, back to grazing

    def spot_predator(self):
        """Called externally (e.g. by a Wolf or the model). Triggers flee."""
        self.interrupt_for(self.flee())


# Run it
model = Model()
sheep = Sheep(model)
model.run_for(3)          # Sheep forages for 3 time units
sheep.spot_predator()     # Wolf! Forage interrupted (3/5 = 60% energy gained), flee starts
model.run_for(2)          # Flee completes, sheep goes back to foraging

Additional Notes

This is deliberately minimal. Future work (not in this PR):

Callable duration/priority for stochastic/state-dependent values
Preconditions and available_actions repertoire
evaluate() / select_action() behavioral loop
Action queue and resumability
Integration into the base Agent class (pending stabilization)

See the #3304 for the full roadmap.

Introduce mesa.experimental.actions with Action and ActionAgent classes. Actions have duration, priority, reward curves, and integrate with Mesa's event scheduling for precise timing. Agents can start, interrupt, and cancel actions, receiving proportional reward based on progress.

Cover action lifecycle (start, complete, interrupt, cancel), reward curves (linear, step, custom), non-interruptible actions, agent removal cleanup, and multi-interrupt integration sequences.

quaquel · 2026-02-15T09:52:55Z