Add Actions: event-driven timed agent behavior#3308
Add Actions: event-driven timed agent behavior#3308
Conversation
Introduce mesa.experimental.actions with Action and ActionAgent classes. Actions have duration, priority, reward curves, and integrate with Mesa's event scheduling for precise timing. Agents can start, interrupt, and cancel actions, receiving proportional reward based on progress.
Cover action lifecycle (start, complete, interrupt, cancel), reward curves (linear, step, custom), non-interruptible actions, agent removal cleanup, and multi-interrupt integration sequences.
| duration: float = 1.0, | ||
| priority: float = 0.0, | ||
| reward_curve: Callable[[float], float] = linear, | ||
| on_effect: Callable[[ActionAgent, float], None] | None = None, |
There was a problem hiding this comment.
I would leave this out of the class and instead make Action an ABC and have users implement this themselves.
I am even wondering whether duration should be user specified rather than being a float parameter. Basically, I see an action as something that is started and at start_time it schedules its time of completion.
This time of completion is a number, but it might be sampled from a distribution.
Next, there should be a on_completed() abstract method, and an on_interrupted() abstract method.
I would leave priority out of it completely because I don't see the value yet of integrating ranking and other decision logic inside the Action primititive.
| # Runtime state | ||
| self.progress: float = 0.0 | ||
| self._started_at: float | None = None | ||
| self._event: Event | None = None |
There was a problem hiding this comment.
I would make it the responsibility of the action to schedule this event, potentially via a start_action() method. I think this is much cleaner than doing this from the agent side as you do in this PR.
| self._event: Event | None = None | ||
|
|
||
| @property | ||
| def effective_completion(self) -> float: |
| @property | ||
| def effective_completion(self) -> float: | ||
| """Reward earned at current progress, based on reward curve.""" | ||
| return self.reward_curve(self.progress) |
There was a problem hiding this comment.
I would leave this out completely. This does not generalize at all. For something like a grazing model, this makes sense. But in say a call center model, this makes no sense at all. I would replace it with an on_interupt and on_completed method, both of which the user has to specify.
| return f"Action({self.name!r}, progress={self.progress:.0%})" | ||
|
|
||
|
|
||
| class ActionAgent(Agent): |
There was a problem hiding this comment.
I am not sure we need a new Agent subclass or we can just have a more developed subclasseable Action primitive.
|
Thanks for everyone's input. I used it to create a second version, which succeeds this PR: |
Summary
Adds a minimal Action system to
mesa.experimental.actions, enabling agents to perform actions that take time, can be interrupted, and give proportional reward based on a reward curve.Motive
Mesa 3.5 introduced event scheduling (
schedule_event,run_for, etc.), but agents still lack a built-in concept of doing something over time. There's no way to know if an agent is busy, interrupt what it's doing, or get partial credit for incomplete work. This has been discussed extensively in #2526 (Tasks), #2529 (Continuous States), #2538 (Behavioral Framework), and #2858 (ActionSpace).This PR introduces the minimal foundation that those proposals build on: timed actions with interruption and partial completion.
Implementation
Two classes in
mesa/experimental/actions/__init__.py:Action: Defines an action with name, duration, priority, reward curve, andon_effectcallback. Tracks its own runtime state (progress, scheduled event).ActionAgent: Agent subclass withcurrent_action,start_action(),interrupt_for(), andcancel_action(). Integrates withmodel.schedule_eventfor completion timing.Reward curves map progress
[0,1]to effective completion[0,1]. Two built-ins:linear(default, proportional) andstep(all-or-nothing). Anyfloat → floatcallable works.Key design decisions (mainly for simplicity, now):
on_effect(agent, completion)callback fires on both completion and interruption, scaled by the reward curve. No separate complete/interrupt reward logic.interrupt_for()silently ignores if current action is non-interruptible.start_action()raises if agent is already busy (explicit over implicit).Agent.remove()cancels any scheduled action event.Usage Examples
Additional Notes
This is deliberately minimal. Future work (not in this PR):
available_actionsrepertoireevaluate()/select_action()behavioral loopAgentclass (pending stabilization)See the #3304 for the full roadmap.