-
-
Notifications
You must be signed in to change notification settings - Fork 69.4k
Feature: heartbeat.skipIfRunActive — skip heartbeat when an agent run is in progress #29537
Description
Problem
Heartbeats fire on a fixed schedule regardless of whether the agent is mid-execution. The current queue check in runHeartbeatOnce only catches pending incoming messages:
if (getQueueSize(CommandLane.Main) > 0) return { status: "skipped", reason: "requests-in-flight" };But when a session is actively running — tool calls, subagents, extended agentic work — the queue is empty. All work is in-flight, not pending. So the heartbeat fires mid-run anyway, either competing as a concurrent session, queuing up to fire the moment the run finishes, or in some queue modes steering directly into the active session.
The Gap
isEmbeddedPiRunActive() already exists in the codebase and is used throughout the queue mode logic (reply-*.js). It correctly detects an active agent run. It just isn't called in runHeartbeatOnce before deciding whether to fire.
Proposed Fix
Add an opt-in config key:
{
agents: {
defaults: {
heartbeat: {
every: "30m",
skipIfRunActive: true // skip this heartbeat tick if isEmbeddedPiRunActive() is true
}
}
}
}And in runHeartbeatOnce, after the requests-in-flight check:
if (opts.skipIfRunActive && isEmbeddedPiRunActive(sessionId)) {
return { status: "skipped", reason: "run-active" };
}Why This Is Different From #19382 and #11393
#19382 proposes skipIfRecentActivity — a time-based heuristic on when the user last sent a message. That doesn't help when the agent has been running a long autonomous task for 10+ minutes with no new user message in the window.
#11393 is about heartbeat responses ending up in the wrong conversation thread — a routing problem, not a scheduling one.
This is about run-state awareness. The heartbeat scheduler should know when the agent is actively executing and yield. The state it needs (isEmbeddedPiRunActive) is already tracked — it just needs to reach the heartbeat runner.
Use Case
On gateways running long agentic tasks — coding sessions, research chains, multi-tool workflows — there's currently no way to prevent heartbeats from firing mid-run short of disabling them entirely (every: "0m"). That's fine for gateways that are always busy, but it removes a genuinely useful background health check for gateways that are idle most of the time.