-
-
Notifications
You must be signed in to change notification settings - Fork 69.4k
Core: Global task idempotency/concurrency guard to prevent duplicate runs under session-lane backpressure #18760
Copy link
Copy link
Closed as not planned
Closed as not planned
Copy link
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Description
Summary
In multi-channel + multi-session production use, we repeatedly hit session-lane backpressure and duplicate sub-task dispatches (especially when long-running tasks and sessions_send/process poll overlap). This causes:
- "ACK visible, body delayed/missing" behavior in chat channels
- cascading timeouts on
sessions_send/tool calls - repeated task launches that look like "infinite retry"
- lock conflicts in downstream scripts due to duplicated dispatch
There are adjacent issues (#14214, #13682, #17569, #7108, #16583), but we still lack a first-class global task idempotency/concurrency guard at platform level.
Current behavior
- A channel session starts a long task (or nested tool loop).
- User sends follow-up messages and/or automation sends additional commands.
- Session lane queue grows; later messages wait behind active runs.
- New runs can still be spawned for logically same task (no built-in task key guard).
- Downstream task runners observe lock conflicts / duplicate work.
Why this is a core platform gap
This is not business-script specific: any non-trivial orchestration with long tasks + multiple message sources can reproduce it.
Proposal
Add a built-in Task Guard abstraction in OpenClaw runtime/tools layer:
taskKeyconvention:<sessionKey>::<taskName>(or explicit task namespace)- Atomic
acquire(taskKey)/release(taskKey, status) - Standard conflict response:
already_running+ currentrunId/startedAt - Failed-task retry policy: require explicit
approve-retrybefore next acquire - Optional TTL + stale-lock healing
- Surface guard state in session/task diagnostics (
queue_status-like visibility)
Suggested integration points
sessions_spawn- high-risk tool pipelines (long exec/poll loops)
- optional guard hook in auto-reply dispatch path before launching taskful flows
Acceptance criteria
- Concurrent same-key launches produce exactly one running task.
- Duplicate triggers return deterministic
already_runningresponse (no new run). - Failed tasks cannot silently auto-retry without explicit approval.
- Queue pressure scenarios no longer amplify duplicate task creation.
- Observability: operators can inspect active guards and stale guards.
Reproduction hints
- Multi-channel Discord setup
- Start long-running task in one channel session
- Send repeated trigger messages/commands for same logical task while first run is active
- Observe current behavior: queue delay + duplicate launches + lock contention
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Type
Fields
Give feedbackNo fields configured for issues without a type.