fix: avoid manual cron deadlock on reentrant lanes#42161
fix: avoid manual cron deadlock on reentrant lanes#42161siewcapital wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes a deadlock that occurs when The fix is well-scoped, properly tested with focused regression tests (direct re-entry and cross-lane re-entry), and includes sensible draining semantics: re-entrant work is allowed to complete even during gateway drain, which is correct because such work is part of an already-committed outer task. The code is safe to merge. Confidence Score: 5/5
Last reviewed commit: f2e70dc |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f2e70dcd4d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| state.activeTaskIds.add(taskId); | ||
| void (async () => { | ||
| const startTime = Date.now(); | ||
| const parentStack = laneExecutionScope.getStore() ?? []; |
There was a problem hiding this comment.
Preserve caller lane context at enqueue time
This captures lane ancestry at dequeue time (laneExecutionScope.getStore() inside drainLane) instead of when the task is enqueued, so queued middle-lane work can lose the outer lane that is still waiting on it. In a cron -> session -> cron chain where session is already busy, the delayed session callback may start without cron in scope, making the inner enqueueCommandInLane("cron", ...) queue behind the still-active outer cron task and deadlock again under load.
Useful? React with 👍 / 👎.
| const parentStack = laneExecutionScope.getStore() ?? []; | ||
| try { | ||
| const result = await entry.task(); | ||
| const result = await laneExecutionScope.run([...parentStack, lane], async () => { |
There was a problem hiding this comment.
Avoid duplicating lane markers on every dequeue
Building each task scope as run([...parentStack, lane], ...) causes the same lane to be appended repeatedly because pump() is invoked from the previous task’s context; with a long queue this grows the stack every task and repeatedly copies larger arrays, and stale ancestor lanes can persist longer than intended, increasing false-positive reentrancy matches and bypassing normal queueing semantics.
Useful? React with 👍 / 👎.
Summary
openclaw cron run <jobId>can enqueue successfully but never execute because the samecronlane is re-entered through the embedded runner path.Change Type
Scope
Linked Issue/PR
User-visible / Behavior Changes
Security Impact
NoNoNoNoNoYes, explain risk + mitigation:Repro + Verification
Environment
Steps
Expected
cron -> session -> cronre-entry.Actual
Evidence
Human Verification
What you personally verified (not just CI), and how:
pnpm vitest run src/process/command-queue.test.ts src/cron/service.issue-regressions.test.ts,pnpm exec oxlint src/process/command-queue.ts src/process/command-queue.test.ts, andpnpm buildcron -> session -> cronre-entryReview Conversations
Compatibility / Migration
YesNoNoFailure Recovery (if this breaks)
src/process/command-queue.tssrc/process/command-queue.ts,src/process/command-queue.test.tsRisks and Mitigations