Message desync after long agent output (responses shifted by one)

## Bug Description

After a long agent output, subsequent messages receive responses intended for the *previous* message. Sending another message then returns the response that should have been delivered to the first one. Responses are effectively shifted by one. Reproducible on both Discord and Telegram channels.

**Version:** openclaw 2026.3.13

## Steps to Reproduce

1. Send a message that triggers a long agent output (e.g., research task, code review)
2. Wait for the response to be fully delivered
3. Send a new message immediately after
4. Observe: the response received is stale/old (from a previous context)
5. Send another message — now receive the response that should have been delivered in step 4

## Root Cause Analysis

Three interacting defects in the message processing pipeline create a race condition where response N gets delivered to message N+1:

### Defect 1 (Primary): Telegram debouncer breaks `sequentialize` serialization

**Location:** `discord-CcCLMjHw.js` lines ~125364-125394, ~154182

The grammY `sequentialize` middleware serializes all updates per chat by holding a lock until the handler returns. However, the inbound debouncer's `enqueue()` method **returns immediately** when it decides to buffer a message, releasing the sequentialize lock before actual processing begins. Real processing happens later via `setTimeout`.

**Result:** Two messages for the same chat process concurrently, destroying ordering guarantees.

**Sequence:**
1. Message A arrives → sequentialize acquires lock
2. Message A enters debouncer → debouncer returns immediately (buffering) → lock released
3. Message B arrives → sequentialize acquires lock (it's free now) → B enters processing
4. Debouncer fires for A → A processes concurrently with B
5. Responses may be swapped depending on completion order

### Defect 2: Stale `FOLLOWUP_RUN_CALLBACKS` cause cross-message delivery

**Location:** lines ~78452-78460 (`kickFollowupDrainIfIdle`), ~78483 (`scheduleFollowupDrain`)

`FOLLOWUP_RUN_CALLBACKS` is a global `Map` keyed by session key. When message A's run finishes, `finalizeWithFollowup` stores A's `runFollowupTurn` callback — which **closes over A's `opts`** including `opts.onBlockReply` (A's reply dispatcher).

When a later message triggers `kickFollowupDrainIfIdle`, it retrieves the stale callback from A's context and uses it to drain the queue, routing responses through the wrong delivery pipeline.

### Defect 3: `finalizeWithFollowup` starts drain before delivery completes

**Location:** lines ~120315-120317

```js
const finalizeWithFollowup = (value, queueKey, runFollowupTurn) => {
    scheduleFollowupDrain(queueKey, runFollowupTurn);
    return value;
};
```

The drain is scheduled **simultaneously** with returning the payload. The next run begins before `withReplyDispatcher` has finished flushing the current run's delivery chain, creating a race between current delivery and followup processing.

### Contributing: Command lane `pump()` ordering

**Location:** lines ~49546-49548

The next queued task starts executing **before** the current task's promise resolves, meaning the followup PI run can begin before the original message's delivery pipeline has been notified of completion.

## Why Long Outputs Trigger It

- **Longer active-run window** = higher probability the next user message arrives during response delivery
- **More block reply chunks in flight** = more interleaving opportunities when followup drain starts concurrently
- **Debounce timing alignment** = 1-second debounce window + serialization bypass makes the race near-certain after long outputs
- **Stale callback window grows** = more time between setting and using `FOLLOWUP_RUN_CALLBACKS`

## Suggested Fixes

1. **Fix debouncer serialization** (critical): Make `enqueue()` return a promise that resolves after processing completes, not after buffering. This restores the `sequentialize` guarantee. Alternatively, move the debouncer inside the sequentialize-protected handler.

2. **Fix stale callbacks** (critical): Store the `runFollowupTurn` callback on the followup queue item itself rather than in a separate global map. Or update `FOLLOWUP_RUN_CALLBACKS` at the start of each new run.

3. **Fix drain timing** (important): Move `scheduleFollowupDrain` to execute **after** `withReplyDispatcher` completes all pending deliveries, not inside `finalizeWithFollowup`.

4. **Fix command lane ordering** (defensive): Resolve the current entry's promise **before** calling `pump()` to start the next task.

All fixes are localized to the queuing and delivery infrastructure. Fix 1 alone would likely eliminate the bug for Telegram. Fix 2 addresses remaining edge cases on both platforms.

## Environment

- OpenClaw version: 2026.3.13
- Platforms affected: Discord, Telegram
- Node.js: v22.22.1
- OS: Linux 6.8.0-100-generic (x64)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Message desync after long agent output (responses shifted by one) #52982

Bug Description

Steps to Reproduce

Root Cause Analysis

Defect 1 (Primary): Telegram debouncer breaks `sequentialize` serialization

Defect 2: Stale `FOLLOWUP_RUN_CALLBACKS` cause cross-message delivery

Defect 3: `finalizeWithFollowup` starts drain before delivery completes

Contributing: Command lane `pump()` ordering

Why Long Outputs Trigger It

Suggested Fixes

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Message desync after long agent output (responses shifted by one) #52982

Description

Bug Description

Steps to Reproduce

Root Cause Analysis

Defect 1 (Primary): Telegram debouncer breaks sequentialize serialization

Defect 2: Stale FOLLOWUP_RUN_CALLBACKS cause cross-message delivery

Defect 3: finalizeWithFollowup starts drain before delivery completes

Contributing: Command lane pump() ordering

Why Long Outputs Trigger It

Suggested Fixes

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Defect 1 (Primary): Telegram debouncer breaks `sequentialize` serialization

Defect 2: Stale `FOLLOWUP_RUN_CALLBACKS` cause cross-message delivery

Defect 3: `finalizeWithFollowup` starts drain before delivery completes

Contributing: Command lane `pump()` ordering