fix(discord): retry chunked message sends on rate-limit mid-sequence by Sid-Qin · Pull Request #32904 · openclaw/openclaw

Sid-Qin · 2026-03-03T10:40:28Z

Summary

Problem: When a long agent response is split into multiple Discord message chunks (>2000 chars), if any chunk hits a 429 rate-limit or 5xx error mid-sequence, the error propagates from sendMessageDiscord and all remaining chunks are silently dropped.
Why it matters: Users see partial message delivery — the first N chunks appear but the rest never arrive. This is common with long responses that produce 5+ chunks in rapid succession.
What changed: Added a sendWithRetry() wrapper around the bot sender path in sendDiscordChunkWithFallback with up to 2 retry attempts, exponential backoff, and retry-after header support. Non-retryable errors (4xx except 429) are thrown immediately.
What did NOT change (scope boundary): Webhook delivery path already has its own fallback (catch → bot sender). Voice and media paths are unchanged.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes fix(discord): chunked message sends silently drop on rate-limit mid-sequence #32887

User-visible / Behavior Changes

Long agent responses split into multiple Discord messages will no longer silently drop remaining chunks when a mid-sequence rate-limit or server error occurs. Chunks are retried up to 2 times with backoff.

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No — same Discord API calls, just retried on failure
Command/tool execution surface changed? No
Data access scope changed? No

Repro + Verification

Environment

OS: Any
Runtime: Node.js 22+

Steps

Configure an agent producing long responses (>10,000 chars)
Response gets chunked into 5+ Discord messages
Simulate Discord rate-limit on chunk 3

Expected

All chunks are delivered (with retry delay on chunk 3)

Actual

Before: Chunks 4+ silently dropped
After: Chunk 3 retries with backoff, chunks 4+ delivered normally

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

4 new vitest unit tests:

retries bot sender on 429 rate-limit and succeeds
retries bot sender on 500 server error and succeeds
throws immediately on non-retryable 4xx errors
delivers remaining chunks after a mid-sequence retry

Human Verification (required)

Verified scenarios: 429 rate-limit retry, 500 server error retry, 403 immediate throw, multi-chunk delivery with mid-sequence failure
Edge cases checked: retry-after header parsing (NaN, 0, valid), all retries exhausted (throws), non-retryable status codes
What you did not verify: Live Discord API rate-limit behavior (tested with mocks only)

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No

Failure Recovery (if this breaks)

How to disable/revert: Revert this commit; the original direct call to sendMessageDiscord is restored
Files/config to restore: src/discord/monitor/reply-delivery.ts

Risks and Mitigations

Risk: Retry delay (up to 2s per attempt) could slow down chunk delivery when Discord is heavily rate-limiting
- Mitigation: Maximum 2 retries with bounded delay; non-retryable errors still throw immediately. The alternative (dropping chunks) is strictly worse.

The bot sender path in sendDiscordChunkWithFallback had no error handling, so a 429 or 5xx error on any chunk would drop all remaining chunks. Add sendWithRetry wrapper with up to 2 retries and retry-after header support. Closes openclaw#32887 Made-with: Cursor

aisle-research-bot · 2026-03-03T10:40:35Z

🤖 We're reviewing this PR with Aisle

We're running a security check on the changes in this PR now. This usually takes a few minutes. ⌛
We'll post the results here as soon as they're ready.

Progress:

Analysis
Triage
Finalization

Latest run failed. Keeping previous successful results. Trace ID: 019cb348ff30e11fe1c1862d3bbb051a.

_{Last updated on: 2026-03-03T10:40:42Z}

greptile-apps · 2026-03-03T10:43:44Z

Greptile Summary

This PR fixes a real bug where mid-sequence Discord rate-limit or server errors silently dropped all remaining message chunks. The solution — a sendWithRetry() wrapper with exponential backoff and retry-after header support — is well-scoped and correctly handles the retryable vs. non-retryable distinction.

One non-blocking issue worth addressing:

Test timers: The three new retry tests do not use vi.useFakeTimers(), so each invokes the real setTimeout with a 1,000 ms base delay, adding ~3 s of wall-clock time to every CI run. The "retry-after": "0.01" header on the 429 test was presumably intended to keep the test fast, but Math.max(retryAfterMs, RETRY_BASE_DELAY_MS * (attempt + 1)) overrides it with the base delay. The existing "touches bound-thread activity" test in this file shows the correct pattern using vi.useFakeTimers().

Confidence Score: 4/5

Safe to merge; the core retry logic is correct and the test timing issue is a non-blocking CI performance concern.
The retry logic itself is sound: loop bounds, retryable-status detection, and throw-on-non-retryable behavior are all correct. Tests cover the main scenarios (429, 500, 403, multi-chunk). The flagged issue (slow tests from missing fake timers) is a style/performance concern for the test suite, not a functional bug in the implementation.
reply-delivery.test.ts — three new retry tests need vi.useFakeTimers() wrappers to eliminate CI slowdown.

_{Last reviewed commit: e9c287e}

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-03-03T10:43:47Z

src/discord/monitor/reply-delivery.test.ts

+  it("retries bot sender on 429 rate-limit and succeeds (#32887)", async () => {
+    const rateLimitErr = Object.assign(new Error("rate limited"), {
+      status: 429,
+      headers: { "retry-after": "0.01" },
+    });
+    sendMessageDiscordMock
+      .mockRejectedValueOnce(rateLimitErr)
+      .mockResolvedValueOnce({ messageId: "msg-retry" });
+
+    await deliverDiscordReply({
+      replies: [{ text: "hello" }],
+      target: "channel:123",
+      token: "token",
+      runtime,
+      textLimit: 2000,
+    });
+
+    expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2);
+  });
+
+  it("retries bot sender on 500 server error and succeeds (#32887)", async () => {
+    const serverErr = Object.assign(new Error("internal"), { status: 500 });
+    sendMessageDiscordMock
+      .mockRejectedValueOnce(serverErr)
+      .mockResolvedValueOnce({ messageId: "msg-ok" });
+
+    await deliverDiscordReply({
+      replies: [{ text: "hello" }],
+      target: "channel:123",
+      token: "token",
+      runtime,
+      textLimit: 2000,
+    });
+
+    expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2);
+  });
+
+  it("throws immediately on non-retryable 4xx errors (#32887)", async () => {
+    const clientErr = Object.assign(new Error("forbidden"), { status: 403 });
+    sendMessageDiscordMock.mockRejectedValueOnce(clientErr);
+
+    await expect(
+      deliverDiscordReply({
+        replies: [{ text: "hello" }],
+        target: "channel:123",
+        token: "token",
+        runtime,
+        textLimit: 2000,
+      }),
+    ).rejects.toThrow("forbidden");
+
+    expect(sendMessageDiscordMock).toHaveBeenCalledTimes(1);
+  });
+
+  it("delivers remaining chunks after a mid-sequence retry (#32887)", async () => {
+    sendMessageDiscordMock
+      .mockResolvedValueOnce({ messageId: "c1" })
+      .mockRejectedValueOnce(Object.assign(new Error("rate limited"), { status: 429 }))
+      .mockResolvedValueOnce({ messageId: "c2-retry" })
+      .mockResolvedValueOnce({ messageId: "c3" });
+
+    await deliverDiscordReply({
+      replies: [{ text: "A".repeat(6) }],
+      target: "channel:123",
+      token: "token",
+      runtime,
+      textLimit: 2,
+    });
+
+    expect(sendMessageDiscordMock).toHaveBeenCalledTimes(4);
+  });


Three of the four new retry tests invoke real setTimeout delays, each adding ~1 second of wall-clock time to the test suite.

Tests 1, 2, and 4 trigger the retry logic in sendWithRetry() (line 141 of reply-delivery.ts), which calls setTimeout(resolve, delayMs). The computed delay is:

429 test (retry-after: "0.01"): delayMs = Math.max(10, 1_000 * 1) = 1_000 ms

500 test (no header): delayMs = Math.max(0, 1_000) = 1_000 ms

Mid-sequence test (no header): delayMs = Math.max(0, 1_000) = 1_000 ms

The existing "touches bound-thread activity" test in this file already demonstrates the pattern: wrap timer-sensitive code in vi.useFakeTimers() / vi.useRealTimers(). Applying the same pattern to the three affected tests eliminates the ~3s slowdown without changing test coverage:

Suggested change

it("retries bot sender on 429 rate-limit and succeeds (#32887)", async () => {

const rateLimitErr = Object.assign(new Error("rate limited"), {

status: 429,

headers: { "retry-after": "0.01" },

});

sendMessageDiscordMock

.mockRejectedValueOnce(rateLimitErr)

.mockResolvedValueOnce({ messageId: "msg-retry" });

await deliverDiscordReply({

replies: [{ text: "hello" }],

target: "channel:123",

token: "token",

runtime,

textLimit: 2000,

});

expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2);

});

it("retries bot sender on 500 server error and succeeds (#32887)", async () => {

const serverErr = Object.assign(new Error("internal"), { status: 500 });

sendMessageDiscordMock

.mockRejectedValueOnce(serverErr)

.mockResolvedValueOnce({ messageId: "msg-ok" });

await deliverDiscordReply({

replies: [{ text: "hello" }],

target: "channel:123",

token: "token",

runtime,

textLimit: 2000,

});

expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2);

});

it("throws immediately on non-retryable 4xx errors (#32887)", async () => {

const clientErr = Object.assign(new Error("forbidden"), { status: 403 });

sendMessageDiscordMock.mockRejectedValueOnce(clientErr);

await expect(

deliverDiscordReply({

replies: [{ text: "hello" }],

target: "channel:123",

token: "token",

runtime,

textLimit: 2000,

}),

).rejects.toThrow("forbidden");

expect(sendMessageDiscordMock).toHaveBeenCalledTimes(1);

});

it("delivers remaining chunks after a mid-sequence retry (#32887)", async () => {

sendMessageDiscordMock

.mockResolvedValueOnce({ messageId: "c1" })

.mockRejectedValueOnce(Object.assign(new Error("rate limited"), { status: 429 }))

.mockResolvedValueOnce({ messageId: "c2-retry" })

.mockResolvedValueOnce({ messageId: "c3" });

await deliverDiscordReply({

replies: [{ text: "A".repeat(6) }],

target: "channel:123",

token: "token",

runtime,

textLimit: 2,

});

expect(sendMessageDiscordMock).toHaveBeenCalledTimes(4);

});

it("retries bot sender on 429 rate-limit and succeeds (#32887)", async () => {

vi.useFakeTimers();

try {

const rateLimitErr = Object.assign(new Error("rate limited"), {

status: 429,

headers: { "retry-after": "0.01" },

});

sendMessageDiscordMock

.mockRejectedValueOnce(rateLimitErr)

.mockResolvedValueOnce({ messageId: "msg-retry" });

await deliverDiscordReply({

replies: [{ text: "hello" }],

target: "channel:123",

token: "token",

runtime,

textLimit: 2000,

});

expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2);

} finally {

vi.useRealTimers();

}

});

it("retries bot sender on 500 server error and succeeds (#32887)", async () => {

vi.useFakeTimers();

try {

const serverErr = Object.assign(new Error("internal"), { status: 500 });

sendMessageDiscordMock

.mockRejectedValueOnce(serverErr)

.mockResolvedValueOnce({ messageId: "msg-ok" });

await deliverDiscordReply({

replies: [{ text: "hello" }],

target: "channel:123",

token: "token",

runtime,

textLimit: 2000,

});

expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2);

} finally {

vi.useRealTimers();

}

});

it("delivers remaining chunks after a mid-sequence retry (#32887)", async () => {

vi.useFakeTimers();

try {

sendMessageDiscordMock

.mockResolvedValueOnce({ messageId: "c1" })

.mockRejectedValueOnce(Object.assign(new Error("rate limited"), { status: 429 }))

.mockResolvedValueOnce({ messageId: "c2-retry" })

.mockResolvedValueOnce({ messageId: "c3" });

await deliverDiscordReply({

replies: [{ text: "A".repeat(6) }],

target: "channel:123",

token: "token",

runtime,

textLimit: 2,

});

expect(sendMessageDiscordMock).toHaveBeenCalledTimes(4);

} finally {

vi.useRealTimers();

}

});

Prompt To Fix With AI

This is a comment left during a code review. Path: src/discord/monitor/reply-delivery.test.ts Line: 262-332 Comment: Three of the four new retry tests invoke real `setTimeout` delays, each adding ~1 second of wall-clock time to the test suite. Tests 1, 2, and 4 trigger the retry logic in `sendWithRetry()` (line 141 of reply-delivery.ts), which calls `setTimeout(resolve, delayMs)`. The computed delay is: - **429 test** (`retry-after: "0.01"`): `delayMs = Math.max(10, 1_000 * 1) = 1_000 ms` - **500 test** (no header): `delayMs = Math.max(0, 1_000) = 1_000 ms` - **Mid-sequence test** (no header): `delayMs = Math.max(0, 1_000) = 1_000 ms` The existing "touches bound-thread activity" test in this file already demonstrates the pattern: wrap timer-sensitive code in `vi.useFakeTimers()` / `vi.useRealTimers()`. Applying the same pattern to the three affected tests eliminates the ~3s slowdown without changing test coverage: ```suggestion it("retries bot sender on 429 rate-limit and succeeds (#32887)", async () => { vi.useFakeTimers(); try { const rateLimitErr = Object.assign(new Error("rate limited"), { status: 429, headers: { "retry-after": "0.01" }, }); sendMessageDiscordMock .mockRejectedValueOnce(rateLimitErr) .mockResolvedValueOnce({ messageId: "msg-retry" }); await deliverDiscordReply({ replies: [{ text: "hello" }], target: "channel:123", token: "token", runtime, textLimit: 2000, }); expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2); } finally { vi.useRealTimers(); } }); it("retries bot sender on 500 server error and succeeds (#32887)", async () => { vi.useFakeTimers(); try { const serverErr = Object.assign(new Error("internal"), { status: 500 }); sendMessageDiscordMock .mockRejectedValueOnce(serverErr) .mockResolvedValueOnce({ messageId: "msg-ok" }); await deliverDiscordReply({ replies: [{ text: "hello" }], target: "channel:123", token: "token", runtime, textLimit: 2000, }); expect(sendMessageDiscordMock).toHaveBeenCalledTimes(2); } finally { vi.useRealTimers(); } }); it("delivers remaining chunks after a mid-sequence retry (#32887)", async () => { vi.useFakeTimers(); try { sendMessageDiscordMock .mockResolvedValueOnce({ messageId: "c1" }) .mockRejectedValueOnce(Object.assign(new Error("rate limited"), { status: 429 })) .mockResolvedValueOnce({ messageId: "c2-retry" }) .mockResolvedValueOnce({ messageId: "c3" }); await deliverDiscordReply({ replies: [{ text: "A".repeat(6) }], target: "channel:123", token: "token", runtime, textLimit: 2, }); expect(sendMessageDiscordMock).toHaveBeenCalledTimes(4); } finally { vi.useRealTimers(); } }); ``` How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e9c287ed01

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-03T10:44:13Z

src/discord/monitor/reply-delivery.ts

+  await sendWithRetry(() =>
+    sendMessageDiscord(params.target, text, {
+      token: params.token,


Retry per physical send instead of whole sendMessageDiscord

Retrying the entire sendMessageDiscord call here is not idempotent and can duplicate output when that function has already sent part of the payload before failing. sendMessageDiscord can emit multiple Discord messages in one call (for example via sendDiscordText chunk loops when maxLinesPerMessage or forum-thread follow-up chunks apply), so if a later internal POST gets a 429/5xx, this wrapper replays earlier successful chunks on the next attempt and users receive duplicate content.

Useful? React with 👍 / 👎.

thewilloftheshadow · 2026-03-03T16:05:49Z

Superseded by #33226 (combined Discord chunk ordering + retry fixes).

openclaw-barnacle bot added channel: discord Channel integration: discord size: S experienced-contributor labels Mar 3, 2026

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 3, 2026

View reviewed changes

thewilloftheshadow mentioned this pull request Mar 3, 2026

fix(discord): retry chunk delivery #33226

Merged

18 tasks

thewilloftheshadow closed this Mar 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(discord): retry chunked message sends on rate-limit mid-sequence#32904

fix(discord): retry chunked message sends on rate-limit mid-sequence#32904
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/discord-chunk-retry-ratelimit-32887

Sid-Qin commented Mar 3, 2026 •

edited

Loading

Uh oh!

aisle-research-bot bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Mar 3, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 3, 2026

Uh oh!

thewilloftheshadow commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Sid-Qin commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Uh oh!

aisle-research-bot bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 3, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

thewilloftheshadow commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sid-Qin commented Mar 3, 2026 •

edited

Loading

aisle-research-bot bot commented Mar 3, 2026 •

edited

Loading