Skip to content

fix(discord): retry chunk delivery#33226

Merged
thewilloftheshadow merged 1 commit intomainfrom
pick/discord-chunk-retry-order
Mar 3, 2026
Merged

fix(discord): retry chunk delivery#33226
thewilloftheshadow merged 1 commit intomainfrom
pick/discord-chunk-retry-order

Conversation

@thewilloftheshadow
Copy link
Copy Markdown
Member

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: Discord chunk delivery can mis-order or abort on rate-limit/server errors during multi-chunk sends.
  • Why it matters: Long replies can arrive out of order or stop mid-sequence, reducing message reliability.
  • What changed: Pre-resolve channel/retry runner, send chunks via sendDiscordText when possible, and add retry handling for 429/5xx using account-level retry config.
  • What did NOT change (scope boundary): No changes to non-Discord channels or Telegram draft behavior.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

Discord chunked message delivery now retries on 429/5xx errors (using account retry config) and preserves chunk ordering when using a REST client.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: N/A
  • Runtime/container: N/A
  • Model/provider: N/A
  • Integration/channel (if any): Discord
  • Relevant config (redacted): N/A

Steps

  1. N/A

Expected

  • N/A

Actual

  • N/A

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: Not run (not requested).
  • Edge cases checked: Not run.
  • What you did not verify: All runtime behavior.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert this PR.
  • Files/config to restore: src/discord/monitor/reply-delivery.ts, src/discord/monitor/reply-delivery.test.ts.
  • Known bad symptoms reviewers should watch for: Discord chunked replies retrying or arriving out of order.

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: Additional retries could delay delivery under persistent 429/5xx responses.
    • Mitigation: Uses account-level retry config (attempts/delays) to control behavior.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 3, 2026

Greptile Summary

This PR improves Discord chunk delivery reliability by pre-resolving channel ID and RetryRunner once (instead of per-chunk), routing text chunks through sendDiscordText when a RequestClient is available, and wrapping sends in sendWithRetry for 429/5xx handling using account-level retry config.

However, the implementation introduces two logic issues that should be addressed:

  1. Nested retry loops on the fast path (lines 169-175): When channelId, request, and rest are all present, sendDiscordText is wrapped in sendWithRetry, but sendDiscordText itself already uses the request RetryRunner to wrap each rest.post() call. With defaults of 3 attempts each, a persistent 429 could result in up to 9 total API calls rather than the configured 3, violating the account-level retry config.

  2. Inconsistent retry coverage (lines 349-357): The primary (first) media attachment send is not wrapped in sendWithRetry, while sendAdditionalDiscordMedia wraps subsequent attachments. Both perform semantically identical operations but receive different error handling, which could lead to unpredictable behavior under transient failures.

Confidence Score: 3/5

  • Mostly safe to merge with low risk to existing behavior, but nested retry and inconsistent retry coverage are architectural concerns that could cause unexpected delays or asymmetric error handling under rate-limit or server-error conditions.
  • The core improvements (pre-resolving channel ID and retry runner, ordering guarantees) are sound and well-tested on the fallback path. However, the fast path with nested retry loops can silently exceed configured retry limits under persistent 429 conditions, and the primary media send lacks consistent retry handling compared to additional media items. These issues only manifest under error conditions, so impact may be limited in practice, but they prevent a higher confidence without resolution.
  • src/discord/monitor/reply-delivery.ts — specifically the sendDiscordChunkWithFallback fast path (lines 169–175) and the primary media send (lines 349–357)

Last reviewed commit: badf91c

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +169 to +175
if (params.channelId && params.request && params.rest) {
await sendWithRetry(
() => sendDiscordText(params.rest, params.channelId, text, params.replyTo, params.request),
params.retryConfig,
);
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested retry loops can multiply attempts beyond configured limits

When using the fast path (channelId && request && rest), sendDiscordText is wrapped in sendWithRetry, but sendDiscordText also internally uses the request RetryRunner (created via createDiscordRetryRunner) to wrap each rest.post() call. The inner request runner retries on RateLimitError (status 429), and when it exhausts its attempts and throws, the outer sendWithRetry sees a 429-status error and retries the entire sendDiscordText call again.

With defaults of 3 attempts each, a persistent 429 could result in 3 × 3 = 9 total REST calls to Discord, rather than the intended 3. This violates the account-level retry config and can cause unexpectedly long delivery delays.

Since sendDiscordText already handles chunking internally and accepts a request runner for retrying at the REST-call level, consider either:

  • Dropping the outer sendWithRetry on the fast path (since request already retries)
  • Or stripping the inner request runner and relying solely on the outer sendWithRetry
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/discord/monitor/reply-delivery.ts
Line: 169-175

Comment:
Nested retry loops can multiply attempts beyond configured limits

When using the fast path (`channelId && request && rest`), `sendDiscordText` is wrapped in `sendWithRetry`, but `sendDiscordText` also internally uses the `request` `RetryRunner` (created via `createDiscordRetryRunner`) to wrap each `rest.post()` call. The inner `request` runner retries on `RateLimitError` (status 429), and when it exhausts its attempts and throws, the outer `sendWithRetry` sees a 429-status error and retries the entire `sendDiscordText` call again.

With defaults of 3 attempts each, a persistent 429 could result in `3 × 3 = 9` total REST calls to Discord, rather than the intended 3. This violates the account-level retry config and can cause unexpectedly long delivery delays.

Since `sendDiscordText` already handles chunking internally and accepts a `request` runner for retrying at the REST-call level, consider either:
- Dropping the outer `sendWithRetry` on the fast path (since `request` already retries)
- Or stripping the inner `request` runner and relying solely on the outer `sendWithRetry`

How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 3, 2026

Additional Comments (1)

src/discord/monitor/reply-delivery.ts
Primary media send missing retry coverage

sendAdditionalDiscordMedia (subsequent media items) is now wrapped in sendWithRetry via the updated helper, but this primary (first) media attachment send is not. A transient 429 or 5xx here will fail immediately without retrying, while the same error on subsequent attachments would be retried. For consistency, this call should also be wrapped in sendWithRetry(…, params.retryConfig).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/discord/monitor/reply-delivery.ts
Line: 349-357

Comment:
Primary media send missing retry coverage

`sendAdditionalDiscordMedia` (subsequent media items) is now wrapped in `sendWithRetry` via the updated helper, but this primary (first) media attachment send is not. A transient 429 or 5xx here will fail immediately without retrying, while the same error on subsequent attachments would be retried. For consistency, this call should also be wrapped in `sendWithRetry(…, params.retryConfig)`.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: badf91cc47

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +169 to +173
if (params.channelId && params.request && params.rest) {
await sendWithRetry(
() => sendDiscordText(params.rest, params.channelId, text, params.replyTo, params.request),
params.retryConfig,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve outbound activity tracking on direct chunk sends

This fast-path returns after sendDiscordText when rest and a channelId are present, bypassing sendMessageDiscord, which is currently where Discord outbound activity is recorded. In the monitor flow (which passes rest and channel targets), text-only replies will no longer update lastOutboundAt, so channel status endpoints can report stale outbound health despite successful sends. Please record outbound activity in this branch or route through a shared sender that preserves that side effect.

Useful? React with 👍 / 👎.

@thewilloftheshadow thewilloftheshadow force-pushed the pick/discord-chunk-retry-order branch from badf91c to e847319 Compare March 3, 2026 16:17
@thewilloftheshadow thewilloftheshadow merged commit 6593a57 into main Mar 3, 2026
3 checks passed
@thewilloftheshadow thewilloftheshadow deleted the pick/discord-chunk-retry-order branch March 3, 2026 16:17
@thewilloftheshadow
Copy link
Copy Markdown
Member Author

Landed via temp rebase onto main.

  • Gate: pnpm lint && pnpm build && pnpm test src/discord/monitor/reply-delivery.test.ts
  • Land commit: e847319
  • Merge commit: 6593a57

Thanks @thewilloftheshadow!