Skip to content

feat(mattermost): add retry logic and timeout handling for DM channel creation#42398

Merged
mukhtharcm merged 8 commits intoopenclaw:mainfrom
JonathanJing:feat/mattermost-dm-retry-logic
Mar 17, 2026
Merged

feat(mattermost): add retry logic and timeout handling for DM channel creation#42398
mukhtharcm merged 8 commits intoopenclaw:mainfrom
JonathanJing:feat/mattermost-dm-retry-logic

Conversation

@JonathanJing
Copy link
Copy Markdown
Contributor

@JonathanJing JonathanJing commented Mar 10, 2026

Summary

Adds exponential backoff retry mechanism for Mattermost DM channel creation to handle transient failures (429, 5xx, network errors).

Problem

PR #29925 introduced user-first resolution for DM media uploads, but the initial DM channel creation has no retry mechanism. If the first attempt fails due to transient errors (rate limiting, temporary server issues), the send operation fails immediately.

Solution

  • Add createMattermostDirectChannelWithRetry() with configurable retry options
  • Support maxRetries, initialDelayMs, maxDelayMs, timeoutMs parameters
  • Add dmChannelRetry config option to Mattermost account schema
  • Allow per-send override via dmRetryOptions in MattermostSendOpts
  • Add comprehensive tests for retry logic and error handling

Retry Behavior

  • Retry on: 429 rate limits, 5xx server errors, network/transient errors
  • Don't retry on: 4xx client errors (except 429)
  • Backoff: Exponential with jitter, capped at maxDelayMs
  • Timeout: Per-request timeout via AbortController

Configuration Example

channels:
  mattermost:
    dmChannelRetry:
      maxRetries: 5
      initialDelayMs: 1000
      maxDelayMs: 10000
      timeoutMs: 30000

Testing

  • 10 new unit tests for retry logic
  • 3 new integration tests for config/options handling
  • All 223 Mattermost tests pass

Review Status

Greptile Review: All 4 issues identified in the initial review have been addressed in commit 8a23919:

  • timeoutMs now properly passes AbortSignal to fetch
  • isRetryableError no longer has false positives on 4xx errors
  • onRetry callback is preserved and called alongside logging
  • ✅ Zod schema validates initialDelayMs <= maxDelayMs

Closes the gap identified in PR #29925.

@openclaw-barnacle openclaw-barnacle bot added channel: mattermost Channel integration: mattermost size: L labels Mar 10, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 10, 2026

Greptile Summary

This PR adds createMattermostDirectChannelWithRetry() with exponential backoff (full-jitter), per-request AbortController-based timeouts, and configurable retry options at both the account config level (dmChannelRetry in YAML) and per-call level (dmRetryOptions in MattermostSendOpts). It addresses the gap noted in PR #29925 where DM channel creation had no retry mechanism.

All nine issues raised in the previous review rounds have been resolved:

  • AbortSignal is now threaded through createMattermostDirectChannel to the underlying fetch
  • onRetry callback is now preserved via a wrapper that delegates to the caller's callback before logging
  • initialDelayMs <= maxDelayMs cross-field validation is enforced by Zod .refine()
  • isRetryableError checks 5xx before 4xx and anchors patterns to "mattermost api" prefix, preventing port numbers and IP octets from triggering false classifications
  • Jitter is now proportional to exponentialDelay rather than a fixed 1000 ms
  • MattermostAccountConfig in types.ts now includes dmChannelRetry
  • The timeout test was rewritten to actually exercise the AbortController abort path

One design point from the previous round (full replacement via ?? rather than field-level merging of opts.dmRetryOptions with accountRetryConfig) was noted but appears to be an intentional design decision — callers who provide any dmRetryOptions are expected to supply all relevant fields they want to control.

Confidence Score: 4/5

  • This PR is safe to merge — all previously flagged issues have been addressed and the retry/timeout logic is correctly implemented.
  • All nine issues from prior review rounds are resolved: AbortSignal is forwarded, onRetry is preserved, jitter is proportional, isRetryableError ordering and anchoring are correct, cross-field Zod validation is present, and the type declaration is in sync. Tests are comprehensive and cover the corrected edge cases. The one remaining design point (full replacement via ?? rather than field-level merge) was already flagged in a previous thread and appears intentional. No new critical issues found.
  • No files require special attention — the key logic in client.ts and send.ts looks correct after all the fixes.

Last reviewed commit: eaaf224

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c177c1dee8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@JonathanJing
Copy link
Copy Markdown
Contributor Author

@greptile-apps Thanks for the thorough review! All four issues have been addressed in commit 8a23919:

  1. timeoutMs non-functional ✅ Fixed - controller.signal is now passed through createMattermostDirectChannel to the underlying fetch

  2. isRetryableError false positives ✅ Fixed - Added explicit 4xx check before network pattern matching. Any message containing a 4xx status code (except 429) returns false before reaching the pattern list

  3. onRetry silently dropped ✅ Fixed - Now calls both user's callback and logger (if verbose): params.dmRetryOptions?.onRetry?.(attempt, delayMs, error) then logger

  4. initialDelayMs > maxDelayMs ✅ Fixed - Added Zod .refine() with clear error message

Also added 3 new tests to verify the 4xx false positive fixes and AbortSignal propagation.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 10, 2026

Perfect! All four issues are properly fixed:

1. timeoutMs ✅ - AbortSignal is now correctly passed through the entire chain (client.ts:188-193, 169-173)

2. isRetryableError false positives ✅ - Two layers of protection:

  • Explicit 4xx check before pattern matching (client.ts:217-225)
  • Additional safeguard that rejects any message with a 3-digit HTTP status before checking network patterns (client.ts:232-236)

3. onRetry callback ✅ - User callback is preserved and called before logging (send.ts:176-177)

4. initialDelayMs validation ✅ - Zod refinement with clear error message (config-schema.ts:17-25)

The new tests comprehensively cover the edge cases. Nice work!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8a2391953e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@JonathanJing
Copy link
Copy Markdown
Contributor Author

Hi maintainers, Greptile's review was on an older commit (c177c1d). All issues have been fixed in the latest commit (8a23919). Could someone please re-trigger Greptile or do a manual review? Thanks!

@aerelune
Copy link
Copy Markdown
Contributor

关闭原因:发现多个严重 bug 需要重新设计

  1. timeoutMs 功能完全不工作 - AbortController signal 未传递给 fetch
  2. isRetryableError 逻辑错误 - 会错误重试包含关键词的 4xx 错误
  3. onRetry 回调在非 verbose 模式下被静默丢弃
  4. Schema 允许无效配置(initialDelayMs > maxDelayMs)

请修复后重新提交。

@aerelune
Copy link
Copy Markdown
Contributor

🔴 发现严重问题,建议关闭此 PR

关键 Bug:

  1. timeoutMs 功能完全不工作

    • AbortController 的 signal 没有传递给 fetch 调用
  2. isRetryableError 逻辑错误

    • 会错误地重试包含 "timeout" 或 "abort" 关键词的 4xx 错误
  3. onRetry 回调被静默丢弃

    • 关闭 verbose 日志后,回调不会执行
  4. Schema 验证缺失

    • 允许 initialDelayMs > maxDelayMs 这种无效配置

建议重新设计后再提交。

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a85603cbee

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9479396f7b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@JonathanJing
Copy link
Copy Markdown
Contributor Author

@greptile review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eaaf224cd6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df9cf76f92

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@mukhtharcm mukhtharcm force-pushed the feat/mattermost-dm-retry-logic branch from df9cf76 to a3149db Compare March 17, 2026 06:43
@openclaw-barnacle openclaw-barnacle bot added the docs Improvements or additions to documentation label Mar 17, 2026
@mukhtharcm mukhtharcm force-pushed the feat/mattermost-dm-retry-logic branch from a3149db to 540a380 Compare March 17, 2026 16:45
JonathanJing and others added 8 commits March 17, 2026 16:45
… creation

Adds exponential backoff retry mechanism for Mattermost DM channel creation
to handle transient failures (429, 5xx, network errors).

Changes:
- Add createMattermostDirectChannelWithRetry() with configurable retry options
- Support maxRetries, initialDelayMs, maxDelayMs, timeoutMs parameters
- Add dmChannelRetry config option to Mattermost account schema
- Allow per-send override via dmRetryOptions in MattermostSendOpts
- Add comprehensive tests for retry logic and error handling
- Retry on: 429 rate limits, 5xx server errors, network/transient errors
- Don't retry on: 4xx client errors (except 429)

Fixes the gap identified in PR openclaw#29925 where initial DM channel creation
failures had no retry mechanism.
- Fix timeoutMs: pass AbortSignal to createMattermostDirectChannel and fetch
- Fix isRetryableError false positives: check for explicit 4xx status codes
  before falling back to network pattern matching
- Fix onRetry callback: preserve user's callback while adding logging
- Add Zod schema refinement: validate initialDelayMs <= maxDelayMs
- Add tests for 4xx false positive cases and AbortSignal propagation
…and jitter

- Fix isRetryableError: check 5xx BEFORE 4xx to prevent misclassification
  when 5xx error detail contains 4xx substring (e.g., '503: upstream 404')
- Fix jitter: use proportional jitter (full-jitter pattern) instead of
  hardcoded 1000ms. Jitter is now 0-100% of exponential delay
- Update tests to reflect new jitter behavior
- Add test for 5xx with 4xx substring in error message
Add missing dmChannelRetry field to TypeScript type definition in types.ts
to match the Zod schema in config-schema.ts. Fixes type checking error
when accessing account.config.dmChannelRetry.
- Fix isRetryableError: use 'mattermost api NNN' prefix pattern instead of
  \b\d{3}\b to avoid matching port numbers (e.g., :443) and IP octets
- Fix timeout test: properly verify AbortController aborts the fetch by
  listening to the abort event on the signal
- Add test for port 443 connection errors to verify no false 4xx classification
- Update error messages in tests to use 'Mattermost API NNN' format
- Fix isRetryableError: check 'mattermost api 429' status code BEFORE
  generic '429' text match to avoid false positives
- Error messages containing '429' in detail (e.g., 'Invalid ID: 4294967295')
  are no longer incorrectly treated as rate limiting
- Add test for 400 error containing '429' text to verify no false retry
@mukhtharcm mukhtharcm force-pushed the feat/mattermost-dm-retry-logic branch from 540a380 to 3db47be Compare March 17, 2026 16:46
@mukhtharcm mukhtharcm merged commit 2145eb5 into openclaw:main Mar 17, 2026
7 checks passed
@mukhtharcm
Copy link
Copy Markdown
Member

Merged via squash.

Thanks @JonathanJing!

mrosmarin added a commit to mrosmarin/openclaw that referenced this pull request Mar 17, 2026
* main: (142 commits)
  fix(zalouser): fix setup-only onboarding flow (openclaw#49219)
  CI: add built plugin singleton smoke (openclaw#48710)
  update contributing focus areas
  docs(providers): clarify provider capabilities vs public capability model
  docs(refactor): align plugin SDK plan with public capability model
  docs(cli): add plugins inspect command reference
  docs(plugins): document public capability model, plugin shapes, and inspection
  Plugins: internalize diagnostics OTel imports
  Plugins: internalize diffs SDK imports
  Plugins: internalize more extension SDK imports
  Plugins: add local extension API barrels
  Plugins: add inspect matrix and trim export
  Plugins: add inspect command and capability report
  fix(telegram): unify transport fallback chain (openclaw#49148)
  Plugins: add binding resolution callbacks (openclaw#48678)
  fix(gateway): clear trusted-proxy control ui scopes
  refactor: narrow extension public seams
  test: stabilize memory async search close
  docs(hooks): clarify trust model and audit guidance
  feat(mattermost): add retry logic and timeout handling for DM channel creation (openclaw#42398)
  ...
nikolaisid pushed a commit to nikolaisid/openclaw that referenced this pull request Mar 18, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm
brandontyler pushed a commit to brandontyler/clawdbot that referenced this pull request Mar 19, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm
pholpaphankorn pushed a commit to pholpaphankorn/openclaw that referenced this pull request Mar 22, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 23, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm

(cherry picked from commit 2145eb5)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 23, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm

(cherry picked from commit 2145eb5)
Interstellar-code pushed a commit to Interstellar-code/operator1 that referenced this pull request Mar 24, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm

(cherry picked from commit 2145eb5)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 26, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm

(cherry picked from commit 2145eb5)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 26, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm

(cherry picked from commit 2145eb5)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 26, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm

(cherry picked from commit 2145eb5)
ralyodio pushed a commit to ralyodio/openclaw that referenced this pull request Apr 3, 2026
… creation (openclaw#42398)

Merged via squash.

Prepared head SHA: 3db47be
Co-authored-by: JonathanJing <[email protected]>
Co-authored-by: mukhtharcm <[email protected]>
Reviewed-by: @mukhtharcm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: mattermost Channel integration: mattermost docs Improvements or additions to documentation size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants