Skip to content

fix: register BlueBubbles webhook route before server-info fetch#45664

Open
techfitmaster wants to merge 4 commits intoopenclaw:mainfrom
techfitmaster:fix/issue-45620-v2
Open

fix: register BlueBubbles webhook route before server-info fetch#45664
techfitmaster wants to merge 4 commits intoopenclaw:mainfrom
techfitmaster:fix/issue-45620-v2

Conversation

@techfitmaster
Copy link
Copy Markdown

Summary

monitorBlueBubblesProvider was calling fetchBlueBubblesServerInfo (up to 5 s timeout) before registering the webhook HTTP route via registerBlueBubblesWebhookTarget. This created a gap on every startup and after every crash-loop restart where the plugin HTTP registry had no entry for /bluebubbles-webhook, so the request fell through all gateway stages and returned 404 Not Found.

Changes

  • extensions/bluebubbles/src/monitor.ts — Move registerBlueBubblesWebhookTarget to the top of monitorBlueBubblesProvider, before the async fetchBlueBubblesServerInfo call. The webhook route is now reachable immediately when the provider starts or restarts; server info is still fetched and cached for macOS version detection.

  • src/gateway/server-channels.ts — Fix off-by-one in crash-loop restart accounting: when the loop hit MAX_RESTART_ATTEMPTS + 1 it was setting reconnectAttempts = MAX + 1 (e.g. 11) instead of MAX (10). Now caps the reported value at MAX_RESTART_ATTEMPTS.

Testing

  • extensions/bluebubbles/src/monitor.test.ts — 60 tests pass (no regressions in webhook parsing, auth, and routing)
  • extensions/bluebubbles/src/monitor.webhook-route.test.ts — 1 test passes
  • extensions/bluebubbles/src/monitor.webhook-auth.test.ts — 16 tests pass
  • src/gateway/server-channels.test.ts — all 4 tests pass (including the previously-failing "caps crash-loop restarts after max attempts")
  • src/gateway/server.plugin-http-auth.test.ts — 21 tests pass
  • src/gateway/control-ui-routing.test.ts — 16 tests pass

Fixes #45620

Dresing and others added 4 commits March 8, 2026 12:37
…cation

Slack sends url_verification challenges WITHOUT a signature (by design).
The HTTPReceiver was rejecting these requests because it verifies
signatures on ALL requests, but url_verification is never signed.

This fix intercepts incoming HTTP requests BEFORE they reach the
receiver's signature verification:
1. Read the request body first
2. Check if it's a url_verification request
3. If yes, respond directly with the challenge (no signature needed)
4. If no, replay the body to the receiver using a PassThrough stream

This allows HTTP webhook mode to complete the Slack handshake and
activate properly, fixing the regression in v2026.3.2.

Closes openclaw#39432
Handle file lock errors gracefully when .openclaw directory is synced
via cloud storage services (OneDrive, Dropbox, Google Drive, etc.).
Retry up to 3 times with exponential backoff before failing.

Fixes openclaw#39446
fix: add retry logic for file lock errors (EBUSY, EACCES, EPERM)
Previously, monitorBlueBubblesProvider called fetchBlueBubblesServerInfo
(up to 5 s timeout) *before* registering the webhook HTTP route.  This
created a window on every startup and after every crash-loop restart
where POST /bluebubbles-webhook returned 404: the plugin HTTP registry
had no matching entry yet so the request fell all the way through the
gateway's stage pipeline.

Fix: move registerBlueBubblesWebhookTarget to the top of the function,
before the async server-info fetch, so the route is reachable as soon
as the provider (re)starts.

Also fix an off-by-one in server-channels restart accounting: when the
crash-loop hit MAX_RESTART_ATTEMPTS + 1 it was reporting
reconnectAttempts = MAX + 1 instead of MAX, causing the snapshot to
show one extra attempt than the configured cap.

Fixes openclaw#45620
@openclaw-barnacle openclaw-barnacle bot added channel: bluebubbles Channel integration: bluebubbles channel: slack Channel integration: slack gateway Gateway runtime size: M labels Mar 14, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c35ee4233e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// Slack sends url_verification WITHOUT a signature (by design).
// We must intercept this BEFORE the receiver's signature verification.
// Read the body first to check the request type.
const body = await readRequestBody(req);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce Slack body limits before reading request payload

The handler now calls readRequestBody(req) before installRequestBodyLimitGuard(...), so the webhook body is fully buffered without any size or timeout protection. A client can send a large/slow chunked request without Content-Length and bypass SLACK_WEBHOOK_MAX_BODY_BYTES and SLACK_WEBHOOK_BODY_TIMEOUT_MS, which means the limit guard only runs after the expensive read has already happened.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 14, 2026

Greptile Summary

This PR fixes a startup race condition in BlueBubbles (webhook route registered before server-info fetch), an off-by-one in crash-loop restart accounting, adds file-write retry logic for lock errors, and intercepts Slack's url_verification challenge before Bolt's signature-verification middleware.

The first three changes are clean and correct. The Slack url_verification intercept is the right approach conceptually, but the implementation introduces two issues worth addressing before merging:

  • Body read without limits (src/slack/monitor/provider.ts line 236): readRequestBody(req) buffers the entire request body from the network with no byte-size cap and no timeout. The installRequestBodyLimitGuard is only applied to the PassThrough replay (data already in memory), so the 1 MB cap and 30 s timeout offer no protection against oversized or slow-drip payloads. readRequestBodyWithLimit (already exported from src/infra/http-body.ts) should be used here instead.
  • Incomplete stream proxy (src/slack/monitor/provider.ts lines 258–268): mockReq inherits _readableState and derived getters (readableEnded, readable, readableLength) from the already-consumed original req. If @slack/bolt's HTTPReceiver inspects these properties before attaching its 'data' listener it may skip body reading silently.
  • Unreachable dead code (src/infra/json-files.ts line 80): throw lastError after the retry loop is never reached; the loop always exits via return or an in-catch throw.

Confidence Score: 2/5

  • Merging as-is exposes the Slack HTTP endpoint to memory exhaustion and slow-client DoS due to the unbounded body read.
  • The BlueBubbles, server-channels, and json-files changes are solid. However, the new readRequestBody helper in the Slack provider reads the full network body without any size or timeout guard, making the existing SLACK_WEBHOOK_MAX_BODY_BYTES / SLACK_WEBHOOK_BODY_TIMEOUT_MS limits ineffective during the actual network read. Additionally, the mockReq object inherits stream-state properties from the consumed original request, which is fragile and could silently fail with future Bolt versions.
  • src/slack/monitor/provider.ts requires attention for the unbounded body read and incomplete stream proxy.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/slack/monitor/provider.ts
Line: 236

Comment:
**`readRequestBody` bypasses all size and timeout guards**

`readRequestBody(req)` buffers the entire network body into memory with no byte-size limit and no timeout. The `SLACK_WEBHOOK_MAX_BODY_BYTES` (1 MB) and `SLACK_WEBHOOK_BODY_TIMEOUT_MS` (30 s) protections are only applied to the `PassThrough` replay on `mockReq`, which already holds the pre-buffered data by the time `installRequestBodyLimitGuard` runs. This means:

1. **Memory exhaustion**: An attacker can send an arbitrarily large body that is fully buffered before any size check occurs. The guard will eventually trip and send a 413, but the damage (heap spike) has already happened.
2. **Timeout bypass**: A slow/stalling client can hold the HTTP connection open indefinitely during `readRequestBody`. The 30 s timeout on `mockReq` fires almost immediately (the PassThrough stream drains instantly), so it offers no protection against sluggish senders.

The pre-read is necessary to inspect the `type` field before Bolt's signature verification runs. The fix is to apply the limit guard *during* the initial body read instead of re-applying it to the replay stream:

```typescript
// Read with the same limits as the guard, then decide what to do
let rawBody: string;
try {
  rawBody = await readRequestBodyWithLimit(req, {
    maxBytes: SLACK_WEBHOOK_MAX_BODY_BYTES,
    timeoutMs: SLACK_WEBHOOK_BODY_TIMEOUT_MS,
  });
} catch (err) {
  if (isRequestBodyLimitError(err)) {
    // send the appropriate 4xx and return
    ...
  }
  throw err;
}
```

(Both `readRequestBodyWithLimit` and `isRequestBodyLimitError` are already exported from `../../infra/http-body.js`.)

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/slack/monitor/provider.ts
Line: 258-268

Comment:
**Incomplete stream proxy may silently break with Bolt upgrades**

`mockReq` overrides `on`, `read`, `removeListener`, and `emit` to delegate to `passThrough`, but all other `Readable`/`IncomingMessage` stream properties and methods still resolve via prototype to the original, already-consumed `req`. Notably:

- **`mockReq.readableEnded`** (a getter that reads `this._readableState.endEmitted`) will return `true` because `req._readableState.endEmitted` is `true` after `readRequestBody` drained the socket. If `@slack/bolt`'s `HTTPReceiver` or a middleware (e.g., `raw-body`) checks `req.readableEnded` before attaching a `'data'` listener, it may treat the body as already consumed and skip reading — even though `passThrough` still has the buffered data.
- `readable`, `readableLength`, `readableHighWaterMark`, `_readableState`, `resume()`, `pause()`, and `destroy()` all inherit from `req`, not from `passThrough`.

A safer approach is to construct `mockReq` from `passThrough` directly and copy only the HTTP-level properties from `req`, or use an explicit wrapper class. This avoids cross-stream property leakage and makes the proxy contract explicit.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/infra/json-files.ts
Line: 80

Comment:
**Unreachable `throw lastError` after retry loop**

The `for` loop always exits via either `return` (on success) or `throw err` (inside the `catch` block, on non-retryable errors or the final attempt). It never exhausts the iteration counter without throwing, so this `throw lastError` is dead code. Additionally, `lastError` is typed as `Error | undefined`, so if this line were somehow reached it would throw `undefined`.

Consider removing the trailing `throw` or replacing the loop with a pattern that makes the termination condition more explicit:

```suggestion
  }
  // loop always returns or throws above; this line is unreachable
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: c35ee42

// Slack sends url_verification WITHOUT a signature (by design).
// We must intercept this BEFORE the receiver's signature verification.
// Read the body first to check the request type.
const body = await readRequestBody(req);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readRequestBody bypasses all size and timeout guards

readRequestBody(req) buffers the entire network body into memory with no byte-size limit and no timeout. The SLACK_WEBHOOK_MAX_BODY_BYTES (1 MB) and SLACK_WEBHOOK_BODY_TIMEOUT_MS (30 s) protections are only applied to the PassThrough replay on mockReq, which already holds the pre-buffered data by the time installRequestBodyLimitGuard runs. This means:

  1. Memory exhaustion: An attacker can send an arbitrarily large body that is fully buffered before any size check occurs. The guard will eventually trip and send a 413, but the damage (heap spike) has already happened.
  2. Timeout bypass: A slow/stalling client can hold the HTTP connection open indefinitely during readRequestBody. The 30 s timeout on mockReq fires almost immediately (the PassThrough stream drains instantly), so it offers no protection against sluggish senders.

The pre-read is necessary to inspect the type field before Bolt's signature verification runs. The fix is to apply the limit guard during the initial body read instead of re-applying it to the replay stream:

// Read with the same limits as the guard, then decide what to do
let rawBody: string;
try {
  rawBody = await readRequestBodyWithLimit(req, {
    maxBytes: SLACK_WEBHOOK_MAX_BODY_BYTES,
    timeoutMs: SLACK_WEBHOOK_BODY_TIMEOUT_MS,
  });
} catch (err) {
  if (isRequestBodyLimitError(err)) {
    // send the appropriate 4xx and return
    ...
  }
  throw err;
}

(Both readRequestBodyWithLimit and isRequestBodyLimitError are already exported from ../../infra/http-body.js.)

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/slack/monitor/provider.ts
Line: 236

Comment:
**`readRequestBody` bypasses all size and timeout guards**

`readRequestBody(req)` buffers the entire network body into memory with no byte-size limit and no timeout. The `SLACK_WEBHOOK_MAX_BODY_BYTES` (1 MB) and `SLACK_WEBHOOK_BODY_TIMEOUT_MS` (30 s) protections are only applied to the `PassThrough` replay on `mockReq`, which already holds the pre-buffered data by the time `installRequestBodyLimitGuard` runs. This means:

1. **Memory exhaustion**: An attacker can send an arbitrarily large body that is fully buffered before any size check occurs. The guard will eventually trip and send a 413, but the damage (heap spike) has already happened.
2. **Timeout bypass**: A slow/stalling client can hold the HTTP connection open indefinitely during `readRequestBody`. The 30 s timeout on `mockReq` fires almost immediately (the PassThrough stream drains instantly), so it offers no protection against sluggish senders.

The pre-read is necessary to inspect the `type` field before Bolt's signature verification runs. The fix is to apply the limit guard *during* the initial body read instead of re-applying it to the replay stream:

```typescript
// Read with the same limits as the guard, then decide what to do
let rawBody: string;
try {
  rawBody = await readRequestBodyWithLimit(req, {
    maxBytes: SLACK_WEBHOOK_MAX_BODY_BYTES,
    timeoutMs: SLACK_WEBHOOK_BODY_TIMEOUT_MS,
  });
} catch (err) {
  if (isRequestBodyLimitError(err)) {
    // send the appropriate 4xx and return
    ...
  }
  throw err;
}
```

(Both `readRequestBodyWithLimit` and `isRequestBodyLimitError` are already exported from `../../infra/http-body.js`.)

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +258 to +268
const mockReq = Object.create(req);
mockReq.headers = { ...req.headers };
mockReq.method = req.method;
mockReq.url = req.url;
mockReq.socket = req.socket;

// Replace the stream with our pre-loaded one
mockReq.read = passThrough.read.bind(passThrough);
mockReq.on = passThrough.on.bind(passThrough);
mockReq.removeListener = passThrough.removeListener.bind(passThrough);
mockReq.emit = passThrough.emit.bind(passThrough);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete stream proxy may silently break with Bolt upgrades

mockReq overrides on, read, removeListener, and emit to delegate to passThrough, but all other Readable/IncomingMessage stream properties and methods still resolve via prototype to the original, already-consumed req. Notably:

  • mockReq.readableEnded (a getter that reads this._readableState.endEmitted) will return true because req._readableState.endEmitted is true after readRequestBody drained the socket. If @slack/bolt's HTTPReceiver or a middleware (e.g., raw-body) checks req.readableEnded before attaching a 'data' listener, it may treat the body as already consumed and skip reading — even though passThrough still has the buffered data.
  • readable, readableLength, readableHighWaterMark, _readableState, resume(), pause(), and destroy() all inherit from req, not from passThrough.

A safer approach is to construct mockReq from passThrough directly and copy only the HTTP-level properties from req, or use an explicit wrapper class. This avoids cross-stream property leakage and makes the proxy contract explicit.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/slack/monitor/provider.ts
Line: 258-268

Comment:
**Incomplete stream proxy may silently break with Bolt upgrades**

`mockReq` overrides `on`, `read`, `removeListener`, and `emit` to delegate to `passThrough`, but all other `Readable`/`IncomingMessage` stream properties and methods still resolve via prototype to the original, already-consumed `req`. Notably:

- **`mockReq.readableEnded`** (a getter that reads `this._readableState.endEmitted`) will return `true` because `req._readableState.endEmitted` is `true` after `readRequestBody` drained the socket. If `@slack/bolt`'s `HTTPReceiver` or a middleware (e.g., `raw-body`) checks `req.readableEnded` before attaching a `'data'` listener, it may treat the body as already consumed and skip reading — even though `passThrough` still has the buffered data.
- `readable`, `readableLength`, `readableHighWaterMark`, `_readableState`, `resume()`, `pause()`, and `destroy()` all inherit from `req`, not from `passThrough`.

A safer approach is to construct `mockReq` from `passThrough` directly and copy only the HTTP-level properties from `req`, or use an explicit wrapper class. This avoids cross-stream property leakage and makes the proxy contract explicit.

How can I resolve this? If you propose a fix, please make it concise.

} finally {
await fs.rm(tmp, { force: true }).catch(() => undefined);
}
throw lastError;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreachable throw lastError after retry loop

The for loop always exits via either return (on success) or throw err (inside the catch block, on non-retryable errors or the final attempt). It never exhausts the iteration counter without throwing, so this throw lastError is dead code. Additionally, lastError is typed as Error | undefined, so if this line were somehow reached it would throw undefined.

Consider removing the trailing throw or replacing the loop with a pattern that makes the termination condition more explicit:

Suggested change
throw lastError;
}
// loop always returns or throws above; this line is unreachable
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/infra/json-files.ts
Line: 80

Comment:
**Unreachable `throw lastError` after retry loop**

The `for` loop always exits via either `return` (on success) or `throw err` (inside the `catch` block, on non-retryable errors or the final attempt). It never exhausts the iteration counter without throwing, so this `throw lastError` is dead code. Additionally, `lastError` is typed as `Error | undefined`, so if this line were somehow reached it would throw `undefined`.

Consider removing the trailing `throw` or replacing the loop with a pattern that makes the termination condition more explicit:

```suggestion
  }
  // loop always returns or throws above; this line is unreachable
```

How can I resolve this? If you propose a fix, please make it concise.

@davidemanuelDEV
Copy link
Copy Markdown
Contributor

Review: ⚠️ Changes Requested

The core fix is correct — registering the webhook route before the async server-info fetch eliminates the 404 race window during restarts. Good reasoning in the comments.

Issues:

  1. Committed log file: openclaw-2026-03-07.log (129 lines of WS connect failure logs) is included in this diff. This should not be committed — please remove it from the PR.
  2. The Slack url_verification handler and json-files.ts retry logic appear unrelated to the BlueBubbles webhook fix. Consider splitting into separate PRs for cleaner review.

The BlueBubbles webhook reordering and the reconnectAttempts cap fix are both solid. The json-files retry-on-lock is a nice defensive improvement. Just clean up the log file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: bluebubbles Channel integration: bluebubbles channel: slack Channel integration: slack gateway Gateway runtime size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: BlueBubbles webhook appears to be listening, but POST /bluebubbles-webhook returns 404 in OpenClaw 2026.3.12

2 participants