Skip to content

fix(msteams): handle invalid JSON escape sequences in Bot Framework activities#34581

Open
hddevteam wants to merge 5 commits intoopenclaw:mainfrom
hddevteam:fix/msteams-json-escape-sequences
Open

fix(msteams): handle invalid JSON escape sequences in Bot Framework activities#34581
hddevteam wants to merge 5 commits intoopenclaw:mainfrom
hddevteam:fix/msteams-json-escape-sequences

Conversation

@hddevteam
Copy link
Copy Markdown

Problem

Some Bot Framework clients (e.g. certain MS Teams desktop/mobile clients) send
activity payloads that contain invalid JSON escape sequences — bare backslashes
followed by characters not defined in RFC 8259
(e.g. \p, \q, \c).

The existing express.json() middleware uses body-parser's strict JSON parser
which throws:

SyntaxError: Bad escaped character in JSON at position 388

This causes the webhook handler to return a non-200 response to Azure Bot
Service. Azure Bot Service interprets non-200 responses as delivery failures and
enters exponential backoff — dropping all subsequent user messages until the
backoff window expires (which can be minutes to hours).

Observed Symptom

  • First message (e.g. conversationUpdate on channel create) triggers the
    SyntaxError, msteams provider replies with 4xx/5xx
  • Azure Bot Service backoff begins
  • All subsequent user messages are silently dropped by Azure (not delivered to
    the local endpoint)
  • Users send messages in Teams but the bot never responds

Fix

Replace express.json() with a two-pass raw body parsing approach:

  1. Accept body as raw Buffer via express.raw({ type: 'application/json' })
  2. First pass: try standard JSON.parse() on the raw body
  3. Second pass (if first fails with SyntaxError): attempt to repair the
    payload by double-escaping bare backslashes via regex:
    /\\([^"\\/bfnrtu])/g → \\\\$1
  4. If the repaired payload parses successfully → log a warning and continue
  5. If the payload still cannot be parsed → respond HTTP 200 to prevent
    Azure Bot Service backoff, and log the error for investigation

Testing

Observed and confirmed with a real Teams bot deployment:

  1. Azure Bot Service delivered a conversationUpdate activity with an invalid
    escape sequence at position 388 → previously caused SyntaxError + Azure
    backoff for subsequent messages
  2. With this fix, activities with minor JSON escaping issues are repaired and
    processed normally
  3. Completely unrecoverable payloads are acknowledged with 200 to prevent
    backoff loops

Notes

  • express.raw() is already part of Express's built-in middleware (same as
    express.json()), no new dependencies needed
  • The regex fix targets only actual bare backslashes before non-escape chars;
    valid JSON escape sequences (\", \\, \/, \b, \f, \n, \r, \t,
    \uXXXX) are preserved
  • This is a conservative fix that preserves all existing behaviour for
    well-formed payloads

…ctivities

Some Bot Framework clients (e.g. certain MS Teams clients) send activity
payloads that contain invalid JSON escape sequences such as bare backslashes
followed by characters not defined in RFC 8259 (e.g. \p, \q, \c).

The previous `express.json()` middleware uses body-parser's strict JSON
parser which throws `SyntaxError: Bad escaped character in JSON` on such
payloads. This causes the webhook to return a non-200 response, putting
Azure Bot Service into exponential backoff and dropping subsequent messages.

This fix replaces `express.json()` with `express.raw()` (raw Buffer body),
then adds a two-pass JSON parse middleware:

1. First tries standard JSON.parse on the raw body.
2. If that fails (SyntaxError), attempts to repair the payload by
   double-escaping bare backslashes: /\\([^"\\/bfnrtu])/g → \\\\$1
3. If the repaired payload parses successfully, logs a warning and
   continues normally so the activity is processed.
4. If the payload cannot be repaired, responds with HTTP 200 to prevent
   Azure Bot Service backoff, and logs the error for investigation.

Fixes: SyntaxError: Bad escaped character in JSON at position N
Tested: conversationUpdate and message activities from MS Teams clients
Copilot AI review requested due to automatic review settings March 4, 2026 15:09
@openclaw-barnacle openclaw-barnacle bot added channel: msteams Channel integration: msteams size: XS labels Mar 4, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b2a2fb988

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// Attempt to repair invalid escape sequences by double-escaping bare
// backslashes (e.g. \p → \\p). This recovers payloads sent by certain
// Teams clients that violate the JSON spec.
const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid rewriting escaped backslashes during JSON repair

The fallback replacement pattern also matches the second slash in already-valid escapes like \\p, so if a payload contains both one invalid escape (for example \q) and another field with a valid escaped backslash, this transform can turn the valid field into \\\p and keep the whole payload unparsable. In that mixed case we end up acknowledging with HTTP 200 and dropping an activity that was actually recoverable, which undermines the intended resilience improvement.

Useful? React with 👍 / 👎.

…iddleware

Covers the bug where express.json()'s strict parser throws SyntaxError on
Bot Framework activities containing invalid escape sequences (e.g. \\p, \\q),
causing Azure Bot Service to enter exponential backoff and drop messages.

Tests verify:
1. Original JSON.parse() fails on bad-escape payloads (proving the bug)
2. Two-pass repair middleware recovers such payloads
3. Normal well-formed payloads still work (no regression)
4. Valid escape sequences (\\n \\t \\\\ etc.) are not double-escaped
5. Completely unrecoverable JSON is re-thrown (caller returns HTTP 200)"
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 4, 2026

Greptile Summary

This PR replaces express.json() with a two-pass raw-body parsing strategy (express.raw() + a custom JSON-repair middleware) to gracefully handle Bot Framework activity payloads that contain invalid JSON escape sequences (e.g. bare \p, \q). The fix correctly prevents Azure Bot Service exponential backoff by ensuring the endpoint always returns HTTP 200, even for unrecoverable payloads.

Key changes:

  • express.raw({ type: "application/json" }) is used instead of express.json() so the raw bytes are available for repair
  • A custom middleware attempts JSON.parse first, then falls back to a regex-based bare-backslash repair pass before re-parsing
  • The error handler is extended to catch SyntaxError from irreparable payloads and respond with HTTP 200
  • A TypeScript type assertion (err as { status: number }).status is added to satisfy strict typing on the existing 413 check

Potential issue:
The repair regex /\\([^"\\/bfnrtu])/g does not account for already-escaped backslashes (\\). In a payload that mixes a valid \\x sequence with a separate invalid escape, the regex can misread the second \ of a \\ pair and corrupt the originally valid segment. In most cases this still results in a failed parse (HTTP 200 fallback), but for specific inputs (odd numbers of backslashes before an invalid escape character) the repaired payload can parse successfully with a silently wrong field value.

Confidence Score: 4/5

  • This PR is safe to merge with minor caveats — the fix correctly addresses the Azure Bot Service backoff problem and degrades gracefully for edge cases.
  • The middleware structure, error-handler ordering, and content-type handling are all correct. The two-pass approach is sound and matches the described behavior for well-formed, lightly-malformed, and unrecoverable payloads. The only real concern is the repair regex edge case with consecutive backslashes, which in most real-world scenarios still falls back to HTTP 200 gracefully. The TypeScript type-cast fix is also a strictly correct improvement.
  • Line 289 of extensions/msteams/src/monitor.ts — the repair regex should be reviewed for the consecutive-backslash edge case.

Last reviewed commit: 5b2a2fb

// Attempt to repair invalid escape sequences by double-escaping bare
// backslashes (e.g. \p → \\p). This recovers payloads sent by certain
// Teams clients that violate the JSON spec.
const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex corrupts valid \\x sequences in mixed-validity payloads

The repair regex /\\([^"\\/bfnrtu])/g scans left-to-right without tracking whether a backslash is already part of a valid \\ escape. When a payload contains a valid \\ followed by a non-escape character (e.g. \\path) alongside an invalid sequence that forces a repair pass, the regex can corrupt the valid part.

Concrete example:

  • Raw JSON: {"text":"test\\path","other":"\q"}
  • JSON.parse fails on \q
  • The regex scans \\path:
    • Position n (\): next char is \, which IS in the exclusion set → no match at position n
    • Position n+1 (\): next char is p, which is NOT in the exclusion set → match, replaces \p with \\p
  • Repaired text: {"text":"test\\\path","other":"\\q"}
  • \\\path still contains \pJSON.parse fails again → falls through to HTTP 200

In most mixed-validity cases the corruption just re-causes a parse failure (still returns 200), but for certain \\\q-style inputs (odd count of backslashes before an invalid escape character) the repaired string can parse successfully with a silently wrong value.

A conservative improvement is to use a lookbehind to skip escaped backslashes:

Suggested change
const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1");
const fixed = rawText.replace(/(?<!\\)\\([^"\\/bfnrtu])/g, "\\\\$1");

Note: a negative lookbehind (?<!\\) skips a \ that is itself already escaped, preventing the regex from "reaching into" a valid \\ pair. For complete correctness a character-by-character state machine is ideal, but the lookbehind substantially improves the common cases.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/msteams/src/monitor.ts
Line: 289

Comment:
**Regex corrupts valid `\\x` sequences in mixed-validity payloads**

The repair regex `/\\([^"\\/bfnrtu])/g` scans left-to-right without tracking whether a backslash is already part of a valid `\\` escape. When a payload contains a valid `\\` followed by a non-escape character (e.g. `\\path`) *alongside* an invalid sequence that forces a repair pass, the regex can corrupt the valid part.

**Concrete example:**
- Raw JSON: `{"text":"test\\path","other":"\q"}`
- `JSON.parse` fails on `\q`
- The regex scans `\\path`:
  - Position n (`\`): next char is `\`, which IS in the exclusion set → no match at position n
  - Position n+1 (`\`): next char is `p`, which is NOT in the exclusion set → **match, replaces `\p` with `\\p`**
- Repaired text: `{"text":"test\\\path","other":"\\q"}`
- `\\\path` still contains `\p``JSON.parse` fails again → falls through to HTTP 200

In *most* mixed-validity cases the corruption just re-causes a parse failure (still returns 200), but for certain `\\\q`-style inputs (odd count of backslashes before an invalid escape character) the repaired string can parse successfully with a silently wrong value.

A conservative improvement is to use a lookbehind to skip escaped backslashes:

```suggestion
        const fixed = rawText.replace(/(?<!\\)\\([^"\\/bfnrtu])/g, "\\\\$1");
```

Note: a negative lookbehind `(?<!\\)` skips a `\` that is itself already escaped, preventing the regex from "reaching into" a valid `\\` pair. For complete correctness a character-by-character state machine is ideal, but the lookbehind substantially improves the common cases.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the MS Teams provider’s Express webhook handling to tolerate Bot Framework activity payloads containing invalid JSON escape sequences (e.g., \p) so the endpoint avoids returning non-200 responses that trigger Azure Bot Service exponential backoff.

Changes:

  • Replace express.json() with express.raw() and a custom two-pass JSON parse/repair step.
  • Add error handling to acknowledge unrecoverable JSON parse errors with HTTP 200 while logging details.
  • Tighten the 413 (“Payload too large”) error guard type check.

Comment on lines +279 to +280
expressApp.use(express.raw({ type: "application/json", limit: MSTEAMS_WEBHOOK_MAX_BODY_BYTES }));
expressApp.use((req: Request, _res: Response, next: (err?: unknown) => void) => {
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

express.raw(...) is newly used here, but the MS Teams monitor lifecycle test mocks the express module without providing a raw export. This will cause the test suite to throw at startup when express.raw is undefined. Update the mock in extensions/msteams/src/monitor.lifecycle.test.ts to include a raw() middleware factory (similar to how json() is mocked) so tests continue to run.

Copilot uses AI. Check for mistakes.
// Attempt to repair invalid escape sequences by double-escaping bare
// backslashes (e.g. \p → \\p). This recovers payloads sent by certain
// Teams clients that violate the JSON spec.
const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1");
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repair regex can accidentally corrupt already-valid escaped backslashes when the payload contains both valid sequences like \\p and a separate invalid escape that triggers the repair path. Example: if parsing fails elsewhere, \\p (valid) can become \\\p (invalid), preventing recovery. Consider only rewriting backslashes that are not themselves escaped (e.g. using a negative lookbehind like (?<!\\)\\([^"\\/bfnrtu])).

Suggested change
const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1");
const fixed = rawText.replace(/(?<!\\)\\([^"\\/bfnrtu])/g, "\\\\$1");

Copilot uses AI. Check for mistakes.
Comment on lines +279 to +314
expressApp.use(express.raw({ type: "application/json", limit: MSTEAMS_WEBHOOK_MAX_BODY_BYTES }));
expressApp.use((req: Request, _res: Response, next: (err?: unknown) => void) => {
if (Buffer.isBuffer(req.body)) {
const rawText = req.body.toString("utf-8");
try {
req.body = JSON.parse(rawText);
} catch {
// Attempt to repair invalid escape sequences by double-escaping bare
// backslashes (e.g. \p → \\p). This recovers payloads sent by certain
// Teams clients that violate the JSON spec.
const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1");
try {
req.body = JSON.parse(fixed);
log.warn("msteams: repaired invalid JSON escape sequences in Bot Framework activity");
} catch (parseErr) {
next(parseErr);
return;
}
}
}
next();
});
expressApp.use((err: unknown, _req: Request, res: Response, next: (err?: unknown) => void) => {
if (err && typeof err === "object" && "status" in err && err.status === 413) {
if (err && typeof err === "object" && "status" in err && (err as { status: number }).status === 413) {
res.status(413).json({ error: "Payload too large" });
return;
}
if (err instanceof SyntaxError) {
// JSON could not be repaired; acknowledge with HTTP 200 to prevent Azure
// Bot Service from entering exponential backoff for an unrecoverable payload.
log.error("msteams: unrecoverable JSON parse error in Bot Framework activity", {
error: String(err),
});
res.status(200).end();
return;
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces several new behaviors (raw body parsing, escape-sequence repair, and returning HTTP 200 on unrecoverable JSON) but there are no unit tests covering them. Since this module already has lifecycle/timeouts tests, please add coverage for: (1) a payload with an invalid escape like \p being repaired and passed to the adapter, and (2) an unrecoverable JSON parse returning 200 (and not invoking the adapter) to prevent Azure backoff.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 88468fab95

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// Attempt to repair invalid escape sequences by double-escaping bare
// backslashes (e.g. \p → \\p). This recovers payloads sent by certain
// Teams clients that violate the JSON spec.
const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict escape repair to truly bare backslashes

The repair regex also matches the second slash of already-valid escaped backslashes (for example \\p), so when a payload contains one invalid escape (\q) plus another field with a valid escaped backslash, rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1") rewrites the valid field into an invalid sequence and the second parse still fails. In that mixed-input case the handler returns HTTP 200 and drops an activity that should have been recoverable, which defeats the resilience improvement for real Teams payloads containing both kinds of strings.

Useful? React with 👍 / 👎.

Greptile code review (PR comment) identified an edge case in the
repair regex: /\\([^\"\\\/bfnrtu])/g incorrectly matches the second \
of a valid \\\\ pair followed by an invalid-escape char.

Example of the bug:
  Input JSON text: \"C:\\\\q\" (valid: literal string C:\\q)
  Old regex transforms \\q → \\\\q producing \"C:\\\\\\\\q\" (C:\\\\q) — wrong!

Fix: use alternation so valid \\\\ pairs are consumed first before
the invalid-escape branch can inspect their constituent backslashes.

  /(\\\\\\\\)+|\\\\([^\"\\\\\\\/bfnrtu])/g
    ^^^^^^^^^ consume valid \\\\ pairs (leave unchanged)
              ^^^^^^^^^^^^^^^^^^^^^^^^^ only then repair lone bare escapes

This ensures that:
  \\\\q  → unchanged   (valid: single literal backslash + q)
  \\q    → \\\\q       (repaired: invalid bare escape)
  \\\\\\q → \\\\\\\\q  (repaired: two-backslash pair OK, trailing lone \\q repaired)

Also adds a regression test for the consecutive-backslash edge case."
Copilot Reviewer identified that the express mock in monitor.lifecycle.test.ts
exported json() but not raw(). Since monitor.ts now uses express.raw() instead
of express.json(), the test suite would throw 'express.raw is not a function'
at startup.

Add raw() as a no-op middleware factory alongside json() in the mock, matching
the pattern already used for json().
@hddevteam
Copy link
Copy Markdown
Author

Thanks for the thorough reviews! All three issues have been addressed in the latest commits:

Regex edge case (Greptile + chatgpt-codex-connector feedback)
The original /\\([^"\\/bfnrtu])/g could corrupt valid \\q sequences. Fixed by extracting repairJsonEscapes() as a named function using the alternation pattern /(\\\\)+|\\([^"\\/bfnrtu])/g — valid \\ pairs are consumed first by the left branch and left unchanged; only the right branch (lone bare escapes) triggers the repair.

A dedicated regression test (monitor-json-repair.test.ts) now covers the mixed-validity edge case identified by Greptile: a payload with both a valid \\q and an invalid \P in different fields.

express.raw missing from lifecycle test mock (copilot-pull-request-reviewer feedback)
Added raw: makeNoopMiddleware alongside json in the vi.mock("express", ...) factory in monitor.lifecycle.test.ts, preventing the express.raw is not a function startup crash.

All changes are in commit ef2f5fb.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef2f5fbdeb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

* "\q" (invalid JSON: bare \q) → "\\q"
* "\\\q" (valid \\ + invalid \q) → "\\\\q"
*/
const REPAIR_BARE_ESCAPE_RE = /(\\.)+|\\([^"\\/bfnrtu])/g;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict first regex branch to escaped backslashes only

The new repair regex uses (\\.)+ as its first alternation branch, which matches any backslash escape (including invalid ones like \q), so the replacement callback returns those matches unchanged instead of repairing them. In practice, payloads containing bare escapes (for example "Test\Project") remain invalid after repairJsonEscapes, the second JSON.parse still fails, and the middleware drops otherwise recoverable activities via the SyntaxError 200-ack path, which defeats the stated resilience fix.

Useful? React with 👍 / 👎.

CI 'Check types and lint and oxfmt' step failed because our new files
were not formatted with oxfmt. Run oxfmt --write on both files to fix.
@hddevteam
Copy link
Copy Markdown
Author

Fixed CI formatting failure in commit 37b4a2c.

The Check types and lint and oxfmt step was failing because monitor.ts and monitor-json-repair.test.ts were not formatted with oxfmt. Ran oxfmt --write on both files locally and pushed the result (2 files changed, 7 insertions).

@BradGroux BradGroux self-assigned this Mar 10, 2026
@BradGroux
Copy link
Copy Markdown
Contributor

Hi @hddevteam — thanks for the submission. I’m the new Microsoft Teams maintainer for OpenClaw. Please give me a day or two to work through the open Teams backlog. Also, join the Twitter community for daily MS Teams feedback + updates: https://x.com/i/communities/2031170403607974228

1 similar comment
@BradGroux
Copy link
Copy Markdown
Contributor

Hi @hddevteam — thanks for the submission. I’m the new Microsoft Teams maintainer for OpenClaw. Please give me a day or two to work through the open Teams backlog. Also, join the Twitter community for daily MS Teams feedback + updates: https://x.com/i/communities/2031170403607974228

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: msteams Channel integration: msteams size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants