fix(msteams): handle invalid JSON escape sequences in Bot Framework activities#34581
fix(msteams): handle invalid JSON escape sequences in Bot Framework activities#34581hddevteam wants to merge 5 commits intoopenclaw:mainfrom
Conversation
…ctivities Some Bot Framework clients (e.g. certain MS Teams clients) send activity payloads that contain invalid JSON escape sequences such as bare backslashes followed by characters not defined in RFC 8259 (e.g. \p, \q, \c). The previous `express.json()` middleware uses body-parser's strict JSON parser which throws `SyntaxError: Bad escaped character in JSON` on such payloads. This causes the webhook to return a non-200 response, putting Azure Bot Service into exponential backoff and dropping subsequent messages. This fix replaces `express.json()` with `express.raw()` (raw Buffer body), then adds a two-pass JSON parse middleware: 1. First tries standard JSON.parse on the raw body. 2. If that fails (SyntaxError), attempts to repair the payload by double-escaping bare backslashes: /\\([^"\\/bfnrtu])/g → \\\\$1 3. If the repaired payload parses successfully, logs a warning and continues normally so the activity is processed. 4. If the payload cannot be repaired, responds with HTTP 200 to prevent Azure Bot Service backoff, and logs the error for investigation. Fixes: SyntaxError: Bad escaped character in JSON at position N Tested: conversationUpdate and message activities from MS Teams clients
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b2a2fb988
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
extensions/msteams/src/monitor.ts
Outdated
| // Attempt to repair invalid escape sequences by double-escaping bare | ||
| // backslashes (e.g. \p → \\p). This recovers payloads sent by certain | ||
| // Teams clients that violate the JSON spec. | ||
| const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1"); |
There was a problem hiding this comment.
Avoid rewriting escaped backslashes during JSON repair
The fallback replacement pattern also matches the second slash in already-valid escapes like \\p, so if a payload contains both one invalid escape (for example \q) and another field with a valid escaped backslash, this transform can turn the valid field into \\\p and keep the whole payload unparsable. In that mixed case we end up acknowledging with HTTP 200 and dropping an activity that was actually recoverable, which undermines the intended resilience improvement.
Useful? React with 👍 / 👎.
…iddleware Covers the bug where express.json()'s strict parser throws SyntaxError on Bot Framework activities containing invalid escape sequences (e.g. \\p, \\q), causing Azure Bot Service to enter exponential backoff and drop messages. Tests verify: 1. Original JSON.parse() fails on bad-escape payloads (proving the bug) 2. Two-pass repair middleware recovers such payloads 3. Normal well-formed payloads still work (no regression) 4. Valid escape sequences (\\n \\t \\\\ etc.) are not double-escaped 5. Completely unrecoverable JSON is re-thrown (caller returns HTTP 200)"
Greptile SummaryThis PR replaces Key changes:
Potential issue: Confidence Score: 4/5
Last reviewed commit: 5b2a2fb |
extensions/msteams/src/monitor.ts
Outdated
| // Attempt to repair invalid escape sequences by double-escaping bare | ||
| // backslashes (e.g. \p → \\p). This recovers payloads sent by certain | ||
| // Teams clients that violate the JSON spec. | ||
| const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1"); |
There was a problem hiding this comment.
Regex corrupts valid \\x sequences in mixed-validity payloads
The repair regex /\\([^"\\/bfnrtu])/g scans left-to-right without tracking whether a backslash is already part of a valid \\ escape. When a payload contains a valid \\ followed by a non-escape character (e.g. \\path) alongside an invalid sequence that forces a repair pass, the regex can corrupt the valid part.
Concrete example:
- Raw JSON:
{"text":"test\\path","other":"\q"} JSON.parsefails on\q- The regex scans
\\path:- Position n (
\): next char is\, which IS in the exclusion set → no match at position n - Position n+1 (
\): next char isp, which is NOT in the exclusion set → match, replaces\pwith\\p
- Position n (
- Repaired text:
{"text":"test\\\path","other":"\\q"} \\\pathstill contains\p→JSON.parsefails again → falls through to HTTP 200
In most mixed-validity cases the corruption just re-causes a parse failure (still returns 200), but for certain \\\q-style inputs (odd count of backslashes before an invalid escape character) the repaired string can parse successfully with a silently wrong value.
A conservative improvement is to use a lookbehind to skip escaped backslashes:
| const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1"); | |
| const fixed = rawText.replace(/(?<!\\)\\([^"\\/bfnrtu])/g, "\\\\$1"); |
Note: a negative lookbehind (?<!\\) skips a \ that is itself already escaped, preventing the regex from "reaching into" a valid \\ pair. For complete correctness a character-by-character state machine is ideal, but the lookbehind substantially improves the common cases.
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/msteams/src/monitor.ts
Line: 289
Comment:
**Regex corrupts valid `\\x` sequences in mixed-validity payloads**
The repair regex `/\\([^"\\/bfnrtu])/g` scans left-to-right without tracking whether a backslash is already part of a valid `\\` escape. When a payload contains a valid `\\` followed by a non-escape character (e.g. `\\path`) *alongside* an invalid sequence that forces a repair pass, the regex can corrupt the valid part.
**Concrete example:**
- Raw JSON: `{"text":"test\\path","other":"\q"}`
- `JSON.parse` fails on `\q`
- The regex scans `\\path`:
- Position n (`\`): next char is `\`, which IS in the exclusion set → no match at position n
- Position n+1 (`\`): next char is `p`, which is NOT in the exclusion set → **match, replaces `\p` with `\\p`**
- Repaired text: `{"text":"test\\\path","other":"\\q"}`
- `\\\path` still contains `\p` → `JSON.parse` fails again → falls through to HTTP 200
In *most* mixed-validity cases the corruption just re-causes a parse failure (still returns 200), but for certain `\\\q`-style inputs (odd count of backslashes before an invalid escape character) the repaired string can parse successfully with a silently wrong value.
A conservative improvement is to use a lookbehind to skip escaped backslashes:
```suggestion
const fixed = rawText.replace(/(?<!\\)\\([^"\\/bfnrtu])/g, "\\\\$1");
```
Note: a negative lookbehind `(?<!\\)` skips a `\` that is itself already escaped, preventing the regex from "reaching into" a valid `\\` pair. For complete correctness a character-by-character state machine is ideal, but the lookbehind substantially improves the common cases.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Pull request overview
Updates the MS Teams provider’s Express webhook handling to tolerate Bot Framework activity payloads containing invalid JSON escape sequences (e.g., \p) so the endpoint avoids returning non-200 responses that trigger Azure Bot Service exponential backoff.
Changes:
- Replace
express.json()withexpress.raw()and a custom two-pass JSON parse/repair step. - Add error handling to acknowledge unrecoverable JSON parse errors with HTTP 200 while logging details.
- Tighten the 413 (“Payload too large”) error guard type check.
| expressApp.use(express.raw({ type: "application/json", limit: MSTEAMS_WEBHOOK_MAX_BODY_BYTES })); | ||
| expressApp.use((req: Request, _res: Response, next: (err?: unknown) => void) => { |
There was a problem hiding this comment.
express.raw(...) is newly used here, but the MS Teams monitor lifecycle test mocks the express module without providing a raw export. This will cause the test suite to throw at startup when express.raw is undefined. Update the mock in extensions/msteams/src/monitor.lifecycle.test.ts to include a raw() middleware factory (similar to how json() is mocked) so tests continue to run.
extensions/msteams/src/monitor.ts
Outdated
| // Attempt to repair invalid escape sequences by double-escaping bare | ||
| // backslashes (e.g. \p → \\p). This recovers payloads sent by certain | ||
| // Teams clients that violate the JSON spec. | ||
| const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1"); |
There was a problem hiding this comment.
The repair regex can accidentally corrupt already-valid escaped backslashes when the payload contains both valid sequences like \\p and a separate invalid escape that triggers the repair path. Example: if parsing fails elsewhere, \\p (valid) can become \\\p (invalid), preventing recovery. Consider only rewriting backslashes that are not themselves escaped (e.g. using a negative lookbehind like (?<!\\)\\([^"\\/bfnrtu])).
| const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1"); | |
| const fixed = rawText.replace(/(?<!\\)\\([^"\\/bfnrtu])/g, "\\\\$1"); |
| expressApp.use(express.raw({ type: "application/json", limit: MSTEAMS_WEBHOOK_MAX_BODY_BYTES })); | ||
| expressApp.use((req: Request, _res: Response, next: (err?: unknown) => void) => { | ||
| if (Buffer.isBuffer(req.body)) { | ||
| const rawText = req.body.toString("utf-8"); | ||
| try { | ||
| req.body = JSON.parse(rawText); | ||
| } catch { | ||
| // Attempt to repair invalid escape sequences by double-escaping bare | ||
| // backslashes (e.g. \p → \\p). This recovers payloads sent by certain | ||
| // Teams clients that violate the JSON spec. | ||
| const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1"); | ||
| try { | ||
| req.body = JSON.parse(fixed); | ||
| log.warn("msteams: repaired invalid JSON escape sequences in Bot Framework activity"); | ||
| } catch (parseErr) { | ||
| next(parseErr); | ||
| return; | ||
| } | ||
| } | ||
| } | ||
| next(); | ||
| }); | ||
| expressApp.use((err: unknown, _req: Request, res: Response, next: (err?: unknown) => void) => { | ||
| if (err && typeof err === "object" && "status" in err && err.status === 413) { | ||
| if (err && typeof err === "object" && "status" in err && (err as { status: number }).status === 413) { | ||
| res.status(413).json({ error: "Payload too large" }); | ||
| return; | ||
| } | ||
| if (err instanceof SyntaxError) { | ||
| // JSON could not be repaired; acknowledge with HTTP 200 to prevent Azure | ||
| // Bot Service from entering exponential backoff for an unrecoverable payload. | ||
| log.error("msteams: unrecoverable JSON parse error in Bot Framework activity", { | ||
| error: String(err), | ||
| }); | ||
| res.status(200).end(); | ||
| return; | ||
| } |
There was a problem hiding this comment.
This change introduces several new behaviors (raw body parsing, escape-sequence repair, and returning HTTP 200 on unrecoverable JSON) but there are no unit tests covering them. Since this module already has lifecycle/timeouts tests, please add coverage for: (1) a payload with an invalid escape like \p being repaired and passed to the adapter, and (2) an unrecoverable JSON parse returning 200 (and not invoking the adapter) to prevent Azure backoff.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 88468fab95
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
extensions/msteams/src/monitor.ts
Outdated
| // Attempt to repair invalid escape sequences by double-escaping bare | ||
| // backslashes (e.g. \p → \\p). This recovers payloads sent by certain | ||
| // Teams clients that violate the JSON spec. | ||
| const fixed = rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1"); |
There was a problem hiding this comment.
Restrict escape repair to truly bare backslashes
The repair regex also matches the second slash of already-valid escaped backslashes (for example \\p), so when a payload contains one invalid escape (\q) plus another field with a valid escaped backslash, rawText.replace(/\\([^"\\/bfnrtu])/g, "\\\\$1") rewrites the valid field into an invalid sequence and the second parse still fails. In that mixed-input case the handler returns HTTP 200 and drops an activity that should have been recoverable, which defeats the resilience improvement for real Teams payloads containing both kinds of strings.
Useful? React with 👍 / 👎.
Greptile code review (PR comment) identified an edge case in the
repair regex: /\\([^\"\\\/bfnrtu])/g incorrectly matches the second \
of a valid \\\\ pair followed by an invalid-escape char.
Example of the bug:
Input JSON text: \"C:\\\\q\" (valid: literal string C:\\q)
Old regex transforms \\q → \\\\q producing \"C:\\\\\\\\q\" (C:\\\\q) — wrong!
Fix: use alternation so valid \\\\ pairs are consumed first before
the invalid-escape branch can inspect their constituent backslashes.
/(\\\\\\\\)+|\\\\([^\"\\\\\\\/bfnrtu])/g
^^^^^^^^^ consume valid \\\\ pairs (leave unchanged)
^^^^^^^^^^^^^^^^^^^^^^^^^ only then repair lone bare escapes
This ensures that:
\\\\q → unchanged (valid: single literal backslash + q)
\\q → \\\\q (repaired: invalid bare escape)
\\\\\\q → \\\\\\\\q (repaired: two-backslash pair OK, trailing lone \\q repaired)
Also adds a regression test for the consecutive-backslash edge case."
Copilot Reviewer identified that the express mock in monitor.lifecycle.test.ts exported json() but not raw(). Since monitor.ts now uses express.raw() instead of express.json(), the test suite would throw 'express.raw is not a function' at startup. Add raw() as a no-op middleware factory alongside json() in the mock, matching the pattern already used for json().
|
Thanks for the thorough reviews! All three issues have been addressed in the latest commits: Regex edge case (Greptile + chatgpt-codex-connector feedback) A dedicated regression test (
All changes are in commit |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef2f5fbdeb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| * "\q" (invalid JSON: bare \q) → "\\q" | ||
| * "\\\q" (valid \\ + invalid \q) → "\\\\q" | ||
| */ | ||
| const REPAIR_BARE_ESCAPE_RE = /(\\.)+|\\([^"\\/bfnrtu])/g; |
There was a problem hiding this comment.
Restrict first regex branch to escaped backslashes only
The new repair regex uses (\\.)+ as its first alternation branch, which matches any backslash escape (including invalid ones like \q), so the replacement callback returns those matches unchanged instead of repairing them. In practice, payloads containing bare escapes (for example "Test\Project") remain invalid after repairJsonEscapes, the second JSON.parse still fails, and the middleware drops otherwise recoverable activities via the SyntaxError 200-ack path, which defeats the stated resilience fix.
Useful? React with 👍 / 👎.
CI 'Check types and lint and oxfmt' step failed because our new files were not formatted with oxfmt. Run oxfmt --write on both files to fix.
|
Fixed CI formatting failure in commit The |
|
Hi @hddevteam — thanks for the submission. I’m the new Microsoft Teams maintainer for OpenClaw. Please give me a day or two to work through the open Teams backlog. Also, join the Twitter community for daily MS Teams feedback + updates: https://x.com/i/communities/2031170403607974228 |
1 similar comment
|
Hi @hddevteam — thanks for the submission. I’m the new Microsoft Teams maintainer for OpenClaw. Please give me a day or two to work through the open Teams backlog. Also, join the Twitter community for daily MS Teams feedback + updates: https://x.com/i/communities/2031170403607974228 |
Problem
Some Bot Framework clients (e.g. certain MS Teams desktop/mobile clients) send
activity payloads that contain invalid JSON escape sequences — bare backslashes
followed by characters not defined in RFC 8259
(e.g.
\p,\q,\c).The existing
express.json()middleware uses body-parser's strict JSON parserwhich throws:
This causes the webhook handler to return a non-200 response to Azure Bot
Service. Azure Bot Service interprets non-200 responses as delivery failures and
enters exponential backoff — dropping all subsequent user messages until the
backoff window expires (which can be minutes to hours).
Observed Symptom
conversationUpdateon channel create) triggers theSyntaxError, msteams provider replies with 4xx/5xxthe local endpoint)
Fix
Replace
express.json()with a two-pass raw body parsing approach:Bufferviaexpress.raw({ type: 'application/json' })JSON.parse()on the raw bodySyntaxError): attempt to repair thepayload by double-escaping bare backslashes via regex:
/\\([^"\\/bfnrtu])/g → \\\\$1Azure Bot Service backoff, and log the error for investigation
Testing
Observed and confirmed with a real Teams bot deployment:
conversationUpdateactivity with an invalidescape sequence at position 388 → previously caused SyntaxError + Azure
backoff for subsequent messages
processed normally
backoff loops
Notes
express.raw()is already part of Express's built-in middleware (same asexpress.json()), no new dependencies neededvalid JSON escape sequences (
\",\\,\/,\b,\f,\n,\r,\t,\uXXXX) are preservedwell-formed payloads