fix #1154: prevent fenced code content from leaking into markdown passes#1190
fix #1154: prevent fenced code content from leaking into markdown passes#1190bergeouss wants to merge 1 commit intonesquena:masterfrom
Conversation
…down passes Root cause: In renderMd(), fenced code blocks were rendered to <pre><code> HTML early (right after the fence stash restore), then exposed to subsequent markdown passes (lists, headings, tables, etc.). When a fenced block contained list-like markers (e.g. diff hunks with - removed lines, + added lines, or bash prompts starting with -), the list regex would inject <ul><li> tags INSIDE the <pre><code> block, breaking the </pre> closing tag and corrupting all subsequent message rendering. Fix: Split the fence stash into two phases: - Inline backticks (\x00F): restored before bold/italic (unchanged behavior) - Fenced blocks (\x00P): rendered to <pre><code> inside the stash callback, kept stashed until AFTER all markdown passes, then restored just before _pre_stash. This ensures diff/patch/list content inside code blocks is never misinterpreted as markdown. Also preserves compatibility with upstream's rawPreStash (\x00R) which protects existing <pre> blocks from streaming HTML. Tests: 14 new tests covering diff blocks, bash prompts, list markers, inline backticks, and headings after fenced blocks. All 2667 tests pass. Fixes nesquena#1154
nesquena
left a comment
There was a problem hiding this comment.
Review — end-to-end ✅ (clean approve)
What this ships
@bergeouss's fix for #1154 — large tool outputs containing fenced diff/patch/log content corrupting subsequent message rendering. Stray </pre>, <ul><li> injected inside <pre><code>, multiple copy buttons, malformed layout.
The bug
renderMd() rendered fenced code blocks to <pre><code> HTML EARLY in the pipeline (right after the fence stash restore), then exposed that HTML to the list/heading/table regexes. Lines inside code blocks that look like markdown — diff hunks (- removed, + added), bash globs (* foo), markdown headings (# title) — got matched by those regexes and had <ul>/<li>/<h1> injected inside <pre>. The injected tags broke the </pre> closure, the SAFE_TAGS escape pass turned the malformed </pre> into </pre>, and the rest of the message rendered inside a broken block.
The fix
Split the fence stash into two phases (static/ui.js:825-841):
- Fenced blocks (
\x00P): converted to<pre><code>HTML inside the stash callback (content escaped viaesc()), then kept stashed as\x00P<n>\x00tokens through ALL markdown passes. Restored at ui.js:1002, AFTER lists/headings/tables/links — so those regexes can never see the rendered HTML. - Inline backticks (
\x00F): converted to<code>tags in the callback, restored before bold/italic so**`code`**still works.
Mermaid handling moved from a separate post-pass into the fence stash callback (detects mermaid language tag, emits <div class="mermaid-block"> instead of <pre>).
Traced against upstream hermes-agent
Pure WebUI renderer change. The agent emits text/markdown, never raw HTML — so the rendering pipeline is downstream of the agent. Zero cross-tool coupling.
Bug-confirmed-on-master harness
Built a behavioural harness covering the exact issue scenarios. 8/10 pass on master, 10/10 pass on the PR. The two master failures show the bug shape directly:
FAIL [Diff -line with leading space inside code]:
out: <pre><code class="language-diff">- removed line<br><ul><li>added line</code></pre></li></ul>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FAIL [Heading inside fenced block stays literal]:
out: <pre><code class="language-markdown"># fake heading<br><h2>also fake</code></pre></h2>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Exactly the issue #1154 symptom: "stray literal </pre> rendered in the page" — the SAFE_TAGS pass escapes the now-malformed closing tag because <ul> was injected into the <pre>.
Post-fix output is correct: <pre><code>- removed line\n+ added line</code></pre> with no <ul>/<li> and balanced <pre>/</pre> count.
End-to-end trace
Token namespace check
Existing tokens: \x00F (inline code, repurposed), \x00M (math), \x00R (raw <pre>), \x00C (inline code stash inside inlineMd), \x00G (img), \x00L (link), \x00O (outer <code>), \x00A (anchor), \x00B (block-element-line), \x00E (rendered pre stash for paragraph splitting). New: \x00P (rendered fenced block, late-restored). No collision. ✅
Stash/restore ordering
| # | Op | Status |
|---|---|---|
| 1 | \x00P stash (fenced → rendered HTML) |
new |
| 2 | \x00F stash (inline `→ <code>) |
new (was raw text) |
| 3 | \x00M stash (math) |
unchanged |
| 4 | \x00R stash (raw <pre>) |
unchanged (#1172) |
| 5 | HTML→md rewrites (<strong> etc.) |
unchanged |
| 6 | \x00R restore |
unchanged |
| 7 | \x00F restore (<code> ready for bold/italic) |
new — correct ordering |
| 8 | bold/italic, headings, lists, tables, autolinks | unchanged |
| 9 | \x00M restore (katex spans) |
unchanged |
| 10 | \x00P restore (fenced HTML) ← new |
placed AFTER all md passes |
| 11 | \x00E stash (pre/mermaid/katex blocks for paragraph protection) |
unchanged |
The critical move is step 10: fenced HTML is reintroduced AFTER lists/headings/tables run. The downstream \x00E stash at line 1008 then captures the now-restored <pre> blocks before paragraph splitting (so newlines inside code don't get <br>'d) — that pipeline is unchanged.
Mermaid integration
Pre-fix: separate post-pass s.replace(/```mermaid\n?(.*?)```/g, ...) after fence restore.
Post-fix: detected inside the fence stash callback via m[1].trim().toLowerCase() === 'mermaid'. Emits <div class="mermaid-block" data-mermaid-id="...">...</div> directly into _preBlock_stash. The id is generated the same way (Math.random().toString(36).slice(2,10)).
The downstream \x00E stash at line 1008 already includes mermaid via the regex <div class="(mermaid-block|katex-block)".... So mermaid blocks survive paragraph splitting just like before. ✅
Edge-case trace
| Scenario | Expected | Actual |
|---|---|---|
Diff block with +/- lines |
content literal, no <ul> inside <pre> |
✅ harness 1 |
Bash block with * foo lines |
no list interpretation | ✅ PR test |
Markdown block with # heading |
literal #, no <h1> |
✅ harness 10 |
Text block with **not bold** |
literal **, no <strong> |
✅ harness 9 |
Empty fenced block ```\n``` |
empty <pre> |
✅ harness 2 |
| Fenced at very start (no blank line) | renders correctly | ✅ harness 3 |
| Two adjacent fenced blocks | both render, count = 2 | ✅ harness 4 |
| Mermaid block | <div class="mermaid-block"> |
✅ harness 8 |
Inline `**not bold**` outside fence |
<code> wraps literal |
✅ harness 5 |
Fenced block containing <pre> text |
text escaped, balanced | ✅ harness 7 |
| Heading after fenced block | <h1> renders normally |
✅ harness 1 |
| List after fenced block | <ul> renders normally |
✅ harness 1 |
<pre>/</pre> count balanced |
always | ✅ all 10 cases |
Tests
- PR's 14 new tests in
tests/test_issue1154_fenced_code_leak.py: pass. - PR's targeted updates in
test_issue347.py(variable-rename:fence_stash.push(m)→fence_stash.push() andtest_issue_code_syntax_highlight.py(normalizedLang→lang): both correct, same intent preserved. - Local full suite: 2651 passed, 47 skipped, 1 PR-unrelated pre-existing failure (
test_sprint3.py::test_workspace_add_rejects_system_paths— the macOS-only quirk that PR #1186 fixes; this PR is on top of master before #1186 lands). - CI on PR: ✅ test (3.11), ✅ test (3.12), ✅ test (3.13).
- My behavioural harness: 10/10 on PR, 8/10 on master (2 failures match issue symptoms exactly).
- All 25
test_renderer_js_behaviourtests: pass — the existing renderer test suite that locks blockquote/list/heading/etc. invariants still passes, confirming no regression in the core rendering paths.
Other audit — confirmed correct
- JS syntax:
node --checkpasses. - Language detection: new regex
^(\w[\w+-]*)\n?([\s\S]*)$— same coverage as the original[\w+-]*\n?(zero-or-more word chars + optional newline). Empty-language case handled at lines 833-834 (mmay be null →lang='',code=raw.replace(/^\n?/,'')). ✅ esc()on code content: still escapes<,>,&,",'via the sameesc()function — no XSS surface change.- Compatible with
rawPreStash(#1172): the rawPreStash captures pre-existing<pre>HTML in the input; this PR's fence stash captures markdown fenced blocks. Different sources, different tokens (\x00Rvs\x00P), no overlap. - Inline backtick semantics: new pass
s.replace(/\([^\`\n]+)`/g, ...)` is identical regex to the inline part of the original combined pattern. Same behaviour. ✅
Minor observations (non-blocking)
- The new
\x00Pstash adds one more token to an already token-heavy renderer. The stash family (F, M, R, P, C, G, L, O, A, B, E) is now 11 tokens — worth a future cleanup pass that consolidates them, but the current per-token comment + restore order is well-documented and traceable. - The mermaid id generator (
Math.random().toString(36).slice(2,10)) is the same as before, so existing mermaid post-processing keyed on that id format keeps working. - The PR description's "All 2667 tests pass" — local count for me was 2651 passed (different test selection because of macOS skips). Same effective signal.
Recommendation
Approved. Principled fix at the right layer: the rendering pipeline now keeps fenced HTML stashed past every dangerous markdown pass, eliminating the entire class of "code block content matches markdown regex and corrupts the page" bugs. Behavioural harness directly confirms the master bug and the post-fix correctness. CI green on 3.11/3.12/3.13. Existing 71 renderer tests still pass. Parked at approval — ready for the release agent's merge/tag pipeline.
…code leak (#1194) Batch release v0.50.231 — 3 fixes. ## PRs included | PR | Author | Fix | |---|---|---| | #1186 | @nesquena (Claude Code) | macOS `/etc` symlink bypass in workspace blocked-roots | | #1187 | @nesquena-hermes | Workspace panel stuck closed after empty-session reload | | #1190 | @bergeouss | Fenced code content leaking into markdown passes (#1154) | All three PRs were independently reviewed and approved by @nesquena. ## Test results **2729 passed, 2 skipped** (2 macOS-only tests correctly skipped on Linux). Browser QA: **21/21**. ## Key fix notes **#1186:** `_workspace_blocked_roots()` now returns both literal and `Path.resolve()` forms of each blocked root. macOS symlinks (`/etc → /private/etc`) previously let a resolved candidate slip past the literal check. New `_is_blocked_system_path()` helper with `/var/folders` and `/var/tmp` carve-outs for pytest temp dirs. **#1187:** Regression from #1182 — `syncWorkspacePanelState()` force-closed on any no-session state. Now only closes in `'preview'` mode. Both boot paths restore localStorage panel pref before sync. **#1190:** Fenced code blocks are now stashed as `\x00P<n>\x00` tokens through ALL markdown passes (list/heading/table regexes), restored at the very end. Previously, diff hunks and markdown headings inside code blocks triggered those regexes, injecting `<ul>/<li>/<h>` tags that broke `</pre>` closure.
|
Merged as v0.50.231 via #1194. Thank you @bergeouss! |
…code leak (nesquena#1194) Batch release v0.50.231 — 3 fixes. ## PRs included | PR | Author | Fix | |---|---|---| | nesquena#1186 | @nesquena (Claude Code) | macOS `/etc` symlink bypass in workspace blocked-roots | | nesquena#1187 | @nesquena-hermes | Workspace panel stuck closed after empty-session reload | | nesquena#1190 | @bergeouss | Fenced code content leaking into markdown passes (nesquena#1154) | All three PRs were independently reviewed and approved by @nesquena. ## Test results **2729 passed, 2 skipped** (2 macOS-only tests correctly skipped on Linux). Browser QA: **21/21**. ## Key fix notes **nesquena#1186:** `_workspace_blocked_roots()` now returns both literal and `Path.resolve()` forms of each blocked root. macOS symlinks (`/etc → /private/etc`) previously let a resolved candidate slip past the literal check. New `_is_blocked_system_path()` helper with `/var/folders` and `/var/tmp` carve-outs for pytest temp dirs. **nesquena#1187:** Regression from nesquena#1182 — `syncWorkspacePanelState()` force-closed on any no-session state. Now only closes in `'preview'` mode. Both boot paths restore localStorage panel pref before sync. **nesquena#1190:** Fenced code blocks are now stashed as `\x00P<n>\x00` tokens through ALL markdown passes (list/heading/table regexes), restored at the very end. Previously, diff hunks and markdown headings inside code blocks triggered those regexes, injecting `<ul>/<li>/<h>` tags that broke `</pre>` closure.
Thinking Path
Problem: When the LLM produces large tool outputs containing fenced code blocks (especially
difforbashblocks), the WebUI rendering corrupts subsequent messages. Diff hunks (-/+lines) are misinterpreted as markdown list items, injecting<ul><li>tags inside<pre><code>blocks, which breaks the</pre>closing tag and corrupts all following message content.Root cause: In
renderMd(), fenced code blocks were rendered to<pre><code>HTML early — right after the fence stash restore — then exposed to subsequent markdown passes (lists, headings, tables, etc.). The list regex^(?: )?[-*+] .+(multiline flag) would match lines inside<pre><code>blocks that happen to start with-or+(diff hunks, bash prompts), injecting HTML inside the code block.Fix: Split the fence stash into two phases:
\x00F): transformed to<code>tags in the stash callback, restored before bold/italic passes (unchanged behavior)\x00P): transformed to<pre><code>HTML inside the stash callback, kept stashed until after ALL markdown passes, then restored just before_pre_stashThis ensures diff/patch/list content inside fenced code blocks is never misinterpreted as markdown, regardless of code block size or content.
Compatible with upstream's
rawPreStash(\x00R) which protects existing<pre>blocks from streaming HTML.What Changed
static/ui.js: Split fence stash into fenced blocks (\x00P) and inline backticks (\x00F); fenced blocks stay stashed through all markdown passestests/test_issue1154_fenced_code_leak.py: 14 new regression tests covering diff blocks, bash prompts, list markers, inline backticks, and headings after fenced blockstests/test_issue347.py: Updated static analysis to match refactored stash code structuretests/test_issue_code_syntax_highlight.py: Updated static analysis for renamed variables in stash callbackAll 2667 tests pass (1 known upstream failure:
test_sprint31)Fixes #1154