fix #1154: prevent fenced code content from leaking into markdown passes by bergeouss · Pull Request #1190 · nesquena/hermes-webui

bergeouss · 2026-04-28T00:21:45Z

Thinking Path

Problem: When the LLM produces large tool outputs containing fenced code blocks (especially diff or bash blocks), the WebUI rendering corrupts subsequent messages. Diff hunks (- / + lines) are misinterpreted as markdown list items, injecting <ul><li> tags inside <pre><code> blocks, which breaks the </pre> closing tag and corrupts all following message content.

Root cause: In renderMd(), fenced code blocks were rendered to <pre><code> HTML early — right after the fence stash restore — then exposed to subsequent markdown passes (lists, headings, tables, etc.). The list regex ^(?: )?[-*+] .+ (multiline flag) would match lines inside <pre><code> blocks that happen to start with - or + (diff hunks, bash prompts), injecting HTML inside the code block.

Fix: Split the fence stash into two phases:

Inline backticks (\x00F): transformed to <code> tags in the stash callback, restored before bold/italic passes (unchanged behavior)
Fenced blocks (\x00P): transformed to <pre><code> HTML inside the stash callback, kept stashed until after ALL markdown passes, then restored just before _pre_stash

This ensures diff/patch/list content inside fenced code blocks is never misinterpreted as markdown, regardless of code block size or content.

Compatible with upstream's rawPreStash (\x00R) which protects existing <pre> blocks from streaming HTML.

What Changed

static/ui.js: Split fence stash into fenced blocks (\x00P) and inline backticks (\x00F); fenced blocks stay stashed through all markdown passes
tests/test_issue1154_fenced_code_leak.py: 14 new regression tests covering diff blocks, bash prompts, list markers, inline backticks, and headings after fenced blocks
tests/test_issue347.py: Updated static analysis to match refactored stash code structure
tests/test_issue_code_syntax_highlight.py: Updated static analysis for renamed variables in stash callback

All 2667 tests pass (1 known upstream failure: test_sprint31)

Fixes #1154

…down passes Root cause: In renderMd(), fenced code blocks were rendered to <pre><code> HTML early (right after the fence stash restore), then exposed to subsequent markdown passes (lists, headings, tables, etc.). When a fenced block contained list-like markers (e.g. diff hunks with - removed lines, + added lines, or bash prompts starting with -), the list regex would inject <ul><li> tags INSIDE the <pre><code> block, breaking the </pre> closing tag and corrupting all subsequent message rendering. Fix: Split the fence stash into two phases: - Inline backticks (\x00F): restored before bold/italic (unchanged behavior) - Fenced blocks (\x00P): rendered to <pre><code> inside the stash callback, kept stashed until AFTER all markdown passes, then restored just before _pre_stash. This ensures diff/patch/list content inside code blocks is never misinterpreted as markdown. Also preserves compatibility with upstream's rawPreStash (\x00R) which protects existing <pre> blocks from streaming HTML. Tests: 14 new tests covering diff blocks, bash prompts, list markers, inline backticks, and headings after fenced blocks. All 2667 tests pass. Fixes nesquena#1154

nesquena

Review — end-to-end ✅ (clean approve)

What this ships

@bergeouss's fix for #1154 — large tool outputs containing fenced diff/patch/log content corrupting subsequent message rendering. Stray </pre>, <ul><li> injected inside <pre><code>, multiple copy buttons, malformed layout.

The bug

renderMd() rendered fenced code blocks to <pre><code> HTML EARLY in the pipeline (right after the fence stash restore), then exposed that HTML to the list/heading/table regexes. Lines inside code blocks that look like markdown — diff hunks (- removed, + added), bash globs (* foo), markdown headings (# title) — got matched by those regexes and had <ul>/<li>/<h1> injected inside <pre>. The injected tags broke the </pre> closure, the SAFE_TAGS escape pass turned the malformed </pre> into </pre>, and the rest of the message rendered inside a broken block.

The fix

Split the fence stash into two phases (static/ui.js:825-841):

Fenced blocks (\x00P): converted to <pre><code> HTML inside the stash callback (content escaped via esc()), then kept stashed as \x00P<n>\x00 tokens through ALL markdown passes. Restored at ui.js:1002, AFTER lists/headings/tables/links — so those regexes can never see the rendered HTML.
Inline backticks (\x00F): converted to <code> tags in the callback, restored before bold/italic so ** `code` ** still works.

Mermaid handling moved from a separate post-pass into the fence stash callback (detects mermaid language tag, emits <div class="mermaid-block"> instead of <pre>).

Traced against upstream hermes-agent

Pure WebUI renderer change. The agent emits text/markdown, never raw HTML — so the rendering pipeline is downstream of the agent. Zero cross-tool coupling.

Bug-confirmed-on-master harness

Built a behavioural harness covering the exact issue scenarios. 8/10 pass on master, 10/10 pass on the PR. The two master failures show the bug shape directly:

FAIL [Diff -line with leading space inside code]:
  out: <pre><code class="language-diff">- removed line<br><ul><li>added line</code>&lt;/pre&gt;</li></ul>
                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

FAIL [Heading inside fenced block stays literal]:
  out: <pre><code class="language-markdown"># fake heading<br><h2>also fake</code>&lt;/pre&gt;</h2>
                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Exactly the issue #1154 symptom: "stray literal </pre> rendered in the page" — the SAFE_TAGS pass escapes the now-malformed closing tag because <ul> was injected into the <pre>.

Post-fix output is correct: <pre><code>- removed line\n+ added line</code></pre> with no <ul>/<li> and balanced <pre>/</pre> count.

End-to-end trace

Token namespace check

Existing tokens: \x00F (inline code, repurposed), \x00M (math), \x00R (raw <pre>), \x00C (inline code stash inside inlineMd), \x00G (img), \x00L (link), \x00O (outer <code>), \x00A (anchor), \x00B (block-element-line), \x00E (rendered pre stash for paragraph splitting). New: \x00P (rendered fenced block, late-restored). No collision. ✅

Stash/restore ordering

#	Op	Status
1	`\x00P` stash (fenced → rendered HTML)	new
2	`\x00F` stash (inline `→ `<code>`)	new (was raw text)
3	`\x00M` stash (math)	unchanged
4	`\x00R` stash (raw `<pre>`)	unchanged (#1172)
5	HTML→md rewrites (`<strong>` etc.)	unchanged
6	`\x00R` restore	unchanged
7	`\x00F` restore (`<code>` ready for bold/italic)	new — correct ordering
8	bold/italic, headings, lists, tables, autolinks	unchanged
9	`\x00M` restore (katex spans)	unchanged
10	`\x00P` restore (fenced HTML) ← new	placed AFTER all md passes
11	`\x00E` stash (pre/mermaid/katex blocks for paragraph protection)	unchanged

The critical move is step 10: fenced HTML is reintroduced AFTER lists/headings/tables run. The downstream \x00E stash at line 1008 then captures the now-restored <pre> blocks before paragraph splitting (so newlines inside code don't get <br>'d) — that pipeline is unchanged.

Mermaid integration

Pre-fix: separate post-pass s.replace(/```mermaid\n?(.*?)```/g, ...) after fence restore.

Post-fix: detected inside the fence stash callback via m[1].trim().toLowerCase() === 'mermaid'. Emits <div class="mermaid-block" data-mermaid-id="...">...</div> directly into _preBlock_stash. The id is generated the same way (Math.random().toString(36).slice(2,10)).

The downstream \x00E stash at line 1008 already includes mermaid via the regex <div class="(mermaid-block|katex-block)".... So mermaid blocks survive paragraph splitting just like before. ✅

Edge-case trace

Scenario	Expected	Actual
Diff block with `+`/`-` lines	content literal, no `<ul>` inside `<pre>`	✅ harness 1
Bash block with `* foo` lines	no list interpretation	✅ PR test
Markdown block with `# heading`	literal `#`, no `<h1>`	✅ harness 10
Text block with `not bold`	literal `**`, no `<strong>`	✅ harness 9
Empty fenced block ```\n```	empty `<pre>`	✅ harness 2
Fenced at very start (no blank line)	renders correctly	✅ harness 3
Two adjacent fenced blocks	both render, count = 2	✅ harness 4
Mermaid block	`<div class="mermaid-block">`	✅ harness 8
Inline `not bold` outside fence	`<code>` wraps literal	✅ harness 5
Fenced block containing `<pre>` text	text escaped, balanced	✅ harness 7
Heading after fenced block	`<h1>` renders normally	✅ harness 1
List after fenced block	`<ul>` renders normally	✅ harness 1
`<pre>`/`</pre>` count balanced	always	✅ all 10 cases

Tests

PR's 14 new tests in tests/test_issue1154_fenced_code_leak.py: pass.
PR's targeted updates in test_issue347.py (variable-rename: fence_stash.push(m) → fence_stash.push() and test_issue_code_syntax_highlight.py (normalizedLang → lang): both correct, same intent preserved.
Local full suite: 2651 passed, 47 skipped, 1 PR-unrelated pre-existing failure (test_sprint3.py::test_workspace_add_rejects_system_paths — the macOS-only quirk that PR #1186 fixes; this PR is on top of master before #1186 lands).
CI on PR: ✅ test (3.11), ✅ test (3.12), ✅ test (3.13).
My behavioural harness: 10/10 on PR, 8/10 on master (2 failures match issue symptoms exactly).
All 25 test_renderer_js_behaviour tests: pass — the existing renderer test suite that locks blockquote/list/heading/etc. invariants still passes, confirming no regression in the core rendering paths.

Other audit — confirmed correct

JS syntax: node --check passes.
Language detection: new regex ^(\w[\w+-]*)\n?([\s\S]*)$ — same coverage as the original [\w+-]*\n? (zero-or-more word chars + optional newline). Empty-language case handled at lines 833-834 (m may be null → lang='', code=raw.replace(/^\n?/,'')). ✅
esc() on code content: still escapes <, >, &, ", ' via the same esc() function — no XSS surface change.
Compatible with rawPreStash (#1172): the rawPreStash captures pre-existing <pre> HTML in the input; this PR's fence stash captures markdown fenced blocks. Different sources, different tokens (\x00R vs \x00P), no overlap.
Inline backtick semantics: new pass s.replace(/\([^\`\n]+)`/g, ...)` is identical regex to the inline part of the original combined pattern. Same behaviour. ✅

Minor observations (non-blocking)

The new \x00P stash adds one more token to an already token-heavy renderer. The stash family (F, M, R, P, C, G, L, O, A, B, E) is now 11 tokens — worth a future cleanup pass that consolidates them, but the current per-token comment + restore order is well-documented and traceable.
The mermaid id generator (Math.random().toString(36).slice(2,10)) is the same as before, so existing mermaid post-processing keyed on that id format keeps working.
The PR description's "All 2667 tests pass" — local count for me was 2651 passed (different test selection because of macOS skips). Same effective signal.

Recommendation

Approved. Principled fix at the right layer: the rendering pipeline now keeps fenced HTML stashed past every dangerous markdown pass, eliminating the entire class of "code block content matches markdown regex and corrupts the page" bugs. Behavioural harness directly confirms the master bug and the post-fix correctness. CI green on 3.11/3.12/3.13. Existing 71 renderer tests still pass. Parked at approval — ready for the release agent's merge/tag pipeline.

@nesquena

…code leak (#1194) Batch release v0.50.231 — 3 fixes. ## PRs included | PR | Author | Fix | |---|---|---| | #1186 | @nesquena (Claude Code) | macOS `/etc` symlink bypass in workspace blocked-roots | | #1187 | @nesquena-hermes | Workspace panel stuck closed after empty-session reload | | #1190 | @bergeouss | Fenced code content leaking into markdown passes (#1154) | All three PRs were independently reviewed and approved by @nesquena. ## Test results **2729 passed, 2 skipped** (2 macOS-only tests correctly skipped on Linux). Browser QA: **21/21**. ## Key fix notes **#1186:** `_workspace_blocked_roots()` now returns both literal and `Path.resolve()` forms of each blocked root. macOS symlinks (`/etc → /private/etc`) previously let a resolved candidate slip past the literal check. New `_is_blocked_system_path()` helper with `/var/folders` and `/var/tmp` carve-outs for pytest temp dirs. **#1187:** Regression from #1182 — `syncWorkspacePanelState()` force-closed on any no-session state. Now only closes in `'preview'` mode. Both boot paths restore localStorage panel pref before sync. **#1190:** Fenced code blocks are now stashed as `\x00P<n>\x00` tokens through ALL markdown passes (list/heading/table regexes), restored at the very end. Previously, diff hunks and markdown headings inside code blocks triggered those regexes, injecting `<ul>/<li>/<h>` tags that broke `</pre>` closure.

nesquena-hermes · 2026-04-28T00:44:10Z

Merged as v0.50.231 via #1194. Thank you @bergeouss!

@nesquena

…code leak (nesquena#1194) Batch release v0.50.231 — 3 fixes. ## PRs included | PR | Author | Fix | |---|---|---| | nesquena#1186 | @nesquena (Claude Code) | macOS `/etc` symlink bypass in workspace blocked-roots | | nesquena#1187 | @nesquena-hermes | Workspace panel stuck closed after empty-session reload | | nesquena#1190 | @bergeouss | Fenced code content leaking into markdown passes (nesquena#1154) | All three PRs were independently reviewed and approved by @nesquena. ## Test results **2729 passed, 2 skipped** (2 macOS-only tests correctly skipped on Linux). Browser QA: **21/21**. ## Key fix notes **nesquena#1186:** `_workspace_blocked_roots()` now returns both literal and `Path.resolve()` forms of each blocked root. macOS symlinks (`/etc → /private/etc`) previously let a resolved candidate slip past the literal check. New `_is_blocked_system_path()` helper with `/var/folders` and `/var/tmp` carve-outs for pytest temp dirs. **nesquena#1187:** Regression from nesquena#1182 — `syncWorkspacePanelState()` force-closed on any no-session state. Now only closes in `'preview'` mode. Both boot paths restore localStorage panel pref before sync. **nesquena#1190:** Fenced code blocks are now stashed as `\x00P<n>\x00` tokens through ALL markdown passes (list/heading/table regexes), restored at the very end. Previously, diff hunks and markdown headings inside code blocks triggered those regexes, injecting `<ul>/<li>/<h>` tags that broke `</pre>` closure.

nesquena approved these changes Apr 28, 2026

View reviewed changes

nesquena-hermes mentioned this pull request Apr 28, 2026

fix: batch v0.50.231 — macOS symlink bypass, workspace panel, fenced code leak #1194

Merged

nesquena-hermes closed this Apr 28, 2026

nesquena-hermes mentioned this pull request Apr 28, 2026

Transcript rendering breaks after large tool outputs + diff/log content, showing stray </pre> and corrupting later layout #1154

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix #1154: prevent fenced code content from leaking into markdown passes#1190

fix #1154: prevent fenced code content from leaking into markdown passes#1190
bergeouss wants to merge 1 commit intonesquena:masterfrom
bergeouss:fix/issue-1154-fenced-code-leak

bergeouss commented Apr 28, 2026

Uh oh!

nesquena left a comment

Uh oh!

nesquena-hermes commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bergeouss commented Apr 28, 2026

Thinking Path

What Changed

Uh oh!

nesquena left a comment

Choose a reason for hiding this comment

Review — end-to-end ✅ (clean approve)

What this ships

The bug

The fix

Traced against upstream hermes-agent

Bug-confirmed-on-master harness

End-to-end trace

Token namespace check

Stash/restore ordering

Mermaid integration

Edge-case trace

Tests

Other audit — confirmed correct

Minor observations (non-blocking)

Recommendation

Uh oh!

nesquena-hermes commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants