Skip to content

fix #1154: prevent fenced code content from leaking into markdown passes#1190

Closed
bergeouss wants to merge 1 commit intonesquena:masterfrom
bergeouss:fix/issue-1154-fenced-code-leak
Closed

fix #1154: prevent fenced code content from leaking into markdown passes#1190
bergeouss wants to merge 1 commit intonesquena:masterfrom
bergeouss:fix/issue-1154-fenced-code-leak

Conversation

@bergeouss
Copy link
Copy Markdown
Contributor

Thinking Path

Problem: When the LLM produces large tool outputs containing fenced code blocks (especially diff or bash blocks), the WebUI rendering corrupts subsequent messages. Diff hunks (- / + lines) are misinterpreted as markdown list items, injecting <ul><li> tags inside <pre><code> blocks, which breaks the </pre> closing tag and corrupts all following message content.

Root cause: In renderMd(), fenced code blocks were rendered to <pre><code> HTML early — right after the fence stash restore — then exposed to subsequent markdown passes (lists, headings, tables, etc.). The list regex ^(?: )?[-*+] .+ (multiline flag) would match lines inside <pre><code> blocks that happen to start with - or + (diff hunks, bash prompts), injecting HTML inside the code block.

Fix: Split the fence stash into two phases:

  1. Inline backticks (\x00F): transformed to <code> tags in the stash callback, restored before bold/italic passes (unchanged behavior)
  2. Fenced blocks (\x00P): transformed to <pre><code> HTML inside the stash callback, kept stashed until after ALL markdown passes, then restored just before _pre_stash

This ensures diff/patch/list content inside fenced code blocks is never misinterpreted as markdown, regardless of code block size or content.

Compatible with upstream's rawPreStash (\x00R) which protects existing <pre> blocks from streaming HTML.

What Changed

  • static/ui.js: Split fence stash into fenced blocks (\x00P) and inline backticks (\x00F); fenced blocks stay stashed through all markdown passes
  • tests/test_issue1154_fenced_code_leak.py: 14 new regression tests covering diff blocks, bash prompts, list markers, inline backticks, and headings after fenced blocks
  • tests/test_issue347.py: Updated static analysis to match refactored stash code structure
  • tests/test_issue_code_syntax_highlight.py: Updated static analysis for renamed variables in stash callback

All 2667 tests pass (1 known upstream failure: test_sprint31)

Fixes #1154

…down passes

Root cause: In renderMd(), fenced code blocks were rendered to
<pre><code> HTML early (right after the fence stash restore), then
exposed to subsequent markdown passes (lists, headings, tables, etc.).

When a fenced block contained list-like markers (e.g. diff hunks with
- removed lines, + added lines, or bash prompts starting with -), the
list regex would inject <ul><li> tags INSIDE the <pre><code> block,
breaking the </pre> closing tag and corrupting all subsequent message
rendering.

Fix: Split the fence stash into two phases:
- Inline backticks (\x00F): restored before bold/italic (unchanged behavior)
- Fenced blocks (\x00P): rendered to <pre><code> inside the stash
  callback, kept stashed until AFTER all markdown passes, then restored
  just before _pre_stash. This ensures diff/patch/list content inside
  code blocks is never misinterpreted as markdown.

Also preserves compatibility with upstream's rawPreStash (\x00R) which
protects existing <pre> blocks from streaming HTML.

Tests: 14 new tests covering diff blocks, bash prompts, list markers,
inline backticks, and headings after fenced blocks. All 2667 tests pass.

Fixes nesquena#1154
Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — end-to-end ✅ (clean approve)

What this ships

@bergeouss's fix for #1154 — large tool outputs containing fenced diff/patch/log content corrupting subsequent message rendering. Stray </pre>, <ul><li> injected inside <pre><code>, multiple copy buttons, malformed layout.

The bug

renderMd() rendered fenced code blocks to <pre><code> HTML EARLY in the pipeline (right after the fence stash restore), then exposed that HTML to the list/heading/table regexes. Lines inside code blocks that look like markdown — diff hunks (- removed, + added), bash globs (* foo), markdown headings (# title) — got matched by those regexes and had <ul>/<li>/<h1> injected inside <pre>. The injected tags broke the </pre> closure, the SAFE_TAGS escape pass turned the malformed </pre> into &lt;/pre&gt;, and the rest of the message rendered inside a broken block.

The fix

Split the fence stash into two phases (static/ui.js:825-841):

  1. Fenced blocks (\x00P): converted to <pre><code> HTML inside the stash callback (content escaped via esc()), then kept stashed as \x00P<n>\x00 tokens through ALL markdown passes. Restored at ui.js:1002, AFTER lists/headings/tables/links — so those regexes can never see the rendered HTML.
  2. Inline backticks (\x00F): converted to <code> tags in the callback, restored before bold/italic so ** `code` ** still works.

Mermaid handling moved from a separate post-pass into the fence stash callback (detects mermaid language tag, emits <div class="mermaid-block"> instead of <pre>).

Traced against upstream hermes-agent

Pure WebUI renderer change. The agent emits text/markdown, never raw HTML — so the rendering pipeline is downstream of the agent. Zero cross-tool coupling.

Bug-confirmed-on-master harness

Built a behavioural harness covering the exact issue scenarios. 8/10 pass on master, 10/10 pass on the PR. The two master failures show the bug shape directly:

FAIL [Diff -line with leading space inside code]:
  out: <pre><code class="language-diff">- removed line<br><ul><li>added line</code>&lt;/pre&gt;</li></ul>
                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

FAIL [Heading inside fenced block stays literal]:
  out: <pre><code class="language-markdown"># fake heading<br><h2>also fake</code>&lt;/pre&gt;</h2>
                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Exactly the issue #1154 symptom: "stray literal </pre> rendered in the page" — the SAFE_TAGS pass escapes the now-malformed closing tag because <ul> was injected into the <pre>.

Post-fix output is correct: <pre><code>- removed line\n+ added line</code></pre> with no <ul>/<li> and balanced <pre>/</pre> count.

End-to-end trace

Token namespace check

Existing tokens: \x00F (inline code, repurposed), \x00M (math), \x00R (raw <pre>), \x00C (inline code stash inside inlineMd), \x00G (img), \x00L (link), \x00O (outer <code>), \x00A (anchor), \x00B (block-element-line), \x00E (rendered pre stash for paragraph splitting). New: \x00P (rendered fenced block, late-restored). No collision. ✅

Stash/restore ordering

# Op Status
1 \x00P stash (fenced → rendered HTML) new
2 \x00F stash (inline `<code>) new (was raw text)
3 \x00M stash (math) unchanged
4 \x00R stash (raw <pre>) unchanged (#1172)
5 HTML→md rewrites (<strong> etc.) unchanged
6 \x00R restore unchanged
7 \x00F restore (<code> ready for bold/italic) new — correct ordering
8 bold/italic, headings, lists, tables, autolinks unchanged
9 \x00M restore (katex spans) unchanged
10 \x00P restore (fenced HTML) ← new placed AFTER all md passes
11 \x00E stash (pre/mermaid/katex blocks for paragraph protection) unchanged

The critical move is step 10: fenced HTML is reintroduced AFTER lists/headings/tables run. The downstream \x00E stash at line 1008 then captures the now-restored <pre> blocks before paragraph splitting (so newlines inside code don't get <br>'d) — that pipeline is unchanged.

Mermaid integration

Pre-fix: separate post-pass s.replace(/```mermaid\n?(.*?)```/g, ...) after fence restore.

Post-fix: detected inside the fence stash callback via m[1].trim().toLowerCase() === 'mermaid'. Emits <div class="mermaid-block" data-mermaid-id="...">...</div> directly into _preBlock_stash. The id is generated the same way (Math.random().toString(36).slice(2,10)).

The downstream \x00E stash at line 1008 already includes mermaid via the regex <div class="(mermaid-block|katex-block)".... So mermaid blocks survive paragraph splitting just like before. ✅

Edge-case trace

Scenario Expected Actual
Diff block with +/- lines content literal, no <ul> inside <pre> ✅ harness 1
Bash block with * foo lines no list interpretation ✅ PR test
Markdown block with # heading literal #, no <h1> ✅ harness 10
Text block with **not bold** literal **, no <strong> ✅ harness 9
Empty fenced block ```\n``` empty <pre> ✅ harness 2
Fenced at very start (no blank line) renders correctly ✅ harness 3
Two adjacent fenced blocks both render, count = 2 ✅ harness 4
Mermaid block <div class="mermaid-block"> ✅ harness 8
Inline `**not bold**` outside fence <code> wraps literal ✅ harness 5
Fenced block containing <pre> text text escaped, balanced ✅ harness 7
Heading after fenced block <h1> renders normally ✅ harness 1
List after fenced block <ul> renders normally ✅ harness 1
<pre>/</pre> count balanced always ✅ all 10 cases

Tests

  • PR's 14 new tests in tests/test_issue1154_fenced_code_leak.py: pass.
  • PR's targeted updates in test_issue347.py (variable-rename: fence_stash.push(m)fence_stash.push() and test_issue_code_syntax_highlight.py (normalizedLanglang): both correct, same intent preserved.
  • Local full suite: 2651 passed, 47 skipped, 1 PR-unrelated pre-existing failure (test_sprint3.py::test_workspace_add_rejects_system_paths — the macOS-only quirk that PR #1186 fixes; this PR is on top of master before #1186 lands).
  • CI on PR: ✅ test (3.11), ✅ test (3.12), ✅ test (3.13).
  • My behavioural harness: 10/10 on PR, 8/10 on master (2 failures match issue symptoms exactly).
  • All 25 test_renderer_js_behaviour tests: pass — the existing renderer test suite that locks blockquote/list/heading/etc. invariants still passes, confirming no regression in the core rendering paths.

Other audit — confirmed correct

  • JS syntax: node --check passes.
  • Language detection: new regex ^(\w[\w+-]*)\n?([\s\S]*)$ — same coverage as the original [\w+-]*\n? (zero-or-more word chars + optional newline). Empty-language case handled at lines 833-834 (m may be null → lang='', code=raw.replace(/^\n?/,'')). ✅
  • esc() on code content: still escapes <, >, &, ", ' via the same esc() function — no XSS surface change.
  • Compatible with rawPreStash (#1172): the rawPreStash captures pre-existing <pre> HTML in the input; this PR's fence stash captures markdown fenced blocks. Different sources, different tokens (\x00R vs \x00P), no overlap.
  • Inline backtick semantics: new pass s.replace(/\([^\`\n]+)`/g, ...)` is identical regex to the inline part of the original combined pattern. Same behaviour. ✅

Minor observations (non-blocking)

  • The new \x00P stash adds one more token to an already token-heavy renderer. The stash family (F, M, R, P, C, G, L, O, A, B, E) is now 11 tokens — worth a future cleanup pass that consolidates them, but the current per-token comment + restore order is well-documented and traceable.
  • The mermaid id generator (Math.random().toString(36).slice(2,10)) is the same as before, so existing mermaid post-processing keyed on that id format keeps working.
  • The PR description's "All 2667 tests pass" — local count for me was 2651 passed (different test selection because of macOS skips). Same effective signal.

Recommendation

Approved. Principled fix at the right layer: the rendering pipeline now keeps fenced HTML stashed past every dangerous markdown pass, eliminating the entire class of "code block content matches markdown regex and corrupts the page" bugs. Behavioural harness directly confirms the master bug and the post-fix correctness. CI green on 3.11/3.12/3.13. Existing 71 renderer tests still pass. Parked at approval — ready for the release agent's merge/tag pipeline.

nesquena-hermes added a commit that referenced this pull request Apr 28, 2026
…code leak (#1194)

Batch release v0.50.231 — 3 fixes.

## PRs included

| PR | Author | Fix |
|---|---|---|
| #1186 | @nesquena (Claude Code) | macOS `/etc` symlink bypass in workspace blocked-roots |
| #1187 | @nesquena-hermes | Workspace panel stuck closed after empty-session reload |
| #1190 | @bergeouss | Fenced code content leaking into markdown passes (#1154) |

All three PRs were independently reviewed and approved by @nesquena.

## Test results

**2729 passed, 2 skipped** (2 macOS-only tests correctly skipped on Linux). Browser QA: **21/21**.

## Key fix notes

**#1186:** `_workspace_blocked_roots()` now returns both literal and `Path.resolve()` forms of each blocked root. macOS symlinks (`/etc → /private/etc`) previously let a resolved candidate slip past the literal check. New `_is_blocked_system_path()` helper with `/var/folders` and `/var/tmp` carve-outs for pytest temp dirs.

**#1187:** Regression from #1182 — `syncWorkspacePanelState()` force-closed on any no-session state. Now only closes in `'preview'` mode. Both boot paths restore localStorage panel pref before sync.

**#1190:** Fenced code blocks are now stashed as `\x00P<n>\x00` tokens through ALL markdown passes (list/heading/table regexes), restored at the very end. Previously, diff hunks and markdown headings inside code blocks triggered those regexes, injecting `<ul>/<li>/<h>` tags that broke `</pre>` closure.
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Merged as v0.50.231 via #1194. Thank you @bergeouss!

JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 29, 2026
…code leak (nesquena#1194)

Batch release v0.50.231 — 3 fixes.

## PRs included

| PR | Author | Fix |
|---|---|---|
| nesquena#1186 | @nesquena (Claude Code) | macOS `/etc` symlink bypass in workspace blocked-roots |
| nesquena#1187 | @nesquena-hermes | Workspace panel stuck closed after empty-session reload |
| nesquena#1190 | @bergeouss | Fenced code content leaking into markdown passes (nesquena#1154) |

All three PRs were independently reviewed and approved by @nesquena.

## Test results

**2729 passed, 2 skipped** (2 macOS-only tests correctly skipped on Linux). Browser QA: **21/21**.

## Key fix notes

**nesquena#1186:** `_workspace_blocked_roots()` now returns both literal and `Path.resolve()` forms of each blocked root. macOS symlinks (`/etc → /private/etc`) previously let a resolved candidate slip past the literal check. New `_is_blocked_system_path()` helper with `/var/folders` and `/var/tmp` carve-outs for pytest temp dirs.

**nesquena#1187:** Regression from nesquena#1182 — `syncWorkspacePanelState()` force-closed on any no-session state. Now only closes in `'preview'` mode. Both boot paths restore localStorage panel pref before sync.

**nesquena#1190:** Fenced code blocks are now stashed as `\x00P<n>\x00` tokens through ALL markdown passes (list/heading/table regexes), restored at the very end. Previously, diff hunks and markdown headings inside code blocks triggered those regexes, injecting `<ul>/<li>/<h>` tags that broke `</pre>` closure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Transcript rendering breaks after large tool outputs + diff/log content, showing stray </pre> and corrupting later layout

3 participants