Skip to content

fix(#1095): image attachment thumbnails in composer + working inline rendering in chat#1135

Closed
nesquena-hermes wants to merge 4 commits intomasterfrom
fix/1095-image-attachment-preview
Closed

fix(#1095): image attachment thumbnails in composer + working inline rendering in chat#1135
nesquena-hermes wants to merge 4 commits intomasterfrom
fix/1095-image-attachment-preview

Conversation

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Summary

Fixes both open bugs in #1095 — the original PR #1109 only fixed the sent-message rendering path; the composer tray and the image URL were both still broken.

Bug 1 — Composer tray: images showed as paperclip chips instead of thumbnails

renderTray() unconditionally rendered every pending file as a paperclip pill. Since the File object is available before upload, we can generate a blob URL with URL.createObjectURL(f) immediately — no backend roundtrip needed.

Fix: renderTray() now checks _IMAGE_EXTS on each file. Image files get an .attach-chip--image chip with a 56×56px object-fit:cover thumbnail. Blob URL is revoked on remove to avoid memory leaks. Non-image files keep the paperclip chip.

Bug 2 — Chat history: uploaded images rendered as broken <img>

m.attachments stores just the filename (['image.png']), not a full path. The previous code built api/media?path=image.png — but api/media requires a full absolute filesystem path (e.g. /home/hermes/.hermes/webui/sessions/.../image.png). Feeding just a filename returns 404, hence the broken image placeholder.

Fix: Use api/file/raw?session_id=SID&path=FILENAME instead. This endpoint resolves the filename relative to the session's workspace, which is exactly where api/upload stores the file. Confirmed via API test: api/file/raw returns 200; api/media with just a filename returns 404.

Verification

  • API confirmed: api/file/raw?session_id=SID&path=test-upload.png → 200
  • API confirmed: api/media?path=test-upload.png (filename only) → 404 (the old bug)
  • 14 new tests cover both code paths (8 for composer tray, 6 for chat history)
  • Full suite: 2588 passed, 0 failed
  • No backend changes — pure frontend (static/ui.js + static/style.css)

Thinking Path

  • renderTray() has access to the raw File object before upload → URL.createObjectURL() gives a free thumbnail, no API needed
  • m.attachments stores filenames not paths → api/media can never work here → api/file/raw is the right endpoint (workspace-relative resolution)
  • Blob URL revocation is mandatory to avoid accumulating dead object URLs in memory

Why It Matters

Images are the most common attachment type. Users dragging a screenshot into the composer see a filename instead of what they're about to send — and after sending, they see a broken image icon instead of their screenshot. Both are regressions from the expected behavior.

Risks / Follow-ups

  • api/file/raw requires auth when auth is enabled (same as all other workspace endpoints) — consistent behavior
  • Thumbnails in the tray are 56×56px; large images are cropped not stretched (object-fit:cover) — no layout blowout
  • Non-image files unchanged throughout

Model Used

Claude Sonnet (via Hermes agent) — assisted with root cause analysis and fix implementation.

Bug 1 — Composer tray: renderTray() now checks _IMAGE_EXTS on each pending
File object. Image files get a thumbnail preview (URL.createObjectURL) in an
.attach-chip--image chip instead of the generic paperclip pill. Blob URLs are
revoked on removal to avoid memory leaks. Non-image files keep the existing
paperclip chip.

Bug 2 — Chat history: uploaded images were rendered with api/media?path=
using just the filename, which api/media cannot resolve (it needs a full
absolute path). Switched to api/file/raw?session_id=SID&path=FILENAME which
resolves the filename relative to the session's workspace — exactly where
api/upload stores the file. Broken-image placeholder is now a working thumbnail.

Both changes are in static/ui.js and static/style.css only — no backend changes.
14 new tests cover both bugs. Full suite: 2588 passed.
The image-thumbnail PR removed the
  if(!S.pendingFiles.length||!S.session)return[];
guard at the top of uploadPendingFiles. This was an accidental deletion
mixed into the renderTray refactor diff -- the PR description does not
mention removing this guard.

Without it, calling uploadPendingFiles when S.session is null
(e.g. during a transient state between session loads, error recovery,
or a broken upload flow) throws
  TypeError: Cannot read properties of null (reading 'session_id')
at the line
  fd.append('session_id', S.session.session_id);

Restoring the one-line guard is the minimum fix. The guard was at
static/ui.js line 3246 in master before this PR.

Found via PR-review trace:
  git show origin/master:static/ui.js | sed -n '3245,3260p'

No other behaviour change. JS syntax clean. Full suite: 2540 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — end-to-end ✅ (approved after fix pushed)

What this ships

Targeted follow-up to PR #1109 / #1117 — fixes the two remaining bugs in #1095:

  1. Composer tray now shows image thumbnails instead of paperclip chips. renderTray() checks _IMAGE_EXTS per file, builds a URL.createObjectURL(f) blob URL, renders an <img class="attach-thumb"> chip, and revokes the blob on remove. Non-image files keep the paperclip chip. (static/ui.js:3239-3257, static/style.css:644-647)
  2. Chat history images load correctly — switched from api/media?path=FILENAME (404, expects absolute filesystem path) to api/file/raw?session_id=SID&path=FILENAME (resolves filename against the session's workspace, which is exactly where api/upload writes). (static/ui.js:2257-2266)

Traced against upstream hermes-agent

Pulled fresh nousresearch/hermes-agent tarball. Both fixes are pure WebUI-frontend changes (static/ui.js + static/style.css) with no agent-side coupling. The backend api/file/raw handler at api/routes.py:2220-2272 is webui-internal.

What I caught — accidental deletion of uploadPendingFiles early-return guard

The original PR diff removed the line

if(!S.pendingFiles.length||!S.session)return[];

from the top of uploadPendingFiles. Verified against master (git show origin/master:static/ui.js | sed -n '3245,3260p') — the guard was at line 3246 prior to this PR, and the deletion is not mentioned anywhere in the PR description. Mixed into the renderTray refactor diff context, presumably accidental.

Consequence: when uploadPendingFiles is called with S.session === null (transient state during session load, error recovery, or any broken init path), the line fd.append('session_id', S.session.session_id) at static/ui.js:3266 throws TypeError: Cannot read properties of null (reading 'session_id') instead of returning the empty array gracefully.

What I pushed — 8f8b968

Restored the one-line guard at the top of uploadPendingFiles. No other behaviour change. JS syntax clean. Full suite: 2540 passed.

End-to-end trace

api/file/raw endpoint (highest-stakes — change to a file-serving endpoint usage)

Verified at api/routes.py:2220-2272:

  • Requires session_id query parameter (line 2222-2224).
  • Loads session via get_session(sid) — handles missing sessions with 404 (line 2226-2228).
  • Resolves the path parameter via safe_resolve(Path(s.workspace), rel) (line 2231) — path-traversal-safe: safe_resolve rejects relative paths that escape the workspace root.
  • Forces Content-Disposition: attachment for dangerous MIME types (text/html, application/xhtml+xml, image/svg+xml) at line 2246-2256 to prevent XSS via uploaded HTML/SVG.
  • For inline previews via ?inline=1, additionally sets Content-Security-Policy: sandbox allow-scripts at line 2269 — defense-in-depth even if accessed outside the iframe.

The PR's URL construction:

const imgUrl='api/file/raw?session_id='+encodeURIComponent(_attachSid)+'&path='+encodeURIComponent(fname);

Both query params are encodeURIComponentd. The rendered <img src="${esc(imgUrl)}"> then escapes the URL again for HTML attribute context. ✅

Composer tray thumbnail (renderTray)

static/ui.js:3239-3257:

if(_IMAGE_EXTS.test(f.name)){
  const blobUrl=URL.createObjectURL(f);
  chip.className='attach-chip attach-chip--image';
  chip.dataset.blobUrl=blobUrl;
  chip.innerHTML=`<img class="attach-thumb" src="${esc(blobUrl)}" alt="${esc(f.name)}" title="${esc(f.name)}"><button title="${t('remove_title')}">${li('x',12)}</button>`;
} else {
  chip.innerHTML=`${li('paperclip',12)} ${esc(f.name)} <button title="${t('remove_title')}">${li('x',12)}</button>`;
}
chip.querySelector('button').onclick=()=>{
  if(chip.dataset.blobUrl) URL.revokeObjectURL(chip.dataset.blobUrl);
  S.pendingFiles.splice(i,1);renderTray();
};
  • f.name is escaped via esc() for both alt and title. ✅
  • blobUrl (a blob: URI generated by the browser) is esc()d in the src attribute. ✅
  • _IMAGE_EXTS is module-scope at static/ui.js:53, shared with renderMessages and renderMd. Reuse is correct.

XSS audit

  • All user-controlled strings (f.name, fname) flow through esc() before reaching innerHTML. ✅
  • The thumbnail path uses URL.createObjectURL, which produces opaque blob:<origin>/<uuid> URLs — not user-controllable.
  • The chat-history imgUrl parameters use encodeURIComponent AND the result is escaped again. Belt-and-braces.

Other audit — things that are correct already

  • CI green: ✅ test (3.11), ✅ test (3.12), ✅ test (3.13).
  • safe_resolve path-traversal: an attacker submitting ?path=../../../etc/passwd is rejected at the workspace boundary inside _handle_file_raw.
  • AVIF support: _IMAGE_EXTS already includes avif from a prior PR; new tray code inherits it.
  • addFiles deduplication: existing if(!S.pendingFiles.find(p=>p.name===f.name)) prevents duplicate file objects. Combined with the new URL.createObjectURL per-file, dedup also avoids creating duplicate blob URLs for re-dropped files.
  • _IMAGE_EXTS module-scope hoist: was done in v0.50.221 (#1117 absorb). The PR correctly uses it from the new call site.

Edge-case trace

Scenario Expected Actual
Drop image into composer thumbnail chip shows ✅ trace verified
Drop image, click X blob URL revoked, chip removed ✅ revoke called before splice
Drop image, send upload via api/upload, chip cleared on send ✅ (note: blob leak — see below)
Drop non-image file paperclip chip (existing behaviour) ✅ branch preserved
Send and view chat history with uploaded image <img> loads via api/file/raw ✅ endpoint trace verified
Hostile filename with <script> escaped via esc() and encodeURIComponent
Path traversal ?path=../../etc/passwd rejected by safe_resolve ✅ backend trace verified
Very large image (50MB) thumbnail decoded by browser, may slow paint ✅ acceptable (bounded by browser policy)
Session not loaded yet, user drops file uploadPendingFiles returns early (after my fix) ✅ guard restored
Session not loaded yet, user drops file (BEFORE my fix) TypeError on S.session.session_id 🚫 → ✅ after pushed commit

Tests

  • PR's tests (test_issue1095_pasted_images.py): 14 tests across TestComposerTrayThumbnails (8) + TestChatHistoryImageRendering (6). All pass.
  • CI on PR: green on 3.11/3.12/3.13.
  • Local full suite (after my fix): 2540 passed, 47 skipped, 0 PR-related failures.
  • The PR-claimed count of 2588 vs. local 2540 is environment-dependent — onboarding-network integration tests skip on macOS.

Minor observations (non-blocking)

  • Blob URL leak on upload-then-send path: when uploadPendingFiles() resets S.pendingFiles=[] and calls renderTray() (line 3278), the blob URLs that were attached to the now-removed chips are NOT revoked. The browser does GC the underlying File data eventually, but URL.createObjectURL references can accumulate. For a session with many uploads, this leaks. A small fix would be: before clearing pendingFiles, iterate the tray's [data-blob-url] chips and URL.revokeObjectURL() each. Not a blocker — magnitude is small (one per upload), browser GCs the underlying data, and it's bounded by session lifetime. Suggest as a follow-up.
  • The PR diff also removed a blank line between addFiles and uploadPendingFiles — purely cosmetic, harmless.
  • chip.dataset.blobUrl=blobUrl — modern browsers store dataset values as strings; the blob: URL is a string. ✅ no JSON-stringify oddity.

Recommendation

Approved (after fix pushed). Two real bugs in #1095 cleanly fixed, the new api/file/raw endpoint usage is path-traversal-safe and properly escaped, the blob URL handling has correct revocation on the explicit-remove path. The accidental deletion of the uploadPendingFiles guard was caught and restored in commit 8f8b968. No cross-tool agent regression. Parked at approval — ready for the release agent's merge/tag pipeline.

Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review of new commits ✅ (re-approve, latest state clean)

Two commits landed after my prior approval at 8f8b968:

  • f15406f — feat(ux): small thumbnails in chat + lightbox overlay on click
  • d182deb — fix(tests): update assertions for lightbox

This re-review covers only the delta from 8f8b968..HEAD. The earlier approval scope (composer thumbnails + api/file/raw switch + my pushed guard restore) is unchanged.

What this ships (delta)

  • 120×90 fixed-size chat thumbnails instead of full-bleed max-width:480px — chat rows are tighter, multiple images can sit on one line. (static/style.css:618-619)
  • Lightbox overlay on click — new _openImgLightbox(src, alt) / _closeImgLightbox(lb) pair at static/ui.js:54-81. All 5 image render sites switched from onclick="this.classList.toggle('msg-media-img--full')" (CSS class toggle on the img itself) to onclick="_openImgLightbox(this.src,this.alt)" (full-screen overlay).
  • Three test files updated to assert the new behaviour — test_issue1095_pasted_images.py, test_issues_853_857.py, test_media_inline.py. The .msg-media-img--full class assertions were replaced with _openImgLightbox / .img-lightbox assertions, plus the thumbnail width assertion bumped from max-widthwidth:120px.

Security audit (delta)

Inline onclick handler — no XSS surface

All 5 occurrences write the literal string _openImgLightbox(this.src,this.alt) into the onclick= attribute. Verified by grep:

849:  ... onclick="_openImgLightbox(this.src,this.alt)">`
928:  ... onclick="_openImgLightbox(this.src,this.alt)">`
992:  ... onclick="_openImgLightbox(this.src,this.alt)">`
999:  ... onclick="_openImgLightbox(this.src,this.alt)">`
2296: ... onclick="_openImgLightbox(this.src,this.alt)">`

this.src and this.alt are runtime DOM property reads, not template substitutions. The user-controlled values flow into src= and alt= attributes (both esc()'d in their respective interpolations), and at click time the browser provides their parsed values to the handler. No string concatenation breakout possible.

CSP impact: existing script-src 'unsafe-inline' already permitted inline event handlers, so no new directive is needed. The PR is moving from one inline handler to another — net neutral.

Lightbox internals

static/ui.js:54-81:

function _openImgLightbox(src, alt) {
  const lb = document.createElement('div');
  lb.className = 'img-lightbox';
  lb.setAttribute('role', 'dialog');
  lb.setAttribute('aria-label', alt || 'Image');
  const img = document.createElement('img');
  img.src = src;            // property assignment, not innerHTML
  img.alt = alt || '';      // property assignment, not innerHTML
  img.onclick = e => e.stopPropagation();
  ...
  lb.onclick = () => _closeImgLightbox(lb);
  document.body.appendChild(lb);
  lb._escHandler = e => { if(e.key==='Escape') _closeImgLightbox(lb); };
  document.addEventListener('keydown', lb._escHandler);
}
  • img.src = src and img.alt = alt are property setters, not innerHTML assignments — no HTML parsing path. A javascript: URL in src is not honoured by <img> elements (browsers reject non-fetch schemes). ✅
  • setAttribute('role', 'dialog') and setAttribute('aria-label', alt)setAttribute on these attributes is plain-text. ✅
  • The escape-key handler is stored on lb._escHandler so _closeImgLightbox can removeEventListener the same function reference (avoids the common "anonymous function can't be removed" leak). ✅
  • _closeImgLightbox guards with if(!lb || !lb.parentNode) return; — double-close is safe.

Behavioural harness

11/11 assertions pass with a Node + DOM-shim harness:

Test1 lightbox created: PASS
  role: PASS
  aria-label: PASS
  has img child: PASS
  has close button: PASS
  esc handler registered: PASS
Test2 img click stops propagation: PASS
Test3 escape closes lightbox: PASS
  esc handler removed: PASS
Test4 backdrop click closes: PASS
Test5 no alt → aria-label fallback: PASS
Test6 double-close safe (no throw): PASS

This proves: the dialog is created with proper ARIA, the image-click stops bubbling so the lightbox stays open when interacting with the image, the escape key closes AND deregisters its handler, the backdrop click closes, missing alt falls back to "Image", and double-close is idempotent.

Edge-case trace (delta)

Scenario Expected Actual
Click any chat image lightbox overlay opens ✅ trace verified
Press Escape in lightbox closes, handler removed ✅ harness PASS
Click backdrop closes ✅ harness PASS
Click image inside lightbox stays open (stopPropagation) ✅ harness PASS
Click ✕ close button closes ✅ trace verified
Image with no alt text aria-label="Image" fallback ✅ harness PASS
Open then re-open without close (programmatic) lightboxes stack — last opened wins via Escape ✅ acceptable (not user-reachable since z-index:9999 covers everything)
Double-close guard at if(!lb.parentNode) return; ✅ harness PASS
Animation interrupted by spam-click setTimeout callbacks no-op after first removal ✅ trace verified
User-controlled src / alt containing <script> escaped via esc() in attribute, set via property in lightbox ✅ no XSS path
javascript: URL in src browser rejects (img doesn't honour non-fetch schemes)

Other audit — confirmed correct

  • Animation cleanup: setTimeout(120ms) removes the element after the reverse animation. Multiple clicks of close → multiple setTimeouts, but they all guard via lb.parentNode && before removeChild. ✅
  • Memory: each lightbox has at most one keydown listener registered, removed in _closeImgLightbox. No leak.
  • CSS: new .img-lightbox rule uses position:fixed;inset:0;z-index:9999 — covers the entire viewport, sits above all other UI including modals (other modals use lower z-index per existing code).
  • Existing test compatibility: the old test at test_msg_media_img_full_class_defined was updated to assert .img-lightbox instead of .msg-media-img--full — correct migration.
  • Visual regression: chat images now appear at fixed 120×90 vs. responsive max-width:480px. Some users may notice. The lightbox compensates for inspecting full-size. Pattern matches Slack/Discord image previews.
  • My pushed fix 8f8b968 is still in the PR: verified the if(!S.pendingFiles.length||!S.session)return[]; guard remains at static/ui.js:3260.

Tests (delta)

  • CI on latest commit (d182deb): ✅ test (3.11), ✅ test (3.12), ✅ test (3.13).
  • Local full suite: 2540 passed, 47 skipped, 0 PR-related failures.
  • Three test files migrated to assert lightbox behaviour; assertions are precise (e.g. width:120px in rule, _openImgLightbox in body, .img-lightbox in css).
  • Behavioural harness covers all 6 happy/edge paths for the new lightbox functions.

Minor observations (still non-blocking)

  • The blob-URL leak on the upload-then-send path that I noted in the prior review is unchanged — still a small memory leak per upload, browser GCs eventually. Suggested as a follow-up.
  • Lightbox could trap focus inside the dialog for full keyboard accessibility (screenreader users currently lose focus context when Escape closes). Not a blocker — the role/aria-label is already there, and the close button is keyboard-reachable. Could be tightened later.
  • Programmatic re-open (without close) stacks lightboxes; only the most recently opened gets escape-keyed closed via its own handler, but others still have their handlers attached. Not user-reachable — the lightbox covers the page so the user can't click another image to trigger a second open. Worth noting for future refactors.

Recommendation

Approved (fresh approval on the new commits). Lightbox is a clean UX upgrade — semantic dialog markup, ARIA labelling, property-based DOM construction (no innerHTML parse), idempotent close, escape and backdrop click both close, behavioural harness confirms 11/11 expected behaviours. No new XSS surface (inline onclick handlers are literal strings; user-controlled values reach via DOM properties at runtime). My earlier guard restoration (8f8b968) is still in place. Parked at approval — ready for the release agent's merge/tag pipeline.

@nesquena-hermes nesquena-hermes added the merge soon Reviewed and queued for next release batch label Apr 27, 2026
@nesquena-hermes
Copy link
Copy Markdown
Collaborator Author

Re-review noted — thank you for the thorough second pass including the behavioural harness (11/11). Moving forward to stage and batch release with #1133 and #1136.

nesquena-hermes added a commit that referenced this pull request Apr 27, 2026
* feat: attention state for broken cron jobs + Korean i18n (#1133, @franksong2702)

* fix: pytest state isolation for direct session saves (#1136, @franksong2702)

* fix(#1095): image thumbnails in composer + lightbox in chat (#1135)

* fix(css): restore cron attention + detail-alert rules overwritten by style.css merge (absorb)

* docs: v0.50.225 release notes and version bump

---------

Co-authored-by: nesquena-hermes <[email protected]>
@nesquena-hermes
Copy link
Copy Markdown
Collaborator Author

Merged in v0.50.225 via PR #1137. Thank you for the thorough independent review (including the behavioural harness)! 🎉

@nesquena-hermes nesquena-hermes deleted the fix/1095-image-attachment-preview branch April 27, 2026 23:59
JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 29, 2026
…#1137)

* feat: attention state for broken cron jobs + Korean i18n (nesquena#1133, @franksong2702)

* fix: pytest state isolation for direct session saves (nesquena#1136, @franksong2702)

* fix(nesquena#1095): image thumbnails in composer + lightbox in chat (nesquena#1135)

* fix(css): restore cron attention + detail-alert rules overwritten by style.css merge (absorb)

* docs: v0.50.225 release notes and version bump

---------

Co-authored-by: nesquena-hermes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge soon Reviewed and queued for next release batch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants