Skip to content

Perf/parallelize session switch#1158

Closed
jasonjcwu wants to merge 5 commits intonesquena:masterfrom
jasonjcwu:perf/parallelize-session-switch
Closed

Perf/parallelize session switch#1158
jasonjcwu wants to merge 5 commits intonesquena:masterfrom
jasonjcwu:perf/parallelize-session-switch

Conversation

@jasonjcwu
Copy link
Copy Markdown
Contributor

No description provided.

Three changes to speed up session switching:

1. workspace.js: Parallelize expanded-dir pre-fetching in loadDir().
   Previously, expanded directories were fetched sequentially with
   for-await, creating a waterfall of N API calls. Now uses
   Promise.all() to fetch all expanded dirs concurrently.

2. sessions.js: Overlap loadDir() network requests with highlightCode().
   The idle path ran highlightCode() (CPU-bound Prism.js) before loadDir,
   delaying the file-tree fetch. Now loadDir() is kicked off first so the
   HTTP request is in-flight while Prism runs, then awaited.

3. workspace.py: Parallelize git_info_for_workspace() subprocess calls.
   git status, rev-list (ahead), and rev-list (behind) were called
   serially (3 subprocess.run calls, each up to 3s timeout). Now they
   run concurrently via ThreadPoolExecutor, reducing wall time by ~60%.
6 tests across 3 classes:
- TestLoadDirParallelPrefetch: static analysis of Promise.all in workspace.js
- TestLoadSessionIdleOverlap: static analysis of loadDir-before-highlightCode ordering
- TestGitInfoParallel: runtime tests with threading.Barrier concurrency proof
  and wall-clock timing (parallel ~0.1s vs serial ~0.3s for 3 git commands)
Add msg_limit and msg_before parameters to /api/session endpoint:
- Initial load fetches only last 30 messages (tail window)
- Scroll-to-top triggers lazy loading of older messages via cursor
- Payload reduced by ~70% for large sessions (100+ messages)
- undo/retry/compress auto-reset truncation state
- Export uses server-side generation, unaffected

Backend (api/routes.py):
- msg_limit: returns only the last N messages with truncation flag
- msg_before: cursor-based paging for scroll-to-top loading

Frontend (static/sessions.js):
- _INITIAL_MSG_LIMIT=30 for initial paginated load
- _messagesTruncated tracks pending older messages
- _loadOlderMessages() prepends older messages on scroll
- _ensureAllMessagesLoaded() for operations needing full history

Frontend (static/ui.js):
- Scroll-to-top detection (scrollTop < 80px) triggers lazy load
- 'Load older messages' visual indicator when truncated

Tests: 19 total (13 new for message pagination), all passing
@jasonjcwu
Copy link
Copy Markdown
Contributor Author

Performance: Parallelize session switch + message pagination

Problem

Switching between sessions in the web UI is noticeably slow, especially for long conversations. The delay comes from four serial bottlenecks:

  1. Expanded-dir pre-fetch — restored expanded directories are fetched one-by-one in a for...of + await loop
  2. highlightCode blocks loadDir — the CPU-bound Prism.js pass runs before the network fetch for directory listing is dispatched
  3. Git info subprocessesstatus, rev-list --ahead, rev-list --behind run sequentially via subprocess.run
  4. Full message payload — every session switch serializes and sends the entire message history (100–450+ messages, up to 86KB gzip), even though the user only sees the last ~30 messages on screen

Changes

Commit 1: 671ca99 — Parallelize serial bottlenecks

File Change
static/workspace.js Expanded-dir pre-fetch: for...of + awaitPromise.all()
static/sessions.js Idle path: start loadDir('.') before highlightCode(), await after
api/workspace.py git_info_for_workspace: 3 serial subprocess.runThreadPoolExecutor(max_workers=3)

Commit 2: 1a805d2 — Tests for parallelization

  • 6 tests proving Promise.all usage, idle-path ordering, ThreadPoolExecutor adoption, and wall-clock speedup via threading.Barrier

Commit 3: 0ef1de0 — Message pagination (biggest win)

File Change
api/routes.py msg_limit param returns last N messages with _messages_truncated flag; msg_before param for cursor-based backward paging
static/sessions.js _INITIAL_MSG_LIMIT=30 for initial load; _messagesTruncated state; _loadOlderMessages() prepends older messages; _ensureAllMessagesLoaded() for operations needing full history
static/ui.js Scroll-to-top detection (scrollTop < 80px) triggers lazy load; "Load older messages" indicator when truncated
static/commands.js undo and compress reset _messagesTruncated=false after full reload

Expected impact

Bottleneck Before After
Expanded-dir pre-fetch (N dirs) N × RTT 1 × RTT
loadDir + highlightCode Sequential Overlapped
Git info (3 commands) 3 × subprocess latency 1 × subprocess latency
Message payload Full history (up to 86KB gzip) Last 30 msgs (~16KB gzip, ↓70%)

For a remote deployment switching to a 200+ message session: ~800ms → ~40ms.

Compatibility

  • undo/retry — use DELETE /api/session/messages/last API endpoint, not frontend S.messages. After reload they reset _messagesTruncated.
  • compress — preflight already fetches full session; resets truncation state.
  • export — server-side JSON generation via /api/session/export, independent of frontend message array.
  • search — can call _ensureAllMessagesLoaded() before searching.
  • Backward compatible — omitting msg_limit returns all messages as before.

Tests

19 new tests, all passing:

  • TestLoadDirParallelPrefetch (2) — Promise.all usage in workspace.js
  • TestLoadSessionIdleOverlap (1) — loadDir dispatched before highlightCode
  • TestGitInfoParallel (3) — ThreadPoolExecutor, Barrier-based concurrency proof, wall-clock speedup
  • TestMessagePaginationBackend (7) — tail slicing, cursor paging, truncation flag, payload reduction
  • TestMessagePaginationFrontend (6) — msg_limit in API call, truncation tracking, scroll-to-top, indicator

All 2,575 existing tests pass (5 pre-existing failures unrelated to this PR).

Issues found and fixed:

1. workspace.js: S._expandedDirs||[] fallback produced Array (no .size)
   → changed to ||new Set()

2. routes.py: _messages_truncated flag was wrong for msg_before paging
   → separate truncation logic per code path, compare against _slice
   not _all_msgs

3. routes.py: msg_before used timestamp comparison which fails when
   messages lack _ts (827 msgs with _ts=0 in real data) or have
   duplicate timestamps
   → changed to index-based cursor (0-based position in full array)
   → added _messages_offset field so frontend knows cursor position
   → added input validation (try/except, max(1,...), bounds clamping)

4. sessions.js: _oldestIdx tracks cursor position, reset on session
   switch, updated from _messages_offset on each page load

5. workspace.py: moved concurrent.futures import to module top level

6. tests: updated to match index-based cursor, added 8 new tests:
   - test_expanded_dirs_fallback_is_set
   - test_msg_before_index_based_slicing
   - test_msg_before_zero_returns_empty
   - test_msg_before_equal_total
   - test_truncation_flag_with_msg_before
   - test_messages_offset_initial_load/with_msg_before
   - test_oldest_idx_tracking/reset_on_session_switch
   - test_load_older_uses_index_cursor
   - test_msg_before_bounds_clamping

27/27 tests pass, 2583/2583 total (5 pre-existing failures unrelated)
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Great performance work, @jasonjcwu! The author's comment gives a thorough breakdown — adding some review notes here:

Parallelization changes (commits 1–2) look well-targeted:

  • Promise.all() for expanded-dir pre-fetch is a straightforward N×RTT → 1×RTT win
  • Overlapping loadDir with highlightCode on the idle path is safe as long as DOM mutations don't conflict — the ordering test is a good guard here
  • ThreadPoolExecutor(max_workers=3) for the three git subprocess calls is the right tool; the threading.Barrier concurrency proof in tests is convincing

Message pagination (commit 3) is the most impactful change and deserves a close review:

  • The _ensureAllMessagesLoaded() escape hatch for undo, compress, and search is the right pattern — avoids breaking those operations on truncated history
  • Scroll-to-top trigger for _loadOlderMessages() is a natural UX
  • The backwards-compatible msg_limit parameter (no param = full history) is important for external callers

One thing worth verifying in review: the _messages_truncated flag reset behavior on loadSession() — if someone switches sessions while _loadOlderMessages() is in flight, the async prepend should not land on the newly loaded session. The 19-test suite covering pagination backend + frontend is a strong start; confirming session-switch cancellation safety would make this solid.

The ~800ms → ~40ms expected improvement for remote deployments is a meaningful UX gain.

Race condition: if user switches sessions while _loadOlderMessages() is
in-flight, the stale response could land on the new session's S.messages.

Guards verified:
- _loadOlderMessages checks _loadingSessionId !== sid after await (bails out)
- Guard appears BEFORE S.messages mutation (stale data cannot land)
- loadSession resets _loadingOlder=false on switch (no stale lock)
- loadSession resets _messagesTruncated=false and _oldestIdx=0 on switch

Added: _loadingOlder=false reset in loadSession() reset block.
Added: 5 tests in TestSessionSwitchCancellation proving all guards exist
and are ordered correctly.

32/32 tests pass.
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Hold -- Pagination UX Needs Review Before Merge

Placing this PR on hold pending a UX decision on the new message pagination behaviour. Details below.


Approved Changes

The parallelization work looks clean and is approved:

  • Promise.all for directory fetches -- straightforward win, no concerns.
  • ThreadPoolExecutor for git info -- well-scoped, correct error handling.
  • Overlapping loadDir with highlightCode -- good use of concurrency; the sequencing looks safe.

These changes can land as-is once the UX question below is resolved.


Hold Reason -- Message Pagination UX

The decision to limit displayed messages to the most-recent 30 on every session switch, with a "click to load older" / scroll-triggered load path, is a meaningful UX change that affects every conversation view. It needs explicit sign-off before merge.

Questions for @aronprins

  1. 30-message tail window -- Is 30 the right default? Should this be user-configurable (e.g. a setting in preferences) or is a fixed window acceptable? For users with long conversations this is the new first-load experience, so the number matters.

  2. "Click to load older" indicator -- Is the current placement (top of the message list) and wording final? Should it be a button, an inline sentinel, or a sticky banner? Does it need a loading spinner while older messages are fetched?

Required Fix -- _messagesTruncated Reset Paths

@author: please audit every code path that reloads or replaces the session message list and confirm that _messagesTruncated is reset to false (or the appropriate initial state) in all of the following:

  • Retry -- when the user retries a failed assistant turn
  • Undo -- when a message is popped/undone
  • Manual compression -- when the conversation is compressed/summarised by the user
  • Session reload after error -- when the session is reloaded following a network or server error

If any of these paths leave _messagesTruncated = true from a previous load, the UI will incorrectly show the "load older" indicator (or suppress messages) even when the full history is already in memory.


Next Steps

  1. @aronprins weighs in on the 30-message default and the indicator UX.
  2. Author confirms _messagesTruncated is correctly reset in all paths above (a short comment or checklist reply here is fine).
  3. Once both are addressed, remove the hold label and this is ready to merge.

Thanks!

@nesquena-hermes nesquena-hermes added hold ux User experience / visual polish labels Apr 27, 2026
@nesquena-hermes nesquena-hermes removed hold ux User experience / visual polish labels Apr 27, 2026
nesquena-hermes added a commit that referenced this pull request Apr 27, 2026
)

Merged as v0.50.229. 2678 tests passing. Browser QA 21/21.

All three PRs were independently reviewed and approved by @nesquena with reviewer commits pulled in:
- #1181 (#1158): `d974388` (stale-response race in _loadOlderMessages)
- #1182: `7e20006` (full-scan fallback path consistency)
- #1180: `a5ad154` (regression test for iOS zoom threshold)

Thanks @jasonjcwu (#1158)!
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Merged as v0.50.229 via integration PR #1181 (then batched in #1183). Thank you @jasonjcwu — the parallelization and pagination work are live! 🎉

JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 29, 2026
…squena#1183)

Merged as v0.50.229. 2678 tests passing. Browser QA 21/21.

All three PRs were independently reviewed and approved by @nesquena with reviewer commits pulled in:
- nesquena#1181 (nesquena#1158): `d974388` (stale-response race in _loadOlderMessages)
- nesquena#1182: `7e20006` (full-scan fallback path consistency)
- nesquena#1180: `a5ad154` (regression test for iOS zoom threshold)

Thanks @jasonjcwu (nesquena#1158)!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants