Skip to content

fix: reuse inflight session stream#1467

Merged
2 commits merged intonesquena:masterfrom
dso2ng:fix/reuse-inflight-session-stream
May 2, 2026
Merged

fix: reuse inflight session stream#1467
2 commits merged intonesquena:masterfrom
dso2ng:fix/reuse-inflight-session-stream

Conversation

@dso2ng
Copy link
Copy Markdown
Contributor

@dso2ng dso2ng commented May 2, 2026

Thinking Path

  • Hermes WebUI supports long-running chat turns.
  • Users may switch away from a running session and later return to it.
  • The active pane and the running session are different concepts.
  • Returning to the same running session should not tear down the existing SSE transport for that same session/stream pair.
  • This PR keeps that transport alive when attachLiveStream() is called again for the same active stream.

What Changed

  • attachLiveStream() now checks LIVE_STREAMS[session_id] before closing an existing stream.
  • If the existing live stream has the same streamId and is not closed, the function reuses it instead of closing/reopening the EventSource.
  • The INFLIGHT[session_id] state is still initialized/updated before the reuse check so session switch-back can restore the active pane's running state.
  • Added a regression test that pins the same-stream reuse invariant and the existing loadSession() reattach path.

Why It Matters

SSE stream events are not a replay log. Tearing down and reopening the same running stream during sidebar navigation creates a small timing window where events can be consumed by the old transport but not reflected back into the current pane/cache.

This is the first small step in improving reliability for in-flight session switching. It follows the direction in #1466: keep the currently viewed pane separate from per-session runtime state.

Verification

node --check static/messages.js static/sessions.js
python -m py_compile api/routes.py api/models.py api/agent_sessions.py
pytest tests/test_inflight_stream_reuse.py tests/test_issue660.py tests/test_session_cross_tab_sync.py -q

Result:

21 passed

Also checked the added diff for accidental non-ASCII text:

non_ascii_added_lines=0

Risks / Follow-ups

Model Used

OpenAI GPT-5.5 via Hermes Agent.

Refs #1466

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Thanks for the surgical PR. The reuse-existing-stream invariant is the right shape, and pulling the INFLIGHT[activeSid] initialization above the reuse check is the correct ordering — switch-back has to be able to rebuild active-pane state from INFLIGHT even when the SSE transport is preserved.

A few things to consider before merge:

1. EventSource.CLOSED typeof guard reads odd

(typeof EventSource==='undefined'||existingLive.source.readyState!==EventSource.CLOSED)

EventSource is checked because the test environment is Node, but in Node existingLive.source would also be undefined — the test is structural (regex against the source). The guard is cheap so it's fine, but a comment on the line explaining "browsers that report readyState === CLOSED still need the close/reopen path" would save the next reader from puzzling over it.

2. Reuse skips uploaded propagation if INFLIGHT already exists

The reuse path currently early-returns. If attachLiveStream is called a second time for the same (activeSid, streamId) pair with a new uploaded array (e.g. composer added an attachment between switches), the early-return skips the INFLIGHT[activeSid].uploaded=[...uploaded] update on lines 288-289.

In the current call sites I think this can't happen — attachLiveStream is only called from sendMessage() (fresh stream) and loadSession() reattach (no uploaded) — but it's the kind of invariant that bites when someone adds a third call site. Worth either:

  • Hoisting the if(uploaded.length) INFLIGHT[activeSid].uploaded=[...uploaded] above the reuse check, or
  • Adding a comment that reuse intentionally drops uploaded.

3. Test: regex against closeLiveStream(activeSid) is brittle

close_pos = body.find("closeLiveStream(activeSid)")

This will match closeLiveStream(activeSid, streamId) further down at line ~340 (_closeSource) — that body lives inside attachLiveStream's closure, so it's inside the slice you're searching. Is the test picking up the right closeLiveStream call? If a future refactor swaps the argument list, the test could pass for the wrong reason. Anchor on closeLiveStream(activeSid);\n (with the trailing semicolon and newline) or on the surrounding existingLive context.

Also: _closeSource at line ~340 calls closeLiveStream(activeSid, streamId) — that's the in-stream close path, and it's correct. Just calling out so reviewers know there are two call sites in this function.

4. existingLive.source truthiness check

closeLiveStream() deletes LIVE_STREAMS[sessionId] after live.source.close(), so a "closed but undeleted" entry shouldn't normally exist. The existingLive.source truthy check is defensive but reasonable.

5. Worth pinning in a separate test: the case where attachLiveStream is called for a different streamId on the same session (e.g. user retries → new stream id). That should still close the old transport and open a new one. The current diff handles it (the existingLive.streamId===streamId check is strict equality) but a regression test would lock the invariant.

Otherwise this looks like a clean first step in the #1466 sequence. The 21 passing tests + non-ASCII check is the right level of rigor for a small reliability fix. I'll let Nathan make the call on whether (1)–(5) are blocking or follow-up.

@dso2ng
Copy link
Copy Markdown
Contributor Author

dso2ng commented May 2, 2026

Thanks for the detailed review. I pushed 3aafe52 to address the concrete items:

  • Added an inline comment on the EventSource.CLOSED guard explaining that same-stream reuse is only valid while the browser has not marked the transport closed.
  • Kept the uploaded propagation above the reuse check and added test_attach_live_stream_updates_uploads_before_same_stream_reuse so that invariant is explicit.
  • Tightened the brittle close-call test to anchor on the top-level closeLiveStream(activeSid); call, so it no longer accidentally matches the in-stream closeLiveStream(activeSid, streamId) close path.
  • Added test_attach_live_stream_different_stream_still_reopens_transport to pin that same-session/different-stream-id still falls through to close/reopen.

Targeted local verification on the updated head:

  • node --check static/messages.js static/sessions.js
  • python -m py_compile api/routes.py api/models.py api/agent_sessions.py
  • pytest tests/test_inflight_stream_reuse.py tests/test_issue660.py tests/test_session_cross_tab_sync.py -q -> 23 passed
  • non-ASCII diff guard -> non_ascii_added_lines=0

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Thanks — 3aafe52 cleanly addresses all five points:

  • (1) EventSource.CLOSED comment — the inline note explaining "same-stream reuse is only valid while the browser has not marked the transport closed" is exactly the right framing for the next reader.
  • (2) Uploaded propagation — keeping the propagation above the reuse check + adding test_attach_live_stream_updates_uploads_before_same_stream_reuse is the better fix. The invariant is now executable, not implicit.
  • (3) Brittle close-call test — anchoring on the top-level closeLiveStream(activeSid); call is exactly what I had in mind. The in-stream closeLiveStream(activeSid, streamId) close path (_closeSource at ~340) is now decoupled from the test's expectations.
  • (4) existingLive.source truthiness — defensive check stays, no change needed. Confirmed.
  • (5) Different-streamId fall-through testtest_attach_live_stream_different_stream_still_reopens_transport pins the strict-equality semantics so a future refactor that loosens the comparison would surface as a test failure.

23 passing + non-ASCII guard at 0 is the right rigor bar. Nothing else from me — leaving the merge call to Nathan.

@nesquena-hermes nesquena-hermes closed this pull request by merging all changes into nesquena:master in 5650d11 May 2, 2026
Thanatos-Z pushed a commit to Thanatos-Z/hermes-webui that referenced this pull request May 2, 2026
Thanatos-Z pushed a commit to Thanatos-Z/hermes-webui that referenced this pull request May 2, 2026
…w-up

- CHANGELOG.md: v0.50.267 entry detailing nesquena#1454/nesquena#1474/nesquena#1461/nesquena#1465/nesquena#1467/nesquena#1460/nesquena#1473
  + Opus advisor SHOULD-FIX trailing-empty guard for _norm_model_id
- ROADMAP.md: bump to v0.50.267, 3776 tests collected
- TESTING.md: bump header + total to 3776
- api/config.py: trailing-empty fallback in _norm_model_id (parts[-1] or s)
- static/ui.js: mirror trailing-empty fallback in _normalizeConfiguredModelKey
- tests/test_norm_model_id_trailing_empty_guard.py: 5 regression tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants