Hermes Web UI — Sprints 11-14: multi-provider models, settings, sessi…#2
Merged
nesquena-hermes merged 1 commit intomasterfrom Mar 31, 2026
Merged
Hermes Web UI — Sprints 11-14: multi-provider models, settings, sessi…#2nesquena-hermes merged 1 commit intomasterfrom
nesquena-hermes merged 1 commit intomasterfrom
Conversation
…on QoL, alerts, polish Sprint 11 (v0.13): multi-provider model support, streaming smoothness - Dynamic model dropdown populated from configured API keys (OpenAI, Anthropic, Google, DeepSeek, GLM, Kimi, MiniMax, OpenRouter, Nous Portal) - Scroll pinning during streaming (no forced scroll when user has scrolled up) - All route handlers extracted to api/routes.py (server.py now ~76 lines) Sprint 12 (v0.14): settings panel, SSE reconnect, session QoL - Settings panel (gear icon) -- persist default model and workspace server-side - SSE auto-reconnect on network blips - Pin/star sessions to top of sidebar - Import session from JSON export Sprint 13 (v0.15): cron alerts, background errors, session duplicate, tab title - Cron completion alerts: toast per completion + unread badge on Tasks tab - Background agent error banner when a non-active session errors mid-stream - Session duplicate button - Browser tab title reflects active session name Sprint 14 (v0.16): Mermaid diagrams, file ops, session archive/tags, timestamps - Mermaid diagram rendering inline (dark theme, lazy CDN load) - File rename (double-click in file tree) and create folder - Session archive (hide without deleting, toggle to show) - Session tags -- #hashtag in title becomes colored chip + click-to-filter - Message timestamps (HH:MM on hover, full date as tooltip) Test suite: 224 tests across 14 sprint files + regression gate, 0 failures.
This was referenced Apr 3, 2026
Ola-Turmo
pushed a commit
to Ola-Turmo/hermes-webui
that referenced
this pull request
Apr 9, 2026
Hermes Web UI — Sprints 11-14: multi-provider models, settings, sessi…
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 12, 2026
- Title/badge/source: 2025 → 2026, sources updated to Chatbot Arena/BenchLM - Overall: Opus 4.6 (#1, 1504 Arena Elo), Gemini 3.1 Pro (#2), GPT-5.4 (#3), Sonnet 4.6 (#4), DeepSeek V3.2 (#5, replaces DeepSeek R1) - Coding: Opus 4.6 (#1, powers Claude Code/Cursor), GPT-5.4 (#2, SWE-Pro 57.7%), Sonnet 4.6 (#3), Gemini 3.1 Pro (#4), DeepSeek V3.2 (#5) Removed: Sonnet 4.5, Gemini 2.5 Pro, GPT-5.3 Codex - Writing: Opus 4.6 (#1, Mazur 8.53), Gemini 3.1 Pro (#2, Arena CW 1487), Sonnet 4.6 (#3, EQ-Bench CW 1936), GPT-5.4 (#4), Meta Muse Spark (#5) Removed: Llama 4 Maverick (replaced by Meta Muse Spark) - Search: Gemini 3.1 Pro (#1), Grok 4 (#2, live X data), Sonnet 4.6+tool (#3), GPT-5.4 (#4), Gemini 3 Flash Thinking (#5) Added Grok 4 for real-time social/X data - Reasoning: Gemini 3.1 Pro (#1, GPQA 95.45%), GPT-5.4 (#2), Opus 4.6 (#3), Gemini 3 Flash Thinking (#4, best value), DeepSeek V3.2 (#5) Removed: Kimi K2, DeepSeek R1 standalone - Quick picker: all model names updated to 2026 versions - Setup boxes: OpenAI gpt-5-4 names, Google gemini-3-1-pro-preview, self-hosted section updated to DeepSeek V3.2 + Muse Spark
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 12, 2026
Coding section: - Gemini 3.1 Pro: add Terminal-Bench 78.4% (highest of any frontier model on CLI/DevOps) Score badge updated to show Terminal-Bench rather than SWE-bench Verified - GPT-5.4: note Terminal-Bench 75.1% in description, consolidate pill text - #5 DeepSeek V3.2 → Qwen 3.6-Plus: leads Terminal-Bench at 61.6%, 88.2% GPQA, 1M context, available now on Alibaba Cloud + OpenRouter Writing section (reordered based on EQ-Bench CW scores): - #1 Claude Sonnet 4.6 (1936 EQ-Bench CW — highest, best voice consistency) - #2 Claude Opus 4.6 (Mazur 8.53, IF Arena #1, 1M context for literary depth) - #3 Gemini 3.1 Pro (Arena CW #1 1487, AI-tell avoidance, 2M context) - #4 GPT-5.4 (noted as ~9th on Arena CW, better for structured/commercial writing) - #5 Meta Muse Spark → Kimi K2.5 (/usr/bin/bash.60/.50, ~1700 EQ-Bench CW, live API) Muse Spark removed — no commercial API available yet Reasoning section: - Gemini 3.1 Pro GPQA: 95.45% → 94.1% (more conservative/recent figure, consistent with both agents' data) - Added ARC-AGI-2 77.1% for Gemini 3.1 Pro (#1 on visual reasoning too) - Opus 4.6: added note that Sonnet leads GDPval-AA (1633 Elo #1) for throughput - #5 DeepSeek V3.2 → Qwen 3.6-Plus (88.2% GPQA, 1M context, same model as coding) Quick picker: - Creative writing: Opus → Sonnet 4.6 (EQ-Bench #1, 85% cheaper) - Hard reasoning: 95.45% → 94.1%, add ARC-AGI-2 mention - Budget pick: DeepSeek V3.2 → Gemini 3 Flash Thinking (/usr/bin/bash.50/1M, 89.8% GPQA) Setup boxes: - Self-hosted: Muse Spark → Qwen 3.6-Plus + Gemma 4 26B MoE (Apache 2.0, 82.3% GPQA with 3.8B active params, best edge/self-hosted reasoning) Overall section: unchanged (top 5 still correct per both agents) Search section: unchanged (no new data from either agent)
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 14, 2026
index.html:
- Hero badge: 56k+ → 77k+ GitHub stars (NousResearch/hermes-agent: 76,811)
community/index.html:
- All 24 project star counts updated to live GitHub values (April 13 2026)
Notable bumps: gbrain 4.8k→7.4k, hermes-webui 1.3k→1.8k, hermes-agent 56k+→77k,
hermes-workspace 1.1k→1.3k, hudui 571→827, self-evolution 893→1.6k
- Removed awizermann/scarf (repo no longer exists on GitHub)
- Added 3 new high-value projects:
- vectorize-io/hindsight (9.1k ⭐) → Memory Providers: Official Hermes memory plugin,
biomimetic 3-tier memory, topped LongMemEval Jan 2026
- rohitg00/agentmemory (1.3k ⭐) → Memory Providers: #1 persistent memory for agents,
explicit Hermes integration path, 43 MCP tools
- builderz-labs/mission-control (4.1k ⭐) → Tools & Orchestration: self-hosted agent
platform with Hermes gateway integration
- Updated project count: 25 → 27
models/index.html:
- Gemini 3.1 Pro: bumped from #4 Arena (1492 Elo) to #1 (1505 Elo)
- Claude Opus 4.6: updated from #1 (1504) to #2 (1503 Elo) — still top tier
- Updated datestamp to April 13, 2026
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 16, 2026
Factual fixes: - Hindsight link now points to github.com/vectorize-io/hindsight (hindsight.ai 403) - Holographic: removed external link (domain for sale), note it is built into Hermes - RetainDB: flagged domain availability as unverified - Hindsight benchmarks added: LongMemEval 91-94.6%, BEAM 64.1%, LoCoMo 89.6% - Holographic hosting corrected to Local only (was Cloud) - Honcho hosting corrected to Cloud + self-host (was Cloud only) - ByteRover license kept as Partial OSS (not MIT) - Supermemory stars footnoted: consumer frontend, not core engine Content additions: - Architecture notes for all 8 providers (TEMPR, HRR/SQLite, L0/L1/L2, triple-store, etc.) - Full pricing detail: paid tiers, per-token cloud rates - GitHub star counts in table and provider cards - Tool counts (2-5) in table and provider cards - Config snippets section with 6 provider examples - Last verified April 2026 timestamp in hero Recommendation fixes: - Coding agents: Hindsight #1 (TEMPR BM25, entity extraction), ByteRover #2 - Knowledge wiki: Hindsight #1 (highest LongMemEval), Supermemory #2 - Privacy: Holographic #1 (zero deps, local SQLite), then Hindsight, OpenViking - Added Personal AI companion use case (Honcho #1) - Benchmark winner updated to Hindsight on LongMemEval axis Table structure: - Best for column moved to position 2 (was last/rightmost) - Paid from column added - Stars and Tools columns added (hidden on narrow screens) - Benchmark column shows multiple scores for Hindsight/ByteRover
This was referenced Apr 16, 2026
snutp
added a commit
to snutp/hermes-webui
that referenced
this pull request
Apr 19, 2026
Codex 리뷰 blocker nesquena#2: 단일 키 저장은 multi-tab / project-switch 시 잘못된 project를 resume하게 만듦. 주 키를 'webui:last:<port>' 로 바꾸고 legacy 전역 키는 호환용으로 유지.
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 20, 2026
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 21, 2026
…6 (58.6%) — April 21, 2026
JKJameson
pushed a commit
to JKJameson/hermes-webui
that referenced
this pull request
Apr 25, 2026
Hermes Web UI — Sprints 11-14: multi-provider models, settings, sessi…
zhonghuaY
referenced
this pull request
in zhonghuaY/hermes-webui
Apr 27, 2026
…#2) Revert to standard /v1 base URL and pass keyword via x-instance-keyword HTTP header in extra_headers. The gw: model prefix prevents Responses API auto-detection; the gateway strips it for strict matching. Co-authored-by: Albert Yang <[email protected]> Co-authored-by: Copilot <[email protected]>
This was referenced Apr 27, 2026
feat: integrate agent-api-gateway + status indicators + full-width chat + session-rename fixes
#1132
Closed
bergeouss
added a commit
to bergeouss/hermes-webui
that referenced
this pull request
Apr 27, 2026
Address reviewer feedback on nesquena#1152 (blocker nesquena#2): - Document page reload behavior: state PERSISTS (server-side in-memory) - Document cross-tab isolation: state is SHARED per session - Document server restart: state is LOST (in-memory only) - Correct misleading 'resets on page reload' comment in messages.js - Update backend route comment in routes.py
bergeouss
added a commit
to bergeouss/hermes-webui
that referenced
this pull request
Apr 27, 2026
Address reviewer feedback on nesquena#1152 (blocker nesquena#2): - Document page reload behavior: state PERSISTS (server-side in-memory) - Document cross-tab isolation: state is SHARED per session - Document server restart: state is LOST (in-memory only) - Correct misleading 'resets on page reload' comment in messages.js - Update backend route comment in routes.py
bsgdigital
pushed a commit
to bsgdigital/hermes-webui
that referenced
this pull request
Apr 30, 2026
- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors (5+ tiers), with first/latest release versions, single-PR roll, and attribution methodology. Generated from git log + gh pulls API + CHANGELOG mention parsing. - README.md: stack-ranked top-10 contributors table at the top of the Contributors section, link to CONTRIBUTORS.md for the full list. Updated test count (1898 → 3309). Refreshed @franksong2702 and @bergeouss entries to reflect their broader bodies of work (now the #1 and nesquena#2 external contributors). - ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header; bumped current shipped build to v0.50.245 with current architecture state notes (streaming-markdown vendoring, byte-range streaming, configurable-model-badges). - ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section refreshed for current CLI/Claude parity (~95% Claude parity now). Generated by aggregating CHANGELOG attribution lines, gh PR API authors, and CHANGELOG version-section walks. Internal/bot accounts filtered out.
nesquena-hermes
added a commit
that referenced
this pull request
Apr 30, 2026
…al reconnect The original PR #1355 frontend health timer at 60s force-reconnected every 60s regardless of activity, with a comment claiming it was a 'no event in 60s' detector. This churned one TCP/SSE setup per minute per active session. Fix: track lastEventAt on initial+clarify event arrivals; only reconnect when the gap exceeds 60s. The server sends keepalive comments every 30s, but EventSource silently consumes those — we can only bump lastEventAt on real application events. Under healthy conditions the timer never fires. Sits on top of nesquena's 604b44a deadlock fix (parallel work — Nathan caught the same Opus MUST-FIX before my push landed). Opus SHOULD-FIX #2 from the v0.50.249 pre-release review.
JKJameson
pushed a commit
to JKJameson/hermes-webui
that referenced
this pull request
Apr 30, 2026
…al reconnect (nesquena#1367) Follow-up to v0.50.249 / PR nesquena#1365 absorbing Opus SHOULD-FIX nesquena#2. Originally reset out of nesquena#1365 because the reviewer flagged it as out-of-scope; brought back per follow-up guidance that correctness-improving changes should ship even when out of scope. The clarify SSE health timer at static/messages.js:1715 was an unconditional 60s force-reconnect, not the 'no event in 60s' detector its comment claimed. Now actually a stale-detector that tracks lastEventAt on initial+clarify event arrivals; only reconnects when the gap exceeds 60s. Under healthy conditions the timer never fires. Co-authored-by: nesquena-hermes <[email protected]>
GeoffBao
pushed a commit
to GeoffBao/hermes-webui
that referenced
this pull request
May 1, 2026
- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors (5+ tiers), with first/latest release versions, single-PR roll, and attribution methodology. Generated from git log + gh pulls API + CHANGELOG mention parsing. - README.md: stack-ranked top-10 contributors table at the top of the Contributors section, link to CONTRIBUTORS.md for the full list. Updated test count (1898 → 3309). Refreshed @franksong2702 and @bergeouss entries to reflect their broader bodies of work (now the nesquena#1 and nesquena#2 external contributors). - ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header; bumped current shipped build to v0.50.245 with current architecture state notes (streaming-markdown vendoring, byte-range streaming, configurable-model-badges). - ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section refreshed for current CLI/Claude parity (~95% Claude parity now). Generated by aggregating CHANGELOG attribution lines, gh PR API authors, and CHANGELOG version-section walks. Internal/bot accounts filtered out.
GeoffBao
pushed a commit
to GeoffBao/hermes-webui
that referenced
this pull request
May 1, 2026
…al reconnect (nesquena#1367) Follow-up to v0.50.249 / PR nesquena#1365 absorbing Opus SHOULD-FIX nesquena#2. Originally reset out of nesquena#1365 because the reviewer flagged it as out-of-scope; brought back per follow-up guidance that correctness-improving changes should ship even when out of scope. The clarify SSE health timer at static/messages.js:1715 was an unconditional 60s force-reconnect, not the 'no event in 60s' detector its comment claimed. Now actually a stale-detector that tracks lastEventAt on initial+clarify event arrivals; only reconnects when the gap exceeds 60s. Under healthy conditions the timer never fires. Co-authored-by: nesquena-hermes <[email protected]>
This was referenced May 1, 2026
This was referenced May 2, 2026
nesquena-hermes
pushed a commit
to ccqqlo/hermes-webui
that referenced
this pull request
May 2, 2026
…na#1458 Bug nesquena#1) Issue nesquena#1458 reports persistent-host crashes (≥1/day) when running the WebUI under launchd KeepAlive on macOS. Root cause: `bootstrap.py` calls `subprocess.Popen([python, "server.py"], start_new_session=True)`, probes /health, then exits 0. Under any process supervisor (launchd, systemd, supervisord, runit, s6), the supervisor sees its tracked PID exit, marks the program as "completed," and respawns it. The new bootstrap fails to bind port 8787 (orphaned server still has it), exits non-zero, supervisor respawns again — loop until the orphan crashes for some other reason and the next respawn finds the port free. This PR addresses Bug nesquena#1 of the three failure modes tracked in nesquena#1458: the `bootstrap.py` double-fork breaking process supervisors. Bug nesquena#2 (state.db FD leak) and Bug nesquena#3 (HTTP-unhealthy wedge) remain open under the same issue — they need diagnosis data before a fix can land. Changes ------- 1. `bootstrap.py`: - New `--foreground` argparse flag with help text mentioning launchd / systemd / supervisord. - New `_detect_supervisor()` that returns the env var name for any supervisor it detects: `INVOCATION_ID` / `JOURNAL_STREAM` / `NOTIFY_SOCKET` (systemd, s6), `XPC_SERVICE_NAME` (launchd), `SUPERVISOR_ENABLED` (supervisord), or `HERMES_WEBUI_FOREGROUND` for the explicit user opt-in. Truthy values for the explicit opt-in: `1` / `true` / `yes` / `on` (case-insensitive). - `main()` branches on `args.foreground or _detect_supervisor()`: - **Foreground path:** chdir to `agent_dir or REPO_ROOT`, then `os.execv(python, [python, server_path])` to replace the bootstrap process image with the server. The supervisor sees the long-lived server as the original child. No `wait_for_health` probe — the supervisor's KeepAlive / Restart=on-failure handles liveness. - **Default path:** unchanged. Spawn server as detached child via `Popen + start_new_session=True`, probe /health, return 0. This still works for interactive `bash start.sh` invocations. - Resolved env vars (HOST/PORT/STATE_DIR/AGENT_DIR) are now mutated on `os.environ` directly instead of into a local `env` copy so they are inherited across `os.execv`. 2. `docs/supervisor.md` (new): runnable launchd plist, systemd .service, and supervisord conf examples + a diagnostic recipe (`lsof` + ppid chain) for catching the orphan-loop in production. 3. `.gitignore`: allowlist `docs/supervisor.md` (the directory uses an opt-in pattern; matches the existing `!docs/docker.md` precedent). 4. `tests/test_bootstrap_foreground.py` (new): 35 regression tests covering the argparse flag, `_detect_supervisor()` behavior across all five supervisor env vars, the explicit opt-in's truthy/falsy values, and `main()`'s execv-vs-Popen routing decision under each input combination. `os.execv` is monkeypatched in the routing tests — we pin the structural choice (which call is made, with which args, in which cwd, with which env) not the post-exec behavior. Why this scope and no more -------------------------- Bug nesquena#2 (state.db FD leak) lists 5 candidate paths and asks the reporter for `lsof -p <pid> | sort | uniq -c | sort -rn | head -20` output to disambiguate. Until that data lands, any "fix" would be speculative — explicitly out of scope per the contributor-pickup comment on the issue. Bug nesquena#3 (launchd-running, port-listening, HTTP-unhealthy) was added in @stefanpieter's reply comment. Diagnosis is in flight; no concrete fix shape yet. Also out of scope. Running locally end-to-end verifies the behavior: ``` [bootstrap] Starting Hermes Web UI on http://127.0.0.1:8789 (foreground mode: --foreground) $ pgrep -af 'server.py' 2997632 /home/.../python /tmp/wt-fix-1458/server.py $ ps -o ppid -p 2997632 2997581 ← bash that ran bootstrap.py — same PID as the original bootstrap $ ps -p 2997581 -o cmd ... bootstrap.py ... ← but exec'd into server.py ``` The same PID that bash forked for `bootstrap.py` is now `server.py`. A supervisor watching that PID would correctly observe the long-lived server. No double-fork. Verification ------------ - 3811 tests pass (`pytest tests/` — full suite, +51 from this PR plus master-merge-in) - All 35 new bootstrap-foreground tests pass - `bash scripts/run-browser-tests.sh` PASS (HTTP API checks against worktree) - `bash scripts/webui_qa_agent.sh 8789` PASS (23/23 visual QA) - Live verified: server starts cleanly under both `--foreground` and `HERMES_WEBUI_FOREGROUND=1`; PID lineage confirms no double-fork Closes nesquena#1458 (Bug nesquena#1 only). Bugs nesquena#2 and nesquena#3 remain tracked under the issue.
This was referenced May 2, 2026
Merged
nesquena-hermes
added a commit
that referenced
this pull request
May 3, 2026
lanehollard
pushed a commit
to lanehollard/hermes-webui
that referenced
this pull request
May 3, 2026
… sidebar polling (nesquena#1494) Production WebUI on macOS launchd reproduced an HTTP-unhealthy wedge after nesquena#1483 closed the bootstrap supervisor double-fork: process alive, port listening, every HTTP request reset by peer before a response. The reporter (@insecurejezza) traced it to FD exhaustion — 366 open FDs on the wedged process, 238 of them `~/.hermes/state.db`, `state.db-wal`, and `state.db-shm`. Root cause: four sqlite callsites use `with sqlite3.connect(...) as conn:`. Python's sqlite3 connection context manager only commits or rolls back on exit; it does NOT close the connection. `/api/sessions` polling calls these on every sidebar refresh, so each poll leaked one or more open state.db FDs until the process hit macOS's soft FD limit and new sqlite3.connect() calls inside fresh request handlers raised before any response bytes were written. Fix: wrap each `sqlite3.connect(...)` in `contextlib.closing(...)` so the connection is explicitly closed on scope exit, in addition to the auto- commit / rollback semantics that `Connection.__exit__` already provides. Callsites patched: - api/agent_sessions.py:read_importable_agent_session_rows - api/agent_sessions.py:read_session_lineage_metadata - api/models.py:get_cli_session_messages - api/models.py:delete_cli_session Reporter's verification (post-patch, 100-request stress loop against /api/sessions and /api/projects): batch=1 fd=92 state_handles=0 batch=2 fd=92 state_handles=0 ... batch=5 fd=92 state_handles=0 Pre-patch the same loop made FD count and state.db handle count climb monotonically. 4 regression tests in tests/test_issue1494_state_db_fd_leak.py monkeypatch sqlite3.connect with a tracking wrapper that records .close() calls and assert every connection opened by each of the four functions is explicitly closed. Verified to fail (catching the original bug) when the closing() wrap is reverted: "leaked 5 of 5 sqlite connection(s) — context-manager- only `with sqlite3.connect()` does not close. Wrap in contextlib.closing()." This addresses Bug nesquena#2 of the umbrella issue nesquena#1458. Bug nesquena#3 (HTTP-unhealthy wedge in the absence of FD exhaustion) remains open pending separate diagnostic data — explicit scope discipline. Closes nesquena#1494 Refs nesquena#1458 (Bug nesquena#2 of 3) Co-authored-by: insecurejezza <[email protected]>
lanehollard
pushed a commit
to lanehollard/hermes-webui
that referenced
this pull request
May 3, 2026
lanehollard
pushed a commit
to lanehollard/hermes-webui
that referenced
this pull request
May 3, 2026
…ix + P0 polish bundle (nesquena#1466 nesquena#1494 nesquena#1469 nesquena#1484 nesquena#1486) 3 PRs in this batch (3866 → 3874 tests, +8): - nesquena#1493 (@dso2ng) — sidebar Stop response cancels row's stream not active pane's (closes nesquena#1466, follow-up to nesquena#1480) - nesquena#1495 (self-built; reported by @insecurejezza in nesquena#1494) — state.db connection FD leak in sidebar polling (closes nesquena#1494, addresses Bug nesquena#2 of nesquena#1458) - nesquena#1492 (@bergeouss) — P0 bugfixes bundle: tool-card args readability + CLI rename persistence + scroll pinning + sw.js relative-path regression test (closes nesquena#1469 nesquena#1484 nesquena#1486) This release closes Bug nesquena#2 of the umbrella issue nesquena#1458. Bug nesquena#1 was closed by v0.50.269 (nesquena#1483) + v0.50.270 (nesquena#1487). Bug nesquena#3 (HTTP-unhealthy without FD exhaustion) is the remaining work item.
waldmanz
pushed a commit
to waldmanz/hermes-webui
that referenced
this pull request
May 4, 2026
Both flagged by pre-release Opus advisor; both clearly defensive and small enough to absorb in-release per the reviewer-flagged-fix-in-release-not-followup policy. SHOULD-FIX nesquena#1 (api/routes.py:_clear_stale_stream_state, ~25 LOC): After the metadata-only reload (nesquena#1559 Layer 2), the local 'session' variable is reassigned to the full-load object but the caller still holds the original metadata-only stub. /api/session then returns the stale active_stream_id at routes.py:1791, causing the frontend to attempt one ghost SSE reconnect before recovering. Fix: capture original_stub at function entry, then patch its in-memory active_stream_id and pending_* fields to None after both the early-return (full-load already cleared) path AND the successful-mutation path. Now the caller's read returns fresh state, no ghost reconnect. SHOULD-FIX nesquena#2 (api/models.py:Session.save, ~20 LOC): The .bak write at api/models.py:436 used write_text() which truncates- then-writes — a crash mid-write or concurrent backup-producing save could leave a torn .bak. Recovery defends correctly (JSONDecodeError → returns -1 → 'no_action'), so the failure mode was 'backup lost' not 'spurious restore'. Fix: tmp + os.replace pattern matching the main file write at line 446-453. Now backup either lands cleanly or doesn't land at all. 4026/4026 tests pass post-absorb.
nesquena-hermes
added a commit
that referenced
this pull request
May 4, 2026
SHOULD-FIX #1 (renamed-root client cross-alias): drop strict-equality client filter at static/sessions.js:1853. Server-side _profiles_match cross-aliases 'default'-tagged rows to a renamed root 'kinni'; the strict-equality client would reject them, dropping every legacy session for renamed-root users. The server is now solely authoritative for profile scoping. SHOULD-FIX #2 (messaging-source dedupe ordering): _keep_latest_messaging_session_per_source now runs AFTER the profile filter at api/routes.py:2078. Before, it ran on the merged-cross-profile list with profile-blind keys, discarding the older profile's row across profiles before the scope filter — leaving zero rows for any messaging identity the active profile shared with another profile. NIT #3: _projects_migrated flag now set only AFTER successful save_projects. NIT #4: cleaned dead test code in test_is_root_profile_invalidation_drops_stale. NIT #5: _create_profile_fallback's clone_from=='default' literal now routes through _is_root_profile() for parity with the 5 other callsites. +2 regression tests pin the SHOULD-FIX shapes: - test_keep_latest_messaging_runs_after_profile_filter (source-string ordering) - test_static_sessions_js_trusts_server_profile_scoping (no client re-filter) 4173 -> 4175 tests pass. 0 regressions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…on QoL, alerts, polish
Sprint 11 (v0.13): multi-provider model support, streaming smoothness
Sprint 12 (v0.14): settings panel, SSE reconnect, session QoL
Sprint 13 (v0.15): cron alerts, background errors, session duplicate, tab title
Sprint 14 (v0.16): Mermaid diagrams, file ops, session archive/tags, timestamps
Test suite: 224 tests across 14 sprint files + regression gate, 0 failures.