Portability#1
Merged
nesquena-hermes merged 11 commits intomasterfrom Mar 31, 2026
Merged
Conversation
added 11 commits
March 31, 2026 02:37
- api/config.py: full multi-strategy discovery for agent dir, python, state dir, and workspace. No hardcoded /home/hermes paths. Prints startup config and SSH tunnel command on launch. - start.sh: completely rewritten. Discovers python and hermes-agent automatically. Creates local .venv + installs requirements if no suitable python found. Kills stale instances. Health-checks after launch. Prints SSH tunnel command when running over SSH. - tests/conftest.py: all paths now discovered dynamically. No more hardcoded ~/webui-mvp or ~/.hermes/hermes-agent. Mirrors the same discovery logic as api/config.py. - requirements.txt: added minimal dep list (pyyaml). - README.md: rewritten around auto-discovery. No internal paths, no machine-specific examples. Documents all env vars and override patterns. - PORTABILITY.md: tracked (was previously untracked).
- start.sh now sources .env from the repo root before doing any discovery. Machine-local overrides load first, auto-detection fills in anything missing. - .env.example added: a commented template listing every variable with explanations. New users copy this to .env and fill in only what their setup needs. - .gitignore: .env stays excluded, but !.env.example is now tracked so the template ships with the repo. - .env on this machine pins the exact paths matching the live server: agent_dir, python, state_dir, workspace, host, port, config path.
… unused sys import in conftest
- Replace 45.79.100.32 with <your-server-ip> everywhere
- Replace hermes@<host> with <user>@<your-server> everywhere
- Replace all /home/hermes/... absolute paths with generic equivalents
(<repo>/, <agent-dir>/, ~/ as appropriate) across all docs
- Replace 'Nathan' with generic references in ARCHITECTURE.md, TESTING.md
- tests/test_regressions.py: replace hardcoded Path('/home/hermes/webui-mvp/...')
with REPO_ROOT / '...' imported from conftest -- paths now work on any machine
- tests/test_sprint6.py, test_sprint10.py: same REPO_ROOT fix
- tests/test_sprint1.py, test_sprint9.py: fix docstring run commands
- static/panels.js: replace /home/hermes/CodePath placeholder with generic example
- tests/conftest.py: remove last /home/hermes mention from a comment
Files modified: AGENTS.md, ARCHITECTURE.md, CHANGELOG.md, PORTABILITY.md,
ROADMAP.md, TESTING.md, static/panels.js, tests/conftest.py,
tests/test_regressions.py, tests/test_sprint1.py, tests/test_sprint6.py,
tests/test_sprint10.py, tests/test_sprint9.py
…m write_file pass)
test_regressions.py was mangled -- digits in string literals and variable
names were stripped along with the read_file prefixes. Rebuilt from master
with only the two legitimate portability changes re-applied:
- REPO_ROOT import from conftest
- pathlib.Path('/home/hermes/webui-mvp/...') -> REPO_ROOT / '...'
test_sprint6.py and test_sprint10.py had clean strips but are included
here to confirm all three are verified syntax-valid with zero hardcoded paths.
…0-line limit Both files were capped at 500 lines during the privacy sweep because the sweep used read_file (default limit=500) then wrote the truncated content back via write_file, silently discarding the rest. Restored both from master and re-applied only the privacy replacements using raw file I/O. Line counts restored: TESTING.md: 500 -> 1455 lines ARCHITECTURE.md: 500 -> 1049 lines
…ivacy changes Previous attempts used git show piped through terminal() which hit the 50KB output cap and truncated the file, and the digit-strip regex mangled line content. This time: git checkout master -- TESTING.md to get the exact file, then apply only the legitimate privacy replacements in-place with Python.
…ly privacy changes
This was referenced Apr 2, 2026
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 12, 2026
- Title/badge/source: 2025 → 2026, sources updated to Chatbot Arena/BenchLM - Overall: Opus 4.6 (#1, 1504 Arena Elo), Gemini 3.1 Pro (#2), GPT-5.4 (#3), Sonnet 4.6 (#4), DeepSeek V3.2 (#5, replaces DeepSeek R1) - Coding: Opus 4.6 (#1, powers Claude Code/Cursor), GPT-5.4 (#2, SWE-Pro 57.7%), Sonnet 4.6 (#3), Gemini 3.1 Pro (#4), DeepSeek V3.2 (#5) Removed: Sonnet 4.5, Gemini 2.5 Pro, GPT-5.3 Codex - Writing: Opus 4.6 (#1, Mazur 8.53), Gemini 3.1 Pro (#2, Arena CW 1487), Sonnet 4.6 (#3, EQ-Bench CW 1936), GPT-5.4 (#4), Meta Muse Spark (#5) Removed: Llama 4 Maverick (replaced by Meta Muse Spark) - Search: Gemini 3.1 Pro (#1), Grok 4 (#2, live X data), Sonnet 4.6+tool (#3), GPT-5.4 (#4), Gemini 3 Flash Thinking (#5) Added Grok 4 for real-time social/X data - Reasoning: Gemini 3.1 Pro (#1, GPQA 95.45%), GPT-5.4 (#2), Opus 4.6 (#3), Gemini 3 Flash Thinking (#4, best value), DeepSeek V3.2 (#5) Removed: Kimi K2, DeepSeek R1 standalone - Quick picker: all model names updated to 2026 versions - Setup boxes: OpenAI gpt-5-4 names, Google gemini-3-1-pro-preview, self-hosted section updated to DeepSeek V3.2 + Muse Spark
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 12, 2026
Coding section: - Gemini 3.1 Pro: add Terminal-Bench 78.4% (highest of any frontier model on CLI/DevOps) Score badge updated to show Terminal-Bench rather than SWE-bench Verified - GPT-5.4: note Terminal-Bench 75.1% in description, consolidate pill text - #5 DeepSeek V3.2 → Qwen 3.6-Plus: leads Terminal-Bench at 61.6%, 88.2% GPQA, 1M context, available now on Alibaba Cloud + OpenRouter Writing section (reordered based on EQ-Bench CW scores): - #1 Claude Sonnet 4.6 (1936 EQ-Bench CW — highest, best voice consistency) - #2 Claude Opus 4.6 (Mazur 8.53, IF Arena #1, 1M context for literary depth) - #3 Gemini 3.1 Pro (Arena CW #1 1487, AI-tell avoidance, 2M context) - #4 GPT-5.4 (noted as ~9th on Arena CW, better for structured/commercial writing) - #5 Meta Muse Spark → Kimi K2.5 (/usr/bin/bash.60/.50, ~1700 EQ-Bench CW, live API) Muse Spark removed — no commercial API available yet Reasoning section: - Gemini 3.1 Pro GPQA: 95.45% → 94.1% (more conservative/recent figure, consistent with both agents' data) - Added ARC-AGI-2 77.1% for Gemini 3.1 Pro (#1 on visual reasoning too) - Opus 4.6: added note that Sonnet leads GDPval-AA (1633 Elo #1) for throughput - #5 DeepSeek V3.2 → Qwen 3.6-Plus (88.2% GPQA, 1M context, same model as coding) Quick picker: - Creative writing: Opus → Sonnet 4.6 (EQ-Bench #1, 85% cheaper) - Hard reasoning: 95.45% → 94.1%, add ARC-AGI-2 mention - Budget pick: DeepSeek V3.2 → Gemini 3 Flash Thinking (/usr/bin/bash.50/1M, 89.8% GPQA) Setup boxes: - Self-hosted: Muse Spark → Qwen 3.6-Plus + Gemma 4 26B MoE (Apache 2.0, 82.3% GPQA with 3.8B active params, best edge/self-hosted reasoning) Overall section: unchanged (top 5 still correct per both agents) Search section: unchanged (no new data from either agent)
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 14, 2026
index.html:
- Hero badge: 56k+ → 77k+ GitHub stars (NousResearch/hermes-agent: 76,811)
community/index.html:
- All 24 project star counts updated to live GitHub values (April 13 2026)
Notable bumps: gbrain 4.8k→7.4k, hermes-webui 1.3k→1.8k, hermes-agent 56k+→77k,
hermes-workspace 1.1k→1.3k, hudui 571→827, self-evolution 893→1.6k
- Removed awizermann/scarf (repo no longer exists on GitHub)
- Added 3 new high-value projects:
- vectorize-io/hindsight (9.1k ⭐) → Memory Providers: Official Hermes memory plugin,
biomimetic 3-tier memory, topped LongMemEval Jan 2026
- rohitg00/agentmemory (1.3k ⭐) → Memory Providers: #1 persistent memory for agents,
explicit Hermes integration path, 43 MCP tools
- builderz-labs/mission-control (4.1k ⭐) → Tools & Orchestration: self-hosted agent
platform with Hermes gateway integration
- Updated project count: 25 → 27
models/index.html:
- Gemini 3.1 Pro: bumped from #4 Arena (1492 Elo) to #1 (1505 Elo)
- Claude Opus 4.6: updated from #1 (1504) to #2 (1503 Elo) — still top tier
- Updated datestamp to April 13, 2026
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 16, 2026
Factual fixes: - Hindsight link now points to github.com/vectorize-io/hindsight (hindsight.ai 403) - Holographic: removed external link (domain for sale), note it is built into Hermes - RetainDB: flagged domain availability as unverified - Hindsight benchmarks added: LongMemEval 91-94.6%, BEAM 64.1%, LoCoMo 89.6% - Holographic hosting corrected to Local only (was Cloud) - Honcho hosting corrected to Cloud + self-host (was Cloud only) - ByteRover license kept as Partial OSS (not MIT) - Supermemory stars footnoted: consumer frontend, not core engine Content additions: - Architecture notes for all 8 providers (TEMPR, HRR/SQLite, L0/L1/L2, triple-store, etc.) - Full pricing detail: paid tiers, per-token cloud rates - GitHub star counts in table and provider cards - Tool counts (2-5) in table and provider cards - Config snippets section with 6 provider examples - Last verified April 2026 timestamp in hero Recommendation fixes: - Coding agents: Hindsight #1 (TEMPR BM25, entity extraction), ByteRover #2 - Knowledge wiki: Hindsight #1 (highest LongMemEval), Supermemory #2 - Privacy: Holographic #1 (zero deps, local SQLite), then Hindsight, OpenViking - Added Personal AI companion use case (Honcho #1) - Benchmark winner updated to Hindsight on LongMemEval axis Table structure: - Best for column moved to position 2 (was last/rightmost) - Paid from column added - Stars and Tools columns added (hidden on narrow screens) - Benchmark column shows multiple scores for Hindsight/ByteRover
8 tasks
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 21, 2026
…coding, writing section updated)
nesquena
added a commit
that referenced
this pull request
Apr 24, 2026
The /background feature was fundamentally non-functional as shipped — two coupled bugs kept results from ever reaching the user: 1. complete_background() was defined but NEVER called. The _handle_background thread ran _run_agent_streaming and then exited; no hook signalled the task tracker that the work was done. Every background task stayed in status="running" forever and get_results() (which filters to done-only) always returned []. 2. get_results() called _BACKGROUND_TASKS.pop(parent_sid, []) which removed the ENTIRE list — including tasks still in flight. Even if bug #1 were fixed, the first frontend poll during a long-running task would drop the task from the tracker, and complete_background()'s loop would iterate over an empty list when the worker eventually finished — the result would still be lost. Fix: - api/background.py::get_results now retains running tasks in the dict; only done ones are popped and returned. - api/routes.py::_handle_background wraps _run_agent_streaming in an inline worker (_run_bg_and_notify) that, after streaming completes, reloads the hidden bg session, extracts the last non-error assistant message, and calls complete_background(parent_sid, task_id, answer). Worker also best-effort unlinks the hidden bg session file so SESSION_DIR doesn't accumulate debris. - Exception safety: any failure in _run_agent_streaming or the post-processing path still calls complete_background with a fallback sentinel so the frontend's polling loop doesn't hang forever. Added 5 regression tests in tests/test_background_tasks.py: - running tasks survive get_results polls - done tasks are returned and removed - poll → complete → poll round-trip surfaces the answer (this is the original bug's reproduction path) - empty parent is cleaned up - static check: _handle_background's worker calls complete_background and uses Session.load to extract the answer Full suite: 2023 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 24, 2026
The /background feature was fundamentally non-functional as shipped — two coupled bugs kept results from ever reaching the user: 1. complete_background() was defined but NEVER called. The _handle_background thread ran _run_agent_streaming and then exited; no hook signalled the task tracker that the work was done. Every background task stayed in status="running" forever and get_results() (which filters to done-only) always returned []. 2. get_results() called _BACKGROUND_TASKS.pop(parent_sid, []) which removed the ENTIRE list — including tasks still in flight. Even if bug #1 were fixed, the first frontend poll during a long-running task would drop the task from the tracker, and complete_background()'s loop would iterate over an empty list when the worker eventually finished — the result would still be lost. Fix: - api/background.py::get_results now retains running tasks in the dict; only done ones are popped and returned. - api/routes.py::_handle_background wraps _run_agent_streaming in an inline worker (_run_bg_and_notify) that, after streaming completes, reloads the hidden bg session, extracts the last non-error assistant message, and calls complete_background(parent_sid, task_id, answer). Worker also best-effort unlinks the hidden bg session file so SESSION_DIR doesn't accumulate debris. - Exception safety: any failure in _run_agent_streaming or the post-processing path still calls complete_background with a fallback sentinel so the frontend's polling loop doesn't hang forever. Added 5 regression tests in tests/test_background_tasks.py: - running tasks survive get_results polls - done tasks are returned and removed - poll → complete → poll round-trip surfaces the answer (this is the original bug's reproduction path) - empty parent is cleaned up - static check: _handle_background's worker calls complete_background and uses Session.load to extract the answer Full suite: 2023 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
GeoffBao
pushed a commit
to GeoffBao/hermes-webui
that referenced
this pull request
Apr 29, 2026
When _loadOlderMessages prepends older messages, the viewport snaps
to the bottom instead of staying where the user was.
Two bugs compounding:
1. Wrong scrollable container. Code used `$("msgInner")` for scrollHeight
and scrollTop, but #msgInner has no overflow-y — it is a flex column.
The actual scrollable container is #messages (`.messages{overflow-y:auto}`).
Setting msgInner.scrollTop was silently ignored.
2. renderMessages calls scrollToBottom at the end (ui.js:2552),
which unconditionally scrolls #messages to the bottom and sets
_scrollPinned=true. Since bug nesquena#1 made the scroll-restore a no-op,
the page landed at the bottom every time.
Fix:
- Changed scroll restore target from `$("msgInner")` to `$("messages")`.
- Reset _scrollPinned = false after restoring the user position,
so scrollToBottom does not re-fire on next tick.
JKJameson
pushed a commit
to JKJameson/hermes-webui
that referenced
this pull request
Apr 29, 2026
When _loadOlderMessages prepends older messages, the viewport snaps
to the bottom instead of staying where the user was.
Two bugs compounding:
1. Wrong scrollable container. Code used `$("msgInner")` for scrollHeight
and scrollTop, but #msgInner has no overflow-y — it is a flex column.
The actual scrollable container is #messages (`.messages{overflow-y:auto}`).
Setting msgInner.scrollTop was silently ignored.
2. renderMessages calls scrollToBottom at the end (ui.js:2552),
which unconditionally scrolls #messages to the bottom and sets
_scrollPinned=true. Since bug nesquena#1 made the scroll-restore a no-op,
the page landed at the bottom every time.
Fix:
- Changed scroll restore target from `$("msgInner")` to `$("messages")`.
- Reset _scrollPinned = false after restoring the user position,
so scrollToBottom does not re-fire on next tick.
nesquena-hermes
added a commit
that referenced
this pull request
Apr 30, 2026
Per Opus pre-release review (SHOULD-FIX #1): the CHANGELOG claimed both filter sites exempt 'active_stream_id OR pending_user_message', but the index path operates on compact() output which doesn't include pending_user_message. The behavior is correct in both paths because both fields are set/cleared in lockstep during streaming, but the wording was stronger than what the code does. Tightened to describe what each path actually checks.
starship-s
pushed a commit
to starship-s/hermes-webui
that referenced
this pull request
Apr 30, 2026
- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors (5+ tiers), with first/latest release versions, single-PR roll, and attribution methodology. Generated from git log + gh pulls API + CHANGELOG mention parsing. - README.md: stack-ranked top-10 contributors table at the top of the Contributors section, link to CONTRIBUTORS.md for the full list. Updated test count (1898 → 3309). Refreshed @franksong2702 and @bergeouss entries to reflect their broader bodies of work (now the nesquena#1 and nesquena#2 external contributors). - ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header; bumped current shipped build to v0.50.245 with current architecture state notes (streaming-markdown vendoring, byte-range streaming, configurable-model-badges). - ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section refreshed for current CLI/Claude parity (~95% Claude parity now). Generated by aggregating CHANGELOG attribution lines, gh PR API authors, and CHANGELOG version-section walks. Internal/bot accounts filtered out.
GeoffBao
pushed a commit
to GeoffBao/hermes-webui
that referenced
this pull request
May 1, 2026
Per Opus pre-release review (SHOULD-FIX nesquena#1): the CHANGELOG claimed both filter sites exempt 'active_stream_id OR pending_user_message', but the index path operates on compact() output which doesn't include pending_user_message. The behavior is correct in both paths because both fields are set/cleared in lockstep during streaming, but the wording was stronger than what the code does. Tightened to describe what each path actually checks.
GeoffBao
pushed a commit
to GeoffBao/hermes-webui
that referenced
this pull request
May 1, 2026
- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors (5+ tiers), with first/latest release versions, single-PR roll, and attribution methodology. Generated from git log + gh pulls API + CHANGELOG mention parsing. - README.md: stack-ranked top-10 contributors table at the top of the Contributors section, link to CONTRIBUTORS.md for the full list. Updated test count (1898 → 3309). Refreshed @franksong2702 and @bergeouss entries to reflect their broader bodies of work (now the nesquena#1 and nesquena#2 external contributors). - ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header; bumped current shipped build to v0.50.245 with current architecture state notes (streaming-markdown vendoring, byte-range streaming, configurable-model-badges). - ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section refreshed for current CLI/Claude parity (~95% Claude parity now). Generated by aggregating CHANGELOG attribution lines, gh PR API authors, and CHANGELOG version-section walks. Internal/bot accounts filtered out.
This was referenced May 1, 2026
Merged
This was referenced May 2, 2026
nesquena-hermes
pushed a commit
to ccqqlo/hermes-webui
that referenced
this pull request
May 2, 2026
…na#1458 Bug nesquena#1) Issue nesquena#1458 reports persistent-host crashes (≥1/day) when running the WebUI under launchd KeepAlive on macOS. Root cause: `bootstrap.py` calls `subprocess.Popen([python, "server.py"], start_new_session=True)`, probes /health, then exits 0. Under any process supervisor (launchd, systemd, supervisord, runit, s6), the supervisor sees its tracked PID exit, marks the program as "completed," and respawns it. The new bootstrap fails to bind port 8787 (orphaned server still has it), exits non-zero, supervisor respawns again — loop until the orphan crashes for some other reason and the next respawn finds the port free. This PR addresses Bug nesquena#1 of the three failure modes tracked in nesquena#1458: the `bootstrap.py` double-fork breaking process supervisors. Bug nesquena#2 (state.db FD leak) and Bug nesquena#3 (HTTP-unhealthy wedge) remain open under the same issue — they need diagnosis data before a fix can land. Changes ------- 1. `bootstrap.py`: - New `--foreground` argparse flag with help text mentioning launchd / systemd / supervisord. - New `_detect_supervisor()` that returns the env var name for any supervisor it detects: `INVOCATION_ID` / `JOURNAL_STREAM` / `NOTIFY_SOCKET` (systemd, s6), `XPC_SERVICE_NAME` (launchd), `SUPERVISOR_ENABLED` (supervisord), or `HERMES_WEBUI_FOREGROUND` for the explicit user opt-in. Truthy values for the explicit opt-in: `1` / `true` / `yes` / `on` (case-insensitive). - `main()` branches on `args.foreground or _detect_supervisor()`: - **Foreground path:** chdir to `agent_dir or REPO_ROOT`, then `os.execv(python, [python, server_path])` to replace the bootstrap process image with the server. The supervisor sees the long-lived server as the original child. No `wait_for_health` probe — the supervisor's KeepAlive / Restart=on-failure handles liveness. - **Default path:** unchanged. Spawn server as detached child via `Popen + start_new_session=True`, probe /health, return 0. This still works for interactive `bash start.sh` invocations. - Resolved env vars (HOST/PORT/STATE_DIR/AGENT_DIR) are now mutated on `os.environ` directly instead of into a local `env` copy so they are inherited across `os.execv`. 2. `docs/supervisor.md` (new): runnable launchd plist, systemd .service, and supervisord conf examples + a diagnostic recipe (`lsof` + ppid chain) for catching the orphan-loop in production. 3. `.gitignore`: allowlist `docs/supervisor.md` (the directory uses an opt-in pattern; matches the existing `!docs/docker.md` precedent). 4. `tests/test_bootstrap_foreground.py` (new): 35 regression tests covering the argparse flag, `_detect_supervisor()` behavior across all five supervisor env vars, the explicit opt-in's truthy/falsy values, and `main()`'s execv-vs-Popen routing decision under each input combination. `os.execv` is monkeypatched in the routing tests — we pin the structural choice (which call is made, with which args, in which cwd, with which env) not the post-exec behavior. Why this scope and no more -------------------------- Bug nesquena#2 (state.db FD leak) lists 5 candidate paths and asks the reporter for `lsof -p <pid> | sort | uniq -c | sort -rn | head -20` output to disambiguate. Until that data lands, any "fix" would be speculative — explicitly out of scope per the contributor-pickup comment on the issue. Bug nesquena#3 (launchd-running, port-listening, HTTP-unhealthy) was added in @stefanpieter's reply comment. Diagnosis is in flight; no concrete fix shape yet. Also out of scope. Running locally end-to-end verifies the behavior: ``` [bootstrap] Starting Hermes Web UI on http://127.0.0.1:8789 (foreground mode: --foreground) $ pgrep -af 'server.py' 2997632 /home/.../python /tmp/wt-fix-1458/server.py $ ps -o ppid -p 2997632 2997581 ← bash that ran bootstrap.py — same PID as the original bootstrap $ ps -p 2997581 -o cmd ... bootstrap.py ... ← but exec'd into server.py ``` The same PID that bash forked for `bootstrap.py` is now `server.py`. A supervisor watching that PID would correctly observe the long-lived server. No double-fork. Verification ------------ - 3811 tests pass (`pytest tests/` — full suite, +51 from this PR plus master-merge-in) - All 35 new bootstrap-foreground tests pass - `bash scripts/run-browser-tests.sh` PASS (HTTP API checks against worktree) - `bash scripts/webui_qa_agent.sh 8789` PASS (23/23 visual QA) - Live verified: server starts cleanly under both `--foreground` and `HERMES_WEBUI_FOREGROUND=1`; PID lineage confirms no double-fork Closes nesquena#1458 (Bug nesquena#1 only). Bugs nesquena#2 and nesquena#3 remain tracked under the issue.
nesquena-hermes
pushed a commit
to ccqqlo/hermes-webui
that referenced
this pull request
May 2, 2026
…267 follow-ups - CHANGELOG.md: v0.50.269 entry detailing nesquena#1478 nesquena#1479 nesquena#1480 - ROADMAP.md: bump to v0.50.269, 3847 tests collected - TESTING.md: bump header + total to 3847 nesquena#1478: nesquena APPROVED self-built bootstrap.py --foreground mode (closes nesquena#1458 Bug nesquena#1, +Opus follow-ups: XPC noise filter, executability guard) nesquena#1479: surgical follow-up to nesquena#1473 — Session.compact() now includes pending_user_message nesquena#1480: bfcache pageshow restores active session via loadSession + checkInflightOnBoot 3847 tests pass (+47 net). Opus advisor on stage diff: no blockers.
This was referenced May 2, 2026
Merged
lanehollard
pushed a commit
to lanehollard/hermes-webui
that referenced
this pull request
May 3, 2026
…ix + P0 polish bundle (nesquena#1466 nesquena#1494 nesquena#1469 nesquena#1484 nesquena#1486) 3 PRs in this batch (3866 → 3874 tests, +8): - nesquena#1493 (@dso2ng) — sidebar Stop response cancels row's stream not active pane's (closes nesquena#1466, follow-up to nesquena#1480) - nesquena#1495 (self-built; reported by @insecurejezza in nesquena#1494) — state.db connection FD leak in sidebar polling (closes nesquena#1494, addresses Bug nesquena#2 of nesquena#1458) - nesquena#1492 (@bergeouss) — P0 bugfixes bundle: tool-card args readability + CLI rename persistence + scroll pinning + sw.js relative-path regression test (closes nesquena#1469 nesquena#1484 nesquena#1486) This release closes Bug nesquena#2 of the umbrella issue nesquena#1458. Bug nesquena#1 was closed by v0.50.269 (nesquena#1483) + v0.50.270 (nesquena#1487). Bug nesquena#3 (HTTP-unhealthy without FD exhaustion) is the remaining work item.
nesquena-hermes
pushed a commit
that referenced
this pull request
May 3, 2026
…n guard) CHANGELOG, ROADMAP, TESTING bumped (3925 → 3929 tests collected). Opus SHOULD-FIX absorbed in-release: tests #1-3 documented the dedup contract via direct construction but did not invoke get_models_grouped(). Test #4 (test_get_models_grouped_unconfigured_providers_get_independent_dicts) inspects the live source for the literal copy.deepcopy(auto_detected_models) call AND runs an end-to-end smoke of the fixed assignment loop. A future refactor that removes the deepcopy at api/config.py:2078 will fail this test immediately.
waldmanz
pushed a commit
to waldmanz/hermes-webui
that referenced
this pull request
May 4, 2026
Both flagged by pre-release Opus advisor; both clearly defensive and small enough to absorb in-release per the reviewer-flagged-fix-in-release-not-followup policy. SHOULD-FIX nesquena#1 (api/routes.py:_clear_stale_stream_state, ~25 LOC): After the metadata-only reload (nesquena#1559 Layer 2), the local 'session' variable is reassigned to the full-load object but the caller still holds the original metadata-only stub. /api/session then returns the stale active_stream_id at routes.py:1791, causing the frontend to attempt one ghost SSE reconnect before recovering. Fix: capture original_stub at function entry, then patch its in-memory active_stream_id and pending_* fields to None after both the early-return (full-load already cleared) path AND the successful-mutation path. Now the caller's read returns fresh state, no ghost reconnect. SHOULD-FIX nesquena#2 (api/models.py:Session.save, ~20 LOC): The .bak write at api/models.py:436 used write_text() which truncates- then-writes — a crash mid-write or concurrent backup-producing save could leave a torn .bak. Recovery defends correctly (JSONDecodeError → returns -1 → 'no_action'), so the failure mode was 'backup lost' not 'spurious restore'. Fix: tmp + os.replace pattern matching the main file write at line 446-453. Now backup either lands cleanly or doesn't land at all. 4026/4026 tests pass post-absorb.
nesquena-hermes
added a commit
that referenced
this pull request
May 4, 2026
SHOULD-FIX #1 (renamed-root client cross-alias): drop strict-equality client filter at static/sessions.js:1853. Server-side _profiles_match cross-aliases 'default'-tagged rows to a renamed root 'kinni'; the strict-equality client would reject them, dropping every legacy session for renamed-root users. The server is now solely authoritative for profile scoping. SHOULD-FIX #2 (messaging-source dedupe ordering): _keep_latest_messaging_session_per_source now runs AFTER the profile filter at api/routes.py:2078. Before, it ran on the merged-cross-profile list with profile-blind keys, discarding the older profile's row across profiles before the scope filter — leaving zero rows for any messaging identity the active profile shared with another profile. NIT #3: _projects_migrated flag now set only AFTER successful save_projects. NIT #4: cleaned dead test code in test_is_root_profile_invalidation_drops_stale. NIT #5: _create_profile_fallback's clone_from=='default' literal now routes through _is_root_profile() for parity with the 5 other callsites. +2 regression tests pin the SHOULD-FIX shapes: - test_keep_latest_messaging_runs_after_profile_filter (source-string ordering) - test_static_sessions_js_trusts_server_profile_scoping (no client re-filter) 4173 -> 4175 tests pass. 0 regressions.
This was referenced May 4, 2026
long12310225
pushed a commit
to long12310225/hermes-webui
that referenced
this pull request
May 6, 2026
…p + test-isolation fix Constituent PRs: - nesquena#1725 (@Michaelyklam) — simplify compact Activity row summary - nesquena#1726 (@Michaelyklam) — delegate generic provider catalogs to Hermes CLI (slice of nesquena#1240) - nesquena#1727 (@Michaelyklam) — link Claude Code OAuth in onboarding (closes nesquena#1362) - nesquena#1728 (@starship-s) — preserve profile context when starting chats - nesquena#1729 (@Michaelyklam) — persist compact Activity disclosure state - nesquena#1730 (@Michaelyklam) — prevent sticky sidebar hover drag state - nesquena#1732 (@Sanjays2402) — unpin scroll on small upward motion during streaming (closes nesquena#1731) Plus 2 in-stage absorbed fixes: - test-isolation fix: monkeypatch.setattr(config, 'cfg', X) survives PR nesquena#1728's path/mtime-aware get_config() reload. Mandatory before tag (Opus stage-302). - Opus SHOULD-FIX nesquena#1: _lastScrollTop reset on session switch (nesquena#1732 follow-up). Tests: 4537 → 4584 passing (+47). 0 regressions. Full suite ~128s. Stably green. Pre-release verification: - All 7 PRs CI-green individually + rebased onto master - pytest 4584 passed, 0 failed (multiple runs) - node -c clean on all 4 modified .js files - 11/11 browser API endpoints PASS on isolated port 8789 - 20 QA tests via webui_qa_agent.sh PASS - Opus advisor: SHIP, 5/5 verification clean, 0 MUST-FIX, 1 SHOULD-FIX absorbed (_lastScrollTop reset), 1 SHOULD-FIX deferred (nesquena#1736 — _clear_anthropic_env_values race, onboarding-time-only) Closes nesquena#1362, nesquena#1731.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.