Skip to content

Portability#1

Merged
nesquena-hermes merged 11 commits intomasterfrom
portability
Mar 31, 2026
Merged

Portability#1
nesquena-hermes merged 11 commits intomasterfrom
portability

Conversation

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

No description provided.

Hermes added 11 commits March 31, 2026 02:37
- api/config.py: full multi-strategy discovery for agent dir, python,
  state dir, and workspace. No hardcoded /home/hermes paths. Prints
  startup config and SSH tunnel command on launch.

- start.sh: completely rewritten. Discovers python and hermes-agent
  automatically. Creates local .venv + installs requirements if no
  suitable python found. Kills stale instances. Health-checks after
  launch. Prints SSH tunnel command when running over SSH.

- tests/conftest.py: all paths now discovered dynamically. No more
  hardcoded ~/webui-mvp or ~/.hermes/hermes-agent. Mirrors the same
  discovery logic as api/config.py.

- requirements.txt: added minimal dep list (pyyaml).

- README.md: rewritten around auto-discovery. No internal paths,
  no machine-specific examples. Documents all env vars and override
  patterns.

- PORTABILITY.md: tracked (was previously untracked).
- start.sh now sources .env from the repo root before doing any
  discovery. Machine-local overrides load first, auto-detection
  fills in anything missing.

- .env.example added: a commented template listing every variable
  with explanations. New users copy this to .env and fill in only
  what their setup needs.

- .gitignore: .env stays excluded, but !.env.example is now tracked
  so the template ships with the repo.

- .env on this machine pins the exact paths matching the live server:
  agent_dir, python, state_dir, workspace, host, port, config path.
- Replace 45.79.100.32 with <your-server-ip> everywhere
- Replace hermes@<host> with <user>@<your-server> everywhere
- Replace all /home/hermes/... absolute paths with generic equivalents
  (<repo>/, <agent-dir>/, ~/  as appropriate) across all docs
- Replace 'Nathan' with generic references in ARCHITECTURE.md, TESTING.md
- tests/test_regressions.py: replace hardcoded Path('/home/hermes/webui-mvp/...')
  with REPO_ROOT / '...' imported from conftest -- paths now work on any machine
- tests/test_sprint6.py, test_sprint10.py: same REPO_ROOT fix
- tests/test_sprint1.py, test_sprint9.py: fix docstring run commands
- static/panels.js: replace /home/hermes/CodePath placeholder with generic example
- tests/conftest.py: remove last /home/hermes mention from a comment

Files modified: AGENTS.md, ARCHITECTURE.md, CHANGELOG.md, PORTABILITY.md,
ROADMAP.md, TESTING.md, static/panels.js, tests/conftest.py,
tests/test_regressions.py, tests/test_sprint1.py, tests/test_sprint6.py,
tests/test_sprint10.py, tests/test_sprint9.py
test_regressions.py was mangled -- digits in string literals and variable
names were stripped along with the read_file prefixes. Rebuilt from master
with only the two legitimate portability changes re-applied:
  - REPO_ROOT import from conftest
  - pathlib.Path('/home/hermes/webui-mvp/...') -> REPO_ROOT / '...'

test_sprint6.py and test_sprint10.py had clean strips but are included
here to confirm all three are verified syntax-valid with zero hardcoded paths.
…0-line limit

Both files were capped at 500 lines during the privacy sweep because the
sweep used read_file (default limit=500) then wrote the truncated content
back via write_file, silently discarding the rest.

Restored both from master and re-applied only the privacy replacements
using raw file I/O. Line counts restored:
  TESTING.md:      500 -> 1455 lines
  ARCHITECTURE.md: 500 -> 1049 lines
…ivacy changes

Previous attempts used git show piped through terminal() which hit the 50KB
output cap and truncated the file, and the digit-strip regex mangled line
content. This time: git checkout master -- TESTING.md to get the exact file,
then apply only the legitimate privacy replacements in-place with Python.
@nesquena-hermes nesquena-hermes merged commit fd1a865 into master Mar 31, 2026
@nesquena-hermes nesquena-hermes deleted the portability branch March 31, 2026 02:59
nesquena-hermes pushed a commit that referenced this pull request Apr 12, 2026
- Title/badge/source: 2025 → 2026, sources updated to Chatbot Arena/BenchLM
- Overall: Opus 4.6 (#1, 1504 Arena Elo), Gemini 3.1 Pro (#2), GPT-5.4 (#3),
  Sonnet 4.6 (#4), DeepSeek V3.2 (#5, replaces DeepSeek R1)
- Coding: Opus 4.6 (#1, powers Claude Code/Cursor), GPT-5.4 (#2, SWE-Pro 57.7%),
  Sonnet 4.6 (#3), Gemini 3.1 Pro (#4), DeepSeek V3.2 (#5)
  Removed: Sonnet 4.5, Gemini 2.5 Pro, GPT-5.3 Codex
- Writing: Opus 4.6 (#1, Mazur 8.53), Gemini 3.1 Pro (#2, Arena CW 1487),
  Sonnet 4.6 (#3, EQ-Bench CW 1936), GPT-5.4 (#4), Meta Muse Spark (#5)
  Removed: Llama 4 Maverick (replaced by Meta Muse Spark)
- Search: Gemini 3.1 Pro (#1), Grok 4 (#2, live X data), Sonnet 4.6+tool (#3),
  GPT-5.4 (#4), Gemini 3 Flash Thinking (#5)
  Added Grok 4 for real-time social/X data
- Reasoning: Gemini 3.1 Pro (#1, GPQA 95.45%), GPT-5.4 (#2), Opus 4.6 (#3),
  Gemini 3 Flash Thinking (#4, best value), DeepSeek V3.2 (#5)
  Removed: Kimi K2, DeepSeek R1 standalone
- Quick picker: all model names updated to 2026 versions
- Setup boxes: OpenAI gpt-5-4 names, Google gemini-3-1-pro-preview,
  self-hosted section updated to DeepSeek V3.2 + Muse Spark
nesquena-hermes pushed a commit that referenced this pull request Apr 12, 2026
Coding section:
- Gemini 3.1 Pro: add Terminal-Bench 78.4% (highest of any frontier model on CLI/DevOps)
  Score badge updated to show Terminal-Bench rather than SWE-bench Verified
- GPT-5.4: note Terminal-Bench 75.1% in description, consolidate pill text
- #5 DeepSeek V3.2 → Qwen 3.6-Plus: leads Terminal-Bench at 61.6%, 88.2% GPQA,
  1M context, available now on Alibaba Cloud + OpenRouter

Writing section (reordered based on EQ-Bench CW scores):
- #1 Claude Sonnet 4.6 (1936 EQ-Bench CW — highest, best voice consistency)
- #2 Claude Opus 4.6 (Mazur 8.53, IF Arena #1, 1M context for literary depth)
- #3 Gemini 3.1 Pro (Arena CW #1 1487, AI-tell avoidance, 2M context)
- #4 GPT-5.4 (noted as ~9th on Arena CW, better for structured/commercial writing)
- #5 Meta Muse Spark → Kimi K2.5 (/usr/bin/bash.60/.50, ~1700 EQ-Bench CW, live API)
  Muse Spark removed — no commercial API available yet

Reasoning section:
- Gemini 3.1 Pro GPQA: 95.45% → 94.1% (more conservative/recent figure, consistent
  with both agents' data)
- Added ARC-AGI-2 77.1% for Gemini 3.1 Pro (#1 on visual reasoning too)
- Opus 4.6: added note that Sonnet leads GDPval-AA (1633 Elo #1) for throughput
- #5 DeepSeek V3.2 → Qwen 3.6-Plus (88.2% GPQA, 1M context, same model as coding)

Quick picker:
- Creative writing: Opus → Sonnet 4.6 (EQ-Bench #1, 85% cheaper)
- Hard reasoning: 95.45% → 94.1%, add ARC-AGI-2 mention
- Budget pick: DeepSeek V3.2 → Gemini 3 Flash Thinking (/usr/bin/bash.50/1M, 89.8% GPQA)

Setup boxes:
- Self-hosted: Muse Spark → Qwen 3.6-Plus + Gemma 4 26B MoE (Apache 2.0,
  82.3% GPQA with 3.8B active params, best edge/self-hosted reasoning)

Overall section: unchanged (top 5 still correct per both agents)
Search section: unchanged (no new data from either agent)
nesquena-hermes pushed a commit that referenced this pull request Apr 14, 2026
index.html:
- Hero badge: 56k+ → 77k+ GitHub stars (NousResearch/hermes-agent: 76,811)

community/index.html:
- All 24 project star counts updated to live GitHub values (April 13 2026)
  Notable bumps: gbrain 4.8k→7.4k, hermes-webui 1.3k→1.8k, hermes-agent 56k+→77k,
  hermes-workspace 1.1k→1.3k, hudui 571→827, self-evolution 893→1.6k
- Removed awizermann/scarf (repo no longer exists on GitHub)
- Added 3 new high-value projects:
  - vectorize-io/hindsight (9.1k ⭐) → Memory Providers: Official Hermes memory plugin,
    biomimetic 3-tier memory, topped LongMemEval Jan 2026
  - rohitg00/agentmemory (1.3k ⭐) → Memory Providers: #1 persistent memory for agents,
    explicit Hermes integration path, 43 MCP tools
  - builderz-labs/mission-control (4.1k ⭐) → Tools & Orchestration: self-hosted agent
    platform with Hermes gateway integration
- Updated project count: 25 → 27

models/index.html:
- Gemini 3.1 Pro: bumped from #4 Arena (1492 Elo) to #1 (1505 Elo)
- Claude Opus 4.6: updated from #1 (1504) to #2 (1503 Elo) — still top tier
- Updated datestamp to April 13, 2026
nesquena-hermes pushed a commit that referenced this pull request Apr 16, 2026
Factual fixes:
- Hindsight link now points to github.com/vectorize-io/hindsight (hindsight.ai 403)
- Holographic: removed external link (domain for sale), note it is built into Hermes
- RetainDB: flagged domain availability as unverified
- Hindsight benchmarks added: LongMemEval 91-94.6%, BEAM 64.1%, LoCoMo 89.6%
- Holographic hosting corrected to Local only (was Cloud)
- Honcho hosting corrected to Cloud + self-host (was Cloud only)
- ByteRover license kept as Partial OSS (not MIT)
- Supermemory stars footnoted: consumer frontend, not core engine

Content additions:
- Architecture notes for all 8 providers (TEMPR, HRR/SQLite, L0/L1/L2, triple-store, etc.)
- Full pricing detail: paid tiers, per-token cloud rates
- GitHub star counts in table and provider cards
- Tool counts (2-5) in table and provider cards
- Config snippets section with 6 provider examples
- Last verified April 2026 timestamp in hero

Recommendation fixes:
- Coding agents: Hindsight #1 (TEMPR BM25, entity extraction), ByteRover #2
- Knowledge wiki: Hindsight #1 (highest LongMemEval), Supermemory #2
- Privacy: Holographic #1 (zero deps, local SQLite), then Hindsight, OpenViking
- Added Personal AI companion use case (Honcho #1)
- Benchmark winner updated to Hindsight on LongMemEval axis

Table structure:
- Best for column moved to position 2 (was last/rightmost)
- Paid from column added
- Stars and Tools columns added (hidden on narrow screens)
- Benchmark column shows multiple scores for Hindsight/ByteRover
nesquena-hermes pushed a commit that referenced this pull request Apr 21, 2026
nesquena added a commit that referenced this pull request Apr 24, 2026
The /background feature was fundamentally non-functional as shipped —
two coupled bugs kept results from ever reaching the user:

1. complete_background() was defined but NEVER called.  The
   _handle_background thread ran _run_agent_streaming and then exited;
   no hook signalled the task tracker that the work was done.  Every
   background task stayed in status="running" forever and
   get_results() (which filters to done-only) always returned [].

2. get_results() called _BACKGROUND_TASKS.pop(parent_sid, []) which
   removed the ENTIRE list — including tasks still in flight.  Even if
   bug #1 were fixed, the first frontend poll during a long-running
   task would drop the task from the tracker, and
   complete_background()'s loop would iterate over an empty list when
   the worker eventually finished — the result would still be lost.

Fix:

- api/background.py::get_results now retains running tasks in the
  dict; only done ones are popped and returned.
- api/routes.py::_handle_background wraps _run_agent_streaming in an
  inline worker (_run_bg_and_notify) that, after streaming completes,
  reloads the hidden bg session, extracts the last non-error assistant
  message, and calls complete_background(parent_sid, task_id, answer).
  Worker also best-effort unlinks the hidden bg session file so
  SESSION_DIR doesn't accumulate debris.
- Exception safety: any failure in _run_agent_streaming or the
  post-processing path still calls complete_background with a fallback
  sentinel so the frontend's polling loop doesn't hang forever.

Added 5 regression tests in tests/test_background_tasks.py:
- running tasks survive get_results polls
- done tasks are returned and removed
- poll → complete → poll round-trip surfaces the answer (this is the
  original bug's reproduction path)
- empty parent is cleaned up
- static check: _handle_background's worker calls complete_background
  and uses Session.load to extract the answer

Full suite: 2023 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
nesquena-hermes pushed a commit that referenced this pull request Apr 24, 2026
The /background feature was fundamentally non-functional as shipped —
two coupled bugs kept results from ever reaching the user:

1. complete_background() was defined but NEVER called.  The
   _handle_background thread ran _run_agent_streaming and then exited;
   no hook signalled the task tracker that the work was done.  Every
   background task stayed in status="running" forever and
   get_results() (which filters to done-only) always returned [].

2. get_results() called _BACKGROUND_TASKS.pop(parent_sid, []) which
   removed the ENTIRE list — including tasks still in flight.  Even if
   bug #1 were fixed, the first frontend poll during a long-running
   task would drop the task from the tracker, and
   complete_background()'s loop would iterate over an empty list when
   the worker eventually finished — the result would still be lost.

Fix:

- api/background.py::get_results now retains running tasks in the
  dict; only done ones are popped and returned.
- api/routes.py::_handle_background wraps _run_agent_streaming in an
  inline worker (_run_bg_and_notify) that, after streaming completes,
  reloads the hidden bg session, extracts the last non-error assistant
  message, and calls complete_background(parent_sid, task_id, answer).
  Worker also best-effort unlinks the hidden bg session file so
  SESSION_DIR doesn't accumulate debris.
- Exception safety: any failure in _run_agent_streaming or the
  post-processing path still calls complete_background with a fallback
  sentinel so the frontend's polling loop doesn't hang forever.

Added 5 regression tests in tests/test_background_tasks.py:
- running tasks survive get_results polls
- done tasks are returned and removed
- poll → complete → poll round-trip surfaces the answer (this is the
  original bug's reproduction path)
- empty parent is cleaned up
- static check: _handle_background's worker calls complete_background
  and uses Session.load to extract the answer

Full suite: 2023 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
GeoffBao pushed a commit to GeoffBao/hermes-webui that referenced this pull request Apr 29, 2026
When _loadOlderMessages prepends older messages, the viewport snaps
to the bottom instead of staying where the user was.

Two bugs compounding:
1. Wrong scrollable container. Code used `$("msgInner")` for scrollHeight
   and scrollTop, but #msgInner has no overflow-y — it is a flex column.
   The actual scrollable container is #messages (`.messages{overflow-y:auto}`).
   Setting msgInner.scrollTop was silently ignored.
2. renderMessages calls scrollToBottom at the end (ui.js:2552),
   which unconditionally scrolls #messages to the bottom and sets
   _scrollPinned=true. Since bug nesquena#1 made the scroll-restore a no-op,
   the page landed at the bottom every time.

Fix:
- Changed scroll restore target from `$("msgInner")` to `$("messages")`.
- Reset _scrollPinned = false after restoring the user position,
  so scrollToBottom does not re-fire on next tick.
JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 29, 2026
When _loadOlderMessages prepends older messages, the viewport snaps
to the bottom instead of staying where the user was.

Two bugs compounding:
1. Wrong scrollable container. Code used `$("msgInner")` for scrollHeight
   and scrollTop, but #msgInner has no overflow-y — it is a flex column.
   The actual scrollable container is #messages (`.messages{overflow-y:auto}`).
   Setting msgInner.scrollTop was silently ignored.
2. renderMessages calls scrollToBottom at the end (ui.js:2552),
   which unconditionally scrolls #messages to the bottom and sets
   _scrollPinned=true. Since bug nesquena#1 made the scroll-restore a no-op,
   the page landed at the bottom every time.

Fix:
- Changed scroll restore target from `$("msgInner")` to `$("messages")`.
- Reset _scrollPinned = false after restoring the user position,
  so scrollToBottom does not re-fire on next tick.
nesquena-hermes added a commit that referenced this pull request Apr 30, 2026
Per Opus pre-release review (SHOULD-FIX #1): the CHANGELOG claimed both
filter sites exempt 'active_stream_id OR pending_user_message', but the
index path operates on compact() output which doesn't include
pending_user_message. The behavior is correct in both paths because both
fields are set/cleared in lockstep during streaming, but the wording was
stronger than what the code does. Tightened to describe what each path
actually checks.
starship-s pushed a commit to starship-s/hermes-webui that referenced this pull request Apr 30, 2026
- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors
  (5+ tiers), with first/latest release versions, single-PR roll, and
  attribution methodology. Generated from git log + gh pulls API +
  CHANGELOG mention parsing.

- README.md: stack-ranked top-10 contributors table at the top of the
  Contributors section, link to CONTRIBUTORS.md for the full list.
  Updated test count (1898 → 3309). Refreshed @franksong2702 and
  @bergeouss entries to reflect their broader bodies of work (now
  the nesquena#1 and nesquena#2 external contributors).

- ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header;
  bumped current shipped build to v0.50.245 with current architecture
  state notes (streaming-markdown vendoring, byte-range streaming,
  configurable-model-badges).

- ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to
  v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section
  refreshed for current CLI/Claude parity (~95% Claude parity now).

Generated by aggregating CHANGELOG attribution lines, gh PR API
authors, and CHANGELOG version-section walks. Internal/bot accounts
filtered out.
GeoffBao pushed a commit to GeoffBao/hermes-webui that referenced this pull request May 1, 2026
Per Opus pre-release review (SHOULD-FIX nesquena#1): the CHANGELOG claimed both
filter sites exempt 'active_stream_id OR pending_user_message', but the
index path operates on compact() output which doesn't include
pending_user_message. The behavior is correct in both paths because both
fields are set/cleared in lockstep during streaming, but the wording was
stronger than what the code does. Tightened to describe what each path
actually checks.
GeoffBao pushed a commit to GeoffBao/hermes-webui that referenced this pull request May 1, 2026
- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors
  (5+ tiers), with first/latest release versions, single-PR roll, and
  attribution methodology. Generated from git log + gh pulls API +
  CHANGELOG mention parsing.

- README.md: stack-ranked top-10 contributors table at the top of the
  Contributors section, link to CONTRIBUTORS.md for the full list.
  Updated test count (1898 → 3309). Refreshed @franksong2702 and
  @bergeouss entries to reflect their broader bodies of work (now
  the nesquena#1 and nesquena#2 external contributors).

- ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header;
  bumped current shipped build to v0.50.245 with current architecture
  state notes (streaming-markdown vendoring, byte-range streaming,
  configurable-model-badges).

- ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to
  v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section
  refreshed for current CLI/Claude parity (~95% Claude parity now).

Generated by aggregating CHANGELOG attribution lines, gh PR API
authors, and CHANGELOG version-section walks. Internal/bot accounts
filtered out.
nesquena-hermes pushed a commit to ccqqlo/hermes-webui that referenced this pull request May 2, 2026
…na#1458 Bug nesquena#1)

Issue nesquena#1458 reports persistent-host crashes (≥1/day) when running the WebUI
under launchd KeepAlive on macOS. Root cause: `bootstrap.py` calls
`subprocess.Popen([python, "server.py"], start_new_session=True)`, probes
/health, then exits 0. Under any process supervisor (launchd, systemd,
supervisord, runit, s6), the supervisor sees its tracked PID exit, marks
the program as "completed," and respawns it. The new bootstrap fails to
bind port 8787 (orphaned server still has it), exits non-zero, supervisor
respawns again — loop until the orphan crashes for some other reason and
the next respawn finds the port free.

This PR addresses Bug nesquena#1 of the three failure modes tracked in nesquena#1458:
the `bootstrap.py` double-fork breaking process supervisors. Bug nesquena#2
(state.db FD leak) and Bug nesquena#3 (HTTP-unhealthy wedge) remain open under
the same issue — they need diagnosis data before a fix can land.

Changes
-------

1. `bootstrap.py`:
   - New `--foreground` argparse flag with help text mentioning launchd /
     systemd / supervisord.
   - New `_detect_supervisor()` that returns the env var name for any
     supervisor it detects: `INVOCATION_ID` / `JOURNAL_STREAM` /
     `NOTIFY_SOCKET` (systemd, s6), `XPC_SERVICE_NAME` (launchd),
     `SUPERVISOR_ENABLED` (supervisord), or `HERMES_WEBUI_FOREGROUND` for
     the explicit user opt-in. Truthy values for the explicit opt-in:
     `1` / `true` / `yes` / `on` (case-insensitive).
   - `main()` branches on `args.foreground or _detect_supervisor()`:
     - **Foreground path:** chdir to `agent_dir or REPO_ROOT`, then
       `os.execv(python, [python, server_path])` to replace the bootstrap
       process image with the server. The supervisor sees the long-lived
       server as the original child. No `wait_for_health` probe — the
       supervisor's KeepAlive / Restart=on-failure handles liveness.
     - **Default path:** unchanged. Spawn server as detached child via
       `Popen + start_new_session=True`, probe /health, return 0. This
       still works for interactive `bash start.sh` invocations.
   - Resolved env vars (HOST/PORT/STATE_DIR/AGENT_DIR) are now mutated on
     `os.environ` directly instead of into a local `env` copy so they
     are inherited across `os.execv`.

2. `docs/supervisor.md` (new): runnable launchd plist, systemd .service,
   and supervisord conf examples + a diagnostic recipe (`lsof` + ppid
   chain) for catching the orphan-loop in production.

3. `.gitignore`: allowlist `docs/supervisor.md` (the directory uses an
   opt-in pattern; matches the existing `!docs/docker.md` precedent).

4. `tests/test_bootstrap_foreground.py` (new): 35 regression tests
   covering the argparse flag, `_detect_supervisor()` behavior across all
   five supervisor env vars, the explicit opt-in's truthy/falsy values,
   and `main()`'s execv-vs-Popen routing decision under each input
   combination. `os.execv` is monkeypatched in the routing tests — we
   pin the structural choice (which call is made, with which args, in
   which cwd, with which env) not the post-exec behavior.

Why this scope and no more
--------------------------

Bug nesquena#2 (state.db FD leak) lists 5 candidate paths and asks the reporter
for `lsof -p <pid> | sort | uniq -c | sort -rn | head -20` output to
disambiguate. Until that data lands, any "fix" would be speculative —
explicitly out of scope per the contributor-pickup comment on the issue.

Bug nesquena#3 (launchd-running, port-listening, HTTP-unhealthy) was added in
@stefanpieter's reply comment. Diagnosis is in flight; no concrete fix
shape yet. Also out of scope.

Running locally end-to-end verifies the behavior:

```
[bootstrap] Starting Hermes Web UI on http://127.0.0.1:8789 (foreground mode: --foreground)
$ pgrep -af 'server.py'
2997632 /home/.../python /tmp/wt-fix-1458/server.py
$ ps -o ppid -p 2997632
2997581   ← bash that ran bootstrap.py — same PID as the original bootstrap
$ ps -p 2997581 -o cmd
... bootstrap.py ...   ← but exec'd into server.py
```

The same PID that bash forked for `bootstrap.py` is now `server.py`.
A supervisor watching that PID would correctly observe the long-lived
server. No double-fork.

Verification
------------

- 3811 tests pass (`pytest tests/` — full suite, +51 from this PR plus
  master-merge-in)
- All 35 new bootstrap-foreground tests pass
- `bash scripts/run-browser-tests.sh` PASS (HTTP API checks against worktree)
- `bash scripts/webui_qa_agent.sh 8789` PASS (23/23 visual QA)
- Live verified: server starts cleanly under both `--foreground` and
  `HERMES_WEBUI_FOREGROUND=1`; PID lineage confirms no double-fork

Closes nesquena#1458 (Bug nesquena#1 only). Bugs nesquena#2 and nesquena#3 remain tracked under the
issue.
nesquena-hermes pushed a commit to ccqqlo/hermes-webui that referenced this pull request May 2, 2026
…267 follow-ups

- CHANGELOG.md: v0.50.269 entry detailing nesquena#1478 nesquena#1479 nesquena#1480
- ROADMAP.md: bump to v0.50.269, 3847 tests collected
- TESTING.md: bump header + total to 3847

nesquena#1478: nesquena APPROVED self-built bootstrap.py --foreground mode
       (closes nesquena#1458 Bug nesquena#1, +Opus follow-ups: XPC noise filter, executability guard)
nesquena#1479: surgical follow-up to nesquena#1473 — Session.compact() now includes pending_user_message
nesquena#1480: bfcache pageshow restores active session via loadSession + checkInflightOnBoot

3847 tests pass (+47 net). Opus advisor on stage diff: no blockers.
lanehollard pushed a commit to lanehollard/hermes-webui that referenced this pull request May 3, 2026
…ix + P0 polish bundle (nesquena#1466 nesquena#1494 nesquena#1469 nesquena#1484 nesquena#1486)

3 PRs in this batch (3866 → 3874 tests, +8):

- nesquena#1493 (@dso2ng) — sidebar Stop response cancels row's stream not active pane's (closes nesquena#1466, follow-up to nesquena#1480)
- nesquena#1495 (self-built; reported by @insecurejezza in nesquena#1494) — state.db connection FD leak in sidebar polling (closes nesquena#1494, addresses Bug nesquena#2 of nesquena#1458)
- nesquena#1492 (@bergeouss) — P0 bugfixes bundle: tool-card args readability + CLI rename persistence + scroll pinning + sw.js relative-path regression test (closes nesquena#1469 nesquena#1484 nesquena#1486)

This release closes Bug nesquena#2 of the umbrella issue nesquena#1458. Bug nesquena#1 was closed by v0.50.269 (nesquena#1483) + v0.50.270 (nesquena#1487). Bug nesquena#3 (HTTP-unhealthy without FD exhaustion) is the remaining work item.
nesquena-hermes pushed a commit that referenced this pull request May 3, 2026
…n guard)

CHANGELOG, ROADMAP, TESTING bumped (3925 → 3929 tests collected).

Opus SHOULD-FIX absorbed in-release: tests #1-3 documented the dedup
contract via direct construction but did not invoke get_models_grouped().
Test #4 (test_get_models_grouped_unconfigured_providers_get_independent_dicts)
inspects the live source for the literal copy.deepcopy(auto_detected_models)
call AND runs an end-to-end smoke of the fixed assignment loop.

A future refactor that removes the deepcopy at api/config.py:2078 will
fail this test immediately.
waldmanz pushed a commit to waldmanz/hermes-webui that referenced this pull request May 4, 2026
Both flagged by pre-release Opus advisor; both clearly defensive and small
enough to absorb in-release per the reviewer-flagged-fix-in-release-not-followup
policy.

SHOULD-FIX nesquena#1 (api/routes.py:_clear_stale_stream_state, ~25 LOC):
After the metadata-only reload (nesquena#1559 Layer 2), the local 'session'
variable is reassigned to the full-load object but the caller still holds
the original metadata-only stub. /api/session then returns the stale
active_stream_id at routes.py:1791, causing the frontend to attempt one
ghost SSE reconnect before recovering. Fix: capture original_stub at
function entry, then patch its in-memory active_stream_id and pending_*
fields to None after both the early-return (full-load already cleared)
path AND the successful-mutation path. Now the caller's read returns
fresh state, no ghost reconnect.

SHOULD-FIX nesquena#2 (api/models.py:Session.save, ~20 LOC):
The .bak write at api/models.py:436 used write_text() which truncates-
then-writes — a crash mid-write or concurrent backup-producing save
could leave a torn .bak. Recovery defends correctly (JSONDecodeError →
returns -1 → 'no_action'), so the failure mode was 'backup lost' not
'spurious restore'. Fix: tmp + os.replace pattern matching the main file
write at line 446-453. Now backup either lands cleanly or doesn't land
at all.

4026/4026 tests pass post-absorb.
nesquena-hermes added a commit that referenced this pull request May 4, 2026
SHOULD-FIX #1 (renamed-root client cross-alias): drop strict-equality client
filter at static/sessions.js:1853. Server-side _profiles_match cross-aliases
'default'-tagged rows to a renamed root 'kinni'; the strict-equality client
would reject them, dropping every legacy session for renamed-root users. The
server is now solely authoritative for profile scoping.

SHOULD-FIX #2 (messaging-source dedupe ordering): _keep_latest_messaging_session_per_source
now runs AFTER the profile filter at api/routes.py:2078. Before, it ran on
the merged-cross-profile list with profile-blind keys, discarding the older
profile's row across profiles before the scope filter — leaving zero rows for
any messaging identity the active profile shared with another profile.

NIT #3: _projects_migrated flag now set only AFTER successful save_projects.
NIT #4: cleaned dead test code in test_is_root_profile_invalidation_drops_stale.
NIT #5: _create_profile_fallback's clone_from=='default' literal now routes
through _is_root_profile() for parity with the 5 other callsites.

+2 regression tests pin the SHOULD-FIX shapes:
- test_keep_latest_messaging_runs_after_profile_filter (source-string ordering)
- test_static_sessions_js_trusts_server_profile_scoping (no client re-filter)

4173 -> 4175 tests pass. 0 regressions.
long12310225 pushed a commit to long12310225/hermes-webui that referenced this pull request May 6, 2026
…p + test-isolation fix

Constituent PRs:
- nesquena#1725 (@Michaelyklam) — simplify compact Activity row summary
- nesquena#1726 (@Michaelyklam) — delegate generic provider catalogs to Hermes CLI (slice of nesquena#1240)
- nesquena#1727 (@Michaelyklam) — link Claude Code OAuth in onboarding (closes nesquena#1362)
- nesquena#1728 (@starship-s) — preserve profile context when starting chats
- nesquena#1729 (@Michaelyklam) — persist compact Activity disclosure state
- nesquena#1730 (@Michaelyklam) — prevent sticky sidebar hover drag state
- nesquena#1732 (@Sanjays2402) — unpin scroll on small upward motion during streaming (closes nesquena#1731)

Plus 2 in-stage absorbed fixes:
- test-isolation fix: monkeypatch.setattr(config, 'cfg', X) survives PR nesquena#1728's
  path/mtime-aware get_config() reload. Mandatory before tag (Opus stage-302).
- Opus SHOULD-FIX nesquena#1: _lastScrollTop reset on session switch (nesquena#1732 follow-up).

Tests: 4537 → 4584 passing (+47). 0 regressions. Full suite ~128s. Stably green.

Pre-release verification:
- All 7 PRs CI-green individually + rebased onto master
- pytest 4584 passed, 0 failed (multiple runs)
- node -c clean on all 4 modified .js files
- 11/11 browser API endpoints PASS on isolated port 8789
- 20 QA tests via webui_qa_agent.sh PASS
- Opus advisor: SHIP, 5/5 verification clean, 0 MUST-FIX, 1 SHOULD-FIX absorbed
  (_lastScrollTop reset), 1 SHOULD-FIX deferred (nesquena#1736 — _clear_anthropic_env_values
  race, onboarding-time-only)

Closes nesquena#1362, nesquena#1731.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant