Portability by nesquena-hermes · Pull Request #1 · nesquena/hermes-webui

nesquena-hermes · 2026-03-31T02:42:41Z

No description provided.

- api/config.py: full multi-strategy discovery for agent dir, python, state dir, and workspace. No hardcoded /home/hermes paths. Prints startup config and SSH tunnel command on launch. - start.sh: completely rewritten. Discovers python and hermes-agent automatically. Creates local .venv + installs requirements if no suitable python found. Kills stale instances. Health-checks after launch. Prints SSH tunnel command when running over SSH. - tests/conftest.py: all paths now discovered dynamically. No more hardcoded ~/webui-mvp or ~/.hermes/hermes-agent. Mirrors the same discovery logic as api/config.py. - requirements.txt: added minimal dep list (pyyaml). - README.md: rewritten around auto-discovery. No internal paths, no machine-specific examples. Documents all env vars and override patterns. - PORTABILITY.md: tracked (was previously untracked).

- start.sh now sources .env from the repo root before doing any discovery. Machine-local overrides load first, auto-detection fills in anything missing. - .env.example added: a commented template listing every variable with explanations. New users copy this to .env and fill in only what their setup needs. - .gitignore: .env stays excluded, but !.env.example is now tracked so the template ships with the repo. - .env on this machine pins the exact paths matching the live server: agent_dir, python, state_dir, workspace, host, port, config path.

… unused sys import in conftest

- Replace 45.79.100.32 with <your-server-ip> everywhere - Replace hermes@<host> with <user>@<your-server> everywhere - Replace all /home/hermes/... absolute paths with generic equivalents (<repo>/, <agent-dir>/, ~/ as appropriate) across all docs - Replace 'Nathan' with generic references in ARCHITECTURE.md, TESTING.md - tests/test_regressions.py: replace hardcoded Path('/home/hermes/webui-mvp/...') with REPO_ROOT / '...' imported from conftest -- paths now work on any machine - tests/test_sprint6.py, test_sprint10.py: same REPO_ROOT fix - tests/test_sprint1.py, test_sprint9.py: fix docstring run commands - static/panels.js: replace /home/hermes/CodePath placeholder with generic example - tests/conftest.py: remove last /home/hermes mention from a comment Files modified: AGENTS.md, ARCHITECTURE.md, CHANGELOG.md, PORTABILITY.md, ROADMAP.md, TESTING.md, static/panels.js, tests/conftest.py, tests/test_regressions.py, tests/test_sprint1.py, tests/test_sprint6.py, tests/test_sprint10.py, tests/test_sprint9.py

…ning name refs

…m write_file pass)

test_regressions.py was mangled -- digits in string literals and variable names were stripped along with the read_file prefixes. Rebuilt from master with only the two legitimate portability changes re-applied: - REPO_ROOT import from conftest - pathlib.Path('/home/hermes/webui-mvp/...') -> REPO_ROOT / '...' test_sprint6.py and test_sprint10.py had clean strips but are included here to confirm all three are verified syntax-valid with zero hardcoded paths.

…0-line limit Both files were capped at 500 lines during the privacy sweep because the sweep used read_file (default limit=500) then wrote the truncated content back via write_file, silently discarding the rest. Restored both from master and re-applied only the privacy replacements using raw file I/O. Line counts restored: TESTING.md: 500 -> 1455 lines ARCHITECTURE.md: 500 -> 1049 lines

…ivacy changes Previous attempts used git show piped through terminal() which hit the 50KB output cap and truncated the file, and the digit-strip regex mangled line content. This time: git checkout master -- TESTING.md to get the exact file, then apply only the legitimate privacy replacements in-place with Python.

…ly privacy changes

- Title/badge/source: 2025 → 2026, sources updated to Chatbot Arena/BenchLM - Overall: Opus 4.6 (#1, 1504 Arena Elo), Gemini 3.1 Pro (#2), GPT-5.4 (#3), Sonnet 4.6 (#4), DeepSeek V3.2 (#5, replaces DeepSeek R1) - Coding: Opus 4.6 (#1, powers Claude Code/Cursor), GPT-5.4 (#2, SWE-Pro 57.7%), Sonnet 4.6 (#3), Gemini 3.1 Pro (#4), DeepSeek V3.2 (#5) Removed: Sonnet 4.5, Gemini 2.5 Pro, GPT-5.3 Codex - Writing: Opus 4.6 (#1, Mazur 8.53), Gemini 3.1 Pro (#2, Arena CW 1487), Sonnet 4.6 (#3, EQ-Bench CW 1936), GPT-5.4 (#4), Meta Muse Spark (#5) Removed: Llama 4 Maverick (replaced by Meta Muse Spark) - Search: Gemini 3.1 Pro (#1), Grok 4 (#2, live X data), Sonnet 4.6+tool (#3), GPT-5.4 (#4), Gemini 3 Flash Thinking (#5) Added Grok 4 for real-time social/X data - Reasoning: Gemini 3.1 Pro (#1, GPQA 95.45%), GPT-5.4 (#2), Opus 4.6 (#3), Gemini 3 Flash Thinking (#4, best value), DeepSeek V3.2 (#5) Removed: Kimi K2, DeepSeek R1 standalone - Quick picker: all model names updated to 2026 versions - Setup boxes: OpenAI gpt-5-4 names, Google gemini-3-1-pro-preview, self-hosted section updated to DeepSeek V3.2 + Muse Spark

Coding section: - Gemini 3.1 Pro: add Terminal-Bench 78.4% (highest of any frontier model on CLI/DevOps) Score badge updated to show Terminal-Bench rather than SWE-bench Verified - GPT-5.4: note Terminal-Bench 75.1% in description, consolidate pill text - #5 DeepSeek V3.2 → Qwen 3.6-Plus: leads Terminal-Bench at 61.6%, 88.2% GPQA, 1M context, available now on Alibaba Cloud + OpenRouter Writing section (reordered based on EQ-Bench CW scores): - #1 Claude Sonnet 4.6 (1936 EQ-Bench CW — highest, best voice consistency) - #2 Claude Opus 4.6 (Mazur 8.53, IF Arena #1, 1M context for literary depth) - #3 Gemini 3.1 Pro (Arena CW #1 1487, AI-tell avoidance, 2M context) - #4 GPT-5.4 (noted as ~9th on Arena CW, better for structured/commercial writing) - #5 Meta Muse Spark → Kimi K2.5 (/usr/bin/bash.60/.50, ~1700 EQ-Bench CW, live API) Muse Spark removed — no commercial API available yet Reasoning section: - Gemini 3.1 Pro GPQA: 95.45% → 94.1% (more conservative/recent figure, consistent with both agents' data) - Added ARC-AGI-2 77.1% for Gemini 3.1 Pro (#1 on visual reasoning too) - Opus 4.6: added note that Sonnet leads GDPval-AA (1633 Elo #1) for throughput - #5 DeepSeek V3.2 → Qwen 3.6-Plus (88.2% GPQA, 1M context, same model as coding) Quick picker: - Creative writing: Opus → Sonnet 4.6 (EQ-Bench #1, 85% cheaper) - Hard reasoning: 95.45% → 94.1%, add ARC-AGI-2 mention - Budget pick: DeepSeek V3.2 → Gemini 3 Flash Thinking (/usr/bin/bash.50/1M, 89.8% GPQA) Setup boxes: - Self-hosted: Muse Spark → Qwen 3.6-Plus + Gemma 4 26B MoE (Apache 2.0, 82.3% GPQA with 3.8B active params, best edge/self-hosted reasoning) Overall section: unchanged (top 5 still correct per both agents) Search section: unchanged (no new data from either agent)

index.html: - Hero badge: 56k+ → 77k+ GitHub stars (NousResearch/hermes-agent: 76,811) community/index.html: - All 24 project star counts updated to live GitHub values (April 13 2026) Notable bumps: gbrain 4.8k→7.4k, hermes-webui 1.3k→1.8k, hermes-agent 56k+→77k, hermes-workspace 1.1k→1.3k, hudui 571→827, self-evolution 893→1.6k - Removed awizermann/scarf (repo no longer exists on GitHub) - Added 3 new high-value projects: - vectorize-io/hindsight (9.1k ⭐) → Memory Providers: Official Hermes memory plugin, biomimetic 3-tier memory, topped LongMemEval Jan 2026 - rohitg00/agentmemory (1.3k ⭐) → Memory Providers: #1 persistent memory for agents, explicit Hermes integration path, 43 MCP tools - builderz-labs/mission-control (4.1k ⭐) → Tools & Orchestration: self-hosted agent platform with Hermes gateway integration - Updated project count: 25 → 27 models/index.html: - Gemini 3.1 Pro: bumped from #4 Arena (1492 Elo) to #1 (1505 Elo) - Claude Opus 4.6: updated from #1 (1504) to #2 (1503 Elo) — still top tier - Updated datestamp to April 13, 2026

Factual fixes: - Hindsight link now points to github.com/vectorize-io/hindsight (hindsight.ai 403) - Holographic: removed external link (domain for sale), note it is built into Hermes - RetainDB: flagged domain availability as unverified - Hindsight benchmarks added: LongMemEval 91-94.6%, BEAM 64.1%, LoCoMo 89.6% - Holographic hosting corrected to Local only (was Cloud) - Honcho hosting corrected to Cloud + self-host (was Cloud only) - ByteRover license kept as Partial OSS (not MIT) - Supermemory stars footnoted: consumer frontend, not core engine Content additions: - Architecture notes for all 8 providers (TEMPR, HRR/SQLite, L0/L1/L2, triple-store, etc.) - Full pricing detail: paid tiers, per-token cloud rates - GitHub star counts in table and provider cards - Tool counts (2-5) in table and provider cards - Config snippets section with 6 provider examples - Last verified April 2026 timestamp in hero Recommendation fixes: - Coding agents: Hindsight #1 (TEMPR BM25, entity extraction), ByteRover #2 - Knowledge wiki: Hindsight #1 (highest LongMemEval), Supermemory #2 - Privacy: Holographic #1 (zero deps, local SQLite), then Hindsight, OpenViking - Added Personal AI companion use case (Honcho #1) - Benchmark winner updated to Hindsight on LongMemEval axis Table structure: - Best for column moved to position 2 (was last/rightmost) - Paid from column added - Stars and Tools columns added (hidden on narrow screens) - Benchmark column shows multiple scores for Hindsight/ByteRover

…coding, writing section updated)

The /background feature was fundamentally non-functional as shipped — two coupled bugs kept results from ever reaching the user: 1. complete_background() was defined but NEVER called. The _handle_background thread ran _run_agent_streaming and then exited; no hook signalled the task tracker that the work was done. Every background task stayed in status="running" forever and get_results() (which filters to done-only) always returned []. 2. get_results() called _BACKGROUND_TASKS.pop(parent_sid, []) which removed the ENTIRE list — including tasks still in flight. Even if bug #1 were fixed, the first frontend poll during a long-running task would drop the task from the tracker, and complete_background()'s loop would iterate over an empty list when the worker eventually finished — the result would still be lost. Fix: - api/background.py::get_results now retains running tasks in the dict; only done ones are popped and returned. - api/routes.py::_handle_background wraps _run_agent_streaming in an inline worker (_run_bg_and_notify) that, after streaming completes, reloads the hidden bg session, extracts the last non-error assistant message, and calls complete_background(parent_sid, task_id, answer). Worker also best-effort unlinks the hidden bg session file so SESSION_DIR doesn't accumulate debris. - Exception safety: any failure in _run_agent_streaming or the post-processing path still calls complete_background with a fallback sentinel so the frontend's polling loop doesn't hang forever. Added 5 regression tests in tests/test_background_tasks.py: - running tasks survive get_results polls - done tasks are returned and removed - poll → complete → poll round-trip surfaces the answer (this is the original bug's reproduction path) - empty parent is cleaned up - static check: _handle_background's worker calls complete_background and uses Session.load to extract the answer Full suite: 2023 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

When _loadOlderMessages prepends older messages, the viewport snaps to the bottom instead of staying where the user was. Two bugs compounding: 1. Wrong scrollable container. Code used `$("msgInner")` for scrollHeight and scrollTop, but #msgInner has no overflow-y — it is a flex column. The actual scrollable container is #messages (`.messages{overflow-y:auto}`). Setting msgInner.scrollTop was silently ignored. 2. renderMessages calls scrollToBottom at the end (ui.js:2552), which unconditionally scrolls #messages to the bottom and sets _scrollPinned=true. Since bug nesquena#1 made the scroll-restore a no-op, the page landed at the bottom every time. Fix: - Changed scroll restore target from `$("msgInner")` to `$("messages")`. - Reset _scrollPinned = false after restoring the user position, so scrollToBottom does not re-fire on next tick.

Per Opus pre-release review (SHOULD-FIX #1): the CHANGELOG claimed both filter sites exempt 'active_stream_id OR pending_user_message', but the index path operates on compact() output which doesn't include pending_user_message. The behavior is correct in both paths because both fields are set/cleared in lockstep during streaming, but the wording was stronger than what the code does. Tightened to describe what each path actually checks.

@franksong2702

- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors (5+ tiers), with first/latest release versions, single-PR roll, and attribution methodology. Generated from git log + gh pulls API + CHANGELOG mention parsing. - README.md: stack-ranked top-10 contributors table at the top of the Contributors section, link to CONTRIBUTORS.md for the full list. Updated test count (1898 → 3309). Refreshed @franksong2702 and @bergeouss entries to reflect their broader bodies of work (now the nesquena#1 and nesquena#2 external contributors). - ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header; bumped current shipped build to v0.50.245 with current architecture state notes (streaming-markdown vendoring, byte-range streaming, configurable-model-badges). - ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section refreshed for current CLI/Claude parity (~95% Claude parity now). Generated by aggregating CHANGELOG attribution lines, gh PR API authors, and CHANGELOG version-section walks. Internal/bot accounts filtered out.

Per Opus pre-release review (SHOULD-FIX nesquena#1): the CHANGELOG claimed both filter sites exempt 'active_stream_id OR pending_user_message', but the index path operates on compact() output which doesn't include pending_user_message. The behavior is correct in both paths because both fields are set/cleared in lockstep during streaming, but the wording was stronger than what the code does. Tightened to describe what each path actually checks.

@franksong2702

- New CONTRIBUTORS.md: full ranked credit roll for all 66 contributors (5+ tiers), with first/latest release versions, single-PR roll, and attribution methodology. Generated from git log + gh pulls API + CHANGELOG mention parsing. - README.md: stack-ranked top-10 contributors table at the top of the Contributors section, link to CONTRIBUTORS.md for the full list. Updated test count (1898 → 3309). Refreshed @franksong2702 and @bergeouss entries to reflect their broader bodies of work (now the nesquena#1 and nesquena#2 external contributors). - ARCHITECTURE.md: removed stale 'tracks upstream v0.50.36' header; bumped current shipped build to v0.50.245 with current architecture state notes (streaming-markdown vendoring, byte-range streaming, configurable-model-badges). - ROADMAP.md / SPRINTS.md / TESTING.md: header/last-updated bumps to v0.50.245 and 3309 tests. SPRINTS.md 'Where we are now' section refreshed for current CLI/Claude parity (~95% Claude parity now). Generated by aggregating CHANGELOG attribution lines, gh PR API authors, and CHANGELOG version-section walks. Internal/bot accounts filtered out.

@stefanpieter

…na#1458 Bug nesquena#1) Issue nesquena#1458 reports persistent-host crashes (≥1/day) when running the WebUI under launchd KeepAlive on macOS. Root cause: `bootstrap.py` calls `subprocess.Popen([python, "server.py"], start_new_session=True)`, probes /health, then exits 0. Under any process supervisor (launchd, systemd, supervisord, runit, s6), the supervisor sees its tracked PID exit, marks the program as "completed," and respawns it. The new bootstrap fails to bind port 8787 (orphaned server still has it), exits non-zero, supervisor respawns again — loop until the orphan crashes for some other reason and the next respawn finds the port free. This PR addresses Bug nesquena#1 of the three failure modes tracked in nesquena#1458: the `bootstrap.py` double-fork breaking process supervisors. Bug nesquena#2 (state.db FD leak) and Bug nesquena#3 (HTTP-unhealthy wedge) remain open under the same issue — they need diagnosis data before a fix can land. Changes ------- 1. `bootstrap.py`: - New `--foreground` argparse flag with help text mentioning launchd / systemd / supervisord. - New `_detect_supervisor()` that returns the env var name for any supervisor it detects: `INVOCATION_ID` / `JOURNAL_STREAM` / `NOTIFY_SOCKET` (systemd, s6), `XPC_SERVICE_NAME` (launchd), `SUPERVISOR_ENABLED` (supervisord), or `HERMES_WEBUI_FOREGROUND` for the explicit user opt-in. Truthy values for the explicit opt-in: `1` / `true` / `yes` / `on` (case-insensitive). - `main()` branches on `args.foreground or _detect_supervisor()`: - **Foreground path:** chdir to `agent_dir or REPO_ROOT`, then `os.execv(python, [python, server_path])` to replace the bootstrap process image with the server. The supervisor sees the long-lived server as the original child. No `wait_for_health` probe — the supervisor's KeepAlive / Restart=on-failure handles liveness. - **Default path:** unchanged. Spawn server as detached child via `Popen + start_new_session=True`, probe /health, return 0. This still works for interactive `bash start.sh` invocations. - Resolved env vars (HOST/PORT/STATE_DIR/AGENT_DIR) are now mutated on `os.environ` directly instead of into a local `env` copy so they are inherited across `os.execv`. 2. `docs/supervisor.md` (new): runnable launchd plist, systemd .service, and supervisord conf examples + a diagnostic recipe (`lsof` + ppid chain) for catching the orphan-loop in production. 3. `.gitignore`: allowlist `docs/supervisor.md` (the directory uses an opt-in pattern; matches the existing `!docs/docker.md` precedent). 4. `tests/test_bootstrap_foreground.py` (new): 35 regression tests covering the argparse flag, `_detect_supervisor()` behavior across all five supervisor env vars, the explicit opt-in's truthy/falsy values, and `main()`'s execv-vs-Popen routing decision under each input combination. `os.execv` is monkeypatched in the routing tests — we pin the structural choice (which call is made, with which args, in which cwd, with which env) not the post-exec behavior. Why this scope and no more -------------------------- Bug nesquena#2 (state.db FD leak) lists 5 candidate paths and asks the reporter for `lsof -p <pid> | sort | uniq -c | sort -rn | head -20` output to disambiguate. Until that data lands, any "fix" would be speculative — explicitly out of scope per the contributor-pickup comment on the issue. Bug nesquena#3 (launchd-running, port-listening, HTTP-unhealthy) was added in @stefanpieter's reply comment. Diagnosis is in flight; no concrete fix shape yet. Also out of scope. Running locally end-to-end verifies the behavior: ``` [bootstrap] Starting Hermes Web UI on http://127.0.0.1:8789 (foreground mode: --foreground) $ pgrep -af 'server.py' 2997632 /home/.../python /tmp/wt-fix-1458/server.py $ ps -o ppid -p 2997632 2997581 ← bash that ran bootstrap.py — same PID as the original bootstrap $ ps -p 2997581 -o cmd ... bootstrap.py ... ← but exec'd into server.py ``` The same PID that bash forked for `bootstrap.py` is now `server.py`. A supervisor watching that PID would correctly observe the long-lived server. No double-fork. Verification ------------ - 3811 tests pass (`pytest tests/` — full suite, +51 from this PR plus master-merge-in) - All 35 new bootstrap-foreground tests pass - `bash scripts/run-browser-tests.sh` PASS (HTTP API checks against worktree) - `bash scripts/webui_qa_agent.sh 8789` PASS (23/23 visual QA) - Live verified: server starts cleanly under both `--foreground` and `HERMES_WEBUI_FOREGROUND=1`; PID lineage confirms no double-fork Closes nesquena#1458 (Bug nesquena#1 only). Bugs nesquena#2 and nesquena#3 remain tracked under the issue.

…267 follow-ups - CHANGELOG.md: v0.50.269 entry detailing nesquena#1478 nesquena#1479 nesquena#1480 - ROADMAP.md: bump to v0.50.269, 3847 tests collected - TESTING.md: bump header + total to 3847 nesquena#1478: nesquena APPROVED self-built bootstrap.py --foreground mode (closes nesquena#1458 Bug nesquena#1, +Opus follow-ups: XPC noise filter, executability guard) nesquena#1479: surgical follow-up to nesquena#1473 — Session.compact() now includes pending_user_message nesquena#1480: bfcache pageshow restores active session via loadSession + checkInflightOnBoot 3847 tests pass (+47 net). Opus advisor on stage diff: no blockers.

@dso2ng

…ix + P0 polish bundle (nesquena#1466 nesquena#1494 nesquena#1469 nesquena#1484 nesquena#1486) 3 PRs in this batch (3866 → 3874 tests, +8): - nesquena#1493 (@dso2ng) — sidebar Stop response cancels row's stream not active pane's (closes nesquena#1466, follow-up to nesquena#1480) - nesquena#1495 (self-built; reported by @insecurejezza in nesquena#1494) — state.db connection FD leak in sidebar polling (closes nesquena#1494, addresses Bug nesquena#2 of nesquena#1458) - nesquena#1492 (@bergeouss) — P0 bugfixes bundle: tool-card args readability + CLI rename persistence + scroll pinning + sw.js relative-path regression test (closes nesquena#1469 nesquena#1484 nesquena#1486) This release closes Bug nesquena#2 of the umbrella issue nesquena#1458. Bug nesquena#1 was closed by v0.50.269 (nesquena#1483) + v0.50.270 (nesquena#1487). Bug nesquena#3 (HTTP-unhealthy without FD exhaustion) is the remaining work item.

…n guard) CHANGELOG, ROADMAP, TESTING bumped (3925 → 3929 tests collected). Opus SHOULD-FIX absorbed in-release: tests #1-3 documented the dedup contract via direct construction but did not invoke get_models_grouped(). Test #4 (test_get_models_grouped_unconfigured_providers_get_independent_dicts) inspects the live source for the literal copy.deepcopy(auto_detected_models) call AND runs an end-to-end smoke of the fixed assignment loop. A future refactor that removes the deepcopy at api/config.py:2078 will fail this test immediately.

Both flagged by pre-release Opus advisor; both clearly defensive and small enough to absorb in-release per the reviewer-flagged-fix-in-release-not-followup policy. SHOULD-FIX nesquena#1 (api/routes.py:_clear_stale_stream_state, ~25 LOC): After the metadata-only reload (nesquena#1559 Layer 2), the local 'session' variable is reassigned to the full-load object but the caller still holds the original metadata-only stub. /api/session then returns the stale active_stream_id at routes.py:1791, causing the frontend to attempt one ghost SSE reconnect before recovering. Fix: capture original_stub at function entry, then patch its in-memory active_stream_id and pending_* fields to None after both the early-return (full-load already cleared) path AND the successful-mutation path. Now the caller's read returns fresh state, no ghost reconnect. SHOULD-FIX nesquena#2 (api/models.py:Session.save, ~20 LOC): The .bak write at api/models.py:436 used write_text() which truncates- then-writes — a crash mid-write or concurrent backup-producing save could leave a torn .bak. Recovery defends correctly (JSONDecodeError → returns -1 → 'no_action'), so the failure mode was 'backup lost' not 'spurious restore'. Fix: tmp + os.replace pattern matching the main file write at line 446-453. Now backup either lands cleanly or doesn't land at all. 4026/4026 tests pass post-absorb.

SHOULD-FIX #1 (renamed-root client cross-alias): drop strict-equality client filter at static/sessions.js:1853. Server-side _profiles_match cross-aliases 'default'-tagged rows to a renamed root 'kinni'; the strict-equality client would reject them, dropping every legacy session for renamed-root users. The server is now solely authoritative for profile scoping. SHOULD-FIX #2 (messaging-source dedupe ordering): _keep_latest_messaging_session_per_source now runs AFTER the profile filter at api/routes.py:2078. Before, it ran on the merged-cross-profile list with profile-blind keys, discarding the older profile's row across profiles before the scope filter — leaving zero rows for any messaging identity the active profile shared with another profile. NIT #3: _projects_migrated flag now set only AFTER successful save_projects. NIT #4: cleaned dead test code in test_is_root_profile_invalidation_drops_stale. NIT #5: _create_profile_fallback's clone_from=='default' literal now routes through _is_root_profile() for parity with the 5 other callsites. +2 regression tests pin the SHOULD-FIX shapes: - test_keep_latest_messaging_runs_after_profile_filter (source-string ordering) - test_static_sessions_js_trusts_server_profile_scoping (no client re-filter) 4173 -> 4175 tests pass. 0 regressions.

@Michaelyklam

…p + test-isolation fix Constituent PRs: - nesquena#1725 (@Michaelyklam) — simplify compact Activity row summary - nesquena#1726 (@Michaelyklam) — delegate generic provider catalogs to Hermes CLI (slice of nesquena#1240) - nesquena#1727 (@Michaelyklam) — link Claude Code OAuth in onboarding (closes nesquena#1362) - nesquena#1728 (@starship-s) — preserve profile context when starting chats - nesquena#1729 (@Michaelyklam) — persist compact Activity disclosure state - nesquena#1730 (@Michaelyklam) — prevent sticky sidebar hover drag state - nesquena#1732 (@Sanjays2402) — unpin scroll on small upward motion during streaming (closes nesquena#1731) Plus 2 in-stage absorbed fixes: - test-isolation fix: monkeypatch.setattr(config, 'cfg', X) survives PR nesquena#1728's path/mtime-aware get_config() reload. Mandatory before tag (Opus stage-302). - Opus SHOULD-FIX nesquena#1: _lastScrollTop reset on session switch (nesquena#1732 follow-up). Tests: 4537 → 4584 passing (+47). 0 regressions. Full suite ~128s. Stably green. Pre-release verification: - All 7 PRs CI-green individually + rebased onto master - pytest 4584 passed, 0 failed (multiple runs) - node -c clean on all 4 modified .js files - 11/11 browser API endpoints PASS on isolated port 8789 - 20 QA tests via webui_qa_agent.sh PASS - Opus advisor: SHIP, 5/5 verification clean, 0 MUST-FIX, 1 SHOULD-FIX absorbed (_lastScrollTop reset), 1 SHOULD-FIX deferred (nesquena#1736 — _clear_anthropic_env_values race, onboarding-time-only) Closes nesquena#1362, nesquena#1731.

Hermes added 11 commits March 31, 2026 02:37

fix: health check grep handles pretty-printed JSON response

be210b2

cleanup: remove unused optional list in verify_hermes_imports, remove…

4b0738f

… unused sys import in conftest

privacy: final sweep -- genericize repo URLs, CodePath example, remai…

e65052b

…ning name refs

fix: strip read_file line-number prefixes from all docs (artifact fro…

f1d1586

…m write_file pass)

fix: fully restore ARCHITECTURE.md from master (1588 lines), apply on…

47231e0

…ly privacy changes

nesquena-hermes merged commit fd1a865 into master Mar 31, 2026

nesquena-hermes deleted the portability branch March 31, 2026 02:59

This was referenced Apr 2, 2026

fix(api): resolve model provider from config to prevent misrouting #4

Merged

MiniMax newest models not appearing in WebUI dropdown #6

Closed

nesquena-hermes mentioned this pull request Apr 8, 2026

feat: add support for displaying thinking/reasoning blocks in chat #181

Closed

nesquena-hermes mentioned this pull request Apr 12, 2026

feat(onboarding): add one-shot bootstrap and first-run setup wizard #285

Closed

nesquena mentioned this pull request Apr 14, 2026

Microphone auto-denies regardless of connection security or browser permissions #358

Closed

nesquena mentioned this pull request Apr 18, 2026

Replace color scheme system with light/dark theme + accent skins #627

Merged

8 tasks

nesquena-hermes mentioned this pull request Apr 19, 2026

fix: robust mic toggle, MediaRecorder fallback for Tailscale, localStorage state persistence #683

Closed

nesquena mentioned this pull request Apr 21, 2026

fix: bootstrap.py loads REPO_ROOT/.env so direct invocation matches start.sh (#730) #791

Merged

nesquena-hermes pushed a commit that referenced this pull request Apr 21, 2026

docs: update models page — Kimi K2.6 replaces K2.5 (58.6% SWE-Pro #1 …

d33a5d7

…coding, writing section updated)

nesquena-hermes mentioned this pull request Apr 22, 2026

bug(chat): image_generate inline rendering fails — generated image URL shown as placeholder instead of <img> #853

Closed

nesquena mentioned this pull request Apr 24, 2026

fix(streaming): respect auxiliary.title_generation config for session titles #931

Merged

nesquena mentioned this pull request Apr 30, 2026

release: v0.50.247 — Cron Jobs project auto-assignment (#1345) #1347

Merged

qxxaa mentioned this pull request May 1, 2026

frontend renderer for code blocks returns "parse error" #1397

Closed

dutchaiagency mentioned this pull request May 2, 2026

fix: add bootstrap foreground mode for supervisors #1477

Closed

This was referenced May 2, 2026

fix(bootstrap): add --foreground mode for process supervisors (closes #1458 Bug #1) #1478

Closed

v0.50.269 — Bootstrap supervisor fix (#1478) + #1473 follow-ups (#1479, #1480) #1483

Merged

Garllee mentioned this pull request May 4, 2026

Fix: Cron system improvements - logging, truncation, pagination #1588

Open

This was referenced May 4, 2026

fix(profiles): scope sessions, projects, and root-profile resolution to active profile (#1611, #1612, #1614) #1629

Merged

Update failed: Failed to Fetch #1321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Portability#1

Portability#1
nesquena-hermes merged 11 commits intomasterfrom
portability

nesquena-hermes commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nesquena-hermes commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant