fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent by nesquena · Pull Request #9 · nesquena/hermes-webui

nesquena · 2026-04-02T06:15:20Z

Summary

Update _PROVIDER_MODELS['minimax'] from stale ABAB 6.5 models to current MiniMax-M2.7/M2.5/M2.1 lineup (matching hermes-agent upstream)
Update _PROVIDER_MODELS['zai'] from GLM-4 series to current GLM-5/4.7/4.5 lineup (matching hermes-agent upstream)
Extend resolve_model_provider() to return base_url from config.yaml so providers with custom endpoints (MiniMax, Z.AI) are routed correctly
Pass base_url to AIAgent in both streaming and sync chat paths

Fixes #6

Test plan

Syntax check all modified Python files
Full test suite: 201 passed, 23 failed (all pre-existing, no regressions)
Verified model IDs match hermes-agent upstream (hermes_cli/models.py)
Verified AIAgent accepts base_url parameter (upstream run_agent.py line ~465)

🤖 Generated with Claude Code

- Update _PROVIDER_MODELS['minimax'] from stale ABAB 6.5 models to current MiniMax-M2.7/M2.5/M2.1 lineup (matching hermes-agent upstream) - Update _PROVIDER_MODELS['zai'] from GLM-4 to current GLM-5/4.7/4.5 lineup (matching hermes-agent upstream) - Extend resolve_model_provider() to also return base_url from config.yaml, so providers with custom endpoints (MiniMax, Z.AI) are routed correctly - Pass base_url to AIAgent in both streaming and sync chat paths Fixes #6 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Sync: apply all public repo changes to private master. This brings private in line with public versioning and branding: - Version scheme: now v0.x to match public (was v1.x) - Branding: 'Hermes Web UI' throughout (Co-Work branding dropped) - PORTABILITY.md removed (portability applied, no longer needed) Changes from public PRs #3-5, #9, security PR, Sprint 15: - api/routes.py: _serve_static() path traversal sandbox, skill category validation, inject_test loopback gate, project CRUD (5 endpoints), resolve_model_provider 3-tuple, cron import cleanup - api/config.py: resolve_model_provider(), MiniMax/Z.AI model lists, PROJECTS_FILE constant - api/models.py: Session.project_id, load_projects(), save_projects() - api/streaming.py: resolve_model_provider 3-tuple unpack - static/ui.js: renderMd esc() on all captures, Mermaid pinned @10.9.3 with SRI hash, code copy button, tool card expand/collapse - static/sessions.js: project filter bar, project chips and picker - static/index.html: PrismJS SRI hashes on all 3 CDN tags - static/style.css: mobile responsive rules, project/copy button styles - static/workspace.js: new URL() + credentials:include - static/boot.js: new URL() + credentials:include - static/messages.js: new URL() + withCredentials on EventSource - tests/test_sprint15.py: 13 new tests (237 total) Additional fixes on branch (review pass): - Project name capped at 128 chars on create/rename - Project color validated as hex #RRGGBB on create/rename - Removed 2 redundant sys.path.insert() calls in cron handlers Test results: 237 passed, 0 failed (1 known timing flake excluded)

…se-url fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent

Three changes from the pre-merge Opus review: **MUST-FIX** — XPC_SERVICE_NAME false-positive on macOS Terminal macOS launchd sets `XPC_SERVICE_NAME` in EVERY Terminal-spawned shell, not just real services. Typical noise values: `"0"` (truthy in Python!) and `"application.com.apple.Terminal.<UUID>"`. A bare `os.environ.get(name)` existence check would auto-promote interactive `./start.sh` runs to foreground mode on every Mac dev machine — silently breaking the most common installation path (no /health probe, no browser open, no log file, hanging shell). Fix: new `_is_real_supervisor_value()` helper that filters noise. For `XPC_SERVICE_NAME` specifically, reject `"0"` and any `"application.*"` prefix. Real launchd plists use reverse-DNS Label form (`com.<rdns>.<svc>`) which still triggers correctly. 7 new tests in `TestXPCServiceNameNoiseFilter`: - 4 noise values (`0`, Terminal.app, iTerm2, VSCode) → no detection - 3 real Label forms → correct detection - Mixed env with XPC noise + real INVOCATION_ID → falls through to systemd **SHOULD-FIX 1** — Test env leakage The original `clean_env` fixture stripped supervisor-detection env vars but not the resolved bootstrap vars (HERMES_WEBUI_HOST/PORT/AGENT_DIR) that `main()` mutates onto `os.environ`. After `test_foreground_exports_resolved_env_vars` ran, later tests would import bootstrap with polluted defaults (DEFAULT_HOST="0.0.0.0" instead of "127.0.0.1"). Existing assertions still passed (tautological vs DEFAULT_*), but it was a footgun for future tests. Fix: extend `clean_env` to also `delenv` the three resolved vars before each test. **SHOULD-FIX 2** — Pre-execv executability guard If `discover_launcher_python` returns a path that doesn't exist or isn't executable, `os.execv` raises OSError → wrapper catches → SystemExit(1) → supervisor restarts → loop forever. That's exactly the failure mode this PR is supposed to eliminate. Fix: `os.access(python_exe, os.X_OK)` check before execv. Converts infinite supervisor loop into a single visible RuntimeError. 1 new test in `TestForegroundExecutabilityGuard` pinning that the guard fires before execv when the python path is non-executable. **Docs** — supervisor.md updates - New section explaining the XPC_SERVICE_NAME noise filter and what values trigger / don't trigger detection - New section listing supervisors that are NOT auto-detected (runit, daemontools, PM2, Foreman/Honcho, custom shell-script supervisors) with explicit recommendation to set HERMES_WEBUI_FOREGROUND=1 Verification - 3820 tests pass (+9 from this commit's new tests vs the original PR push of 3811) - Filter manually verified end-to-end with the live os.environ: XPC=0 → None, XPC=application.* → None, XPC=com.example.foo → triggers - run-browser-tests.sh ALL CHECKS PASSED on the worktree Items deferred from the Opus review - #4 chdir target may not exist: REPO_ROOT comes from __file__.resolve() so it's stable; not a real concern in practice - #6 two startup messages in foreground mode: cosmetic, useful for diagnostics - #7 stricter explicit-only mode: leaves user the override of just not passing --foreground (current behavior) - #8 test stub return value: trivial, can fix later if regression surface - #9 argparse positional-after-option ordering: test reads fine These can be follow-up issues if anyone hits them.

Issue nesquena#11 Option B — augments the existing 3-step button wizard without replacing it. - Skip CTA on every step. POST /api/setup/skip marks onboarding complete without collecting an API key. Provides an exit hatch for users who'll configure providers later from Settings, or who'll rely on local Ollama (nesquena#66). - Externalized welcome text. Read from /data/config/onboarding.md via GET /api/setup/welcome; default copy ships from the default-configs sweep. Edit the file in place to customize the greeting per install. - State alignment. /api/setup/{complete,skip} now write both onboarding.json (the redirect middleware's flag) AND settings.json:onboarding_completed (the hermes-webui settings surface). onboarding_complete() reads either, so future code paths flipping settings.json directly are honoured. - Local Ollama fast-path on Welcome (depends on nesquena#66). When a host-side Ollama daemon is detected with at least one model installed, the Welcome step surfaces a "Use <model name>" CTA that activates the model server-side and drops the user straight into chat — no API key step. Falls back gracefully to the existing button wizard when Ollama isn't detected. Conversational onboarding "regardless of provider status" is tracked separately as nesquena#69 (depends on #9 — llama.cpp fallback).

Bridges the v0.4.0 download manager (#10), the bundled llama.cpp binary (Dockerfile change), and the existing streaming.py fallback_model machinery. When the user opts into Settings → Providers → "Local fallback" and the GGUF is on disk, transient remote-provider failures (5xx, 429, connection errors) silently retry through the on-device model — no chat interruption. api/local_fallback.py (new): - is_enabled() / set_enabled() — single bool in settings.json, mirrors the existing onboarding_completed pattern (#11 alignment). - supervisor_status() / start_llama_server() / stop_llama_server() — thin wrappers around `supervisorctl`. Soft-fail with status NO_SUPERVISOR when the binary isn't on PATH (upstream Hermes, dev environments) so the rest of the module still works. - get_fallback_endpoint() — returns the routing dict the streaming.py hook reads, or None if the user is opted out OR the server isn't RUNNING. Two-condition gate prevents silent failover to a non- functional local model. - should_failover() classifier — 5xx + 429 + connection-error substrings are eligible; 401/403/404 + auth/quota/billing/ model-not-found substrings are NEVER eligible (config errors must not be masked by a local model). NEVER list takes precedence. - enable() / disable() — orchestration. enable() persists the flag, triggers #10's download_start, and either starts llama-server immediately (model already on disk) or spawns a background watcher that starts it once the file lands (handles 2.5 GB long downloads without blocking the toggle-on POST). disable() persists the flag and stops llama-server; preserves the downloaded model file. api/streaming.py: - Surgical 12-line addition just after the existing fallback_model resolution at line 1768. If local_fallback enabled AND the live endpoint is up, override _fallback_resolved with the on-device config. The agent's existing rate-limit-recovery / 5xx-retry path uses fallback_model, so we get failover for free without touching the exception classifier or retry orchestration. Failure modes are silent (try/except, falls through to original config-driven fallback or none). api/config.py: - Adds `local_fallback_enabled` to _SETTINGS_DEFAULTS (default False) AND _SETTINGS_BOOL_KEYS (so save_settings preserves it). Without these the toggle would silently drop on save — caught during Phase 4 testing. api/routes.py: - GET /api/local-fallback/status (full snapshot) - POST /api/local-fallback/enable - POST /api/local-fallback/disable UI (static/index.html + panels.js): - New "Local fallback" tile in Settings → Providers, below the Local Ollama tile. Status dot (gray=off, green=running/downloading), one- line status text, native checkbox toggle. - _refreshLocalFallback() polls /api/local-fallback/status every 2.5s while Settings is open — catches downloading→ready state transitions without forcing the user to refresh the page. - ui_state-driven copy: disabled → "Off — toggle on to download (~2.5 GB) and …" needs-download → "Will start downloading (~2.5 GB) …" downloading → "Downloading 1.2 GB / 2.5 GB (48%)" (live) starting/warming → "Starting llama-server…" ready → "Ready — your provider failures will silently…" no-supervisor → "Supervisor not available — only works in Fox" Verified e2e against a freshly built test container with a deliberately broken MODEL_URL_PHI4MINI: - Toggle OFF → ON: enabled flips, model download starts (status reaches 'failed' because of the bogus URL — that's the test signal), ui_state correctly transitions disabled → downloading → needs-download - Toggle ON → OFF: enabled flips back, settings.json persisted, supervisorctl status confirms llama-server STOPPED - Settings.json contains "local_fallback_enabled": false after the off-toggle (persistence verified) What's NOT in this commit (deferred to a follow-up): - Reactive modal for users who haven't opted in when a chat fails. Today they see the existing remote-provider error; a small UX improvement would offer "Try local model? (one-time download)". Documented in #9's Phase 1 plan as a v0.4.2 polish item. - Recovery banner ("Remote provider is back — switch?") — same reasoning, polish for v0.4.2.

…l + banner) (#3) * feat(tailscale): power-user fields in Settings advanced accordion (nesquena#96 phase 2) Surfaces the 6 power-user Tailscale flags that nesquena#96 phase 1's argv builder already accepted but the UI didn't expose: - Login server (custom control plane URL — headscale, on-prem) - Advertise routes (subnet-router CIDRs) - Advertise tags (ACL identity) - Exit node (route all traffic via a peer) - Accept routes (consume peers' subnet routes) - Accept DNS (MagicDNS, default on) Closes Persona 3 of nesquena#96. Backend: - api/config.py: 6 new keys in _SETTINGS_DEFAULTS, the two booleans added to _SETTINGS_BOOL_KEYS allowlist - api/tailscale.py: _load_persisted_opts() reads them as the body-shape that _build_up_argv() expects; get_status() exposes current values under config{}; start_up() merges body opts on top of persisted (body wins per-key, empty string falls through) Frontend: - static/index.html: expand Advanced accordion in the Tailscale tile with 4 text inputs, 2 checkboxes, Save button, status line - static/panels.js: _tsPopulateAdvanced() pre-fills from state.config on Settings load; tsSaveAdvanced() POSTs all 6 values to the existing /api/settings endpoint (which already validates types via _SETTINGS_BOOL_KEYS / allowlist) Persistence is via settings.json — same store as everything else. Power user can edit, Save, Connect; values survive container restarts and pre-populate next session. Phase 1 of v0.4.6. * feat(fallback): reactive modal + recovery banner (#9 polish) Two pieces of UX layered on top of v0.4.1's silent failover: 1. Reactive modal — when a chat stream fails on a remote provider AND the user has NOT opted into local fallback, surface a one-time modal offering to enable it. Fires only on error types where local would actually help (stream_interrupted, rate_limit, no_response, unknown). Skips eligibility-killers like auth_mismatch / model_not_found / quota_exhausted that local can't fix. 2. Recovery banner — when local fallback IS enabled, periodically poll /api/local-fallback/remote-health (new endpoint that probes openrouter.ai/api/v1/models with a 5s timeout, 30s in-process cache). When remote is reachable again, show a top banner offering to switch off local fallback. Architecture: - Surgical one-line dispatch in messages.js apperror handler: window.dispatchEvent(new CustomEvent('fitb:stream-error', {detail})) - New fallback-polish.js IIFE listens for the event, talks to /api/local-fallback/{status,enable,disable,remote-health} - sessionStorage flags for "modal seen" / "banner dismissed" so the UI doesn't pester. Reset on page reload. - New CSS in fox-in-the-box.css styles the modal + banner using existing CSS vars (theme-aware). Backend: - local_fallback.get_remote_health() — urllib.request, 5s timeout, 30s in-process cache to keep multi-tab polling cheap. Returns {remote_healthy, tested_url, error}. OpenRouter is the FITB default provider; its public /models endpoint is the canonical "internet works and primary provider is up" probe. - routes.py — wires GET /api/local-fallback/remote-health. Phase 2 of v0.4.6. --------- Co-authored-by: Dennis <[email protected]>

This was referenced Apr 2, 2026

MiniMax newest models not appearing in WebUI dropdown #6

Closed

Custom Models not being loaded properly #8

Closed

nesquena merged commit 4a1bf20 into master Apr 2, 2026

nesquena deleted the fix/minimax-models-and-base-url branch April 2, 2026 07:40

nesquena mentioned this pull request Apr 4, 2026

feat: subagent session trees, unified tool cards, streaming fixes, UI polish #75

Closed

Ola-Turmo pushed a commit to Ola-Turmo/hermes-webui that referenced this pull request Apr 9, 2026

Merge pull request nesquena#9 from nesquena/fix/minimax-models-and-ba…

f9bf65a

…se-url fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent

JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 25, 2026

Merge pull request nesquena#9 from nesquena/fix/minimax-models-and-ba…

df4fd74

…se-url fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent

nesquena-hermes mentioned this pull request May 2, 2026

fix(bootstrap): add --foreground mode for process supervisors (closes #1458 Bug #1) #1478

Closed

roadhero mentioned this pull request May 5, 2026

feat(v0.4.6): Tailscale Phase 2 (power-user fields) + #9 polish (modal + banner) fox-in-the-box-ai/hermes-webui#3

Merged

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent#9

fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent#9
nesquena merged 1 commit intomasterfrom
fix/minimax-models-and-base-url

nesquena commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant