fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent#9
Merged
fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent#9
Conversation
- Update _PROVIDER_MODELS['minimax'] from stale ABAB 6.5 models to current MiniMax-M2.7/M2.5/M2.1 lineup (matching hermes-agent upstream) - Update _PROVIDER_MODELS['zai'] from GLM-4 to current GLM-5/4.7/4.5 lineup (matching hermes-agent upstream) - Extend resolve_model_provider() to also return base_url from config.yaml, so providers with custom endpoints (MiniMax, Z.AI) are routed correctly - Pass base_url to AIAgent in both streaming and sync chat paths Fixes #6 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This was referenced Apr 2, 2026
nesquena-hermes
pushed a commit
that referenced
this pull request
Apr 4, 2026
Sync: apply all public repo changes to private master. This brings private in line with public versioning and branding: - Version scheme: now v0.x to match public (was v1.x) - Branding: 'Hermes Web UI' throughout (Co-Work branding dropped) - PORTABILITY.md removed (portability applied, no longer needed) Changes from public PRs #3-5, #9, security PR, Sprint 15: - api/routes.py: _serve_static() path traversal sandbox, skill category validation, inject_test loopback gate, project CRUD (5 endpoints), resolve_model_provider 3-tuple, cron import cleanup - api/config.py: resolve_model_provider(), MiniMax/Z.AI model lists, PROJECTS_FILE constant - api/models.py: Session.project_id, load_projects(), save_projects() - api/streaming.py: resolve_model_provider 3-tuple unpack - static/ui.js: renderMd esc() on all captures, Mermaid pinned @10.9.3 with SRI hash, code copy button, tool card expand/collapse - static/sessions.js: project filter bar, project chips and picker - static/index.html: PrismJS SRI hashes on all 3 CDN tags - static/style.css: mobile responsive rules, project/copy button styles - static/workspace.js: new URL() + credentials:include - static/boot.js: new URL() + credentials:include - static/messages.js: new URL() + withCredentials on EventSource - tests/test_sprint15.py: 13 new tests (237 total) Additional fixes on branch (review pass): - Project name capped at 128 chars on create/rename - Project color validated as hex #RRGGBB on create/rename - Removed 2 redundant sys.path.insert() calls in cron handlers Test results: 237 passed, 0 failed (1 known timing flake excluded)
Ola-Turmo
pushed a commit
to Ola-Turmo/hermes-webui
that referenced
this pull request
Apr 9, 2026
…se-url fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent
JKJameson
pushed a commit
to JKJameson/hermes-webui
that referenced
this pull request
Apr 25, 2026
…se-url fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent
nesquena-hermes
pushed a commit
that referenced
this pull request
May 2, 2026
Three changes from the pre-merge Opus review: **MUST-FIX** — XPC_SERVICE_NAME false-positive on macOS Terminal macOS launchd sets `XPC_SERVICE_NAME` in EVERY Terminal-spawned shell, not just real services. Typical noise values: `"0"` (truthy in Python!) and `"application.com.apple.Terminal.<UUID>"`. A bare `os.environ.get(name)` existence check would auto-promote interactive `./start.sh` runs to foreground mode on every Mac dev machine — silently breaking the most common installation path (no /health probe, no browser open, no log file, hanging shell). Fix: new `_is_real_supervisor_value()` helper that filters noise. For `XPC_SERVICE_NAME` specifically, reject `"0"` and any `"application.*"` prefix. Real launchd plists use reverse-DNS Label form (`com.<rdns>.<svc>`) which still triggers correctly. 7 new tests in `TestXPCServiceNameNoiseFilter`: - 4 noise values (`0`, Terminal.app, iTerm2, VSCode) → no detection - 3 real Label forms → correct detection - Mixed env with XPC noise + real INVOCATION_ID → falls through to systemd **SHOULD-FIX 1** — Test env leakage The original `clean_env` fixture stripped supervisor-detection env vars but not the resolved bootstrap vars (HERMES_WEBUI_HOST/PORT/AGENT_DIR) that `main()` mutates onto `os.environ`. After `test_foreground_exports_resolved_env_vars` ran, later tests would import bootstrap with polluted defaults (DEFAULT_HOST="0.0.0.0" instead of "127.0.0.1"). Existing assertions still passed (tautological vs DEFAULT_*), but it was a footgun for future tests. Fix: extend `clean_env` to also `delenv` the three resolved vars before each test. **SHOULD-FIX 2** — Pre-execv executability guard If `discover_launcher_python` returns a path that doesn't exist or isn't executable, `os.execv` raises OSError → wrapper catches → SystemExit(1) → supervisor restarts → loop forever. That's exactly the failure mode this PR is supposed to eliminate. Fix: `os.access(python_exe, os.X_OK)` check before execv. Converts infinite supervisor loop into a single visible RuntimeError. 1 new test in `TestForegroundExecutabilityGuard` pinning that the guard fires before execv when the python path is non-executable. **Docs** — supervisor.md updates - New section explaining the XPC_SERVICE_NAME noise filter and what values trigger / don't trigger detection - New section listing supervisors that are NOT auto-detected (runit, daemontools, PM2, Foreman/Honcho, custom shell-script supervisors) with explicit recommendation to set HERMES_WEBUI_FOREGROUND=1 Verification - 3820 tests pass (+9 from this commit's new tests vs the original PR push of 3811) - Filter manually verified end-to-end with the live os.environ: XPC=0 → None, XPC=application.* → None, XPC=com.example.foo → triggers - run-browser-tests.sh ALL CHECKS PASSED on the worktree Items deferred from the Opus review - #4 chdir target may not exist: REPO_ROOT comes from __file__.resolve() so it's stable; not a real concern in practice - #6 two startup messages in foreground mode: cosmetic, useful for diagnostics - #7 stricter explicit-only mode: leaves user the override of just not passing --foreground (current behavior) - #8 test stub return value: trivial, can fix later if regression surface - #9 argparse positional-after-option ordering: test reads fine These can be follow-up issues if anyone hits them.
roadhero
referenced
this pull request
in fox-in-the-box-ai/hermes-webui
May 5, 2026
Issue nesquena#11 Option B — augments the existing 3-step button wizard without replacing it. - Skip CTA on every step. POST /api/setup/skip marks onboarding complete without collecting an API key. Provides an exit hatch for users who'll configure providers later from Settings, or who'll rely on local Ollama (nesquena#66). - Externalized welcome text. Read from /data/config/onboarding.md via GET /api/setup/welcome; default copy ships from the default-configs sweep. Edit the file in place to customize the greeting per install. - State alignment. /api/setup/{complete,skip} now write both onboarding.json (the redirect middleware's flag) AND settings.json:onboarding_completed (the hermes-webui settings surface). onboarding_complete() reads either, so future code paths flipping settings.json directly are honoured. - Local Ollama fast-path on Welcome (depends on nesquena#66). When a host-side Ollama daemon is detected with at least one model installed, the Welcome step surfaces a "Use <model name>" CTA that activates the model server-side and drops the user straight into chat — no API key step. Falls back gracefully to the existing button wizard when Ollama isn't detected. Conversational onboarding "regardless of provider status" is tracked separately as nesquena#69 (depends on #9 — llama.cpp fallback).
roadhero
referenced
this pull request
in fox-in-the-box-ai/hermes-webui
May 5, 2026
Bridges the v0.4.0 download manager (#10), the bundled llama.cpp binary (Dockerfile change), and the existing streaming.py fallback_model machinery. When the user opts into Settings → Providers → "Local fallback" and the GGUF is on disk, transient remote-provider failures (5xx, 429, connection errors) silently retry through the on-device model — no chat interruption. api/local_fallback.py (new): - is_enabled() / set_enabled() — single bool in settings.json, mirrors the existing onboarding_completed pattern (#11 alignment). - supervisor_status() / start_llama_server() / stop_llama_server() — thin wrappers around `supervisorctl`. Soft-fail with status NO_SUPERVISOR when the binary isn't on PATH (upstream Hermes, dev environments) so the rest of the module still works. - get_fallback_endpoint() — returns the routing dict the streaming.py hook reads, or None if the user is opted out OR the server isn't RUNNING. Two-condition gate prevents silent failover to a non- functional local model. - should_failover() classifier — 5xx + 429 + connection-error substrings are eligible; 401/403/404 + auth/quota/billing/ model-not-found substrings are NEVER eligible (config errors must not be masked by a local model). NEVER list takes precedence. - enable() / disable() — orchestration. enable() persists the flag, triggers #10's download_start, and either starts llama-server immediately (model already on disk) or spawns a background watcher that starts it once the file lands (handles 2.5 GB long downloads without blocking the toggle-on POST). disable() persists the flag and stops llama-server; preserves the downloaded model file. api/streaming.py: - Surgical 12-line addition just after the existing fallback_model resolution at line 1768. If local_fallback enabled AND the live endpoint is up, override _fallback_resolved with the on-device config. The agent's existing rate-limit-recovery / 5xx-retry path uses fallback_model, so we get failover for free without touching the exception classifier or retry orchestration. Failure modes are silent (try/except, falls through to original config-driven fallback or none). api/config.py: - Adds `local_fallback_enabled` to _SETTINGS_DEFAULTS (default False) AND _SETTINGS_BOOL_KEYS (so save_settings preserves it). Without these the toggle would silently drop on save — caught during Phase 4 testing. api/routes.py: - GET /api/local-fallback/status (full snapshot) - POST /api/local-fallback/enable - POST /api/local-fallback/disable UI (static/index.html + panels.js): - New "Local fallback" tile in Settings → Providers, below the Local Ollama tile. Status dot (gray=off, green=running/downloading), one- line status text, native checkbox toggle. - _refreshLocalFallback() polls /api/local-fallback/status every 2.5s while Settings is open — catches downloading→ready state transitions without forcing the user to refresh the page. - ui_state-driven copy: disabled → "Off — toggle on to download (~2.5 GB) and …" needs-download → "Will start downloading (~2.5 GB) …" downloading → "Downloading 1.2 GB / 2.5 GB (48%)" (live) starting/warming → "Starting llama-server…" ready → "Ready — your provider failures will silently…" no-supervisor → "Supervisor not available — only works in Fox" Verified e2e against a freshly built test container with a deliberately broken MODEL_URL_PHI4MINI: - Toggle OFF → ON: enabled flips, model download starts (status reaches 'failed' because of the bogus URL — that's the test signal), ui_state correctly transitions disabled → downloading → needs-download - Toggle ON → OFF: enabled flips back, settings.json persisted, supervisorctl status confirms llama-server STOPPED - Settings.json contains "local_fallback_enabled": false after the off-toggle (persistence verified) What's NOT in this commit (deferred to a follow-up): - Reactive modal for users who haven't opted in when a chat fails. Today they see the existing remote-provider error; a small UX improvement would offer "Try local model? (one-time download)". Documented in #9's Phase 1 plan as a v0.4.2 polish item. - Recovery banner ("Remote provider is back — switch?") — same reasoning, polish for v0.4.2.
14 tasks
roadhero
referenced
this pull request
in fox-in-the-box-ai/hermes-webui
May 5, 2026
…l + banner) (#3) * feat(tailscale): power-user fields in Settings advanced accordion (nesquena#96 phase 2) Surfaces the 6 power-user Tailscale flags that nesquena#96 phase 1's argv builder already accepted but the UI didn't expose: - Login server (custom control plane URL — headscale, on-prem) - Advertise routes (subnet-router CIDRs) - Advertise tags (ACL identity) - Exit node (route all traffic via a peer) - Accept routes (consume peers' subnet routes) - Accept DNS (MagicDNS, default on) Closes Persona 3 of nesquena#96. Backend: - api/config.py: 6 new keys in _SETTINGS_DEFAULTS, the two booleans added to _SETTINGS_BOOL_KEYS allowlist - api/tailscale.py: _load_persisted_opts() reads them as the body-shape that _build_up_argv() expects; get_status() exposes current values under config{}; start_up() merges body opts on top of persisted (body wins per-key, empty string falls through) Frontend: - static/index.html: expand Advanced accordion in the Tailscale tile with 4 text inputs, 2 checkboxes, Save button, status line - static/panels.js: _tsPopulateAdvanced() pre-fills from state.config on Settings load; tsSaveAdvanced() POSTs all 6 values to the existing /api/settings endpoint (which already validates types via _SETTINGS_BOOL_KEYS / allowlist) Persistence is via settings.json — same store as everything else. Power user can edit, Save, Connect; values survive container restarts and pre-populate next session. Phase 1 of v0.4.6. * feat(fallback): reactive modal + recovery banner (#9 polish) Two pieces of UX layered on top of v0.4.1's silent failover: 1. Reactive modal — when a chat stream fails on a remote provider AND the user has NOT opted into local fallback, surface a one-time modal offering to enable it. Fires only on error types where local would actually help (stream_interrupted, rate_limit, no_response, unknown). Skips eligibility-killers like auth_mismatch / model_not_found / quota_exhausted that local can't fix. 2. Recovery banner — when local fallback IS enabled, periodically poll /api/local-fallback/remote-health (new endpoint that probes openrouter.ai/api/v1/models with a 5s timeout, 30s in-process cache). When remote is reachable again, show a top banner offering to switch off local fallback. Architecture: - Surgical one-line dispatch in messages.js apperror handler: window.dispatchEvent(new CustomEvent('fitb:stream-error', {detail})) - New fallback-polish.js IIFE listens for the event, talks to /api/local-fallback/{status,enable,disable,remote-health} - sessionStorage flags for "modal seen" / "banner dismissed" so the UI doesn't pester. Reset on page reload. - New CSS in fox-in-the-box.css styles the modal + banner using existing CSS vars (theme-aware). Backend: - local_fallback.get_remote_health() — urllib.request, 5s timeout, 30s in-process cache to keep multi-tab polling cheap. Returns {remote_healthy, tested_url, error}. OpenRouter is the FITB default provider; its public /models endpoint is the canonical "internet works and primary provider is up" probe. - routes.py — wires GET /api/local-fallback/remote-health. Phase 2 of v0.4.6. --------- Co-authored-by: Dennis <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_PROVIDER_MODELS['minimax']from stale ABAB 6.5 models to current MiniMax-M2.7/M2.5/M2.1 lineup (matching hermes-agent upstream)_PROVIDER_MODELS['zai']from GLM-4 series to current GLM-5/4.7/4.5 lineup (matching hermes-agent upstream)resolve_model_provider()to returnbase_urlfrom config.yaml so providers with custom endpoints (MiniMax, Z.AI) are routed correctlybase_urlto AIAgent in both streaming and sync chat pathsFixes #6
Test plan
hermes_cli/models.py)base_urlparameter (upstreamrun_agent.pyline ~465)🤖 Generated with Claude Code