Skip to content

fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent#9

Merged
nesquena merged 1 commit intomasterfrom
fix/minimax-models-and-base-url
Apr 2, 2026
Merged

fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent#9
nesquena merged 1 commit intomasterfrom
fix/minimax-models-and-base-url

Conversation

@nesquena
Copy link
Copy Markdown
Owner

@nesquena nesquena commented Apr 2, 2026

Summary

  • Update _PROVIDER_MODELS['minimax'] from stale ABAB 6.5 models to current MiniMax-M2.7/M2.5/M2.1 lineup (matching hermes-agent upstream)
  • Update _PROVIDER_MODELS['zai'] from GLM-4 series to current GLM-5/4.7/4.5 lineup (matching hermes-agent upstream)
  • Extend resolve_model_provider() to return base_url from config.yaml so providers with custom endpoints (MiniMax, Z.AI) are routed correctly
  • Pass base_url to AIAgent in both streaming and sync chat paths

Fixes #6

Test plan

  • Syntax check all modified Python files
  • Full test suite: 201 passed, 23 failed (all pre-existing, no regressions)
  • Verified model IDs match hermes-agent upstream (hermes_cli/models.py)
  • Verified AIAgent accepts base_url parameter (upstream run_agent.py line ~465)

🤖 Generated with Claude Code

- Update _PROVIDER_MODELS['minimax'] from stale ABAB 6.5 models to
  current MiniMax-M2.7/M2.5/M2.1 lineup (matching hermes-agent upstream)
- Update _PROVIDER_MODELS['zai'] from GLM-4 to current GLM-5/4.7/4.5
  lineup (matching hermes-agent upstream)
- Extend resolve_model_provider() to also return base_url from config.yaml,
  so providers with custom endpoints (MiniMax, Z.AI) are routed correctly
- Pass base_url to AIAgent in both streaming and sync chat paths

Fixes #6

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@nesquena nesquena merged commit 4a1bf20 into master Apr 2, 2026
@nesquena nesquena deleted the fix/minimax-models-and-base-url branch April 2, 2026 07:40
nesquena-hermes pushed a commit that referenced this pull request Apr 4, 2026
Sync: apply all public repo changes to private master.

This brings private in line with public versioning and branding:
- Version scheme: now v0.x to match public (was v1.x)
- Branding: 'Hermes Web UI' throughout (Co-Work branding dropped)
- PORTABILITY.md removed (portability applied, no longer needed)

Changes from public PRs #3-5, #9, security PR, Sprint 15:
- api/routes.py: _serve_static() path traversal sandbox, skill category
  validation, inject_test loopback gate, project CRUD (5 endpoints),
  resolve_model_provider 3-tuple, cron import cleanup
- api/config.py: resolve_model_provider(), MiniMax/Z.AI model lists,
  PROJECTS_FILE constant
- api/models.py: Session.project_id, load_projects(), save_projects()
- api/streaming.py: resolve_model_provider 3-tuple unpack
- static/ui.js: renderMd esc() on all captures, Mermaid pinned @10.9.3
  with SRI hash, code copy button, tool card expand/collapse
- static/sessions.js: project filter bar, project chips and picker
- static/index.html: PrismJS SRI hashes on all 3 CDN tags
- static/style.css: mobile responsive rules, project/copy button styles
- static/workspace.js: new URL() + credentials:include
- static/boot.js: new URL() + credentials:include
- static/messages.js: new URL() + withCredentials on EventSource
- tests/test_sprint15.py: 13 new tests (237 total)

Additional fixes on branch (review pass):
- Project name capped at 128 chars on create/rename
- Project color validated as hex #RRGGBB on create/rename
- Removed 2 redundant sys.path.insert() calls in cron handlers

Test results: 237 passed, 0 failed (1 known timing flake excluded)
Ola-Turmo pushed a commit to Ola-Turmo/hermes-webui that referenced this pull request Apr 9, 2026
…se-url

fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent
JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 25, 2026
…se-url

fix: update MiniMax/Z.AI model lists, pass base_url to AIAgent
nesquena-hermes pushed a commit that referenced this pull request May 2, 2026
Three changes from the pre-merge Opus review:

**MUST-FIX** — XPC_SERVICE_NAME false-positive on macOS Terminal

macOS launchd sets `XPC_SERVICE_NAME` in EVERY Terminal-spawned shell, not
just real services. Typical noise values: `"0"` (truthy in Python!) and
`"application.com.apple.Terminal.<UUID>"`. A bare `os.environ.get(name)`
existence check would auto-promote interactive `./start.sh` runs to
foreground mode on every Mac dev machine — silently breaking the most
common installation path (no /health probe, no browser open, no log file,
hanging shell).

Fix: new `_is_real_supervisor_value()` helper that filters noise. For
`XPC_SERVICE_NAME` specifically, reject `"0"` and any `"application.*"`
prefix. Real launchd plists use reverse-DNS Label form (`com.<rdns>.<svc>`)
which still triggers correctly.

7 new tests in `TestXPCServiceNameNoiseFilter`:
- 4 noise values (`0`, Terminal.app, iTerm2, VSCode) → no detection
- 3 real Label forms → correct detection
- Mixed env with XPC noise + real INVOCATION_ID → falls through to systemd

**SHOULD-FIX 1** — Test env leakage

The original `clean_env` fixture stripped supervisor-detection env vars
but not the resolved bootstrap vars (HERMES_WEBUI_HOST/PORT/AGENT_DIR)
that `main()` mutates onto `os.environ`. After
`test_foreground_exports_resolved_env_vars` ran, later tests would import
bootstrap with polluted defaults (DEFAULT_HOST="0.0.0.0" instead of
"127.0.0.1"). Existing assertions still passed (tautological vs DEFAULT_*),
but it was a footgun for future tests.

Fix: extend `clean_env` to also `delenv` the three resolved vars before
each test.

**SHOULD-FIX 2** — Pre-execv executability guard

If `discover_launcher_python` returns a path that doesn't exist or isn't
executable, `os.execv` raises OSError → wrapper catches → SystemExit(1)
→ supervisor restarts → loop forever. That's exactly the failure mode
this PR is supposed to eliminate.

Fix: `os.access(python_exe, os.X_OK)` check before execv. Converts
infinite supervisor loop into a single visible RuntimeError.

1 new test in `TestForegroundExecutabilityGuard` pinning that the guard
fires before execv when the python path is non-executable.

**Docs** — supervisor.md updates

- New section explaining the XPC_SERVICE_NAME noise filter and what
  values trigger / don't trigger detection
- New section listing supervisors that are NOT auto-detected (runit,
  daemontools, PM2, Foreman/Honcho, custom shell-script supervisors)
  with explicit recommendation to set HERMES_WEBUI_FOREGROUND=1

Verification

- 3820 tests pass (+9 from this commit's new tests vs the original PR
  push of 3811)
- Filter manually verified end-to-end with the live os.environ:
  XPC=0 → None, XPC=application.* → None, XPC=com.example.foo → triggers
- run-browser-tests.sh ALL CHECKS PASSED on the worktree

Items deferred from the Opus review

- #4 chdir target may not exist: REPO_ROOT comes from __file__.resolve()
  so it's stable; not a real concern in practice
- #6 two startup messages in foreground mode: cosmetic, useful for
  diagnostics
- #7 stricter explicit-only mode: leaves user the override of just not
  passing --foreground (current behavior)
- #8 test stub return value: trivial, can fix later if regression surface
- #9 argparse positional-after-option ordering: test reads fine

These can be follow-up issues if anyone hits them.
roadhero referenced this pull request in fox-in-the-box-ai/hermes-webui May 5, 2026
Issue nesquena#11 Option B — augments the existing 3-step button wizard
without replacing it.

- Skip CTA on every step. POST /api/setup/skip marks onboarding
  complete without collecting an API key. Provides an exit hatch
  for users who'll configure providers later from Settings, or
  who'll rely on local Ollama (nesquena#66).
- Externalized welcome text. Read from /data/config/onboarding.md
  via GET /api/setup/welcome; default copy ships from the
  default-configs sweep. Edit the file in place to customize the
  greeting per install.
- State alignment. /api/setup/{complete,skip} now write both
  onboarding.json (the redirect middleware's flag) AND
  settings.json:onboarding_completed (the hermes-webui settings
  surface). onboarding_complete() reads either, so future code
  paths flipping settings.json directly are honoured.
- Local Ollama fast-path on Welcome (depends on nesquena#66). When a
  host-side Ollama daemon is detected with at least one model
  installed, the Welcome step surfaces a "Use <model name>" CTA
  that activates the model server-side and drops the user
  straight into chat — no API key step. Falls back gracefully
  to the existing button wizard when Ollama isn't detected.

Conversational onboarding "regardless of provider status" is
tracked separately as nesquena#69 (depends on #9 — llama.cpp fallback).
roadhero referenced this pull request in fox-in-the-box-ai/hermes-webui May 5, 2026
Bridges the v0.4.0 download manager (#10), the bundled llama.cpp
binary (Dockerfile change), and the existing streaming.py
fallback_model machinery. When the user opts into Settings →
Providers → "Local fallback" and the GGUF is on disk, transient
remote-provider failures (5xx, 429, connection errors) silently
retry through the on-device model — no chat interruption.

api/local_fallback.py (new):
- is_enabled() / set_enabled() — single bool in settings.json,
  mirrors the existing onboarding_completed pattern (#11 alignment).
- supervisor_status() / start_llama_server() / stop_llama_server() —
  thin wrappers around `supervisorctl`. Soft-fail with status
  NO_SUPERVISOR when the binary isn't on PATH (upstream Hermes,
  dev environments) so the rest of the module still works.
- get_fallback_endpoint() — returns the routing dict the streaming.py
  hook reads, or None if the user is opted out OR the server isn't
  RUNNING. Two-condition gate prevents silent failover to a non-
  functional local model.
- should_failover() classifier — 5xx + 429 + connection-error
  substrings are eligible; 401/403/404 + auth/quota/billing/
  model-not-found substrings are NEVER eligible (config errors must
  not be masked by a local model). NEVER list takes precedence.
- enable() / disable() — orchestration. enable() persists the flag,
  triggers #10's download_start, and either starts llama-server
  immediately (model already on disk) or spawns a background watcher
  that starts it once the file lands (handles 2.5 GB long downloads
  without blocking the toggle-on POST). disable() persists the flag
  and stops llama-server; preserves the downloaded model file.

api/streaming.py:
- Surgical 12-line addition just after the existing fallback_model
  resolution at line 1768. If local_fallback enabled AND the live
  endpoint is up, override _fallback_resolved with the on-device
  config. The agent's existing rate-limit-recovery / 5xx-retry path
  uses fallback_model, so we get failover for free without touching
  the exception classifier or retry orchestration. Failure modes are
  silent (try/except, falls through to original config-driven
  fallback or none).

api/config.py:
- Adds `local_fallback_enabled` to _SETTINGS_DEFAULTS (default False)
  AND _SETTINGS_BOOL_KEYS (so save_settings preserves it). Without
  these the toggle would silently drop on save — caught during
  Phase 4 testing.

api/routes.py:
- GET /api/local-fallback/status (full snapshot)
- POST /api/local-fallback/enable
- POST /api/local-fallback/disable

UI (static/index.html + panels.js):
- New "Local fallback" tile in Settings → Providers, below the Local
  Ollama tile. Status dot (gray=off, green=running/downloading), one-
  line status text, native checkbox toggle.
- _refreshLocalFallback() polls /api/local-fallback/status every 2.5s
  while Settings is open — catches downloading→ready state transitions
  without forcing the user to refresh the page.
- ui_state-driven copy:
    disabled         → "Off — toggle on to download (~2.5 GB) and …"
    needs-download   → "Will start downloading (~2.5 GB) …"
    downloading      → "Downloading 1.2 GB / 2.5 GB (48%)" (live)
    starting/warming → "Starting llama-server…"
    ready            → "Ready — your provider failures will silently…"
    no-supervisor    → "Supervisor not available — only works in Fox"

Verified e2e against a freshly built test container with a deliberately
broken MODEL_URL_PHI4MINI:
- Toggle OFF → ON: enabled flips, model download starts (status
  reaches 'failed' because of the bogus URL — that's the test signal),
  ui_state correctly transitions disabled → downloading → needs-download
- Toggle ON → OFF: enabled flips back, settings.json persisted,
  supervisorctl status confirms llama-server STOPPED
- Settings.json contains "local_fallback_enabled": false after the
  off-toggle (persistence verified)

What's NOT in this commit (deferred to a follow-up):
- Reactive modal for users who haven't opted in when a chat fails.
  Today they see the existing remote-provider error; a small UX
  improvement would offer "Try local model? (one-time download)".
  Documented in #9's Phase 1 plan as a v0.4.2 polish item.
- Recovery banner ("Remote provider is back — switch?") — same
  reasoning, polish for v0.4.2.
roadhero referenced this pull request in fox-in-the-box-ai/hermes-webui May 5, 2026
…l + banner) (#3)

* feat(tailscale): power-user fields in Settings advanced accordion (nesquena#96 phase 2)

Surfaces the 6 power-user Tailscale flags that nesquena#96 phase 1's argv
builder already accepted but the UI didn't expose:

  - Login server (custom control plane URL — headscale, on-prem)
  - Advertise routes (subnet-router CIDRs)
  - Advertise tags (ACL identity)
  - Exit node (route all traffic via a peer)
  - Accept routes (consume peers' subnet routes)
  - Accept DNS (MagicDNS, default on)

Closes Persona 3 of nesquena#96.

Backend:
- api/config.py: 6 new keys in _SETTINGS_DEFAULTS, the two booleans
  added to _SETTINGS_BOOL_KEYS allowlist
- api/tailscale.py: _load_persisted_opts() reads them as the
  body-shape that _build_up_argv() expects; get_status() exposes
  current values under config{}; start_up() merges body opts on
  top of persisted (body wins per-key, empty string falls through)

Frontend:
- static/index.html: expand Advanced accordion in the Tailscale
  tile with 4 text inputs, 2 checkboxes, Save button, status line
- static/panels.js: _tsPopulateAdvanced() pre-fills from
  state.config on Settings load; tsSaveAdvanced() POSTs all 6
  values to the existing /api/settings endpoint (which already
  validates types via _SETTINGS_BOOL_KEYS / allowlist)

Persistence is via settings.json — same store as everything else.
Power user can edit, Save, Connect; values survive container
restarts and pre-populate next session.

Phase 1 of v0.4.6.

* feat(fallback): reactive modal + recovery banner (#9 polish)

Two pieces of UX layered on top of v0.4.1's silent failover:

1. Reactive modal — when a chat stream fails on a remote provider AND
   the user has NOT opted into local fallback, surface a one-time
   modal offering to enable it. Fires only on error types where local
   would actually help (stream_interrupted, rate_limit, no_response,
   unknown). Skips eligibility-killers like auth_mismatch /
   model_not_found / quota_exhausted that local can't fix.

2. Recovery banner — when local fallback IS enabled, periodically poll
   /api/local-fallback/remote-health (new endpoint that probes
   openrouter.ai/api/v1/models with a 5s timeout, 30s in-process
   cache). When remote is reachable again, show a top banner offering
   to switch off local fallback.

Architecture:
- Surgical one-line dispatch in messages.js apperror handler:
  window.dispatchEvent(new CustomEvent('fitb:stream-error', {detail}))
- New fallback-polish.js IIFE listens for the event, talks to
  /api/local-fallback/{status,enable,disable,remote-health}
- sessionStorage flags for "modal seen" / "banner dismissed" so the
  UI doesn't pester. Reset on page reload.
- New CSS in fox-in-the-box.css styles the modal + banner using
  existing CSS vars (theme-aware).

Backend:
- local_fallback.get_remote_health() — urllib.request, 5s timeout,
  30s in-process cache to keep multi-tab polling cheap. Returns
  {remote_healthy, tested_url, error}. OpenRouter is the FITB default
  provider; its public /models endpoint is the canonical "internet
  works and primary provider is up" probe.
- routes.py — wires GET /api/local-fallback/remote-health.

Phase 2 of v0.4.6.

---------

Co-authored-by: Dennis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MiniMax newest models not appearing in WebUI dropdown

1 participant