Skip to content

fix(api): resolve model provider from config to prevent misrouting#4

Merged
nesquena merged 2 commits intonesquena:masterfrom
deboste:fix/model-provider-routing
Apr 2, 2026
Merged

fix(api): resolve model provider from config to prevent misrouting#4
nesquena merged 2 commits intonesquena:masterfrom
deboste:fix/model-provider-routing

Conversation

@deboste
Copy link
Copy Markdown
Contributor

@deboste deboste commented Mar 31, 2026

When the model dropdown sends a prefixed ID like "anthropic/claude-xxx", AIAgent interprets the provider/model format as an OpenRouter path and routes through OpenRouter instead of the direct Anthropic API.

Fix: read the configured provider from config.yaml model section. If the model ID starts with the configured provider name followed by "/", strip that prefix and pass the provider explicitly to AIAgent. This ensures direct API providers (Anthropic, OpenAI, etc.) are used when configured, regardless of the model ID format from the dropdown.

When the model dropdown sends a prefixed ID like "anthropic/claude-xxx",
AIAgent interprets the provider/model format as an OpenRouter path and
routes through OpenRouter instead of the direct Anthropic API.

Fix: read the configured provider from config.yaml model section. If
the model ID starts with the configured provider name followed by "/",
strip that prefix and pass the provider explicitly to AIAgent. This
ensures direct API providers (Anthropic, OpenAI, etc.) are used when
configured, regardless of the model ID format from the dropdown.
Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: PR #4 — model provider routing fix

Problem Statement

Valid concern — when the dropdown sends a prefixed ID like anthropic/claude-sonnet-4.6, it could be misinterpreted as an OpenRouter model path. This is a real issue worth fixing.

Issues Found

1. provider parameter may not exist on AIAgent (Critical)

Per ARCHITECTURE.md (lines 566-574), AIAgent's documented constructor parameters are: model, platform, quiet_mode, enabled_toolsets, session_id, stream_delta_callback, tool_progress_callback. There is no provider parameter. Without access to the hermes-agent source to verify, this could either silently pass (if AIAgent uses **kwargs) or raise a TypeError at runtime.

Can you confirm this parameter exists in your hermes-agent version?

2. Wrong provider for cross-provider model selection (Logic Bug)

The fix always passes the globally configured provider to AIAgent, even when the user selects a model from a different provider. Example:

  • Config has provider: anthropic
  • User selects openai/gpt-5.4-mini from the dropdown
  • The prefix doesn't match, so the model stays as openai/gpt-5.4-mini
  • But provider='anthropic' is still passed to AIAgent
  • This would try to route an OpenAI model through the Anthropic API

The provider should be inferred from the model ID being selected, not from the global config.

3. Duplicated logic

The same provider-resolution block is copy-pasted in both routes.py (sync path) and streaming.py (async path). This should be a shared helper in api/config.py.

4. Variable naming style

Using underscore-prefixed names (_model, _prov, _mc, _hcfg) for local variables is unconventional — in Python, underscore prefix typically means module-private or 'unused'. Regular names like resolved_model and resolved_provider would be clearer.

Suggested Approach

A cleaner fix would be to add a helper in api/config.py:

def resolve_model_and_provider(model_id):
    """Strip provider prefix from model ID and return (model, provider)."""
    model_cfg = cfg.get('model', {})
    configured_provider = model_cfg.get('provider') if isinstance(model_cfg, dict) else None
    if configured_provider and '/' in model_id and model_id.startswith(configured_provider + '/'):
        return model_id.split('/', 1)[1], configured_provider
    return model_id, None

Then both routes.py and streaming.py can call it. But the cross-provider issue (point 2) still needs to be addressed — likely by parsing the provider from the model ID prefix itself, not from the global config.

Tests

Ran full test suite: 201 passed, 23 failed — all 23 failures are pre-existing. No regressions.

Verdict

The concept is right but the implementation has a logic bug with cross-provider model selection and a potential runtime error with the provider parameter. Requesting changes before merge.

Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update on point #1 from my review: I verified against the upstream hermes-agent sourceAIAgent.__init__ does accept provider as a keyword argument (line 469 of run_agent.py). So that concern is resolved. Apologies for the incorrect call.

The remaining issues still stand:

  • Cross-provider model selection bug (point #2): passing the global config provider even when the user selects a model from a different provider
  • Duplicated logic (point #3): should be a shared helper

…outing

Replace duplicated inline provider resolution in routes.py and streaming.py
with a shared resolve_model_provider() helper in config.py.

Improvements over original:
- If model ID has a prefix matching any known direct-API provider
  (not just the config provider), strip it and route correctly.
  This handles edge cases like localStorage restoring a model from
  a different provider group.
- Single source of truth for the resolution logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @deboste — you identified a real and important routing issue here. I pushed a follow-up commit that:

  1. Extracts the resolution logic into a shared resolve_model_provider() helper in config.py (eliminates the duplication between routes.py and streaming.py)
  2. Extends the prefix matching to handle any known direct-API provider, not just the config provider — this covers edge cases like localStorage restoring a model ID from a different provider group

With those improvements in place, this is ready to merge. Thanks again for the solid bug find!

@nesquena nesquena merged commit b0da4db into nesquena:master Apr 2, 2026
Ola-Turmo pushed a commit to Ola-Turmo/hermes-webui that referenced this pull request Apr 9, 2026
fix(api): resolve model provider from config to prevent misrouting
nesquena-hermes pushed a commit that referenced this pull request Apr 12, 2026
- Title/badge/source: 2025 → 2026, sources updated to Chatbot Arena/BenchLM
- Overall: Opus 4.6 (#1, 1504 Arena Elo), Gemini 3.1 Pro (#2), GPT-5.4 (#3),
  Sonnet 4.6 (#4), DeepSeek V3.2 (#5, replaces DeepSeek R1)
- Coding: Opus 4.6 (#1, powers Claude Code/Cursor), GPT-5.4 (#2, SWE-Pro 57.7%),
  Sonnet 4.6 (#3), Gemini 3.1 Pro (#4), DeepSeek V3.2 (#5)
  Removed: Sonnet 4.5, Gemini 2.5 Pro, GPT-5.3 Codex
- Writing: Opus 4.6 (#1, Mazur 8.53), Gemini 3.1 Pro (#2, Arena CW 1487),
  Sonnet 4.6 (#3, EQ-Bench CW 1936), GPT-5.4 (#4), Meta Muse Spark (#5)
  Removed: Llama 4 Maverick (replaced by Meta Muse Spark)
- Search: Gemini 3.1 Pro (#1), Grok 4 (#2, live X data), Sonnet 4.6+tool (#3),
  GPT-5.4 (#4), Gemini 3 Flash Thinking (#5)
  Added Grok 4 for real-time social/X data
- Reasoning: Gemini 3.1 Pro (#1, GPQA 95.45%), GPT-5.4 (#2), Opus 4.6 (#3),
  Gemini 3 Flash Thinking (#4, best value), DeepSeek V3.2 (#5)
  Removed: Kimi K2, DeepSeek R1 standalone
- Quick picker: all model names updated to 2026 versions
- Setup boxes: OpenAI gpt-5-4 names, Google gemini-3-1-pro-preview,
  self-hosted section updated to DeepSeek V3.2 + Muse Spark
nesquena-hermes pushed a commit that referenced this pull request Apr 12, 2026
Coding section:
- Gemini 3.1 Pro: add Terminal-Bench 78.4% (highest of any frontier model on CLI/DevOps)
  Score badge updated to show Terminal-Bench rather than SWE-bench Verified
- GPT-5.4: note Terminal-Bench 75.1% in description, consolidate pill text
- #5 DeepSeek V3.2 → Qwen 3.6-Plus: leads Terminal-Bench at 61.6%, 88.2% GPQA,
  1M context, available now on Alibaba Cloud + OpenRouter

Writing section (reordered based on EQ-Bench CW scores):
- #1 Claude Sonnet 4.6 (1936 EQ-Bench CW — highest, best voice consistency)
- #2 Claude Opus 4.6 (Mazur 8.53, IF Arena #1, 1M context for literary depth)
- #3 Gemini 3.1 Pro (Arena CW #1 1487, AI-tell avoidance, 2M context)
- #4 GPT-5.4 (noted as ~9th on Arena CW, better for structured/commercial writing)
- #5 Meta Muse Spark → Kimi K2.5 (/usr/bin/bash.60/.50, ~1700 EQ-Bench CW, live API)
  Muse Spark removed — no commercial API available yet

Reasoning section:
- Gemini 3.1 Pro GPQA: 95.45% → 94.1% (more conservative/recent figure, consistent
  with both agents' data)
- Added ARC-AGI-2 77.1% for Gemini 3.1 Pro (#1 on visual reasoning too)
- Opus 4.6: added note that Sonnet leads GDPval-AA (1633 Elo #1) for throughput
- #5 DeepSeek V3.2 → Qwen 3.6-Plus (88.2% GPQA, 1M context, same model as coding)

Quick picker:
- Creative writing: Opus → Sonnet 4.6 (EQ-Bench #1, 85% cheaper)
- Hard reasoning: 95.45% → 94.1%, add ARC-AGI-2 mention
- Budget pick: DeepSeek V3.2 → Gemini 3 Flash Thinking (/usr/bin/bash.50/1M, 89.8% GPQA)

Setup boxes:
- Self-hosted: Muse Spark → Qwen 3.6-Plus + Gemma 4 26B MoE (Apache 2.0,
  82.3% GPQA with 3.8B active params, best edge/self-hosted reasoning)

Overall section: unchanged (top 5 still correct per both agents)
Search section: unchanged (no new data from either agent)
nesquena-hermes pushed a commit that referenced this pull request Apr 14, 2026
index.html:
- Hero badge: 56k+ → 77k+ GitHub stars (NousResearch/hermes-agent: 76,811)

community/index.html:
- All 24 project star counts updated to live GitHub values (April 13 2026)
  Notable bumps: gbrain 4.8k→7.4k, hermes-webui 1.3k→1.8k, hermes-agent 56k+→77k,
  hermes-workspace 1.1k→1.3k, hudui 571→827, self-evolution 893→1.6k
- Removed awizermann/scarf (repo no longer exists on GitHub)
- Added 3 new high-value projects:
  - vectorize-io/hindsight (9.1k ⭐) → Memory Providers: Official Hermes memory plugin,
    biomimetic 3-tier memory, topped LongMemEval Jan 2026
  - rohitg00/agentmemory (1.3k ⭐) → Memory Providers: #1 persistent memory for agents,
    explicit Hermes integration path, 43 MCP tools
  - builderz-labs/mission-control (4.1k ⭐) → Tools & Orchestration: self-hosted agent
    platform with Hermes gateway integration
- Updated project count: 25 → 27

models/index.html:
- Gemini 3.1 Pro: bumped from #4 Arena (1492 Elo) to #1 (1505 Elo)
- Claude Opus 4.6: updated from #1 (1504) to #2 (1503 Elo) — still top tier
- Updated datestamp to April 13, 2026
JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 25, 2026
fix(api): resolve model provider from config to prevent misrouting
zhonghuaY referenced this pull request in zhonghuaY/hermes-webui Apr 27, 2026
…, CLI busy

Re-applied #4 + nesquena#6 against upstream v0.50.217:

- Add /api/gateway-status endpoint returning instance statuses
- Add status dot CSS (.status-dot variants + .stream-cursor + accent rule)
- Add gatewayDot inside composer-model-chip (replaces removed topbar modelChip)
- Append per-session .status-dot in renderSessionListFromCache with
  data-session-id, data-cli-session, data-updated-at metadata
- Per-conversation accent via _hashHue() → --conv-accent CSS var
- Append pollGatewayStatus() (10s) + updateSessionDots() (3s) + setBusy hook
  to static/ui.js — recognizes both composerModelChip and modelChip ids
- get_gateway_instance_statuses() helper in api/gateway_provider.py

Streaming-cursor injection in messages.js was skipped because upstream
replaced the simple Thinking placeholder with smd-based incremental
rendering; upstream's session-state-indicator already covers that role.

Co-authored-by: Copilot <[email protected]>
nesquena-hermes pushed a commit that referenced this pull request May 2, 2026
Three changes from the pre-merge Opus review:

**MUST-FIX** — XPC_SERVICE_NAME false-positive on macOS Terminal

macOS launchd sets `XPC_SERVICE_NAME` in EVERY Terminal-spawned shell, not
just real services. Typical noise values: `"0"` (truthy in Python!) and
`"application.com.apple.Terminal.<UUID>"`. A bare `os.environ.get(name)`
existence check would auto-promote interactive `./start.sh` runs to
foreground mode on every Mac dev machine — silently breaking the most
common installation path (no /health probe, no browser open, no log file,
hanging shell).

Fix: new `_is_real_supervisor_value()` helper that filters noise. For
`XPC_SERVICE_NAME` specifically, reject `"0"` and any `"application.*"`
prefix. Real launchd plists use reverse-DNS Label form (`com.<rdns>.<svc>`)
which still triggers correctly.

7 new tests in `TestXPCServiceNameNoiseFilter`:
- 4 noise values (`0`, Terminal.app, iTerm2, VSCode) → no detection
- 3 real Label forms → correct detection
- Mixed env with XPC noise + real INVOCATION_ID → falls through to systemd

**SHOULD-FIX 1** — Test env leakage

The original `clean_env` fixture stripped supervisor-detection env vars
but not the resolved bootstrap vars (HERMES_WEBUI_HOST/PORT/AGENT_DIR)
that `main()` mutates onto `os.environ`. After
`test_foreground_exports_resolved_env_vars` ran, later tests would import
bootstrap with polluted defaults (DEFAULT_HOST="0.0.0.0" instead of
"127.0.0.1"). Existing assertions still passed (tautological vs DEFAULT_*),
but it was a footgun for future tests.

Fix: extend `clean_env` to also `delenv` the three resolved vars before
each test.

**SHOULD-FIX 2** — Pre-execv executability guard

If `discover_launcher_python` returns a path that doesn't exist or isn't
executable, `os.execv` raises OSError → wrapper catches → SystemExit(1)
→ supervisor restarts → loop forever. That's exactly the failure mode
this PR is supposed to eliminate.

Fix: `os.access(python_exe, os.X_OK)` check before execv. Converts
infinite supervisor loop into a single visible RuntimeError.

1 new test in `TestForegroundExecutabilityGuard` pinning that the guard
fires before execv when the python path is non-executable.

**Docs** — supervisor.md updates

- New section explaining the XPC_SERVICE_NAME noise filter and what
  values trigger / don't trigger detection
- New section listing supervisors that are NOT auto-detected (runit,
  daemontools, PM2, Foreman/Honcho, custom shell-script supervisors)
  with explicit recommendation to set HERMES_WEBUI_FOREGROUND=1

Verification

- 3820 tests pass (+9 from this commit's new tests vs the original PR
  push of 3811)
- Filter manually verified end-to-end with the live os.environ:
  XPC=0 → None, XPC=application.* → None, XPC=com.example.foo → triggers
- run-browser-tests.sh ALL CHECKS PASSED on the worktree

Items deferred from the Opus review

- #4 chdir target may not exist: REPO_ROOT comes from __file__.resolve()
  so it's stable; not a real concern in practice
- #6 two startup messages in foreground mode: cosmetic, useful for
  diagnostics
- #7 stricter explicit-only mode: leaves user the override of just not
  passing --foreground (current behavior)
- #8 test stub return value: trivial, can fix later if regression surface
- #9 argparse positional-after-option ordering: test reads fine

These can be follow-up issues if anyone hits them.
nesquena-hermes pushed a commit that referenced this pull request May 3, 2026
…n guard)

CHANGELOG, ROADMAP, TESTING bumped (3925 → 3929 tests collected).

Opus SHOULD-FIX absorbed in-release: tests #1-3 documented the dedup
contract via direct construction but did not invoke get_models_grouped().
Test #4 (test_get_models_grouped_unconfigured_providers_get_independent_dicts)
inspects the live source for the literal copy.deepcopy(auto_detected_models)
call AND runs an end-to-end smoke of the fixed assignment loop.

A future refactor that removes the deepcopy at api/config.py:2078 will
fail this test immediately.
nesquena-hermes added a commit that referenced this pull request May 4, 2026
SHOULD-FIX #1 (renamed-root client cross-alias): drop strict-equality client
filter at static/sessions.js:1853. Server-side _profiles_match cross-aliases
'default'-tagged rows to a renamed root 'kinni'; the strict-equality client
would reject them, dropping every legacy session for renamed-root users. The
server is now solely authoritative for profile scoping.

SHOULD-FIX #2 (messaging-source dedupe ordering): _keep_latest_messaging_session_per_source
now runs AFTER the profile filter at api/routes.py:2078. Before, it ran on
the merged-cross-profile list with profile-blind keys, discarding the older
profile's row across profiles before the scope filter — leaving zero rows for
any messaging identity the active profile shared with another profile.

NIT #3: _projects_migrated flag now set only AFTER successful save_projects.
NIT #4: cleaned dead test code in test_is_root_profile_invalidation_drops_stale.
NIT #5: _create_profile_fallback's clone_from=='default' literal now routes
through _is_root_profile() for parity with the 5 other callsites.

+2 regression tests pin the SHOULD-FIX shapes:
- test_keep_latest_messaging_runs_after_profile_filter (source-string ordering)
- test_static_sessions_js_trusts_server_profile_scoping (no client re-filter)

4173 -> 4175 tests pass. 0 regressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants