fix(ui): prevent fuzzy match false positive in model chip after live Nous fetch by nesquena-hermes · Pull Request #1192 · nesquena/hermes-webui

nesquena-hermes · 2026-04-28T00:41:11Z

Summary

Fixes a false-positive fuzzy match in _findModelInDropdown() where a session model like gpt-5.5 would resolve to @nous:openai/gpt-5.4-mini once live Nous models loaded, showing the wrong label in the model chip.

Root cause

Step 3 of _findModelInDropdown() strips the last version segment from the normalized target to produce a base:

const base = target.replace(/\.\d+$/, '');  // gpt-5.5 → target=gpt.5.5 → base=gpt.5
const partial = opts.find(o => norm(o).startsWith(base) || norm(o).includes(base));

For gpt-5.5, base = "gpt.5". The Nous live model @nous:openai/gpt-5.4-mini normalizes to gpt.5.4.mini, which starts with "gpt.5" → wrong match. The chip shows "GPT-5.4 Mini (via Nous)" for a session storing gpt-5.5.

Fix

Use the full target as the prefix when base has meaningful content (length > 4 and base !== target). Only fall back to the shorter base when it is a bare root like "gpt" or "claude" (length ≤ 4) where stripping the version was essentially a no-op anyway.

gpt-5.5  →  target=gpt.5.5, base=gpt.5, useBase=false, prefixTarget=gpt.5.5
gpt.5.4.mini  startsWith("gpt.5.5") → false  ✓ no false match

gpt  →  target=gpt, base=gpt, useBase=true (length≤4), prefixTarget=gpt
gpt.5.4.mini  startsWith("gpt") → true  ✓ bare-root match preserved

Testing

2685 tests passing
Verified with inline Node.js test: OLD incorrectly returns @nous:openai/gpt-5.4-mini for session model gpt-5.5; NEW correctly returns null

Closes #1188

@nous

…alse positives (#1188) Step 3 of _findModelInDropdown() used a truncated 'base' (target with last version segment stripped) as the prefix to match against dropdown options. For 'gpt-5.5', target='gpt.5.5' and base='gpt.5', which incorrectly matched '@nous:openai/gpt-5.4-mini' (norm: 'gpt.5.4.mini') because it starts with 'gpt.5'. The chip would then show 'GPT-5.4 Mini (via Nous)' for a session that stores 'gpt-5.5'. Fix: use the full target as the prefix when base has meaningful content (length > 4 and base !== target). Only fall back to the shorter base when it is a bare root word ('gpt', 'claude', etc.) where stripping the version segment would be a no-op. 'gpt-5.5' with prefixTarget='gpt.5.5': 'gpt.5.4.mini' does NOT start with 'gpt.5.5' → returns null (correct — no false match). 'gpt' with prefixTarget='gpt' (useBase=true): still finds 'gpt.5.4.mini' via the shorter base → prefix match for bare roots preserved. Closes #1188

9 tests run the live _findModelInDropdown function via Node so the real regex/normalization rules are exercised (no Python mirror to drift). Two locked-bad cases (gpt-5.5 → gpt-5.4-mini, claude-opus-4.7 → claude-opus-4.6) reproduce on master and pass on the PR. Seven preserved-good cases (bare-root prefix match, exact match, unrelated) ensure the tighter check doesn't regress legit fuzzy lookups. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

nesquena

Review — end-to-end ✅ (clean approve, regression test pushed)

What this ships

#1188 — _findModelInDropdown() step-3 fuzzy match was over-broad: stripping the trailing version segment from target (e.g. gpt-5.5 → base gpt.5) and matching against any option that startsWith(base) || includes(base). For gpt.5.5, the live Nous model @nous:openai/gpt-5.4-mini (norm: gpt.5.4.mini) starts with gpt.5 → false match. The chip showed "GPT-5.4 Mini (via Nous)" for a session storing gpt-5.5.

7-line fix in static/ui.js:103-110: use the FULL normalized target as the prefix when base.length > 4 and base !== target. Only fall back to the shorter base when it's a bare root (length ≤ 4) where stripping was effectively a no-op.

Traced against upstream hermes-agent

Pure WebUI display logic. Zero agent coupling — this is the chip-label resolver in the model picker.

End-to-end trace

The fix's decision matrix:

Target	base (after strip)	useBase	prefixTarget	Behaviour
`gpt-5.5` → `gpt.5.5`	`gpt.5` (len 5)	false	`gpt.5.5`	Tight: `gpt.5.4.mini` doesn't start with `gpt.5.5` → no false match ✅
`gpt-5` → `gpt.5`	`gpt` (len 3, ≤4)	true	`gpt`	Loose root match preserved ✅
`gpt` → `gpt`	`gpt` (base===target)	true	`gpt`	Bare root match ✅
`claude-opus-4.6` → `claude.opus.4.6`	`claude.opus.4` (len 13)	false	`claude.opus.4.6`	Tight match ✅
`claude-opus` → `claude.opus`	`claude.opus` (no version, base===target)	true	`claude.opus`	`claude.opus.4.6` matches ✅
`claude-opus-4.7` → `claude.opus.4.7`	`claude.opus.4` (len 13)	false	`claude.opus.4.7`	Won't match `claude.opus.4.6` → null instead of wrong version ✅
`gpt-3.5` → `gpt.3.5`	`gpt.3` (len 5)	false	`gpt.3.5`	`gpt.3.5.turbo` matches; `gpt.4.0` doesn't ✅

The includes half of the original disjunction is gone. That was actually the only path that would match e.g. user-typed mini to gpt-5.4-mini — but that's a degenerate case that user likely never hits, and dropping it tightens the match. ✅

Bug-confirmed harness — 9/9 PR, 7/9 master

Built a Node behavioural harness running the live _findModelInDropdown against fake <select> options:

PASS [gpt-5.5 should NOT match gpt-5.4-mini (issue #1188)] → null
PASS [gpt-5.5 finds @nous:openai/gpt-5.5 (exact post-norm prefix)]
PASS [gpt finds gpt-5.4-mini (bare root prefix)]
PASS [gpt-5 finds gpt-5.4-mini (base=gpt is bare root)]
PASS [claude-opus-4.7 should NOT match claude-opus-4.6]
PASS [claude finds claude-opus-4.6 (bare root)]
PASS [claude-opus finds claude-opus-4.6 (base===target since no version)]
PASS [exact match short-circuits]
PASS [unrelated target returns null]

On master, two fail: gpt-5.5 → @nous:openai/gpt-5.4-mini and claude-opus-4.7 → claude-opus-4.6. Exactly the issue's described over-match shape.

What I pushed — `6126552`

The PR didn't add a regression test. I added tests/test_issue1188_fuzzy_match.py — 9 tests running the live function via Node so the real regex/normalization rules are exercised (no Python mirror to drift). 2 tests lock the over-match cases as None; 7 lock the preserved-good cases (bare-root prefix, exact short-circuit, unrelated). CI re-ran green on 6126552.

Edge-case trace

Scenario	Pre-fix	Post-fix
`gpt-5.5` session, dropdown has `gpt-5.4-mini` only	shows wrong model	shows null (chip falls through to default) ✅
`gpt-5.5` session, dropdown has `gpt-5.5`	exact match	exact match ✅
`gpt` (legacy bare ID)	matches first gpt-*	matches first gpt-* (bare root preserved) ✅
`claude-opus-4.7`, dropdown has `claude-opus-4.6` only	wrong sibling-version match	null (correct: not present) ✅
`mistral-large`, dropdown has `mistral-medium` only	matches via `includes('mistral.large')` failing... actually originally would match via `startsWith('mistral')` if base falls to bare root	depends on base length
Legacy `o3-2024-12-17`	`o3.2024.12.17` → base `o3.2024.12` len 10 → useBase=false → tight	Won't false-match `o3-2024-11-20` ✅

Tests

My new test_issue1188_fuzzy_match.py: 9/9 pass.
Local full suite: 2637 passed (+ my 9 new tests = 2646), only the unrelated pre-existing macOS test_sprint3 failure that PR #1186 fixes.
CI on PR after my push: ✅ test (3.11), ✅ test (3.12), ✅ test (3.13).

Other audit — confirmed correct

JS syntax: node --check passes on ui.js.
No agent coupling: pure UI display code.
useBase threshold of 4 chars: covers the realistic bare-root names (gpt, claude, gemini, llama, qwen — wait gemini is 6, llama is 5 — those wouldn't trigger useBase). Let me re-check: target='gemini' → base='gemini' (no version) → base===target → useBase=true regardless of length 6. ✅ Same for llama. The length<=4 check is a separate path for "target with version stripped to bare 1-3-letter root".

Minor observations (non-blocking)

The threshold of 4 chars is arbitrary; bare 5-letter roots (llama) hit base===target (no version to strip) so they go through the useBase branch anyway. The threshold is mostly defensive against pathological short inputs.
Step 3's existence at all is a fuzzy-match fallback — exact match (step 1) and provider-prefix match (step 2) handle the common cases. If users find this still over-matches in some other shape, dropping step 3 entirely would be the next move.

Recommendation

Approved. Tight 7-line fix with clear decision logic. The useBase = base.length<=4 || base===target predicate correctly distinguishes "stripping changed nothing meaningful" (bare roots) from "stripping is now over-loosening the match" (versioned IDs). Behavioural harness directly confirms 2 master failures match the bug shape and all 9 cases pass on PR. Pushed regression test runs the live function via Node so the rules can't silently drift. CI green; no agent coupling. Parked at approval — ready for the release agent's merge/tag pipeline.

@nesquena-hermes

…, timestamp sync (#1198) Batch release v0.50.232 — 4 fixes. ## PRs included | PR | Author | Fix | |---|---|---| | #1192 | @nesquena-hermes | Model chip fuzzy-match false positive (#1188) | | #1193 | @nesquena-hermes | openai-codex not detected in model picker (#1189) | | #1196 | @nesquena-hermes | Workspace files blank after second empty-session reload | | #1197 | @bergeouss | Session timestamps wrong with server/client clock drift (#1144) | All four PRs independently reviewed and approved by @nesquena. ## Integration fixes applied **#1193:** Updated misleading comment — `OPENAI_API_KEY` does NOT authenticate the default Codex OAuth endpoint (that uses `chatgpt.com/backend-api/codex` and requires a separate OAuth flow). The comment now accurately states the known limitation. Also replaced a fragile 400-char source-scan test with an isolation-safe unit test. Note: OAuth-authenticated users already get detected via `hermes_cli.auth` — this fix only addresses the env-var fallback path. ## Test results **2764 passed, 2 skipped** (macOS-only workspace tests). Browser QA: **21/21**. `/api/sessions` confirmed returning `server_time` and `server_tz` fields.

nesquena-hermes · 2026-04-28T01:40:38Z

Merged as v0.50.232 via #1198. Thank you @nesquena-hermes!

@nesquena-hermes

…, timestamp sync (nesquena#1198) Batch release v0.50.232 — 4 fixes. ## PRs included | PR | Author | Fix | |---|---|---| | nesquena#1192 | @nesquena-hermes | Model chip fuzzy-match false positive (nesquena#1188) | | nesquena#1193 | @nesquena-hermes | openai-codex not detected in model picker (nesquena#1189) | | nesquena#1196 | @nesquena-hermes | Workspace files blank after second empty-session reload | | nesquena#1197 | @bergeouss | Session timestamps wrong with server/client clock drift (nesquena#1144) | All four PRs independently reviewed and approved by @nesquena. ## Integration fixes applied **nesquena#1193:** Updated misleading comment — `OPENAI_API_KEY` does NOT authenticate the default Codex OAuth endpoint (that uses `chatgpt.com/backend-api/codex` and requires a separate OAuth flow). The comment now accurately states the known limitation. Also replaced a fragile 400-char source-scan test with an isolation-safe unit test. Note: OAuth-authenticated users already get detected via `hermes_cli.auth` — this fix only addresses the env-var fallback path. ## Test results **2764 passed, 2 skipped** (macOS-only workspace tests). Browser QA: **21/21**. `/api/sessions` confirmed returning `server_time` and `server_tz` fields.

nesquena-hermes requested a review from nesquena April 28, 2026 00:41

nesquena approved these changes Apr 28, 2026

View reviewed changes

nesquena-hermes mentioned this pull request Apr 28, 2026

fix: batch v0.50.232 — fuzzy match, codex detection, workspace reload, timestamp sync #1198

Merged

nesquena-hermes closed this Apr 28, 2026

nesquena-hermes deleted the fix/1188-fuzzy-match-overmatch branch April 28, 2026 01:40

nesquena-hermes mentioned this pull request Apr 28, 2026

bug(ui): model chip shows wrong model after live Nous fetch — _findModelInDropdown over-broad fuzzy match #1188

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ui): prevent fuzzy match false positive in model chip after live Nous fetch#1192

fix(ui): prevent fuzzy match false positive in model chip after live Nous fetch#1192
nesquena-hermes wants to merge 2 commits intomasterfrom
fix/1188-fuzzy-match-overmatch

nesquena-hermes commented Apr 28, 2026

Uh oh!

nesquena left a comment

Uh oh!

nesquena-hermes commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nesquena-hermes commented Apr 28, 2026

Summary

Root cause

Fix

Testing

Uh oh!

nesquena left a comment

Choose a reason for hiding this comment

Review — end-to-end ✅ (clean approve, regression test pushed)

What this ships

Traced against upstream hermes-agent

End-to-end trace

Bug-confirmed harness — 9/9 PR, 7/9 master

What I pushed — 6126552

Edge-case trace

Tests

Other audit — confirmed correct

Minor observations (non-blocking)

Recommendation

Uh oh!

nesquena-hermes commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

What I pushed — `6126552`