perf: cache live model lookups briefly by NocGeek · Pull Request #1378 · nesquena/hermes-webui

NocGeek · 2026-05-01T00:22:52Z

Summary

add a 60-second in-memory TTL cache for /api/models/live
scope cache entries by active WebUI profile and provider
return deep copies so callers cannot mutate cached payloads
add regression tests for cache hits, expiry, profile scoping, and mutation isolation

Tests

python3 -m py_compile api/routes.py tests/test_live_models_ttl_cache.py
python -m pytest tests/test_live_models_ttl_cache.py tests/test_opencode_providers.py::test_live_models_handler_delegates_to_provider_model_ids tests/test_opencode_providers.py::test_live_models_ui_no_longer_skips_any_provider
python -m pytest tests/test_byok_model_dropdown.py tests/test_credential_pool_providers.py tests/test_custom_providers_in_panel.py tests/test_issue1094_provider_bugs.py tests/test_provider_management.py tests/test_issues_373_374_375.py tests/test_live_models_ttl_cache.py tests/test_opencode_providers.py

Local focused result: 6 passed
Local broader provider/model result: 120 passed

nesquena-hermes · 2026-05-01T00:55:51Z

Nice — this directly closes the TODO already sitting in api/routes.py (lines ~146–148):

# TODO: Add TTL-based caching (e.g. 60s) so repeated model-list requests
# don't hit provider APIs.  The frontend already caches via _liveModelCache
# but the backend re-fetches on every /api/models/live call.

A few thoughts from skimming the diff:

Profile-scoping the cache key is the right call. /api/models/live results depend on the active profile's credentials, so a global cache would mask provider keys differing across profiles. Good that the regression test asserts this explicitly.
Deep-copying on read is the right call too. Returning a shared dict and letting a handler mutate it before serialization would be a great way to introduce a subtle "first request after restart looks fine, second request is missing fields" bug. Worth making sure the deep-copy path is exercised before serialization in the actual handler, not only in the test (copy.deepcopy on a 200-model list is cheap but not free — a quick time.perf_counter log would tell us if it shows up at scale).
60s TTL matches the comment hint. One thing to consider: when a user just added a new BYOK key and hits "refresh models" in the picker, they may now wait up to 60s before the new provider's models appear. If that becomes a complaint, the TTL cache could grow a "bypass on explicit refresh" query param (?nocache=1) — not blocking on this PR, just flagging.
Cache invalidation on profile switch / credential rotation — worth confirming that operations that mutate provider credentials (adding/removing a BYOK key, switching a profile's credential_pool) don't leave stale entries pinned for up to a minute. If invalidation does need to be added, exposing a small _clear_live_models_cache(profile=None) and calling it from the credential-update path would be the cleanest hook.

Tests cover hit/expiry/profile/mutation, which is the right square. Will defer merge to maintainer review.

release: v0.50.252 — 6 fork PRs (#1377 #1378 #1379 #1380 #1382 #1386) + Opus follow-ups

nesquena-hermes · 2026-05-01T05:11:42Z

Released as part of v0.50.252 — thanks @NocGeek!

This PR was merged into the v0.50.252 release batch via #1387 alongside 5 other contributor fixes. The full CHANGELOG entry is at https://github.com/nesquena/hermes-webui/blob/master/CHANGELOG.md.

Pre-release verification: 3507 pytest tests pass, full QA harness pass (20 structural + 11 browser API + 23 Agent Browser CDP), Opus mentor APPROVED with two non-blocking follow-ups applied during the release batch (force=True on agent redactor, debug-log on profile fallback).

Closing this PR — the change is live on master.

perf: cache live model lookups briefly

6d1183a

nesquena-hermes mentioned this pull request May 1, 2026

release: v0.50.252 — 6 fork PRs (#1377 #1378 #1379 #1380 #1382 #1386) + Opus follow-ups #1387

Merged

nesquena-hermes added a commit that referenced this pull request May 1, 2026

Merge pull request #1387 from nesquena/stage-may1

b5009dd

release: v0.50.252 — 6 fork PRs (#1377 #1378 #1379 #1380 #1382 #1386) + Opus follow-ups

nesquena-hermes closed this May 1, 2026

pull Bot pushed a commit to JamesWilliam1977/hermes-webui that referenced this pull request May 1, 2026

Cache /api/models/live with 60s TTL (nesquena#1378)

a6d831f

GeoffBao pushed a commit to GeoffBao/hermes-webui that referenced this pull request May 1, 2026

Cache /api/models/live with 60s TTL (nesquena#1378)

830d9be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: cache live model lookups briefly#1378

perf: cache live model lookups briefly#1378
NocGeek wants to merge 1 commit intonesquena:masterfrom
NocGeek:perf/live-models-ttl-cache

NocGeek commented May 1, 2026

Uh oh!

nesquena-hermes commented May 1, 2026

Uh oh!

nesquena-hermes commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NocGeek commented May 1, 2026

Summary

Tests

Uh oh!

nesquena-hermes commented May 1, 2026

Uh oh!

nesquena-hermes commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants