Skip to content

fix(profiles): profile switch now shows correct workspace, model, and chip label immediately#1203

Closed
nesquena-hermes wants to merge 2 commits intomasterfrom
fix/profile-switch-workspace-model
Closed

fix(profiles): profile switch now shows correct workspace, model, and chip label immediately#1203
nesquena-hermes wants to merge 2 commits intomasterfrom
fix/profile-switch-workspace-model

Conversation

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

@nesquena-hermes nesquena-hermes commented Apr 28, 2026

Summary

Profile switching was broken in three distinct ways. After clicking a different profile in the composer chip dropdown, the workspace showed the wrong directory, the model dropdown kept the old profile's models, and the profile chip label stayed on the old name. After a page refresh the profile name corrected but workspace and model were still wrong.

Root causes were three separate bugs across backend and frontend. All fixed in this PR.


Bugs fixed

Bug 1 — wrong workspace immediately after switch

File: api/profiles.pyswitch_profile(process_wide=False)

switch_profile() called get_last_workspace() to populate the default_workspace field in its return value. get_last_workspace() routes through get_active_profile_name(), which reads the thread-local profile context. During the switch request, thread-local is still set to the old profile (the Set-Cookie response hasn't been processed by a new request yet). So it returned the old profile's workspace.

Fix: Read directly from home / 'webui_state' / 'last_workspace.txt' where home is the already-resolved target profile directory. Falls back to config.yaml keys (workspace, default_workspace, terminal.cwd), then DEFAULT_WORKSPACE. Also adds expanduser() on the file content (for ~/... paths) and fixes the exception fallback to use DEFAULT_WORKSPACE instead of re-calling get_last_workspace().

Bug 2 — wrong model dropdown immediately after switch

File: api/routes.py/api/profile/switch handler

The in-memory models cache (_available_models_cache) was not cleared after a profile switch. The client immediately calls /api/models after switching, but a fresh cache from the old profile was returned unchanged.

Fix: Call invalidate_models_cache() in the route handler immediately after switch_profile() succeeds. Full invalidation (in-memory + disk) is required because the disk cache is a single per-server file, not per-profile.

Bug 3 — profile chip label stays stale

File: static/ui.jssyncTopbar()

syncTopbar() has an early-return path when S.session is null — which it is during a profile switch with no active conversation. The profileChipLabel update was only in the main path, after that early return, so it never ran when there was no session.

Fix: Added the chip update inside the !S.session early-return branch, before return;.

Bug 4 + 4b — flaky timing tests

File: tests/test_issue1144_session_time_sync.py

Both skew-compensation tests (positive_skew, negative_skew) called Date.now() twice with an internal call in between and asserted exact equality. Under CPU load, 1ms jitter caused intermittent failure.

Fix: Both tests now use midpoint averaging of t0..t1 with ±5ms tolerance.


Tests

tests/test_profile_switch_1200.py — 9 tests total:

  • 5 direct bug-verification tests (one per fix + chip update path)
  • 4 named regression guards to catch future regressions immediately

Results: 2773 passed, 0 failed, 2 skipped (macOS-only tests, intentional)


Browser verification

Tested against a running server with four real profiles (default, camanji, webui, codepath):

Switch Profile chip Model Workspace
default → camanji camanji gpt-5.5 Camanji
camanji → webui webui correct ✅ correct ✅
webui → default default claude-sonnet-4.6 Home

All updates appear immediately after switch, no reload needed.

Closes #1204

… chip label immediately (#1200)

Three separate bugs caused profile switching to feel broken after each change:

Bug 1 — wrong workspace (api/profiles.py):
  switch_profile(process_wide=False) called get_last_workspace() to populate the
  default_workspace return field. That function routes through get_active_profile_name()
  which reads thread-local context — still pointing at the OLD profile during the switch
  request (the Set-Cookie hasn't been processed yet). Fix: read directly from
  home/webui_state/last_workspace.txt where home is the already-resolved target profile
  directory, with fallbacks to config.yaml terminal.cwd and DEFAULT_WORKSPACE.
  Also: expanduser() on file content for ~/... paths; exception fallback now uses
  DEFAULT_WORKSPACE instead of re-calling the buggy get_last_workspace().

Bug 2 — wrong model dropdown (api/routes.py):
  The in-memory models cache (_available_models_cache) was not cleared after a switch.
  A populated cache from the old profile was served unchanged to the first /api/models
  request after switching. Fix: call invalidate_models_cache() in the /api/profile/switch
  route handler immediately after a successful switch. This forces a cold rebuild from
  the new profile's config on the next request. Full disk+memory invalidation is
  necessary because the disk cache is a single per-server file, not per-profile.

Bug 3 — profile chip label stale (static/ui.js):
  syncTopbar() returns early when S.session is null, which is the state during a
  profile switch with no active conversation. The profileChipLabel update was only in
  the main path after that early return, so it never ran. Fix: added the chip update
  to the early-return branch before return;.

Bug 4+4b — flaky timing tests (tests/test_issue1144_session_time_sync.py):
  Both skew-compensation tests used exact Date.now() equality across two consecutive
  calls with an internal call in between. Under CPU load, 1ms jitter caused intermittent
  failure. Fixed both: midpoint averaging of t0..t1 with ±5ms tolerance window.

Browser verified: default→camanji→webui→default all show correct profile chip, model,
and workspace immediately after switch with no reload needed.

9 new tests in tests/test_profile_switch_1200.py, including 4 named regression guards.
2773 tests pass, 0 failures.
…ace lookup

api/profiles.py:17 already imports Path from pathlib at module level, so
re-importing it as _Path inside switch_profile's try-block was redundant.
The aliasing also created a latent NameError path: if both inner
"from api.config import DEFAULT_WORKSPACE" imports failed, the final
fallback str(_Path.home()) would NameError because _Path was never bound
(only conditionally assigned inside the outer try). Practically
unreachable (api.config has been imported successfully at module load
time already, so a second import attempt won't suddenly fail), but
unnecessary code surface.

Switch to the module-level Path everywhere in this block.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — end-to-end ✅ (approved with small cleanup pushed)

What this ships

Three coordinated bugs fixed in profile switching (#1200), plus a flaky-test stabilisation:

Bug File Problem Fix
1 api/profiles.py:309-351 switch_profile(process_wide=False) called get_last_workspace() which routed through thread-local profile context — still pointing at the OLD profile during the switch. Returned old workspace. Read directly from home / 'webui_state' / 'last_workspace.txt' with config.yaml fallbacks.
2 api/routes.py:1556-1560 In-memory _available_models_cache not invalidated after switch → next /api/models returned old profile's models. Call invalidate_models_cache() after switch_profile() succeeds.
3 static/ui.js:1859-1861 syncTopbar() early-returns when S.session is null (the state during a switch with no active conversation). The profileChipLabel update only ran in the main path AFTER the early return. Update the chip inside the early-return branch before return;.
4 tests/test_issue1144_session_time_sync.py Two skew-compensation tests asserted exact equality across two Date.now() calls; intermittent 1-2ms jitter caused flakes. Midpoint averaging with ±5ms tolerance.

Plus 9 new tests in tests/test_profile_switch_1200.py (5 direct verification + 4 named regression guards).

Traced against upstream hermes-agent

Profile system is shared between WebUI and hermes-agent CLI. Pulled fresh nousresearch/hermes-agent tarball.

The agent reads profile state from ~/.hermes/profiles/<name>/ directly — no dependence on WebUI's request-scoped thread-local profile. The PR's fix reads from the target profile's resolved home directory, which is the same directory the agent uses. Cross-tool consistency: a workspace persisted by the agent CLI under <profile>/webui_state/last_workspace.txt would also be picked up by the WebUI's switch-profile flow. ✅

The webui_state/ subdirectory under each profile home is WebUI-specific (set up at api/workspace.py:46). The agent CLI writes its own state under different paths. No conflict.

End-to-end trace

Bug 1 — thread-local mismatch confirmation

api/profiles.py:94-104get_active_profile_name() returns _tls.profile (set per-request from cookie) or falls back to _active_profile (process-global). During a process_wide=False switch:

  1. Request arrives with the OLD profile cookie → set_request_profile(OLD) was called by the routing layer
  2. Handler calls switch_profile(NEW, process_wide=False)
  3. Inside the function, _tls.profile is still OLD (we haven't received a new request yet)
  4. The PR's pre-fix code called get_last_workspace()get_active_profile_name() → returns OLD → reads OLD's last_workspace.txt → wrong workspace
  5. Set-Cookie response is sent. NEXT request will have NEW cookie set in TLS, but THIS response already has the wrong default_workspace baked in.

The new code bypasses the thread-local by reading from home / 'webui_state' / 'last_workspace.txt' where home was already resolved at line 261-263 from the user-supplied name parameter. Direct, no thread-local indirection. ✅

Fallback chain order is sensible:

  1. last_workspace.txt — most recent user-chosen for this profile
  2. cfg.workspace / cfg.default_workspace — explicit profile config keys
  3. cfg.terminal.cwd — agent CLI's working dir (most common shared key)
  4. DEFAULT_WORKSPACE — boot-time constant

Each step verifies is_dir() so a stale path doesn't propagate.

Bug 2 — cache invalidation timing

api/routes.py:1553-1562:

result = switch_profile(name, process_wide=False)
from api.config import invalidate_models_cache
invalidate_models_cache()
return j(handler, result, extra_headers={'Set-Cookie': build_profile_cookie(name)})

invalidate_models_cache() clears BOTH the in-memory cache and the on-disk cache (api/config.py:1148-1172). The PR's claim about disk being single-per-server (not per-profile) is correct — verified by reading the cache implementation; full invalidation is necessary.

Order: switch happens, cache cleared, response sent with cookie. The next request (with NEW cookie) hitting /api/models cold-builds from the new profile's config. ✅

Bug 3 — early-return chip update

static/ui.js:1856-1864:

if(!S.session){
  ...
  if(typeof syncAppTitlebar==='function') syncAppTitlebar();
  // Update profile chip even when no session is active (e.g. right after profile switch)
  const _profileLabel=$('profileChipLabel');
  if(_profileLabel) _profileLabel.textContent=S.activeProfile||'default';
  return;
}

Verified the main-path chip update at the end of the function reads the same S.activeProfile value, so the new branch matches semantics. ✅

Bug 4 — flaky test stabilisation

The original tests asserted clientNow - serverNow === 3_600_000 (exact ms equality). Both clientNow and serverNow call Date.now(), but at different physical times — so a 1ms increment between the two calls produces a diff of 3,600,001 ms, failing the equality check. The fix samples t0 before and t1 after, takes the midpoint, asserts within ±5ms. Sound stabilisation pattern.

What I caught — latent NameError in the workspace fallback

The original PR's switch_profile code did a redundant local import:

from pathlib import Path as _Path  # but Path is already imported at line 17

In the exception handler:

except Exception:
    try:
        from api.config import DEFAULT_WORKSPACE as _DW2
        default_workspace = str(_DW2)
    except Exception:
        default_workspace = str(_Path.home())  # _Path may be unbound

If the OUTER try fails before line from pathlib import Path as _Path runs (e.g., the from api.config import DEFAULT_WORKSPACE as _DW import on the line above raises), then _Path is never bound. The double-fallback at str(_Path.home()) would raise NameError rather than gracefully producing a path.

In practice this is unreachable — api.config is already imported at module load time (line 17), so a second import attempt won't suddenly fail. But the code surface is unnecessary.

What I pushed — 5634f21

Removed the redundant Path as _Path aliasing; uses the module-level Path everywhere in the workspace-derivation block. Same behaviour, no latent NameError, cleaner. CI re-ran green on 5634f21.

Edge-case trace

Scenario Behaviour
Switch to profile with last_workspace.txt Returns that path ✅
Switch to profile with stale last_workspace.txt (deleted dir) is_dir() check skips, falls through to cfg keys ✅
Switch to profile with no webui_state/ yet Falls through to cfg keys → DEFAULT_WORKSPACE ✅
Switch to profile with config.yaml terminal.cwd only Returns that path ✅
Switch to profile with all-bad workspace candidates Returns DEFAULT_WORKSPACE ✅
Switch with home directory unreadable Falls through outer except → DEFAULT_WORKSPACE ✅
Switch + immediate /api/models call Models cache rebuilt from new profile ✅
Switch with no active session (composer-only) Profile chip updates ✅
Switch with active session Chip updates via main path (existing behaviour) ✅

Tests

  • PR's 9 new profile-switch tests + my pushed cleanup: pass.
  • #1144 stabilised tests: 23/23 in test_issue1144_session_time_sync.py, including my prior fractional-offset additions.
  • Local full suite: 2728 passed, 47 skipped, 0 failures (the macOS test_sprint3 quirk now fixed by my #1186 in v0.50.231).
  • CI on PR after my push: ✅ test (3.11), ✅ test (3.12), ✅ test (3.13).

Other audit — confirmed correct

  • JS syntax: node --check passes.
  • No agent coupling: profile system is shared but the read pattern matches what the agent CLI does.
  • Profile lock not held during workspace lookup: the fallback chain runs after with _profile_lock exits, which is correct (the lookup is read-only against a stable home path).
  • Browser QA in PR description: 3 manual switch sequences (default→camanji, camanji→webui, webui→default) all show correct chip/model/workspace immediately. Matches the static-test contract.

Minor observations (non-blocking)

  • The process_wide=False branch reads cfg via direct yaml.safe_load rather than get_config() — that's intentional (the PR comment explains: _cfg_cache is process-global and would still hold the OLD profile's config). The slight code duplication is a worthwhile trade-off for correctness.
  • _DW / _DW2 aliasing in the import statements (from api.config import DEFAULT_WORKSPACE as _DW) is unusual but harmless — protects against accidental shadowing within the function. Could be tightened to a single _DW outside the try block.
  • The cookie + cache-clear + response ordering means the cache is invalidated even if cookie-setting fails. Not a real concern since cookie-setting is just a header append.

Recommendation

Approved (with cleanup pushed). Three real, distinct bugs in profile switching all addressed at the right layer: backend reads from the target profile's home directly (closing the thread-local race), routes call cache-invalidate after the switch (closing the model-list race), frontend chip updates from the no-session path. The flaky-test fix using midpoint averaging is the right pattern. My cleanup removes a latent NameError + redundant import without changing behaviour. Comprehensive 9-test coverage with explicit named regression guards. CI green; full suite clean. Parked at approval — ready for the release agent's merge/tag pipeline.

@nesquena-hermes
Copy link
Copy Markdown
Collaborator Author

Thanks for the thorough write-up — the root cause analysis for all three bugs is clear and the fixes look correct.

Bug 1 (workspace): Reading directly from the target profile's last_workspace.txt instead of routing through get_last_workspace()get_active_profile_name() is the right call. The thread-local context is still pointing at the old profile during the switch request, so bypassing it here is exactly right. The fallback chain (file → config.yaml keys → DEFAULT_WORKSPACE) and the expanduser() fix are good additions.

Bug 2 (model dropdown): Calling invalidate_models_cache() immediately after switch_profile() succeeds makes sense. Since the disk cache is a single per-server file (not per-profile), full invalidation is the correct strategy rather than a targeted eviction.

Bug 3 (profile chip label): The early-return in syncTopbar() when S.session is null was the culprit — adding the chip update before return; in that branch is the minimal, correct fix.

Timing tests: The midpoint-averaging approach with ±5ms tolerance for the skew-compensation tests is a reasonable fix for CI flakiness under CPU load — cleaner than adding arbitrary sleeps.

Test coverage: 9 targeted tests covering the bug paths plus regression guards looks solid. The 2773-passed / 0-failed / 2-skipped run is a good signal. Noted that the 2 skipped are intentional macOS-only tests.

One thing worth confirming before merge: does switch_profile() with the new direct-file read handle the case where last_workspace.txt contains a path that no longer exists on disk (e.g. an external drive that was unmounted)? If so, make sure the fallback chain still kicks in rather than returning a non-existent path to the client.

Overall this looks mergeable — clean fixes, well-tested, browser-verified across four real profiles.

nesquena-hermes added a commit that referenced this pull request Apr 28, 2026
…le switch fixes (#1206)

fix: batch v0.50.234-235 — XSS hardening, workspace validation, profile switch fixes

v0.50.235 (#1203 — profile switch workspace/model/chip, 3 bugs + flaky test):
- switch_profile now reads target profile's workspace directly (thread-local bypass)
- invalidate_models_cache() after profile switch (model dropdown staleness)
- syncTopbar() updates chip before early-return (no-session path)

v0.50.234 (#1201/#1205 — XSS hardening + workspace security):
- renderMd() full HTML attribute sanitizer replacing tag-name-only allowlist
- Delegated image lightbox (removes all inline onclick)
- macOS /etc → /private/etc symlink bypass fixed
- /System /Library added to blocked workspace roots
- Legacy /api/chat workspace trust gap closed

Both PRs independently reviewed. 2787/2787 tests. QA harness 20/20 + 11/11 API checks.

Co-authored-by: Brendan Schmid <[email protected]>
Co-authored-by: Nathan Esquenazi <[email protected]>
@nesquena-hermes
Copy link
Copy Markdown
Collaborator Author

Merged via batch release PR #1206 as v0.50.235. Thanks @nesquena for the thorough review!

@nesquena-hermes nesquena-hermes deleted the fix/profile-switch-workspace-model branch April 28, 2026 04:40
JKJameson pushed a commit to JKJameson/hermes-webui that referenced this pull request Apr 29, 2026
…le switch fixes (nesquena#1206)

fix: batch v0.50.234-235 — XSS hardening, workspace validation, profile switch fixes

v0.50.235 (nesquena#1203 — profile switch workspace/model/chip, 3 bugs + flaky test):
- switch_profile now reads target profile's workspace directly (thread-local bypass)
- invalidate_models_cache() after profile switch (model dropdown staleness)
- syncTopbar() updates chip before early-return (no-session path)

v0.50.234 (nesquena#1201/nesquena#1205 — XSS hardening + workspace security):
- renderMd() full HTML attribute sanitizer replacing tag-name-only allowlist
- Delegated image lightbox (removes all inline onclick)
- macOS /etc → /private/etc symlink bypass fixed
- /System /Library added to blocked workspace roots
- Legacy /api/chat workspace trust gap closed

Both PRs independently reviewed. 2787/2787 tests. QA harness 20/20 + 11/11 API checks.

Co-authored-by: Brendan Schmid <[email protected]>
Co-authored-by: Nathan Esquenazi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(profiles): profile switch shows wrong workspace, model, and chip label

2 participants