fix: stabilize CI — TS widen, sys.modules restore, WS subscriber race#17836
Merged
fix: stabilize CI — TS widen, sys.modules restore, WS subscriber race#17836
Conversation
Three narrow fixes targeting the remaining red checks after #17828: 1. ui-tui/src/app/slash/commands/ops.ts (Docker Build): /reload-mcp's local params type annotated session_id: string while ctx.sid is string | null. Widen to string | null — matches every other rpc call site and the test harness which passes { session_id: null }. Fixes TS2322 on line 86. The rpc signature itself is Record<string, unknown>, so this is purely a local typing fix, no behavioral change. 2. tests/plugins/test_achievements_plugin.py (13 cascading test failures): _install_fake_session_db did a raw sys.modules['hermes_state'] = fake_module without restoration, leaking the fake across xdist worker boundaries. Downstream tests doing from hermes_state import SessionDB got a module whose SessionDB was lambda: fake_db — 6 test_hermes_state.py tests failed with AttributeError: 'function' object has no attribute '_sanitize_fts5_query' / _contains_cjk, and 7 test_860_dedup.py tests failed with TypeError: got unexpected keyword argument 'db_path' (real code calls SessionDB(db_path=...)). Fix: stash monkeypatch on the plugin_api module object in the fixture, and have the helper do monkeypatch.setitem(sys.modules, 'hermes_state', fake_module) for auto-restoration at test teardown. 3. tests/hermes_cli/test_web_server.py (WS race): TestPtyWebSocket::test_pub_broadcasts_to_events_subscribers hit the 30s test timeout on CI. websocket_connect returns after ws.accept() — but /api/events registers the subscriber in _event_channels on the NEXT await (inside _event_lock). A publish immediately after connect could race ahead of registration and be dropped, and the subsequent receive_text() blocked until SIGALRM killed the test. Fix: poll _event_channels after the subscriber connects, before publishing. Validation: scripts/run_tests.sh tests/plugins/test_achievements_plugin.py tests/run_agent/test_860_dedup.py tests/test_hermes_state.py tests/hermes_cli/test_web_server.py 338 passed cd ui-tui && npm run type-check clean cd ui-tui && npm run build clean Remaining red checks are pure infra (Nix ubuntu hits TwirpErrorResponse ResourceExhausted on the GH Actions cache API; Nix macos bounces between npm build openssl-legacy and cache rate-limits) and cannot be fixed in the codebase.
This was referenced Apr 30, 2026
alt-glitch
pushed a commit
that referenced
this pull request
Apr 30, 2026
magic-nix-cache caused recurring CI failures (TwirpErrorResponse ResourceExhausted) by hitting GitHub Actions Cache's 10 GB limit and 200 req/min rate limit. This was flagged as 'unfixable infra flake' in #17836 but is actually a fixable architecture choice. Switch to Cachix (dedicated binary cache, no GHA quota dependency): - Replace DeterminateSystems/magic-nix-cache-action with cachix/cachix-action - Add cachix-auth-token input to nix-setup composite action - Pass CACHIX_AUTH_TOKEN secret through all three nix workflows - continue-on-error: true so cache failures never block CI Cache 'hermes-agent' is public at hermes-agent.cachix.org. Devs can pull locally with: cachix use hermes-agent
alt-glitch
added a commit
that referenced
this pull request
Apr 30, 2026
* fix(nix): replace magic-nix-cache with Cachix magic-nix-cache caused recurring CI failures (TwirpErrorResponse ResourceExhausted) by hitting GitHub Actions Cache's 10 GB limit and 200 req/min rate limit. This was flagged as 'unfixable infra flake' in #17836 but is actually a fixable architecture choice. Switch to Cachix (dedicated binary cache, no GHA quota dependency): - Replace DeterminateSystems/magic-nix-cache-action with cachix/cachix-action - Add cachix-auth-token input to nix-setup composite action - Pass CACHIX_AUTH_TOKEN secret through all three nix workflows - continue-on-error: true so cache failures never block CI Cache 'hermes-agent' is public at hermes-agent.cachix.org. Devs can pull locally with: cachix use hermes-agent * fix: correct cachix-action commit SHA pin --------- Co-authored-by: Hermes Agent <[email protected]>
donald131
pushed a commit
to donald131/hermes-agent
that referenced
this pull request
May 2, 2026
* fix(nix): replace magic-nix-cache with Cachix magic-nix-cache caused recurring CI failures (TwirpErrorResponse ResourceExhausted) by hitting GitHub Actions Cache's 10 GB limit and 200 req/min rate limit. This was flagged as 'unfixable infra flake' in NousResearch#17836 but is actually a fixable architecture choice. Switch to Cachix (dedicated binary cache, no GHA quota dependency): - Replace DeterminateSystems/magic-nix-cache-action with cachix/cachix-action - Add cachix-auth-token input to nix-setup composite action - Pass CACHIX_AUTH_TOKEN secret through all three nix workflows - continue-on-error: true so cache failures never block CI Cache 'hermes-agent' is public at hermes-agent.cachix.org. Devs can pull locally with: cachix use hermes-agent * fix: correct cachix-action commit SHA pin --------- Co-authored-by: Hermes Agent <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #17828 to turn the remaining red checks green.
Summary
Three narrow, unrelated fixes — each addressing one failing check on
mainafter #17828 landed.Changes
1.
ui-tui/src/app/slash/commands/ops.ts— fixes Docker Build/reload-mcp's localparamswas annotatedsession_id: string, butctx.sidis typedstring | null. TS2322 on line 86. The rpc signature isRecord<string, unknown>(accepts null), every other rpc call site does the samesession_id: ctx.sidwithout narrowing, and the existing vitest test literally passes{ session_id: null }to this rpc. Widening the local type fromstringtostring | nullmatches reality — one line, no runtime change.2.
tests/plugins/test_achievements_plugin.py— fixes 13 cascading test failures_install_fake_session_dbdid a rawsys.modules['hermes_state'] = fake_modulewith no restoration, leaking the fake past test boundaries. In xdist workers that run this test beforetests/test_hermes_state.pyortests/run_agent/test_860_dedup.py, later tests seeSessionDB = lambda: fake_dbinstead of the real class — causingAttributeError: 'function' object has no attribute '_sanitize_fts5_query'(6x) andTypeError: got unexpected keyword argument 'db_path'(7x).Fix: stash
monkeypatchon theplugin_apimodule via the fixture, and route thesys.modulesswap throughmonkeypatch.setitem(...)so teardown auto-restores.3.
tests/hermes_cli/test_web_server.py— fixes WS broadcast race timeoutTestPtyWebSocket::test_pub_broadcasts_to_events_subscribershit the 30s test timeout on CI.websocket_connectreturns afterws.accept(), but the server registers the subscriber in_event_channelson the next await (inside_event_lock). A publish immediately after connect can race ahead, the broadcast has no subscribers, and the subsequentreceive_text()blocks forever.Fix: after the subscriber connects, poll
_event_channelsuntil the registration is visible before opening the publisher.Validation
scripts/run_tests.sh tests/plugins/test_achievements_plugin.py tests/run_agent/test_860_dedup.py tests/test_hermes_state.py tests/hermes_cli/test_web_server.pycd ui-tui && npm run type-checkcd ui-tui && npm run buildNot fixed (intentional)
Nix has been failing on
mainfor 5+ consecutive runs due to GH Actions cacheTwirpErrorResponse { code: ResourceExhausted }on ubuntu and npm openssl-legacy on macos. Pure infra flake — nothing to fix in the codebase.