Skip to content

fix: add missing openai and httpx to requirements.txt#1383

Closed
wali-reheman wants to merge 1 commit intonesquena:masterfrom
wali-reheman:fix/missing-openai-dependency
Closed

fix: add missing openai and httpx to requirements.txt#1383
wali-reheman wants to merge 1 commit intonesquena:masterfrom
wali-reheman:fix/missing-openai-dependency

Conversation

@wali-reheman
Copy link
Copy Markdown

Summary

When the WebUI's Python venv lacks the openai package, running a chat that hits an API error causes the server to hang indefinitely.

Root Cause

api/streaming.py delegates chat execution to hermes-agent's AIAgent. When AIAgent raises an anthropic.BadRequestError (e.g. "context window exceeds limit"), its internal exception handler tries to import openai.APIError:

# run_agent.py ~line 7040
from openai import APIError as _APIError
if isinstance(e, _APIError) and not getattr(e, "status_code", None):
    ...

If openai is not installed in the WebUI's venv, this import raises ModuleNotFoundError as an unhandled exception inside a daemon thread. The orphaned thread exits while holding the global _ENV_LOCK (a threading.Lock in api/streaming.py). All subsequent HTTP requests block forever waiting for that lock.

Fix

Add openai>=1.0 and httpx>=0.25 to requirements.txt. Both packages are already imported by the WebUI streaming code or its dependency chain; they just weren't declared.

 # Hermes Web UI -- minimal Python dependencies
 # The server uses only stdlib + pyyaml.
 # All heavy ML/agent deps live in the Hermes agent venv.
 pyyaml>=6.0
+openai>=1.0
+httpx>=0.25

Testing

  1. Install the WebUI in a fresh venv without openai
  2. Send a chat message that triggers an API error (e.g. context window overflow)
  3. Observe: server hangs on all subsequent requests
  4. Apply fix (pip install openai httpx), restart server
  5. Observe: server continues responding normally after API errors

Related

  • hermes-agent issue: from openai import APIError in exception handler is not guarded by a try/except, so any ModuleNotFoundError propagates unhandled

Hermes WebUI's streaming module calls into hermes-agent's AIAgent,
which uses 'from openai import APIError' in exception handlers.
When hermes-agent raises an anthropic.APIError (e.g. context window
overflow), the handler attempts to import openai.APIError to check
for SSE-specific retryable errors.

If the WebUI's venv lacks the openai package, this import raises
ModuleNotFoundError as an unhandled exception in a daemon thread.
The orphaned thread exits while holding the global _ENV_LOCK, causing
all subsequent HTTP requests to hang indefinitely on that lock.

Adding openai>=1.0 to requirements.txt ensures the venv has the
package, so the import succeeds and the exception handler completes
normally, releasing _ENV_LOCK.

htpx is added as a peer dependency of openai and is also imported
directly by the WebUI streaming code.
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Thanks for the careful root-cause analysis in the PR body — the daemon-thread holding _ENV_LOCK forever is a real bug, and your trace from from openai import APIError as _APIError inside run_agent.py looks correct.

That said, I'm holding this PR (and labeling hold) because it fixes the symptom in the wrong layer:

The requirements.txt contract

The WebUI's requirements.txt is intentionally minimal — the comment block at the top says so:

# Hermes Web UI -- minimal Python dependencies
# The server uses only stdlib + pyyaml.
# All heavy ML/agent deps live in the Hermes agent venv.

The WebUI server itself doesn't import openai or httpx anywhere — grep -rn "^import openai\|^from openai\|^import httpx\|^from httpx" api/ static/ returns nothing. The only place those imports show up is inside hermes-agent code that the WebUI shells out to (i.e. they're already installed in the agent venv that actually runs AIAgent). Adding them to the WebUI's requirements.txt would inflate the WebUI install footprint by ~50 MB of transitive deps that the WebUI process itself never uses, and would make the comment block above misleading.

Where the fix belongs

The real bug is in hermes-agent: the exception handler does an unguarded import that throws ModuleNotFoundError and tears down the daemon thread while it holds the lock. Two viable fixes, both in hermes-agent (run_agent.py ~line 7040):

# Option A — guard the import
try:
    from openai import APIError as _APIError
except ImportError:
    _APIError = None
if _APIError is not None and isinstance(e, _APIError) and not getattr(e, "status_code", None):
    ...
# Option B — release the lock unconditionally
try:
    # ... existing handler body, including the openai import ...
except Exception:
    logger.exception("error handler crashed")
finally:
    # ensure _ENV_LOCK is released even if the handler raises

Option A is the minimal correct fix — it eliminates the ModuleNotFoundError path entirely. Option B is the architectural fix (handler crashes shouldn't ever leak the lock). Ideally both.

What I'd like in a follow-up PR

Either:

  1. Open the matching PR against nesquena/hermes-agent (or wherever run_agent.py lives in your fork) with the try/except guard around the openai import. We can land that and the lock won't deadlock anymore. Or
  2. If you don't have access to the hermes-agent repo and just need this unblocked locally, you can pip install openai httpx into your WebUI venv as a one-off — but the design contract says we shouldn't ship it as a default dep for everyone.

Closing thought: the bug analysis is great and I want to land the underlying fix. Just from the right side. Tag me on the agent-side PR (or open an issue here referencing the agent file/line) and I'll help push it through.

@wali-reheman
Copy link
Copy Markdown
Author

Cross-reference: fix landed in hermes-agent

The root-cause fix (Option A -- try/except guard on the openai import) has been landed in hermes-agent:

NousResearch/hermes-agent#18247

Two sites in run_agent.py (~lines 7057 and 7166) now use:

try:
    from openai import APIError as _APIError
except ImportError:
    _APIError = None
if _APIError is not None and isinstance(e, _APIError) and not getattr(e, "status_code", None):
    ...

This eliminates the ModuleNotFoundError -> daemon-thread-teardown -> lock-deadlock path entirely. With openai absent, the handlers now degrade gracefully by setting _APIError = None and skipping the openai-specific branch.

This supersedes the requirements.txt workaround in this PR. Recommend closing #1383 as superseded -- the fix belongs in hermes-agent, not in the WebUI's minimal requirements contract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants