[Fix] Hanging CI Tests in custom_httpx test_http_handler#23674
[Fix] Hanging CI Tests in custom_httpx test_http_handler#23674yuneng-jiang merged 3 commits intolitellm_internal_dev_03_14_2026from
Conversation
- Remove real HTTP call to example.com in test_force_ipv4_transport (hangs in CI when network is slow/unavailable) - Close leaked aiohttp.ClientSession in test_ssl_verification_with_aiohttp_transport - Add cleanup for transports in test_aiohttp_transport_trust_env_setting - Add cleanup for handler in test_ssl_security_level - Add cleanup for transport in test_ssl_context_transport Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes CI test hangs in Key changes:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| tests/test_litellm/llms/custom_httpx/test_gemini_session_leak.py | Removes the GC-traversal count_aiohttp_sessions() function and replaces it with direct session.closed assertions — substantially simpler and avoids the CI hang. Minor concern: test_atexit_cleanup mutates the global litellm.base_llm_aiohttp_handler session which may affect later tests. |
| tests/test_litellm/llms/custom_httpx/test_http_handler.py | Adds finally cleanup blocks and state restoration to several tests; removes the real HTTP call to example.com. However, test_force_ipv4_transport still leaks an unclosed httpx.AsyncHTTPTransport, and test_ssl_context_transport's finally block only handles the LiteLLMAiohttpTransport case, leaving non-aiohttp transports unclosed. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[CI Test Run] --> B{test_aiohttp_handler_cleanup}
B --> B1[Create BaseLLMAIOHTTPHandler]
B1 --> B2[Get session reference]
B2 --> B3[Assert session.closed == False]
B3 --> B4[del handler + gc.collect]
B4 --> B5[await asyncio.sleep 0.1]
B5 --> B6[Assert session.closed == True]
A --> C{test_atexit_cleanup}
C --> C1[Access global litellm.base_llm_aiohttp_handler]
C1 --> C2[Get session reference]
C2 --> C3[Assert session.closed == False]
C3 --> C4[Call close_litellm_async_clients]
C4 --> C5[Assert session.closed == True]
C5 --> C6["⚠️ Global handler session now closed"]
A --> D{test_force_ipv4_transport}
D --> D1[Save + set litellm flags]
D1 --> D2[_create_async_transport]
D2 --> D3[Assert httpx.AsyncHTTPTransport]
D3 --> D4["⚠️ Transport NOT closed in finally"]
D4 --> D5[Restore litellm flags]
A --> E{test_ssl_context_transport}
E --> E1[_create_async_transport with ssl_context]
E1 --> E2{isinstance LiteLLMAiohttpTransport?}
E2 -->|Yes| E3[Assert connector._ssl not None]
E3 --> E4[await transport.aclose]
E2 -->|No| E5["⚠️ Transport NOT closed"]
Comments Outside Diff (3)
-
tests/test_litellm/llms/custom_httpx/test_http_handler.py, line 67-74 (link)Unclosed transport resource leak
The PR fixes resource leaks in other tests but misses this one.
AsyncHTTPHandler._create_async_transport()withforce_ipv4=Trueanddisable_aiohttp_transport=Truereturns anhttpx.AsyncHTTPTransport, which wraps anhttpcore.AsyncConnectionPool. Not closing it will still trigger "unclosed resource" warnings in CI, which is the root problem this PR is trying to solve. -
tests/test_litellm/llms/custom_httpx/test_http_handler.py, line 86-96 (link)Potential transport leak for non-aiohttp transport path
Both the
trybody andfinallyguard onisinstance(transport, LiteLLMAiohttpTransport). Ifdisable_aiohttp_transporthappens to beTruein the test environment,_create_async_transport(ssl_context=...)may return anhttpx.AsyncHTTPTransportinstead, and thefinallyblock skips closing it. Consider closing any non-Nonetransport regardless of type: -
tests/test_litellm/llms/custom_httpx/test_gemini_session_leak.py, line 40-57 (link)Global handler session left closed after test
test_atexit_cleanupcallsclose_litellm_async_clients()on the reallitellm.base_llm_aiohttp_handler(the module-global singleton). After this test runs, that handler's session is permanently closed. Any test that subsequently calls intolitellm.base_llm_aiohttp_handler._get_async_client_session()(e.g., a Gemini completion test) will either get an error or trigger an implicit re-creation depending on the implementation.Depending on whether
BaseLLMAIOHTTPHandlerlazily re-creates its session after aclose, this may cause silent failures in later tests. Consider either:- Saving and restoring the handler reference around the test, or
- Creating a fresh
BaseLLMAIOHTTPHandler()instead of using the global singleton, to avoid cross-test contamination.
Last reviewed commit: 460f620
count_aiohttp_sessions() iterates every object in the Python GC, which hangs in CI when xdist workers have millions of loaded objects. Replace with direct session.closed checks — same coverage, no hang. Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Fix __main__ block: test returns None now, so always exited 1 - Close litellm_async_client in test_ssl_verification_with_aiohttp_transport - Save/restore litellm.force_ipv4 and litellm.disable_aiohttp_transport in test_force_ipv4_transport and test_aiohttp_disabled_transport Co-Authored-By: Claude Opus 4.6 <[email protected]>
d907a81
into
litellm_internal_dev_03_14_2026
Summary
Failure Path (Before Fix)
CI unit tests consistently hang at ~98% completion on
test_aiohttp_handler_cleanup. Two root causes:count_aiohttp_sessions()intest_gemini_session_leak.pycallsgc.get_objects(), which traverses every object in the Python process. With xdist workers and hundreds of loaded test modules, this means millions ofisinstance()checks — causing the test to hang in CI.test_force_ipv4_transportmakes a real HTTP request tohttp://example.comwith no timeout, which can also hang when network is slow or restricted.aiohttp.ClientSessionand transport objects, causing potential hangs during teardown.litellm.force_ipv4andlitellm.disable_aiohttp_transportwithout restoring them.Fix
test_gemini_session_leak.py:gc.get_objects()session-counting with directsession.closedassertions. The original tests verified that session count didn't increase after cleanup — the new tests verify the exact same property by checkingsession.closedisTrueafter__del__/close_litellm_async_clients(). This is a stronger assertion (checks the specific session, not a global count) and avoids GC traversal entirely.__main__block: the refactored test returnsNone, so0 if success else 1always exited with code 1.test_http_handler.py:example.comintest_force_ipv4_transport. The test's purpose is to verify_create_async_transport()returns anhttpx.AsyncHTTPTransportwhenforce_ipv4=True— the HTTP round-trip added no coverage for that.litellm_async_client(AsyncHTTPHandler) intest_ssl_verification_with_aiohttp_transport— previously only the manually-createdaiohttp_sessionwas closed.finallycleanup blocks for transports/handlers intest_aiohttp_transport_trust_env_setting,test_ssl_security_level, andtest_ssl_context_transport.litellm.force_ipv4andlitellm.disable_aiohttp_transportintest_force_ipv4_transportandtest_aiohttp_disabled_transport.Testing
All 94 tests in
tests/test_litellm/llms/custom_httpx/pass locally.Type
🐛 Bug Fix
✅ Test