Skip to content

[Fix] Hanging CI Tests in custom_httpx test_http_handler#23674

Merged
yuneng-jiang merged 3 commits intolitellm_internal_dev_03_14_2026from
litellm_fix_hanging_httpx_tests
Mar 15, 2026
Merged

[Fix] Hanging CI Tests in custom_httpx test_http_handler#23674
yuneng-jiang merged 3 commits intolitellm_internal_dev_03_14_2026from
litellm_fix_hanging_httpx_tests

Conversation

@yuneng-jiang
Copy link
Copy Markdown
Contributor

@yuneng-jiang yuneng-jiang commented Mar 15, 2026

Summary

Failure Path (Before Fix)

CI unit tests consistently hang at ~98% completion on test_aiohttp_handler_cleanup. Two root causes:

  1. count_aiohttp_sessions() in test_gemini_session_leak.py calls gc.get_objects(), which traverses every object in the Python process. With xdist workers and hundreds of loaded test modules, this means millions of isinstance() checks — causing the test to hang in CI.
  2. test_force_ipv4_transport makes a real HTTP request to http://example.com with no timeout, which can also hang when network is slow or restricted.
  3. Several tests leak unclosed aiohttp.ClientSession and transport objects, causing potential hangs during teardown.
  4. Tests mutate litellm.force_ipv4 and litellm.disable_aiohttp_transport without restoring them.

Fix

test_gemini_session_leak.py:

  • Replace gc.get_objects() session-counting with direct session.closed assertions. The original tests verified that session count didn't increase after cleanup — the new tests verify the exact same property by checking session.closed is True after __del__ / close_litellm_async_clients(). This is a stronger assertion (checks the specific session, not a global count) and avoids GC traversal entirely.
  • Fix __main__ block: the refactored test returns None, so 0 if success else 1 always exited with code 1.

test_http_handler.py:

  • Remove real HTTP call to example.com in test_force_ipv4_transport. The test's purpose is to verify _create_async_transport() returns an httpx.AsyncHTTPTransport when force_ipv4=True — the HTTP round-trip added no coverage for that.
  • Close litellm_async_client (AsyncHTTPHandler) in test_ssl_verification_with_aiohttp_transport — previously only the manually-created aiohttp_session was closed.
  • Add finally cleanup blocks for transports/handlers in test_aiohttp_transport_trust_env_setting, test_ssl_security_level, and test_ssl_context_transport.
  • Save/restore litellm.force_ipv4 and litellm.disable_aiohttp_transport in test_force_ipv4_transport and test_aiohttp_disabled_transport.

Testing

All 94 tests in tests/test_litellm/llms/custom_httpx/ pass locally.

Type

🐛 Bug Fix
✅ Test

- Remove real HTTP call to example.com in test_force_ipv4_transport
  (hangs in CI when network is slow/unavailable)
- Close leaked aiohttp.ClientSession in test_ssl_verification_with_aiohttp_transport
- Add cleanup for transports in test_aiohttp_transport_trust_env_setting
- Add cleanup for handler in test_ssl_security_level
- Add cleanup for transport in test_ssl_context_transport

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 15, 2026 1:16am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 15, 2026

Greptile Summary

This PR fixes CI test hangs in tests/test_litellm/llms/custom_httpx/ caused by three root issues: a slow gc.get_objects() traversal, a real unbounded HTTP request to example.com, and leaked aiohttp sessions/transports. The fixes are generally correct and well-targeted, and the changes align with the no-real-network-calls rule for this test folder.

Key changes:

  • test_gemini_session_leak.py: Replaces gc.get_objects() counting with direct session.closed assertions — same coverage, no GC overhead. Verbose print statements are removed and the standalone main() runner is cleaned up.
  • test_http_handler.py: Removes the real http://example.com request from test_force_ipv4_transport; adds finally blocks with await client.close() / await transport.aclose() to test_ssl_security_level, test_ssl_context_transport, test_ssl_verification_with_aiohttp_transport, and test_aiohttp_transport_trust_env_setting; saves and restores litellm global flags in test_force_ipv4_transport and test_aiohttp_disabled_transport.
  • Two resource-cleanup gaps remain: test_force_ipv4_transport creates an httpx.AsyncHTTPTransport without closing it, and test_ssl_context_transport's finally only handles the LiteLLMAiohttpTransport branch.

Confidence Score: 4/5

  • Safe to merge with minor follow-up: two small resource-cleanup gaps in test teardown remain after this fix.
  • Changes are test-only with no production code impact. The root causes (GC traversal and unbounded real HTTP call) are correctly identified and fixed. Two remaining transport cleanup gaps in test_force_ipv4_transport and test_ssl_context_transport are minor and won't cause CI hangs, only potential unclosed-resource warnings.
  • tests/test_litellm/llms/custom_httpx/test_http_handler.py — specifically test_force_ipv4_transport (lines 67-74) and test_ssl_context_transport (lines 86-96)

Important Files Changed

Filename Overview
tests/test_litellm/llms/custom_httpx/test_gemini_session_leak.py Removes the GC-traversal count_aiohttp_sessions() function and replaces it with direct session.closed assertions — substantially simpler and avoids the CI hang. Minor concern: test_atexit_cleanup mutates the global litellm.base_llm_aiohttp_handler session which may affect later tests.
tests/test_litellm/llms/custom_httpx/test_http_handler.py Adds finally cleanup blocks and state restoration to several tests; removes the real HTTP call to example.com. However, test_force_ipv4_transport still leaks an unclosed httpx.AsyncHTTPTransport, and test_ssl_context_transport's finally block only handles the LiteLLMAiohttpTransport case, leaving non-aiohttp transports unclosed.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CI Test Run] --> B{test_aiohttp_handler_cleanup}
    B --> B1[Create BaseLLMAIOHTTPHandler]
    B1 --> B2[Get session reference]
    B2 --> B3[Assert session.closed == False]
    B3 --> B4[del handler + gc.collect]
    B4 --> B5[await asyncio.sleep 0.1]
    B5 --> B6[Assert session.closed == True]

    A --> C{test_atexit_cleanup}
    C --> C1[Access global litellm.base_llm_aiohttp_handler]
    C1 --> C2[Get session reference]
    C2 --> C3[Assert session.closed == False]
    C3 --> C4[Call close_litellm_async_clients]
    C4 --> C5[Assert session.closed == True]
    C5 --> C6["⚠️ Global handler session now closed"]

    A --> D{test_force_ipv4_transport}
    D --> D1[Save + set litellm flags]
    D1 --> D2[_create_async_transport]
    D2 --> D3[Assert httpx.AsyncHTTPTransport]
    D3 --> D4["⚠️ Transport NOT closed in finally"]
    D4 --> D5[Restore litellm flags]

    A --> E{test_ssl_context_transport}
    E --> E1[_create_async_transport with ssl_context]
    E1 --> E2{isinstance LiteLLMAiohttpTransport?}
    E2 -->|Yes| E3[Assert connector._ssl not None]
    E3 --> E4[await transport.aclose]
    E2 -->|No| E5["⚠️ Transport NOT closed"]
Loading

Comments Outside Diff (3)

  1. tests/test_litellm/llms/custom_httpx/test_http_handler.py, line 67-74 (link)

    Unclosed transport resource leak

    The PR fixes resource leaks in other tests but misses this one. AsyncHTTPHandler._create_async_transport() with force_ipv4=True and disable_aiohttp_transport=True returns an httpx.AsyncHTTPTransport, which wraps an httpcore.AsyncConnectionPool. Not closing it will still trigger "unclosed resource" warnings in CI, which is the root problem this PR is trying to solve.

  2. tests/test_litellm/llms/custom_httpx/test_http_handler.py, line 86-96 (link)

    Potential transport leak for non-aiohttp transport path

    Both the try body and finally guard on isinstance(transport, LiteLLMAiohttpTransport). If disable_aiohttp_transport happens to be True in the test environment, _create_async_transport(ssl_context=...) may return an httpx.AsyncHTTPTransport instead, and the finally block skips closing it. Consider closing any non-None transport regardless of type:

  3. tests/test_litellm/llms/custom_httpx/test_gemini_session_leak.py, line 40-57 (link)

    Global handler session left closed after test

    test_atexit_cleanup calls close_litellm_async_clients() on the real litellm.base_llm_aiohttp_handler (the module-global singleton). After this test runs, that handler's session is permanently closed. Any test that subsequently calls into litellm.base_llm_aiohttp_handler._get_async_client_session() (e.g., a Gemini completion test) will either get an error or trigger an implicit re-creation depending on the implementation.

    Depending on whether BaseLLMAIOHTTPHandler lazily re-creates its session after a close, this may cause silent failures in later tests. Consider either:

    1. Saving and restoring the handler reference around the test, or
    2. Creating a fresh BaseLLMAIOHTTPHandler() instead of using the global singleton, to avoid cross-test contamination.

Last reviewed commit: 460f620

count_aiohttp_sessions() iterates every object in the Python GC,
which hangs in CI when xdist workers have millions of loaded objects.
Replace with direct session.closed checks — same coverage, no hang.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Fix __main__ block: test returns None now, so always exited 1
- Close litellm_async_client in test_ssl_verification_with_aiohttp_transport
- Save/restore litellm.force_ipv4 and litellm.disable_aiohttp_transport
  in test_force_ipv4_transport and test_aiohttp_disabled_transport

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@yuneng-jiang yuneng-jiang merged commit d907a81 into litellm_internal_dev_03_14_2026 Mar 15, 2026
35 of 54 checks passed
@ishaan-berri ishaan-berri deleted the litellm_fix_hanging_httpx_tests branch March 26, 2026 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant