Skip to content

Fix unclosed 'MemoryObjectReceiveStream' upon exception in 'BaseHTTPMiddleware' children#2813

Merged
Kludex merged 11 commits intomasterfrom
fix-resource-warning-anyio
Dec 29, 2024
Merged

Fix unclosed 'MemoryObjectReceiveStream' upon exception in 'BaseHTTPMiddleware' children#2813
Kludex merged 11 commits intomasterfrom
fix-resource-warning-anyio

Conversation

@Kludex
Copy link
Copy Markdown
Owner

@Kludex Kludex commented Dec 26, 2024

I open a different PR to remove the close_recv_stream_on_response_sent. I'm not sure that's why the test is failing...

…iddleware' children

Co-authored-by: Thomas Grainger <[email protected]>
Co-authored-by: Nikita Gashkov <[email protected]>
Comment on lines -133 to -136
async def close_recv_stream_on_response_sent() -> None:
await response_sent.wait()
recv_stream.close()

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kludex Kludex requested a review from graingert December 26, 2024 09:18
@Kludex
Copy link
Copy Markdown
Owner Author

Kludex commented Dec 26, 2024

Well... I'm very confused as to why the current test is failing. 😆

@graingert
Copy link
Copy Markdown
Contributor

graingert commented Dec 26, 2024

hmm, 3.8 is EOL so probably getting an old anyio?

edit 1: I ran the test suite locally and got the failure once, running the suite again 3 more times did not reproduce it

edit 2: I deleted my pyc files, ran and it passed. I deleted my pyc files again and recreated my venv and it passed

@Kludex
Copy link
Copy Markdown
Owner Author

Kludex commented Dec 26, 2024

The flakiness worries me. Do you know why? Even if it's an old version of anyio.

@graingert
Copy link
Copy Markdown
Contributor

I'm going to try running it 1000x

@graingert
Copy link
Copy Markdown
Contributor

for future reference in case the CI logs expire, in python3.8 CI the failure is:

=================================== FAILURES ===================================
_______________ test_websocket_endpoint_on_receive_text[asyncio] _______________

test_client_factory = functools.partial(<class 'starlette.testclient.TestClient'>, backend='asyncio', backend_options={})

    def test_websocket_endpoint_on_receive_text(
        test_client_factory: TestClientFactory,
    ) -> None:
        class WebSocketApp(WebSocketEndpoint):
            encoding = "text"
    
            async def on_receive(self, websocket: WebSocket, data: str) -> None:
                await websocket.send_text(f"Message text was: {data}")
    
        client = test_client_factory(WebSocketApp)
        with client.websocket_connect("/ws") as websocket:
            websocket.send_text("Hello, world!")
            _text = websocket.receive_text()
            assert _text == "Message text was: Hello, world!"
    
        with pytest.raises(RuntimeError):
            with client.websocket_connect("/ws") as websocket:
>               websocket.send_bytes(b"Hello world")
E               Failed: DID NOT RAISE <class 'RuntimeError'>

tests/test_endpoints.py:134: Failed
================== 1 failed, 878 passed, 4 xfailed in 13.38s ===================

@graingert
Copy link
Copy Markdown
Contributor

graingert commented Dec 26, 2024

ah I think it's a race condition here:

https://github.com/encode/starlette/blob/29b9c42273ad5e3a4a8120b3a1a8f7b578be894a/starlette/testclient.py#L132-L136

the other thread could be busy doing stuff moments before trying to put a message on the queue, so the queue appears empty when there's a message pending being put

You need to use queue.shutdown() 3.13+ or send an explicit EOF object when the queue is done with

@graingert
Copy link
Copy Markdown
Contributor

I think this is a pre-existing issue, just VERY rare

Comment on lines +88 to +93
class _Eof(enum.Enum):
EOF = enum.auto()


EOF: typing.Final = _Eof.EOF
Eof = typing.Literal[_Eof.EOF]
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem this solves is not related to this PR then?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes probably best to split into two PRs and add a changelog entry for the lost shutdown exception issue

@graingert
Copy link
Copy Markdown
Contributor

graingert commented Dec 26, 2024

omg it's still possible to fail!?

@Kludex
Copy link
Copy Markdown
Owner Author

Kludex commented Dec 26, 2024

Hmmm... Then it's not the queue shutdown thingy?

@graingert
Copy link
Copy Markdown
Contributor

rather than continue to hijack your PR I've opened a new one #2814

@graingert
Copy link
Copy Markdown
Contributor

truly bizarre

@Kludex Kludex merged commit 5a10fba into master Dec 29, 2024
@Kludex Kludex deleted the fix-resource-warning-anyio branch December 29, 2024 12:32
@Kludex
Copy link
Copy Markdown
Owner Author

Kludex commented Dec 29, 2024

I'll make a release when I get home.

@linbrandon4
Copy link
Copy Markdown

Good catch/fix. Closing MemoryObjectReceiveStream when a middleware child throws is one of those annoying “only shows up under exceptions / CI” issues, and it makes sense to handle it in BaseHTTPMiddleware so you don’t leak streams and trigger resource warnings.

The review thread about the flaky websocket test was useful too. It really does look like there was a separate race in the testclient/queue shutdown path (rare, but real), so I like that it got split into a different PR instead of mixing concerns here.

Only things I’d double-check:

  • the stream gets closed on all exit paths (normal response, exception, cancellation)
  • we’re not accidentally masking the original exception (close should be best-effort)
  • tests cover the exception path well enough so this doesn’t regress later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants