-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Memory usage streaming large responses #1012
Description
We've been running into memory issues when providing very large async generators to a streaming response. We have these generators producing large (larger than memory set) responses in a way that allows us to only keep small chunks in memory at a time. However, it looks like the BaseHTTPMiddleware implementation uses an asyncio queue to store the individual chunks:
This prevents any network backpressure handling -- if the client that is receiving the streaming response is on a slow connection, the queue will happily grow without bound and consume all memory, triggering kernel out-of-memory, when the ideal handling here would be for send to block (yield) when this happens. I believe this would naturally happen if there were no queue here at all, so I am wondering why it needs to be here?
Would a PR to remove the queueing be accepted?
If not, what is the appropriate way to override this to not use a queue? We can write our own, but the use of BaseHTTPMiddleware is hardcoded: https://github.com/encode/starlette/blob/519f5750b5e797bb3d4805fd29657674304ce397/starlette/applications.py#L197, leaving only some fairly hacky approaches to preventing this queueing.