make HTTPResponse.stream use read1 when amt=None#3216
make HTTPResponse.stream use read1 when amt=None#3216smason wants to merge 4 commits intourllib3:mainfrom
Conversation
|
@illia-v A hopefully simple followup to my last PR. This makes the actual change to the behavior of Not sure why all those tests timed out! I rebased to main branch before making this, I see there's been a bit of a rework of the test infrastructure recently could that be related? |
|
@saschpe thanks! The test failures look to be a result of this PR. The case is pretty similar to hanging you described in python/cpython#112064. urllib3/src/urllib3/response.py Line 1022 in 4ae4b71 |
f6fa97b to
043edc8
Compare
|
@illia-v I was only running the few tests I cared about locally as the whole suite takes a long time and missed that unintended breakage, should be fixed now. I tried to rework while data := fp.read(amt):
yield databut this raises on the final read due to the response having been implicitly closed in the Am wondering whether this is worth trying to clean up closed handling separately? |
src/urllib3/response.py
Outdated
| if not data: | ||
| break |
There was a problem hiding this comment.
When the loop is broken here after read1 was used, self._fp remains not closed even though all data has been read. Closing it may be a better way to break the loop. What do you think about checking self.length_remaining after calling read1 and, if it is 0 and not self.closed, calling self._fp.close()?
There was a problem hiding this comment.
When the loop is broken here after
read1was used,self._fpremains not closed even though all data has been read.
Hadn't realised the semantics around what gets closed and when, have done a bit more reading now. Have had a bit of a refactor of the change.
Closing it may be a better way to break the loop. What do you think about checking
self.length_remainingafter callingread1and, if it is 0 andnot self.closed, callingself._fp.close()?
It feels as though it could raise an IncompleteRead if it finishes and length_remaining indicates there's missing data (as _raw_read does), any preferences on this / other changes
There was a problem hiding this comment.
It feels as though it could raise an
IncompleteReadif it finishes andlength_remainingindicates there's missing data (as_raw_readdoes), any preferences on this / other changes
Good point!
Looking at CPython's code, it's possible to get in the situation at least if self._fp is closed between read1 calls. However, you have to respect self.enforce_content_length as _raw_read does.
043edc8 to
24fb044
Compare
24fb044 to
83e4ab8
Compare
|
rebased due to all the bumping in dependencies and test changes |
src/urllib3/response.py
Outdated
| if not data: | ||
| break |
There was a problem hiding this comment.
It feels as though it could raise an
IncompleteReadif it finishes andlength_remainingindicates there's missing data (as_raw_readdoes), any preferences on this / other changes
Good point!
Looking at CPython's code, it's possible to get in the situation at least if self._fp is closed between read1 calls. However, you have to respect self.enforce_content_length as _raw_read does.
9417e22 to
4ce6a9b
Compare
|
I wanted to check in on this PR. It looks like the Python landscape and urllib3 have changed a bit since there was last discussion. I'm seeing the previously outstanding concerns as: Raising IncompleteRead correctly: Handled in this PR Is there anything else outstanding I'm missing? |
|
@nateprewitt IIRC this was blocked by python/cpython#115998 |
Following up on #3186 (and suggested in #2125) this uses the new
read1method when streaming a non-chunked response andamt=None.This would mean that psf/requests#5536 could also be closed.