[275] Parse JSON lines ourselves to avoid aiohttp's "Line is too long"#276
[275] Parse JSON lines ourselves to avoid aiohttp's "Line is too long"#276nolar merged 4 commits intozalando-incubator:masterfrom
Conversation
🤖 zincr found 0 problems , 0 warnings |
🤖 zincr found 1 problem , 0 warningsDetails on how to resolve are provided below ApprovalsAll proposed changes must be reviewed by project maintainers before they can be merged Not enough people have approved this pull request - please ensure that 1 additional user, who have not contributed to this pull request approve the changes.
|
|
Thanks a lot for a quick patch! |
|
PS: Releasing it as 0.23.3 hotfix on top of 0.23.2 is not possible, as the CI/CD of that old state is broken due to new releases of the external components (#272 #269). So, this fix goes on top of the master — with all the changes that are already there — and will be released as 0.24. Preview of 0.24: |
Otherwise, it freezes at the real cluster (example 99) due to an empty line is yielded, and cannot be parsed.
|
@nolar seems to work fine with us! eagerly waiting for 0.24 release now ... |

Parse the JSON lines of Kubernetes watch-streams with no memory limit.
Description
Kubernetes can send JSON lines with huge objects. E.g., secrets of 2MB in #275.
aiohttphas a hard-coded limit the lines received from the stream, and it is 128KB (aiohttp.streams.DEFAULT_LIMIT = 2**16= 64 KB for the low watermark of the buffer, multiplied by 2x for high watermark, where the exception is raised).There is no way to externally configure the buffer size of the aiohttp's StreamReader, so as there is no way to inject our own
StreamReaderinstances withlimit=properly set. The docs say:For this reason, we implement our own per-line iterator with no memory control — it takes as much memory, as it is needed. Though, we try not to waste it too much by not having duplicates of each line in the buffer and by not having multiple copies of the same line.
The efficiency is not tested and is not measured. Specifically, it can be slow if there are hundreds or thousands of such huge resources or events are happening very often. But we assume it is sufficient for now, and can later be improved when some performance optimisations and measurements are performed.
FYI: The provided test clearly shows how the pure aiohttp approach fails with "Line is too long" — if used without this wrapping iterator.
Types of Changes
Review
List of tasks the reviewer must do to review the PR