[275] Parse JSON lines ourselves to avoid aiohttp's "Line is too long" by nolar · Pull Request #276 · zalando-incubator/kopf

nolar · 2019-12-16T18:49:25Z

Parse the JSON lines of Kubernetes watch-streams with no memory limit.

Issue : #275

Description

Kubernetes can send JSON lines with huge objects. E.g., secrets of 2MB in #275.

aiohttp has a hard-coded limit the lines received from the stream, and it is 128KB (aiohttp.streams.DEFAULT_LIMIT = 2**16 = 64 KB for the low watermark of the buffer, multiplied by 2x for high watermark, where the exception is raised).

There is no way to externally configure the buffer size of the aiohttp's StreamReader, so as there is no way to inject our own StreamReader instances with limit= properly set. The docs say:

User should never instantiate streams manually but use existing aiohttp.web.Request.content and aiohttp.ClientResponse.content properties for accessing raw BODY data.

For this reason, we implement our own per-line iterator with no memory control — it takes as much memory, as it is needed. Though, we try not to waste it too much by not having duplicates of each line in the buffer and by not having multiple copies of the same line.

The efficiency is not tested and is not measured. Specifically, it can be slow if there are hundreds or thousands of such huge resources or events are happening very often. But we assume it is sufficient for now, and can later be improved when some performance optimisations and measurements are performed.

FYI: The provided test clearly shows how the pure aiohttp approach fails with "Line is too long" — if used without this wrapping iterator.

Types of Changes

Bug fix (non-breaking change which fixes an issue)

Review

List of tasks the reviewer must do to review the PR

Tests

zincr · 2019-12-16T18:49:36Z

🤖 zincr found 0 problems , 0 warnings

✅ Large Commits
✅ Approvals
✅ Specification
✅ Dependency Licensing

zincr · 2019-12-16T18:49:36Z

🤖 zincr found 1 problem , 0 warnings

❌ Approvals
✅ Large Commits
✅ Specification
✅ Dependency Licensing

Details on how to resolve are provided below

Approvals

All proposed changes must be reviewed by project maintainers before they can be merged

Not enough people have approved this pull request - please ensure that 1 additional user, who have not contributed to this pull request approve the changes.

✅ Approved by PR author @nolar
❌ 1 additional approval needed

pshchelo · 2019-12-16T19:04:49Z

Thanks a lot for a quick patch!
I will test it tomorrow on our actual use case and report back.

nolar · 2019-12-16T19:46:07Z

PS: Releasing it as 0.23.3 hotfix on top of 0.23.2 is not possible, as the CI/CD of that old state is broken due to new releases of the external components (#272 #269). So, this fix goes on top of the master — with all the changes that are already there — and will be released as 0.24.

Preview of 0.24:

Otherwise, it freezes at the real cluster (example 99) due to an empty line is yielded, and cannot be parsed.

pshchelo · 2019-12-17T14:35:09Z

@nolar seems to work fine with us! eagerly waiting for 0.24 release now ...

[275] Parse JSON lines ourselves to avoid aiohttp's "Line is too long"

9e492d4

nolar added the bug Something isn't working label Dec 16, 2019

nolar requested a review from samurang87 as a code owner December 16, 2019 18:49

[275] Reduce the memory footprint of the HTTP/JSON line parser

d2f8e77

nolar added 2 commits December 16, 2019 21:06

[275] Handle a case of no-content response and EmptyStreamReader

fd3e3fb

[275] Iterate as JSON-lines, not just as regular text lines (skip empty)

b5e6d21

Otherwise, it freezes at the real cluster (example 99) due to an empty line is yielded, and cannot be parsed.

samurang87 approved these changes Dec 19, 2019

View reviewed changes

nolar mentioned this pull request Dec 19, 2019

"Line is too long" for JSON-lines from Kubernetes API aio-libs/aiohttp#4453

Closed

nolar merged commit 60038b5 into zalando-incubator:master Dec 19, 2019

nolar deleted the 275-line-too-long branch December 19, 2019 10:10

kopf-archiver bot mentioned this pull request Aug 19, 2020

[PR] [275] Parse JSON lines ourselves to avoid aiohttp's "Line is too long" nolar/kopf#276

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[275] Parse JSON lines ourselves to avoid aiohttp's "Line is too long"#276

[275] Parse JSON lines ourselves to avoid aiohttp's "Line is too long"#276
nolar merged 4 commits intozalando-incubator:masterfrom
nolar:275-line-too-long

nolar commented Dec 16, 2019 •

edited

Loading

Uh oh!

zincr bot commented Dec 16, 2019 •

edited

Loading

Uh oh!

zincr bot commented Dec 16, 2019

Uh oh!

pshchelo commented Dec 16, 2019

Uh oh!

nolar commented Dec 16, 2019

Uh oh!

pshchelo commented Dec 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nolar commented Dec 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Types of Changes

Review

Uh oh!

zincr bot commented Dec 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 zincr found 0 problems , 0 warnings

Uh oh!

zincr bot commented Dec 16, 2019

🤖 zincr found 1 problem , 0 warnings

Approvals

Uh oh!

pshchelo commented Dec 16, 2019

Uh oh!

nolar commented Dec 16, 2019

Uh oh!

pshchelo commented Dec 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nolar commented Dec 16, 2019 •

edited

Loading

zincr bot commented Dec 16, 2019 •

edited

Loading