Skip to content

Comments

Minidumps: PoC#270

Merged
vanschelven merged 34 commits intomainfrom
minidumps
Nov 15, 2025
Merged

Minidumps: PoC#270
vanschelven merged 34 commits intomainfrom
minidumps

Conversation

@vanschelven
Copy link
Contributor

PoC for minidumps; many TODOs remain, among which DOS-related ones.

for this reason, behind a setting that's default False

See #82

in preparation of the minidump handling through the envelope path,
which probably requires dealing with the whole envelope as a whole.

"theoretically" this might be less efficient, but [a] see the notes
at the top of the parser on how we think about streaming parsing and
[b] the inefficiencies are in the "immediate" path anyway (which has
the assumtion that it's not in high-performance envs).

"blind commit" (tests not run against this commit).
i.e. fix tests for 391e22b

(the changes in the present commit in `ingest/tests.py` are not strictly necessary
but they are principly right)
Although DIGEST_IMMEDIATELY=True is theoretically a nice thing to
have, the upkeep is not worth it now that we're about to introduce
minidump ingestion.

The only thing that you're saving is the round-trip via the filesystem,
but performance of that is negligable, and if you're configuring
DIGEST_IMMEDIATELY you're actually _not_ in the performance-critical path
anyway.

Getting rid of it _also_ harmonizes/reduces the number of paths to test.

It's approximately 1% of our installed base.
this makes this consistent with the work we did in the previous commit
at the price of being slightly more inefficient.

but it's a deprecated endpoint anyway
this was inlined in 7f831f5
envelope-based happy path only
(from reading the code, not actually double-checked right now)
works w/ sentry-client; no actual handling yet; TODOs in-code

See #82
(as per my notes; didn't recheck this when committing)
(ran into this b/c the native minidump upload uses chunked mode and
our impl. of that looks at request.body (via FILES))

See #9
….2 upgrade

this method wasn't changed upstream.

See #89
adds readline() method to GeneratorReader (ChatGPT-generated; eyeballed for
correctness) to match the Django FILES/POST handling expectations.
rather than try-and-recover, just look at the headers and show body/POST etc.
this avoids hard-to-reason about situations where either of those won't work
because the other has already been executed; in combination with reasoning
about max size usage the explicit solution is simply easier to reason about.

further:

* makes api_catch_all one of the content_encoding-ready views.
* implement a max length for the ingest api view
Plenty of TODOs left; but this proves we can find:

* file names
* function names
* line nos
* source context

See #82
mirrors how we show fetch it in `get_exception_type_and_value_for_exception`
(the now-removed 'treat as pid' was hallunicated by the bot; the
taken-from-sentry version missed the guard against -1)

> The index of the thread that requested a dump be written in the
> threads vector. [..] If the dump was not produced as a result of an exception
> [..] this field will be set to -1,
this code was created in a REPL/ChatGPT/minidump-API/HITL session,
until I had something that "seemed to work". the present commit
is the result of rereading, refactoring for understanding etc.
it's not "pure refacting" in the sense that it's behavior-changing,
but AFAICT for the better. e.g. "line 0" => just leave that out and
many similar changes.
@vanschelven vanschelven merged commit 1829465 into main Nov 15, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant