-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Reset ZLibStreamHandle when parsing concatenated GZip streams #113587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reset ZLibStreamHandle when parsing concatenated GZip streams #113587
Conversation
|
Tagging subscribers to this area: @dotnet/area-system-io-compression |
stephentoub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me. Thanks.
Do we have sufficient test coverage of concatenated gzip streams to appropriately exercise this?
|
Thanks. I think we do - |
|
/azp run runtime-libraries-coreclr outerloop Edit: could someone trigger the outerloop tests for ManyManyConcatenatedGzipStreams please? |
This comment was marked as resolved.
This comment was marked as resolved.
1 similar comment
This comment was marked as resolved.
This comment was marked as resolved.
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Thanks. The GZip tests and the wider set of System.IO.Compression tests are passing on both pipelines. The remaining test failures look unrelated to me. |
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/ba-g Deadletter in an unrelated test |
carlossanlop
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'll make sure we mention this perf improvement in the next preview release notes.
The
GZipStreamclass is capable of parsing multiple, concatenated GZip data streams. If it reaches the end of a GZip stream and it detects that there's more GZip data remaining to process, it currently disposes of its existingZLibStreamHandleand instantiates a new one with the same settings. As a result, it frees and reallocates enough memory for the ZLibStreamHandle (including theZStreamPAL structure it wraps) and thez_streamwrapped by the PAL structure. It also calls the ZLib methodsinflateEndandinflateInit2, reconstructing a brand new set of inflater state.This PR changes that logic to call
inflateReset2within ZLib. This means that we can reset and reuse the existing ZLibStreamHandle rather than allocating a new one. It also provides a modest reduction in execution time and memory usage.Full benchmark results are below, but to summarise:
NoCompressionandFastest, respectively. I'm not sure why, but my changes aren't invoked unless there are multiple streams so I think it's pretty likely that it's just background noise.GZipStream/DeflateStream/Inflater/ZLibStreamHandleseems to be close to an inlining threshold; we've replaced two of the PInvokes with one, so I think we've reduced the code size ofInflater.ResetStreamForLeftoverInputby enough to cross that threshold.Benchmark results
We'll also be able to revisit #71991, since one of the reasons for its partial reversion in #85001 was that the Inflater class would sometimes recreate its SafeHandle during its lifetime. This no longer happens.
I'd also appreciate it if someone could run the outer loop tests too - they contain the
ManyManyConcatenatedGzipStreamstest.