fix performance issue in scenario #2966 (part 1)#2969
Merged
Conversation
When re-using a compression state, across multiple successive compressions, the state should minimize the amount of allocation and initialization required. This mostly matters in situations where initialization is an overwhelming task compared to compression itself. This can happen when the amount to compress is small, while the compression state was given the impression that it would be much larger, aka, streaming mode without providing a srcSize hint. This lean-initialization optimization was broken in 980f3bb . This commit fixes it, making this scenario once again on par with v1.4.9. Note that this does not completely fix #2966, since another heavy initialization, specific to row mode, is also happening (and was not present in v1.4.9). This will be fixed in a separate commit.
felixhandte
approved these changes
Jan 4, 2022
Contributor
felixhandte
left a comment
There was a problem hiding this comment.
Ok, I believe I understand the functional change.
(It would be helpful in the future to separate functional and aesthetic changes into their own commits.)
lib/compress/zstd_cwksp.h
Outdated
| ws->tableValidEnd = end; | ||
| ws->objectEnd = objectEnd; | ||
| ws->tableEnd = objectEnd; | ||
| if (ws->tableEnd > ws->tableValidEnd) ws->tableValidEnd = objectEnd; |
Contributor
There was a problem hiding this comment.
I believe this is the single line of change that actually changes the behavior of the code and that everything else is just formatting/naming/comments. (Correct me if I'm wrong!)
I guess this makes sense.
Nit: the phrasing is a bit awkward. Normally I would expect an expression of this form to be if (a > b) b = a;, but here you have if (a > b) b = c; where it happens to be the case that a == c. I think this would be clearer as:
if (objectEnd > ws->tableValidEnd) {
ws->tableValidEnd = objectEnd;
}
terrelln
approved these changes
Jan 4, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Re-employing a compression state across multiple successive compression jobs
shall minimize the amount of allocation and initialization required.
This mostly matters in situations where initialization is an overwhelming task compared to compression itself.
This can happen when the amount to compress is small,
while the compression state was expecting it to be much larger,
aka, streaming mode without the benefit of a
srcSizehindsight.In which case,
zstdcan't rely on its automatic resource resizing capability, and initialization cost can only be made acceptable by amortizing it over a series of compressions.This diff restores a lean-initialization optimization trick that used to be present up to
v1.4.9and was lost in commit 980f3bb .The following measurement is taken on a core
i7-9700K(turbo disabled) withfullbench -b41, usinggeldings.txtas sample (a small text file). The test corresponds to a scenario usingZSTD_compressStream()without the benefit of knowing the small sample size beforehand.Note how important is this lean-initialization optimization for small data as the compression level increases, hence the amount of resources used increases too.
Note that this PR does not completely fix #2966,
since another heavy initialization, specific to
rowHashmode,is also present (and wasn't in
v1.4.9).This second issue will be fixed in a separate commit.
For information :
dfastrowHashrowHashrowHashbtlazy2