Support parallel decompression (pigz)#2640
Conversation
|
Is there any reason not to make this a compilation option and use pgzip by default? |
There was a problem hiding this comment.
The benchmarks can just be BenchmarkDecompression* and run with 2 different environments(or compilation options). This also makes it easier to use with benchcmp
There was a problem hiding this comment.
Can we avoid using env var?
samuelkarp
left a comment
There was a problem hiding this comment.
Code LGTM, but it looks like b8aca28 still needs the Signed-off-by line.
889f1bd to
806f6e9
Compare
|
Thanks for contributing this! http://github.com/klauspost/pgzip has no external dependencies, right? Just double check. |
|
@Random-Liu yes, there are no external dependencies. |
|
@mxpv Thanks! |
There was a problem hiding this comment.
Try this _, err = io.CopyBuffer(ioutil.Discard, decompressor, buf) where buf is declared before the closure as buf := make([]byte, 32*1024). I saw a massive improvement locally with that.
|
I am seeing much different benchmark results. Are you running these with any other flags to see the benefits of parallelism? |
|
@dmcgowan that's interesting. It looks like |
|
@mxpv I am curious if that was similarly the bottleneck for the external process approach. This is a much cleaner approach but I am wondering if it is a problem with the implementation or the way it is being tested (size of test, parallelism, etc). |
|
@dmcgowan 4440.48MB/s is way over normal decompression speed. Either the content is way too compressible or uncompressible to be a real-world test. Typical speeds should be at least an order of magnitude slower. Also, if you are just copying to |
|
@dmcgowan I've reworked benchmark to make it a bit closer to real world scenario. |
I ran |
62bb3a9 to
8bb1073
Compare
|
Can you squash your commits? I'm not a fan of the merge commit, it's better to do a rebase on master, not a merge. After a squash this LGTM |
6c313ef to
6e131af
Compare
|
@crosbymichael done |
There was a problem hiding this comment.
Can we move test data to gist or somewhere else? (could be downloaded from TestMain())
182766a to
1d909d7
Compare
Signed-off-by: Maksym Pavlenko <[email protected]>
Signed-off-by: Maksym Pavlenko <[email protected]>
|
I am going to add this to 1.2 milestone, we should have this in the release. There are still a few changes I want to see but could be done after we merge this
|
| "testing" | ||
| ) | ||
|
|
||
| const benchmarkTestDataURL = "https://git.io/fADcl" |
There was a problem hiding this comment.
I don't think its a great idea to call out to an external url every time the tests are run.
There was a problem hiding this comment.
Can we move test data to gist or somewhere else? (could be downloaded from TestMain())
@AkihiroSuda @stevvooe could you please decide which approach works for you?
There was a problem hiding this comment.
Sorry, I realize the previous suggestion may not have been clear. I believe the suggestion to move to a gist was to move the entire benchmark to a gist, not just the input file.
There was a problem hiding this comment.
@dmcgowan oh, ok. I've moved the benchmark to https://git.io/fASMy and added sync.Once for initialization
Benchmark gist: https://git.io/fASMy Signed-off-by: Maksym Pavlenko <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #2640 +/- ##
==========================================
- Coverage 47.54% 43.98% -3.56%
==========================================
Files 87 94 +7
Lines 8211 10362 +2151
==========================================
+ Hits 3904 4558 +654
- Misses 3590 5083 +1493
- Partials 717 721 +4
Continue to review full report at Codecov.
|
|
LGTM |
The benchmarks were deleted in containerd#2640 but we could use that to evaluate zstd further. Signed-off-by: Kazuyoshi Kato <[email protected]>
The benchmarks were deleted in containerd#2640 but we could use that to evaluate zstd further. Signed-off-by: Kazuyoshi Kato <[email protected]>
The benchmarks were deleted in containerd#2640 but we could use that to evaluate zstd further. Signed-off-by: Kazuyoshi Kato <[email protected]>
This PR speed up image (layer) decompression by using parallel gzip (http://github.com/klauspost/pgzip).
In the first revision decompression was done by running
unpigzexternal process (similarly to how it's done in moby - moby/moby#35697), but after performing benchmarks it turned out that it's surprisingly slow (I think it's due to external process overhead and moving data back and forth via stdin/stdout).Here are benchmark results:
CC @samuelkarp
Related #1896