Skip to content

avoid repeated decompression and further utilize the --GH option to improve conversion speed#2145

Merged
ktock merged 1 commit intocontainerd:mainfrom
escapefreeg:main
Oct 18, 2025
Merged

avoid repeated decompression and further utilize the --GH option to improve conversion speed#2145
ktock merged 1 commit intocontainerd:mainfrom
escapefreeg:main

Conversation

@escapefreeg
Copy link
Copy Markdown

During the current image conversion process, the converted blob is decompressed in estargz/build.go to calculate its hash value. Additionally, the converted blob is decompressed again in nativeconverter/estargz/estargz.go (estargz format) and nativeconverter/zstdchunked/zstdchunked.go (zstd format) to calculate its size. However, these two decompression operations are not necessary, and decompression is a time-consuming task. Therefore, the two decompressions can be merged into a single operation to improve image conversion speed.

Furthermore, according to #2117 , compared with Go’s built-in gzip decompression library, command-line tools can achieve faster decompression of gzip archives. When the --estargz-gzip-helper (--GH) option is specified, the gzip helper can be further utilized to speed up decompression of the converted image.

I' conducted tests on the related changes. Results from 10 images show that the improvement reduces conversion time by an average of 26.7%. For more details, see:
image

The calculation method for Time improvement from applying current optimization on pigz GH is (With pigz GH - With pigz GH and current optimization) / With pigz GH and the calculation method for Time improvement from combining pigz GH and current optimization is (Without GH - With pigz GH and current optimization) / Without GH

Note: Although ioutils.CountWriter ensures correct byte counting in concurrent environments, its usage in nativeconverter/estargz/estargz.go and nativeconverter/zstdchunked/zstdchunked.go does not involve concurrency. Therefore, the size of the converted blob can be directly obtained from the return value of io.Copy.

@escapefreeg escapefreeg force-pushed the main branch 3 times, most recently from 210513c to c8413f3 Compare October 14, 2025 02:59
estargz/build.go Outdated
Comment on lines +295 to +296
uncompressedSizeChan <- uncompressedSize
close(uncompressedSizeChan)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make UncompressedSize return a zero value when it called more than once? And I guess this coroutine leaks with blocking at L295 if UncompressedSize isn't called.
Instead of using channel, I think it can just use an int64 pointer + atomic. UncompressedSize can have a comment that the value is valid only after the full read.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I've updated it based on your advice.

Copy link
Copy Markdown
Member

@ktock ktock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ktock ktock merged commit 73ac8ff into containerd:main Oct 18, 2025
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants