Skip to content

Decompress 7z archives on the fly rather than extracting to disk#2931

Merged
gantoine merged 2 commits intorommapp:masterfrom
Delgan:decompress-7z-on-the-fly
Jan 22, 2026
Merged

Decompress 7z archives on the fly rather than extracting to disk#2931
gantoine merged 2 commits intorommapp:masterfrom
Delgan:decompress-7z-on-the-fly

Conversation

@Delgan
Copy link
Copy Markdown
Contributor

@Delgan Delgan commented Jan 22, 2026

Description

It looks to me that we don't need to 7z archive to a temporary file and we can instead stream-decompress them, calculating the hases on the fly. It can't OOM since the pipes have a finite size.

Tested happy and timeout paths, it works well.

Checklist
Please check all that apply.

  • I've tested the changes locally
  • I've updated relevant comments
  • I've assigned reviewers for this PR
  • I've added unit tests that cover the changes

Screenshots (if applicable)

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Delgan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the efficiency and resource management of 7z archive processing by transitioning from a disk-based extraction method to a direct streaming approach. This change allows for on-the-fly data processing and hash calculation, thereby reducing reliance on temporary files and improving overall performance and system resource utilization.

Highlights

  • Streaming Decompression: The 7z archive decompression process has been refactored to stream data directly from the 7z utility's standard output, eliminating the need for temporary file extraction.
  • Resource Efficiency: By avoiding temporary files, this change reduces disk I/O and storage requirements during archive processing, leading to more efficient resource utilization.
  • On-the-fly Hashing: File hashes are now calculated incrementally as data is streamed from the 7z process, integrating the hashing step directly into the decompression flow.
  • Custom Timeout Handling: A custom timeout mechanism has been implemented within the streaming loop to monitor decompression time and terminate the process if the configured timeout is exceeded, ensuring responsiveness.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the 7-Zip archive handling to decompress files on-the-fly by streaming from the 7zz process, instead of extracting to a temporary file on disk. This is a great optimization that reduces disk I/O and temporary storage requirements. My review focuses on the new implementation using subprocess.Popen. While the change is functionally correct for happy paths, I've identified a potential issue with the manual timeout implementation that could lead to the process hanging in certain edge cases. My feedback includes a detailed explanation of the issue and suggestions for improving robustness.

@gantoine gantoine merged commit bb3a2a0 into rommapp:master Jan 22, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants