Skip to content

feat: speed up finding changed files in commits#2645

Merged
jskeet merged 2 commits intogoogleapis:mainfrom
jskeet:speed-up-finding-changed-files
Oct 22, 2025
Merged

feat: speed up finding changed files in commits#2645
jskeet merged 2 commits intogoogleapis:mainfrom
jskeet:speed-up-finding-changed-files

Conversation

@jskeet
Copy link
Copy Markdown
Contributor

@jskeet jskeet commented Oct 22, 2025

We don't need the actual diffs between commits - only the names of the changed files. This massively improves performance of "release init".

While there may be other things we want to do in the future, this makes enough of a difference that we can probably hold off on larger changes (at least for the sake of performance).

Fixes #2634

We don't need the actual diffs between commits - only the names of the
changed files. This massively improves performance of "release init".

While there *may* be other things we want to do in the future, this
makes enough of a difference that we can probably hold off on larger
changes (at least for the sake of performance).

Fixes googleapis#2634
@jskeet jskeet requested a review from codyoss October 22, 2025 10:13
@jskeet jskeet requested a review from a team as a code owner October 22, 2025 10:13
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @jskeet, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant performance enhancement by refining how changed files are detected within Git commits. Instead of computing comprehensive content diffs, the system now efficiently retrieves only the names of modified files. This targeted optimization is expected to dramatically speed up the "release init" command, addressing a known performance bottleneck without requiring more extensive architectural changes at this time.

Highlights

  • Performance Improvement: The method for identifying changed files between Git commits has been optimized to avoid generating full diffs, significantly boosting the performance of the "release init" process.
  • Git Operations Refinement: The underlying Git operation in ChangedFilesInCommit was switched from fromTree.Patch(toTree) to fromTree.Diff(toTree) to retrieve only file change metadata instead of full content differences.
  • File Name Extraction Logic: The logic for extracting and collecting changed file names has been refactored to correctly handle deletions, modifications, insertions, and renames based on the new Diff output structure.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the performance of finding changed files in a commit by using fromTree.Diff instead of fromTree.Patch. This is a great optimization as it avoids calculating the full diff when only file names are needed.

I've found a small bug in the new implementation where it would add an empty string to the list of files for deletions. I've left a comment with a suggested fix. Otherwise, the change looks good.

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 22, 2025

Codecov Report

❌ Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 85.68%. Comparing base (88c4379) to head (511bbd4).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/gitrepo/gitrepo.go 90.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2645   +/-   ##
=======================================
  Coverage   85.68%   85.68%           
=======================================
  Files         108      108           
  Lines       11057    11059    +2     
=======================================
+ Hits         9474     9476    +2     
  Misses       1250     1250           
  Partials      333      333           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jskeet
Copy link
Copy Markdown
Contributor Author

jskeet commented Oct 22, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the performance of the release init process by optimizing the method for identifying changed files in commits. Instead of retrieving the actual diffs, the updated code now focuses solely on extracting the names of the modified files, leading to significant speed improvements. The changes include modifications to the ChangedFilesInCommit function in internal/gitrepo/gitrepo.go and the addition of new test cases in internal/gitrepo/gitrepo_test.go to cover deletion and renaming scenarios.

Copy link
Copy Markdown
Member

@codyoss codyoss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jon, great find!

@jskeet jskeet merged commit 1060946 into googleapis:main Oct 22, 2025
6 checks passed
@jskeet jskeet deleted the speed-up-finding-changed-files branch October 22, 2025 13:43
ldetmer pushed a commit that referenced this pull request Oct 22, 2025
Librarian Version: v0.0.0-20251022154542-dd249212325d
Language Image:
us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/librarian-release-container:latest
<details><summary>librarian: 0.5.0</summary>

##
[0.5.0](v0.4.0...v0.5.0)
(2025-10-22)

### Features

* speed up finding changed files in commits (#2645)
([1060946](10609465))

* Can config dart `export` (#2641)
([97eacd2](97eacd29))

* minor whitespace and doc changes to the Dart templates (#2636)
([b3ac7b4](b3ac7b42))

* only generate libraries with changed APIs (#2618)
([82171be](82171bed))

* make extra modules public (#2622)
([2c94a53](2c94a53f))

* skip a GitHub release for a library (#2612)
([6258f4d](6258f4d1))

* allow skipping semver checks for rust-publish (#2584)
([739ce0d](739ce0d5))

* add ability to open pull request as a draft (#2604)
([c1f0285](c1f02859))

* capture discovery revision (#2605)
([14a1483](14a14830))

* Add conditional instrumentation to gRPC clients (#2594)
([3cc63b2](3cc63b22))

* add update-image CLI command (#2580)
([90e0f6e](90e0f6e5))

* Generate more samples for oneof main setters. (#2592)
([c55f3ce](c55f3ceb))

* Add documentation for generated service constructors (#2575)
([6a4aead](6a4aeade))

* disable some clippy warnings (#2567)
([9f51084](9f510842))

* Generate setter samples for oneof fields. (#2573)
([8c2416a](8c2416ab))

* add default Rust features option (#2562)
([892f42b](892f42b7))

* Include a correct URL for issues (#2570)
([10493ed](10493ed9))

* add ability to find the latest docker image SHA (#2539)
([62e80f1](62e80f19))

* add ability to checkout a repo at a certain commit (#2555)
([23b8ffe](23b8ffea))

* add onboarding PR body (#2552)
([e32719c](e32719cd))

### Bug Fixes

* associate bulk change to individual libraries (#2626)
([dd24921](dd249212))

* Allow `unnecessary_import`s (#2642)
([88c4379](88c43794))

* resolve issue where onboarded library can&#39;t be released (#2632)
([b300a4e](b300a4ea))

* resolve issue where commits cannot be fetched for new library (#2631)
([45652c0](45652c03))

* address a typo in the Message.ServicePlaceholder docs (#2616)
([82fda96](82fda96b))

* bad version bump edits (#2613)
([9902b1d](9902b1d5))

* use templates for update-image PR body (#2602)
([7309cad](7309cadd))

* populate configure command pr content (#2591)
([811eb8e](811eb8e2))

* shrink release PR size when there are bulk changes (#2585)
([bcb914a](bcb914ac))

* Fixes several issues with oneof main setter samples (#2589)
([e4958d0](e4958d00))

* resolve broken link in PR body (#2579)
([098c1d2](098c1d24))

* Remove double sample code blocks (#2582)
([6b10456](6b104567))

* change commit package (#2571)
([45ee48f](45ee48f0))

* show chores in release notes (#2544)
([88b62cc](88b62ccb))

* avoid work duplication when finding changes (#2558)
([0adeeac](0adeeac6))

* mangled method names and doc links (#2565)
([895dac9](895dac94))

### Reverts

* show chores in release notes (#2601)
([7e6740f](7e6740ff))

</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

librarian: improve performance of commit processing of libraries.

2 participants