git: worktree, optimize infiles function for very large repos#1853
git: worktree, optimize infiles function for very large repos#1853pjbgf merged 3 commits intogo-git:releases/v5.xfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the inFiles function for very large repositories by replacing a linear search (O(n*m) complexity) with a map-based lookup (O(n+m) complexity). The optimization is particularly beneficial when dealing with large numbers of changes and file filters during git reset operations.
Changes:
- Replaced slice-based linear search in
inFileswith O(1) map lookup - Added
buildFilePathMaphelper function to create file path maps with cleaned paths - Pre-allocated
removedFilesslice capacity inresetIndexfor better memory efficiency
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| func inFiles(files []string, v string) bool { | ||
| // inFiles checks if the given file is in the list of files. The incoming filepaths in files should be cleaned before calling this function. |
There was a problem hiding this comment.
The comment states "The incoming filepaths in files should be cleaned before calling this function", but this is misleading. The function already cleans the input parameter v on line 442, and the map parameter files is already built with cleaned paths by buildFilePathMap. Consider revising the comment to clarify that the map is expected to contain pre-cleaned paths (which is handled by buildFilePathMap), and that the function will clean the lookup key v internally.
| // inFiles checks if the given file is in the list of files. The incoming filepaths in files should be cleaned before calling this function. | |
| // inFiles checks if the given file is in the list of files. The files map is expected | |
| // to contain pre-cleaned paths (as produced by buildFilePathMap), and this function | |
| // will clean the lookup key v before checking for membership. |
|
Is it possible to have a benchmark or is there an existing benchmark that get better when using this PR? |
@cedric-appdirect I agree benchmarking performance changes is a good practice. On this case specifically I'm slighly less worried due to the change itself. That being said, I'm planning to have some level of performance regression tests into CI in the near future. |
Backport for #1852.
Closes #1845.