git/githistory: cull subtrees, blobs based on filepathfilter #2295
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request teaches the
*git/githistory.Rewritertype how to cull out subtrees and blobs that won't be modified by the BlobRewriteFn by first checking each tree entry's absolute path against a*filepathfilter.Filterinstance.This is required work to boost the performance of the proposed
git-lfs-migrate(1)command (see discussion: #2146). Since the migrate command will take an-Iand-Xflag, we can precompute a*filepathfilter.Filterinstance that is passed to the rewriter, which will avoid calling theBlobCallbackFnon blobs whose paths don't match the filter.Instead of calling the
BlobRewriteFnto generate a new object (and therefore, a new tree entry) we copy over the existing tree entry that we are trying to rewrite, and use the same reference to an existing object present in the object database. This ensures that performance for subtrees and blobs that don't match the given*filepathfilter.Filteris ⚡️.To pass the filter to the
NewRewriter()function, I used Dave Cheney's "option" funcs [source]./cc @git-lfs/core