-
Notifications
You must be signed in to change notification settings - Fork 2.2k
commands: add git-lfs-migrate(1) 'import' subcommand #2353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- indent ref updates by two spaces - print ref.Name instead of full reference
This reverts commit f5b4f98.
Contributor
Author
|
Opened up #2358, which should cause this branch to pass on CircleCI after it's merged in. |
technoweenie
approved these changes
Jun 26, 2017
Closed
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Nov 24, 2024
Our README file contains a brief note in its "Example Usage" section stating that Git LFS requires a Git version higher than 1.8.2 on Linux and 1.8.5 on macOS. This statement dates from commit 59a49b0 in PR git-lfs#412 in 2015, and so is relatively out of date. In particular, when we added support for the "git lfs migrate" command in PR git-lfs#2353, the actual minimum supported version of Git was changed from 1.8.x to 1.9.0 (in commit 1d0e834) and then to 2.0.0 (in commit 5aea841). These changes were made to the Travis CI configuration in use at the time, and later migrated to our current GitHub Actions CI workflow in commit c32820806229c3f42364d989f7a8597f73cb107ba of PR git-lfs#3808. This workflow continues to run our Git LFS test suite using Git 2.0.0. We therefore now update our README file to remove the outdated note about Git 1.8.x versions, and add a paragraph to the "Limitations" section which documents the current minimum supported Git version of 2.0.0 but also strongly advises the use of a more recent Git version.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
Since commit a343a11 of PR git-lfs#1461, a number of our commands, including "git lfs pull", "git lfs push", and "git lfs track", have checked the version of the currently available Git program and reported an error if it was not at least version 1.8.2. However, when we added support for the "git lfs migrate" command in PR git-lfs#2353, the actual minimum supported version of Git was changed from 1.8.x to 1.9.0 (in commit 1d0e834) and then to 2.0.0 (in commit 5aea841). These changes were made to the Travis CI configuration in use at the time, and later migrated to our current GitHub Actions CI workflow in commit c32820806229c3f42364d989f7a8597f73cb107ba of PR git-lfs#3808. This workflow continues to run our Git LFS test suite using Git 2.0.0. More recently, in commit 1501265 of PR git-lfs#5921, we updated our README file to document that the current minimum supported version of Git we require is v2.0.0. We therefore now update the minimum Git version required by the Git LFS client to 2.0.0 by adjusting the version string defined in the requireGitVersion() function of our "commands" package.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
Since commit a343a11 of PR git-lfs#1461, a number of our commands, including "git lfs pull", "git lfs push", and "git lfs track", have checked the version of the currently available Git program and reported an error if it was not at least version 1.8.2. However, when we added support for the "git lfs migrate" command in PR git-lfs#2353, the actual minimum supported version of Git was changed from 1.8.x to 1.9.0 (in commit 1d0e834) and then to 2.0.0 (in commit 5aea841). These changes were made to the Travis CI configuration in use at the time, and later migrated to our current GitHub Actions CI workflow in commit c32820806229c3f42364d989f7a8597f73cb107ba of PR git-lfs#3808. This workflow continues to run our Git LFS test suite using Git 2.0.0. More recently, in commit 1501265 of PR git-lfs#5921, we updated our README file to document that the current minimum supported version of Git we require is v2.0.0. We therefore now update the minimum Git version required by the Git LFS client to 2.0.0 by adjusting the version string defined in the requireGitVersion() function of our "commands" package.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Jun 19, 2025
When we build Debian and RPM Linux packages, we define the minimum versions of Git and Go required by the Git LFS client. However, the minimum versions we specify are at present somewhat out of date. Specifically, both the "control" file for our Debian packages and the SPEC file for our RPM packages state that we require at least Git version 1.8.2, and the former also specifies that we require at least Go version 1.12.0. In practice, though, since we introduced the "git lfs migrate" command in PR git-lfs#2353, Git v2.0.0 has been the earliest version of Git we support, as per commit 5aea841 of that PR. We have also required at least Go v1.23.0 to build the Git LFS client since commit 70e23fa of PR git-lfs#5997, when we updated the minimum version of the x/crypto Go module specified in our "go.mod" file and the "go mod tidy" command then also updated the minimum required version of Go to 1.23.0. Because we anticipate making a v3.7.0 release of the Git LFS client in the near future, we now update the "control" file for our Debian packages and the SPEC file for our RPM packages to indicate that the Git LFS client requires at least Git v2.0.0 and Go v1.23.0.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Nov 26, 2025
When the "git lfs migrate import" subcommand was implemented in PR git-lfs#2353, a few initial tests were included, beginning with those from commit e39a767 when the original version of what is now our t/t-migrate-import.sh test script was first added. Several of these tests were designed to check that files matching certain path patterns are converted to Git LFS files, while files which do not match those patterns are left unchanged. For instance, the "migrate import (default branch with filter)" test intends to check that files matching the pattern "*.md" are converted to Git LFS by the "git lfs migrate import" command, while files matching the pattern "*.txt" are not converted. One of the specific checks performed by these tests is to try to verify that no Git LFS "filter=lfs" entry has been added to the ".gitattributes" file for the "*.txt" path pattern. To do this, they read the ".gitattributes" file's contents from a given branch and then pipe its contents to a grep(1) command with the -v option, in the expectation that this will fail if a line is found which matches a regular expression containing the "*.txt" pattern. However, the -v option of the "grep" command does not cause the command to fail (i.e., exit with a non-zero value) if a line is found in its input which matches the provided regular expression. Rather, the -v option causes the "grep" command to filter out any lines from its input which match the expression, and then the exit status is determined in the usual manner, so that the command only returns a non-zero value if no other lines were seen in the input. Since our tests happen to always generate entries in the ".gitattributes" files which do not match the "*.txt" pattern, the "grep" commands with the -v option always succeed, but without actually verifying that entries with "*.txt" patterns do not appear in the files. If such an entry did appear, it would simply be filtered by the "grep" commands and then the existence of the other lines would still allow the commands to succeed. In fact, this specific problem affects two of the tests in the t/t-migrate-import.sh test script, the "migrate import (default branch, exclude remote refs)" and "migrate import (given branch, exclude remote refs)" tests. In both cases, the ".gitattributes" files that our tests currently intend to prove do not contain entries with the "*.txt" pattern actually do contain such entries. We therefore correct these checks now, as discussed further below. As additional tests have been added to the t/t-migrate-import.sh test script over time, misuse of the "grep" command's -v option has been accidentally propagated into a number of our tests in the script. We therefore rewrite all of these checks so that they do not use the -v option of the "grep" command. Instead, we utilize the "grep" command's -c option to produce a count of all lines matching the given pattern, and then verify that the count is zero (except in two cases where the existing checks are incorrect, as mentioned above). We use this idiom throughout many of our other test scripts, in part because it has several advantages over other possible techniques for ensuring that a file contains no lines matching a certain pattern. One alternative approach used in a few of our test scripts, solely for historical reasons, is to simply run "grep" without any options and then check that the command's output is empty with the -z or string comparison shell test operators. While this generally suffices, if by chance the "grep" command should fail with an error, the check will still pass and the error will not be reported. In such a case, the command would return an exit status code of 2, and while the "errexit" shell attribute is enabled in all of our tests by the "set -e" command, because we run the "grep" command in a subshell within the shell "[" builtin, if the command returns a non-zero status code that will not cause the test's shell to exit immediately. The command would also likely print an error message to its standard error file descriptor but would not write anything to its standard output, so our checks using the -z or string comparison shell test operators would still succeed. By comparison, when we use the -c option, if by chance the "grep" command were to fail with an error, it would not output an integer count value and so the test would fail. In the case of the two specific tests where we currently attempt to verify that an entry with the "*.txt" pattern does not appear in a ".gitattributes" file, even though these entries do appear, we simply remove the -v option from the "grep" command rather than replace it with the -c option. The purpose of both of these tests, the "migrate import (default branch, exclude remote refs)" and "migrate import (given branch, exclude remote refs)" tests, is to demonstrate that the "git lfs migrate import" command will not rewrite the Git history of any locally cached remote references. The tests therefore assert that after running the "git lfs migrate import" command, no ".gitattributes" file has been added to the Git tree of the commit associated with the "refs/remotes/origin/main" reference. This check alone is sufficient to prove that the reference has not been altered. The two tests then proceeded to also try to check that no "*.txt" entries had been added to the ".gitattributes" files in one of the local references. However, the specific "git lfs migrate import" commands performed by the tests are expected to create such entries, since they convert all of the files in the local branches, exactly as they do in the comparable "migrate import (default branch)" and "migrate import (given branch)" tests. Finally, note that many of the regular expressions used in the "grep" commands we modify in this commit do not properly escape the "." character, so it will technically match any character and not just a literal "." character, as is intended. (The asterisk character in these commands' patterns would also normally be parsed as a metacharacter, but because it happens to be the first character in the expressions it is treated as a literal "*" character.) We could resolve this problem for the regular expressions of the specific "grep" commands we are modifying in this commit, either by adding the -F option to the "grep" commands or by escaping the "." character. However, since this issue affects multiple other patterns as well (including some in other test scripts), we would prefer to address this issue in a more comprehensive fashion. We therefore defer to a future PR any revisions of the patterns used by any "grep" commands in our test suite.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Nov 26, 2025
When the "git lfs migrate import" subcommand was implemented in PR git-lfs#2353, a few initial tests were included, beginning with those from commit e39a767 when the original version of what is now our t/t-migrate-import.sh test script was first added. Several of these tests were designed to check that files matching certain path patterns are converted to Git LFS files, while files which do not match those patterns are left unchanged. For instance, the "migrate import (default branch with filter)" test intends to check that files matching the pattern "*.md" are converted to Git LFS by the "git lfs migrate import" command, while files matching the pattern "*.txt" are not converted. One of the specific checks performed by these tests is to try to verify that no Git LFS "filter=lfs" entry has been added to the ".gitattributes" file for the "*.txt" path pattern. To do this, they read the ".gitattributes" file's contents from a given branch and then pipe its contents to a grep(1) command with the -v option, in the expectation that this will fail if a line is found which matches a regular expression containing the "*.txt" pattern. However, the -v option of the "grep" command does not cause the command to fail (i.e., exit with a non-zero value) if a line is found in its input which matches the provided regular expression. Rather, the -v option causes the "grep" command to filter out any lines from its input which match the expression, and then the exit status is determined in the usual manner, so that the command only returns a non-zero value if no other lines were seen in the input. Since our tests happen to always generate entries in the ".gitattributes" files which do not match the "*.txt" pattern, the "grep" commands with the -v option always succeed, but without actually verifying that entries with "*.txt" patterns do not appear in the files. If such an entry did appear, it would simply be filtered by the "grep" commands and then the existence of the other lines would still allow the commands to succeed. In fact, this specific problem affects two of the tests in the t/t-migrate-import.sh test script, the "migrate import (default branch, exclude remote refs)" and "migrate import (given branch, exclude remote refs)" tests. In both cases, the ".gitattributes" files that our tests currently intend to prove do not contain entries with the "*.txt" pattern actually do contain such entries. We therefore correct these checks now, as discussed further below. As additional tests have been added to the t/t-migrate-import.sh test script over time, misuse of the "grep" command's -v option has been accidentally propagated into a number of our tests in the script. We therefore rewrite all of these checks so that they do not use the -v option of the "grep" command. Instead, we utilize the "grep" command's -c option to produce a count of all lines matching the given pattern, and then verify that the count is zero (except in two cases where the existing checks are incorrect, as mentioned above). We use this idiom throughout many of our other test scripts, in part because it has several advantages over other possible techniques for ensuring that a file contains no lines matching a certain pattern. One alternative approach used in a few of our test scripts, solely for historical reasons, is to simply run "grep" without any options and then check that the command's output is empty with the -z or string comparison shell test operators. While this generally suffices, if by chance the "grep" command should fail with an error, the check will still pass and the error will not be reported. In such a case, the command would return an exit status code of 2, and while the "errexit" shell attribute is enabled in all of our tests by the "set -e" command, because we run the "grep" command in a subshell within the shell "[" builtin, if the command returns a non-zero status code that will not cause the test's shell to exit immediately. The command would also likely print an error message to its standard error file descriptor but would not write anything to its standard output, so our checks using the -z or string comparison shell test operators would still succeed. By comparison, when we use the -c option, if by chance the "grep" command were to fail with an error, it would not output an integer count value and so the test would fail. In the case of the two specific tests where we currently attempt to verify that an entry with the "*.txt" pattern does not appear in a ".gitattributes" file, even though these entries do appear, we simply remove the -v option from the "grep" command rather than replace it with the -c option. The purpose of both of these tests, the "migrate import (default branch, exclude remote refs)" and "migrate import (given branch, exclude remote refs)" tests, is to demonstrate that the "git lfs migrate import" command will not rewrite the Git history of any locally cached remote references. The tests therefore assert that after running the "git lfs migrate import" command, no ".gitattributes" file has been added to the Git tree of the commit associated with the "refs/remotes/origin/main" reference. This check alone is sufficient to prove that the reference has not been altered. The two tests then proceeded to also try to check that no "*.txt" entries had been added to the ".gitattributes" files in one of the local references. However, the specific "git lfs migrate import" commands performed by the tests are expected to create such entries, since they convert all of the files in the local branches, exactly as they do in the comparable "migrate import (default branch)" and "migrate import (given branch)" tests. Finally, note that many of the regular expressions used in the "grep" commands we modify in this commit do not properly escape the "." character, so it will technically match any character and not just a literal "." character, as is intended. (The asterisk character in these commands' patterns would also normally be parsed as a metacharacter, but because it happens to be the first character in the expressions it is treated as a literal "*" character.) We could resolve this problem for the regular expressions of the specific "grep" commands we are modifying in this commit, either by adding the -F option to the "grep" commands or by escaping the "." character. However, since this issue affects multiple other patterns as well (including some in other test scripts), we would prefer to address this issue in a more comprehensive fashion. We therefore defer to a future PR any revisions of the patterns used by any "grep" commands in our test suite.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request implements the
git-lfs-migrate(1)'import' subcommand.The 'import' subcommand is designed to convert large blobs stored in Git history to LFS pointer files based on the
--include,--exclude, and--include-ref,--exclude-refflags. It works by calling thecommands.clean()function with the blob loaded in memory, and then a) storing that blob's contents in.git/lfs/objectsand b) writing out an LFS pointer file in its place in the githistory.Regarding some of the notes from #2146 about when/where to insert .gitattributes changes @technoweenie:
and @andyneff:
I initially implemented this according to my original suggestion of appending patterns to the
.gitattributesfile when a file of that kind was first rewritten. This is problematic for two reasons:-X,--excludeflags. TheBlobFn(used to rewrite blobs) is only called on blobs that do match the filter, not ones that don't. This prevents us from ever seeing tree entries that are excluded from the filterset, thus never presenting us an opportunity to add the negative matches to the.gitattributes.Instead, I add the
.gitattributeschanges to the first commit that we migrate by writing lines like:and adding negative entries for
--exclude'd patterns like:We keep track of these in a set of patterns that the LFS migrator has tracked, and merge them each time with the .gitattributes in the root tree therefore persisting any .gitattributes changes that exist in the original history. See:
Here's some example output:
git lfs migrate infoon that data:git lfs migrate importto rewrite those commits such that the large files are tracked with LFS:.gitattributescontents are persisted, b) new.gitattributespatterns are added accordingly, and c) the*.datfile was added to Git LFS.HEADcontains a checked-out LFS object, a pointer file in the index, and a corresponding entry in.git/lfs/objects:Closes: #2146.
/cc @git-lfs/core
/refs #2146