Skip to content

Conversation

@bogomolets-owl
Copy link
Contributor

Hi team! Hope you are doing great!

Me again with some minor tweaks for git lfs.

What this request is about:
git lfs has batch api. Current batch api implemenataion has fixed batch request size (100 oids).
The purpose of this request is to make it possible to override the batch request size through the configuration.
The default value remains unchanged (100).

Motivation.
100 oids per batch is pretty good for general case of basic transfer agent.
But custom transfer agents implementations may want to override this value for some reasons.
I am working on a particular example of such transfer agent. The purpose of my agent is to amplify the transfer of small binary files using git lfs.
Generally using git lfs for small files is a bad practice, but the project i am currently working on requires me to use git lfs that way cause

  • i have exactly one file extension for files with different sizes (from 5 KB to 5 GB all have the same extension)
  • i can't rely on any sort of naming conventions to solve the issue via gitattributes (this will quickly go out of control due to team size)

And to solve my particular problem i need to override the size of a batch request.

Best regards, Alexander Bogomoletz.

Current code has fixed batch request size (100 oids).
This is okay for general case, but custom transfers implementation
may require to override this value.
Particular example would be bulk transfer adapter that expects recieving
many artifacts in a single download stream.
@bogomolets-owl bogomolets-owl requested a review from a team as a code owner October 2, 2024 13:21
Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the idea and the implementation! I think this is a useful additional feature and option to have, for sure.

Aside from a few implementation notes, the main three things I'd like to suggest for this PR are to rename the new configuration option to lfs.transfer.batchSize, add some documentation of the option to the docs/man/git-lfs-config.adoc file in the Upload and download transfer settings section, and probably also add a t/t-batch-size.sh test file with some small tests of the new option.

As far as documentation is concerned, we should probably note that a server is free to return an HTTP 413 status code
in its response if it deems the requested batch size too large, as our Batch API specification states.

Regarding a test file, I think a single pair of upload and download tests is probably sufficient, very similar to what we have now for the lfs.transfer.maxRetries configuration option in the t/t-batch-retries.sh file. In these tests we can set lfs.transfer.batchSize to a small number like 1 or 2, and try a push or pull of more than that number of objects while setting GIT_TRACE=1. Checking the trace logs should then find the expected number of API requests (e.g., if we push 3 objects with a batch size of 1, we should find 3 API requests in the logs).

Fix review comments. Update the documentation for this feature, remove
unnesessary usages, fix naming and codestyle, create a test for feature.
@bogomolets-owl
Copy link
Contributor Author

Hi @chrisd8088!
Thank you for your review!
I added the chages you mentioned to the merge request.

Though i have troubles with running my tests. I am currently working on windows and i managed to test the feature itself, but not the tests so i am not sure the tests are actully behave as expected.
Maybe there are some docs i missed to clarify the development pipeline for lfs?
I have two specific troubles:

  • i can build only windows platform currently by making some changes in makefile. This one can be skipped cause i dont have any other platforms to check the tests against.
  • i dont understand how to use my git lfs with my git not ruinig my daily workspace setup. This is what i need help with.

Best regards, Alexander Bogomoletz.

@chrisd8088
Copy link
Member

Thanks very much for the additional work! This is looking really good. I ran your new test on my system and it passes, so that's great.

I may not have time for a full second review for a few days, but I'll make sure to get to it shortly.

As for developing on Windows, the core team typically has not done that, so I'm not sure how much advice I can give. When I need to check some behaviour on Windows, I use a Git Bash terminal, as provided by the Git for Windows project, and install Go there so I can build the client. That usually works OK for me to build and run tests, but it's also been a while since I did that, so maybe some details have changed. You might also try an Azure VM or something like that, where you can develop without affecting your own regular workspace.

@chrisd8088
Copy link
Member

One small detail which I did notice just now, while running your new test script, is that it has non-executable permissions (100644). Would you mind giving it executable permissions (100755) in Git?

Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks so much for the additions; this is looking really good. It will be a nice new feature to include in the next release!

I had only a couple of minor suggestions for the new test file. I'd be happy to put those into a commit and push them to your branch, if you're willing to grant me permission to do so in GitHub.

Otherwise, if you don't mind pulling in the one suggested code change and running the git update-index --chmod=+x, that would be all we need before we can merge this.

Thanks again so much for the idea and the implementation! We sincerely appreciate these kinds of contributions from the community.

@chrisd8088
Copy link
Member

Hey @bogomolets-owl—thank you again for this PR! We'd really like to include it in the next Git LFS release.

Have you had a chance to look at my last few suggestions? If you're willing to either pull those into your branch, or grant me permissions to push them into your branch, that would be great.

Otherwise, if we don't hear from you for a while, we can create a new PR and pull in both your commits from this one plus one of my own, but I'd prefer to just stick with this PR if we can.

Let me know how you'd like to proceed, and thank you again for this new configuration option!

Fix review comments. Update the documentation, update test file attributes,
simplify test logic.
@bogomolets-owl
Copy link
Contributor Author

Hi @chrisd8088 !

Sorry for the delay, i've had some busy time recently.

Thank you for your comments!
I agree with your suggestions so i updated the pull request with a new commit!
Also i granted you acces to the fork repository so you can modify it directly just in case it will be needed, so please feel free to make any changes to it you see appropriate!

@chrisd8088 chrisd8088 self-requested a review October 25, 2024 19:53
Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thank you again for the contribution!

@chrisd8088
Copy link
Member

Sorry for the delay, i've had some busy time recently.

No worries at all, and I really appreciate your time and effort on this PR. The CI failures are just due to an unrelated change in upstream Git for which I'm working on a fix, and once that's in place we can merge this PR.

Thanks again very much!

The t/t-batch-transfer-size.sh test script added in a prior commit
in this PR contains a leading Unicode byte-order mark which causes
the Perl prove(1) command used by our t/Makefile's recipes to be
unable to determine if it should be run by its
TAP::Parser::SourceHandler::Perl module (which looks for Perl scripts)
or its TAP::Parser::SourceHandler::Executable (which looks for
executable files).

We therefore just remove the BOM, and also add a bit of extra
whitespace to the "batch storage download with small batch size" test
to align it with the "batch storage upload with small batch size" test.
@chrisd8088 chrisd8088 merged commit dfa3b2d into git-lfs:main Oct 29, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants