Skip to content

Speed up cloning by using a depth of 1 if there is no refspec#420

Merged
betatim merged 2 commits intojupyterhub:masterfrom
evertrol:clone-depth-1
Oct 5, 2018
Merged

Speed up cloning by using a depth of 1 if there is no refspec#420
betatim merged 2 commits intojupyterhub:masterfrom
evertrol:clone-depth-1

Conversation

@evertrol
Copy link
Copy Markdown
Contributor

@evertrol evertrol commented Oct 1, 2018

To speed up cloning a repository, a shallow clone containing just the
last commit can be made, but only if no refspec is provided.

To speed up cloning a repository, a shallow clone containing just the
last commit can be made, but *only* if no refspec is provided.
@evertrol
Copy link
Copy Markdown
Contributor Author

evertrol commented Oct 1, 2018

I'm aware of #120 and #131 , but as far as I can tell, the issue there was that the shallow clone was performed for cases where the refspec was either empty or given. Here, the shallow clone is only made when there is no refspec.

Potentially, the same shallow cloning can be used for a branch of tag (but not for a specific commit), but this can't work with the current implementation: the refspec is parsed once the repository is cloned, and could be a branch, tag or commit. It would require a separate option to specify a branch or tag to allow shallow clones for those cases, but this would complicate the user interface.

@betatim
Copy link
Copy Markdown
Member

betatim commented Oct 1, 2018

It would be great to add this feature back in! I think as is this PR won't speed up BinderHubs as they always specify a --ref, making repo2docker faster for local users would be great though.

One question for my git education: if there are two branches in a remote git repository: master and some-feature and the last (in time) commit was to some-feature we will still get the last commit to master with our --depth 1 checkout?

Could you add some tests of the shallow/not shallow cloning based around a repo like https://github.com/betatim/repo2docker-ci-clone-depth so that this behaviour keeps working also in the future? Once we are actively using https://github.com/betatim/repo2docker-ci-clone-depth we should move it to https://github.com/binderhub-ci-repos (or similar org).

@evertrol
Copy link
Copy Markdown
Contributor Author

evertrol commented Oct 1, 2018

Git will use HEAD when cloning, which (for GitHub at least) points to master (unless this was explicitly changed by a repository owner):

If I try for a test repository, I indeed get master when cloning, not matter the last commit and branch, while a local (file) clone results in the current branch.


It's probably for the better that an explicit refspec is used (I assume the default is master), to avoid surprises; I hadn't realised BinderHub did that.
Of course, if there is a default and that is a branch, there would be the option to use --branch in the git command, as suggested above. But this requires some proper thinking how to implement that nicely, if wanted at all.

I'll try and add some tests in one of the next days.

@betatim
Copy link
Copy Markdown
Member

betatim commented Oct 2, 2018

This is the code that we use to look up explicit SHAs. This is the code used to run repo2docker to build the image.

Looking forward to some tests and thanks for explaining something new about git to me!

@betatim
Copy link
Copy Markdown
Member

betatim commented Oct 5, 2018

LGTM. I moved the "clone depth test repo" to https://github.com/binderhub-ci-repos/repo2docker-ci-clone-depth.

I think we should merge this after updating that link in the test. What do you think?

Adds test for the use of a clone depth of 1 for cases where no refspec
is used. This relies on an external source (an existing test
repository), so may break easier.
@betatim betatim merged commit d95f7dc into jupyterhub:master Oct 5, 2018
@evertrol
Copy link
Copy Markdown
Contributor Author

evertrol commented Oct 5, 2018

Thanks!
Let's see in time if the real world agrees with this change and doesn't fall over.

@evertrol evertrol deleted the clone-depth-1 branch October 5, 2018 08:34
markmo pushed a commit to markmo/repo2docker that referenced this pull request Jan 22, 2021
Speed up cloning by using a depth of 1 if there is no refspec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants