Add support for cloning repositories from github#534
Add support for cloning repositories from github#534shreyas-goenka wants to merge 14 commits intomainfrom
Conversation
libs/git/clone.go
Outdated
| return err | ||
| } | ||
|
|
||
| func Clone(org string, repoName string, targetDir string) error { |
There was a problem hiding this comment.
This is specific to GitHub. Please turn this into an interface (e.g. with only the Clone function) such that we can determine which one to use based on the URL or whatever parameter we get.
There was a problem hiding this comment.
I did something different, ie added CloneOptions with the provider as an option. Please let me know if that looks good
|
I could not make integration tests for the private repository work. With some cursory investigation, I found out that it's probably the GITHUB_TOKEN that allows the CI runner to download private repos, which is not something the git CLI will support. Manually tested however |
|
To create an integration test for cloning private repositories, we can create a new deco account on github, and add that to our test infra. WDYT? |
| } | ||
|
|
||
| func (opts CloneOptions) repoUrl() string { | ||
| return fmt.Sprintf(`https://github.com/%s/%s`, opts.Organization, opts.RepositoryName) |
There was a problem hiding this comment.
It assumes it's always github but it can be GitLab for example or any other provider, right?
There was a problem hiding this comment.
Yes, we can add support for gitlab as a followup. gitlab also support zip downloads, and for other providers like bitbucket we can defer to the CLI
| return cmd.Wait() | ||
| } | ||
|
|
||
| func clonePublic(ctx context.Context, opts CloneOptions) error { |
There was a problem hiding this comment.
I would call it "downloadRepo" or something like this, because clone assumes we do the git operation while we don't
| zipDst := filepath.Join(tmpDir, opts.RepositoryName+".zip") | ||
|
|
||
| // Download public repository from github as a ZIP file | ||
| err := download(ctx, opts.zipUrl(), zipDst) |
There was a problem hiding this comment.
Shall we have a timeout for download operation?
| } | ||
|
|
||
| func (opts CloneOptions) zipUrl() string { | ||
| return fmt.Sprintf(`%s/archive/%s.zip`, opts.repoUrl(), opts.Reference) |
There was a problem hiding this comment.
This would work only for Github, right?
There was a problem hiding this comment.
Yes, gitlab has a different format for zip URLs
|
|
||
| func Clone(ctx context.Context, opts CloneOptions) error { | ||
| if opts.Provider != "github" { | ||
| return fmt.Errorf("git provider not supported: %s", opts.Provider) |
There was a problem hiding this comment.
Why do we support only Github?
There was a problem hiding this comment.
Just as a starting point, followups coming soon :)
There was a problem hiding this comment.
But design of this code is very tightly coupled with Github specific implementation. If we plan to support other providers, shall we abstract this with some type Provider interface? You could have some implementation specific logic like zipUrl and etc in the implementation of this interface, so later when you decide to add GitLab support you will just need to extend and modify the code
There was a problem hiding this comment.
I left some inline suggestion on how to make it more flexible and rely less on implementation details
| return other == errNotFound | ||
| } | ||
|
|
||
| type CloneOptions struct { |
There was a problem hiding this comment.
Add
type provider interface {
repoUrl() string,
zipUrl() string,
destination() string
}
| TargetDir string | ||
| } | ||
|
|
||
| func (opts CloneOptions) repoUrl() string { |
There was a problem hiding this comment.
replace with
func (provider GitHubProvider) repoUrl() string
| return err | ||
| } | ||
|
|
||
| func clonePrivate(ctx context.Context, opts CloneOptions) error { |
There was a problem hiding this comment.
replace with
func clonePrivate(ctx context.Context, provider Provider) error
|
I had a discussion with @pietern and based on feedback from @lennartkats-db we decided to not support ZIP downloads right out of the gate. Lets start with git CLI by default, and switch over to supporting ZIP downloads if there's a need based on customer feedback. This PR is being closed in favour of #544 |
Changes
Adds support for cloning public and private github repositories for databricks templates
Tests
Integration tests