Skip to content

Add support for cloning repositories from github#534

Closed
shreyas-goenka wants to merge 14 commits intomainfrom
git-clone
Closed

Add support for cloning repositories from github#534
shreyas-goenka wants to merge 14 commits intomainfrom
git-clone

Conversation

@shreyas-goenka
Copy link
Copy Markdown
Contributor

@shreyas-goenka shreyas-goenka commented Jun 29, 2023

Changes

Adds support for cloning public and private github repositories for databricks templates

Tests

Integration tests

@shreyas-goenka shreyas-goenka changed the title Add support for cloneing public repositories from github Add support for cloning public repositories from github Jun 29, 2023
return err
}

func Clone(org string, repoName string, targetDir string) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specific to GitHub. Please turn this into an interface (e.g. with only the Clone function) such that we can determine which one to use based on the URL or whatever parameter we get.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something different, ie added CloneOptions with the provider as an option. Please let me know if that looks good

@shreyas-goenka shreyas-goenka changed the title Add support for cloning public repositories from github [WIP] Add support for cloning public repositories from github Jun 29, 2023
@shreyas-goenka shreyas-goenka changed the title [WIP] Add support for cloning public repositories from github Add support for cloning public repositories from github Jun 30, 2023
@shreyas-goenka
Copy link
Copy Markdown
Contributor Author

I could not make integration tests for the private repository work. git clone does not work out of the box in the CI runner environment.

With some cursory investigation, I found out that it's probably the GITHUB_TOKEN that allows the CI runner to download private repos, which is not something the git CLI will support.

Manually tested however git clone functionality works properly for private repositories (for both branches and tags)

@shreyas-goenka shreyas-goenka requested a review from pietern June 30, 2023 00:30
@shreyas-goenka shreyas-goenka changed the title Add support for cloning public repositories from github Add support for cloning repositories from github Jun 30, 2023
@shreyas-goenka
Copy link
Copy Markdown
Contributor Author

To create an integration test for cloning private repositories, we can create a new deco account on github, and add that to our test infra. WDYT?

}

func (opts CloneOptions) repoUrl() string {
return fmt.Sprintf(`https://github.com/%s/%s`, opts.Organization, opts.RepositoryName)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It assumes it's always github but it can be GitLab for example or any other provider, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can add support for gitlab as a followup. gitlab also support zip downloads, and for other providers like bitbucket we can defer to the CLI

return cmd.Wait()
}

func clonePublic(ctx context.Context, opts CloneOptions) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call it "downloadRepo" or something like this, because clone assumes we do the git operation while we don't

zipDst := filepath.Join(tmpDir, opts.RepositoryName+".zip")

// Download public repository from github as a ZIP file
err := download(ctx, opts.zipUrl(), zipDst)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we have a timeout for download operation?

}

func (opts CloneOptions) zipUrl() string {
return fmt.Sprintf(`%s/archive/%s.zip`, opts.repoUrl(), opts.Reference)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would work only for Github, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, gitlab has a different format for zip URLs


func Clone(ctx context.Context, opts CloneOptions) error {
if opts.Provider != "github" {
return fmt.Errorf("git provider not supported: %s", opts.Provider)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we support only Github?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a starting point, followups coming soon :)

Copy link
Copy Markdown
Contributor

@andrewnester andrewnester Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But design of this code is very tightly coupled with Github specific implementation. If we plan to support other providers, shall we abstract this with some type Provider interface? You could have some implementation specific logic like zipUrl and etc in the implementation of this interface, so later when you decide to add GitLab support you will just need to extend and modify the code

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some inline suggestion on how to make it more flexible and rely less on implementation details

return other == errNotFound
}

type CloneOptions struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add

type provider interface {
  repoUrl() string,
  zipUrl() string,
  destination() string
}

TargetDir string
}

func (opts CloneOptions) repoUrl() string {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with

func (provider GitHubProvider) repoUrl() string

return err
}

func clonePrivate(ctx context.Context, opts CloneOptions) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with

func clonePrivate(ctx context.Context, provider Provider) error 

@shreyas-goenka
Copy link
Copy Markdown
Contributor Author

I had a discussion with @pietern and based on feedback from @lennartkats-db we decided to not support ZIP downloads right out of the gate. Lets start with git CLI by default, and switch over to supporting ZIP downloads if there's a need based on customer feedback.

This PR is being closed in favour of #544

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants