Add --from-template flag on CLI Project Initialization to setup template based project#1728
Add --from-template flag on CLI Project Initialization to setup template based project#1728
--from-template flag on CLI Project Initialization to setup template based project#1728Conversation
lzdanski
left a comment
There was a problem hiding this comment.
Found some grammar/spelling nits.
neel-astro
left a comment
There was a problem hiding this comment.
Left some comments, but great work for a first time 🚀
airflow/airflow.go
Outdated
| if err := initDirs(path, dirs); err != nil { | ||
| return errors.Wrap(err, "failed to create project directories") | ||
| } | ||
| if template != "" { |
There was a problem hiding this comment.
how would the custom runtime version work with this new template logic?
There was a problem hiding this comment.
Template based astro projects will always have latest runtime version as we are fetching these files from astronomer/templates repo. I have included a note in description that airflow-version flag will not be respected when using --template flag.
| @@ -0,0 +1,200 @@ | |||
| package runtimetemplateclient | |||
There was a problem hiding this comment.
I'm not sure this warrants a top-level "client" for the simple task of cloning a repo.
I'd suggest a different structure to this:
- Add a
Clonefunction topkg/git/git.goto usegitto clone a given repo to a given path. We should not need to be doing all this low-level tar/gzip work, andgitis already expected to be installed - Add a new file
airflow/templates.gowith:- An interface (e.g.
gitClient) with the signature forClone - A struct that implements the interface by simply calling the
pkgfunction - A top-level variable that is instantiated with the struct
- A function (e.g.
initFromTemplate) that uses the variable to clone the templates repo to a temp dir and moves the specific template subdirectory to a given path
- An interface (e.g.
- In
Initcall the above function with the template name - In the
Inittests replace the top-level variable with a mocked struct that doesn't do a real clone but instead inserts the required dirs/files for the rest to work
What do you think?
There was a problem hiding this comment.
This is something @pritt20 and I discussed. As far as I know, we don't currently list git as a prerequisite. In this particular instance, we're trying to get the user up and running with the least amount of dependencies and prerequisites as possible. Even though most of our user base will probably have git installed, it just felt too risky to rely on for such a critical onboarding step where we're doing everything we can to optimize user experience and remove friction.
Are there other places in the CLI code where git is currently a hard dependency?
There was a problem hiding this comment.
Yeah that is true, we do currently use git in a few places but not as a hard dependency. If we go with the tar/gzip approach can we follow a similar structure to what I listed above, so we do not need a new top-level client? Where a top-level variable can be replaced by a unit test to simulate the file download. We do this in a few other places, e.g. for deploying an image which a unit test cannot really do.
I'm also hoping we can extract the contents straight into the target dir instead of into a temp dir and then copying the dirs and files across?
There was a problem hiding this comment.
Thanks for the suggestion @jeremybeard and @schnie.
I have updated the code to reflect above changes.
I tried to dump the extracted files to target directory but the tarball that is extracted contains all templates directories and use path like target-directory/astronomer-templates-7a8aa17/<templates-directories>. We want the selected template directory files to be in target-directory without any sub directories. Therefore I have kept the temp directory logic intact to move files around and copy only required files to target directory 😅
cmd/airflow.go
Outdated
| errPytestArgs = errors.New("you can only pass one pytest file or directory") | ||
| buildSecrets = []string{} | ||
| errNoCompose = errors.New("cannot use '--compose-file' without '--compose' flag") | ||
| templateVariations = []string{"etl", "dbt-on-astro", "learning-airflow", "generative-ai"} |
There was a problem hiding this comment.
I don't think we should embed the current list of template names in the CLI. If we do this then whenever we add/remove a template we will immediately need a new CLI release, and even then users will not always upgrade and so the validation behavior there will not be correct. Instead can we add another flag --list-templates that queries https://updates.astronomer.io/astronomer-templates and prints the list of available templates?
There was a problem hiding this comment.
I think this could makes sense - the less steps involved with adding a new template, the better. We could use that same list for validation right?
There was a problem hiding this comment.
Thanks @jeremybeard! that makes sense.
Would it be okay if we make template options interactively available to users that they can choose from? I tried to implement this and named this flag as --select-template ( we can change this if we find better name). --select-template is boolean flag and if passed, user will be provided with all available template options that user can then select from and CLI will create selected template based project. Please find below example:
astro dev init --select-template
# TEMPLATE
1 dbt-on-astro
2 learning-airflow
3 generative-ai
4 etl
> 2
Initializing Astro project
Pulling Airflow development files from Astro Runtime 12.2.0
/Users/pritt/Astronomer/plg/astro-cli/airflow-test-2
You are not in an empty directory. Are you sure you want to initialize a project? (y/n) y
Reinitialized existing Astro project in /Users/pritt/Astronomer/plg/astro-cli/airflow-test-2
cmd/airflow_test.go
Outdated
| } | ||
| for _, f := range files { | ||
| e := os.Remove(f) | ||
| e := os.RemoveAll(f) |
There was a problem hiding this comment.
Is there some way we can clean this up in one call? Seems very brittle to list these out individually.
There was a problem hiding this comment.
maybe create and run the tests in a temp dir, and delete the temp dir itself at the end of each test as part of TearDownSubTest
There was a problem hiding this comment.
Updated all tests to make use of temp directory (which will then be removed after tests are executed) instead of cmd directory
airflow/airflow.go
Outdated
| if err := initDirs(path, dirs); err != nil { | ||
| return errors.Wrap(err, "failed to create project directories") | ||
| } | ||
| if template != "" { |
There was a problem hiding this comment.
This placement is mixed in with the existing non-template init flow, which is a bit confusing. Can we add the template != "" check at the top of the function and have it return early so we don't do unnecessary things like construct the files map?
airflow/airflow_test.go
Outdated
| "path/filepath" | ||
|
|
||
| "github.com/astronomer/astro-cli/pkg/fileutil" | ||
| "github.com/pkg/errors" |
There was a problem hiding this comment.
can we use "errors" please :), pkg/errors is not actively maintained any longer
airflow/runtime_templates.go
Outdated
| if resp.StatusCode != http.StatusOK { | ||
| return nil, fmt.Errorf("received non-200 status code: %d", resp.StatusCode) | ||
| } | ||
|
|
||
| body, err := io.ReadAll(resp.Body) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to read response body: %w", err) | ||
| } |
There was a problem hiding this comment.
can we try to read the response and add it to the error when we get back non 200 response code, just so that we are passing enough information to help the customer help us debug issues
cmd/airflow.go
Outdated
| } | ||
| cmd.Flags().StringVarP(&projectName, "name", "n", "", "Name of Astro project") | ||
| cmd.Flags().StringVarP(&airflowVersion, "airflow-version", "a", "", "Version of Airflow you want to create an Astro project with. If not specified, latest is assumed. You can change this version in your Dockerfile at any time.") | ||
| cmd.Flags().StringVarP(&fromTemplate, "from-template", "t", "", "Template name that you want to use to create the local astro project. Possible values can be etl, dbt-on-astro, generative-ai and learning-airflow. Please note template based astro project use latest runtime version and airflow-version flag will be ignored when creating a project with template flag") |
There was a problem hiding this comment.
maybe either we don't specify the option in the description if that is subject to change, or we call the airflow.FetchTemplateList as part of the newAirflowInitCmd itself so that the description could be dynamic based on response, wdyt?
There was a problem hiding this comment.
Thanks @neel-astro ! I've updated the description of flag to remove the template names
cmd/airflow_test.go
Outdated
| } | ||
| for _, f := range files { | ||
| e := os.Remove(f) | ||
| e := os.RemoveAll(f) |
There was a problem hiding this comment.
maybe create and run the tests in a temp dir, and delete the temp dir itself at the end of each test as part of TearDownSubTest
cmd/airflow.go
Outdated
| cmd.Flags().StringVarP(&projectName, "name", "n", "", "Name of Astro project") | ||
| cmd.Flags().StringVarP(&airflowVersion, "airflow-version", "a", "", "Version of Airflow you want to create an Astro project with. If not specified, latest is assumed. You can change this version in your Dockerfile at any time.") | ||
| cmd.Flags().StringVarP(&fromTemplate, "from-template", "t", "", "Template name that you want to use to create the local astro project. Possible values can be etl, dbt-on-astro, generative-ai and learning-airflow. Please note template based astro project use latest runtime version and airflow-version flag will be ignored when creating a project with template flag") | ||
| cmd.Flags().BoolVarP(&selectTemplate, "select-template", "s", false, "Provides a list of templates to select from and create the local astro project based on selected template") |
There was a problem hiding this comment.
I quite like this UX, the user is presented with the valid choices and picks the one they want. Can we just make this the only way to create a project from a template? It could even be what --from-template does and then we don't need --select-template. We shouldn't need a non-interactive variant because this isn't going to be called by scripts/CI/etc.
There was a problem hiding this comment.
Hey @jeremybeard ,
The initial Idea was to provide users with simple set of commands that they can execute to set up template based astro project. Eg: astro dev init --from-template etl
If we keep only --select-template flag functionality then we will have to guide users in our onboarding workflow to select the template and set up astro project accordingly.
CC: @schnie
There was a problem hiding this comment.
Yea the interactive mode is definitely awesome but I would still like to have the one-shot option available so we can compose it with a few other commands when we display them during onboarding. It'd be easier to copy/paste/execute the little block of commands without the interactive piece running. So in a way, we are kind of scripting with it here, its just not something a user would hook up in a CI pipeline or something.
I really wish we could just use the same flag and run interactive mode when the option is not provided but it seems like that's not really possible :/
There was a problem hiding this comment.
@pritt20 we could use the same flag if we used the = format right?
So astro dev init --from-template would be interactive, and
astro dev init --from-template=etl would work for the one-shot version?
There was a problem hiding this comment.
yeah, this format can be implemented to use same flag for interactive and for one-shot version. Its just that we don't really use this format anywhere in CLI for existing commands😅
There was a problem hiding this comment.
From a CLI UX perspective, I might be a bit concerned about changing the behavior of a single flag in the CLI, where a user could only pass a flag value with =.
From what I understand based on comments if passing a template value in the command isn't going to be a major form of usage (because the user has to copy/paste the value from somewhere), then should we just skip that and always present an interactive mode? . astro dev init does not have a CI-CD use case, so we don't necessarily need to support a non-interactive mode.
There was a problem hiding this comment.
Our next steps for the onboarding flow are to show the user a small block of commands to copy and execute in their terminal. I think we want a non-interactive way to do this to reduce the keystrokes (minor, I know) and steps involved to get setup. It'd be great to just paste in the terminal and go with no more interactions.
Given that, we have a choice to either piggyback on the same flag (--from-template), or add a separate, but similar command. The latter feels a little awkward and a little confusing. For that reason, I lean towards just using the = format for this command alone right now, but keep our existing pattern for other flags. We can treat this as a special case for onboarding purposes.
I imagine most users exploring the CLI on their own would run the command interactively. For our onboarding use-case we want to specify the exact template to use, so we'd provide them a command to run that matches the template they selected when going through the onboarding flow, like astro dev init --from-template=generative-ai.
Maybe we just document the bare, interactive --from-template mode and reserve the = format (--from-template=generative-ai just for the case where we give them the command to copy/paste.
There was a problem hiding this comment.
Maybe we just document the bare, interactive --from-template mode and reserve the = format (--from-template=generative-ai just for the case where we give them the command to copy/paste.
That sounds good to me.
There was a problem hiding this comment.
Our next steps for the onboarding flow are to show the user a small block of commands to copy and execute in their terminal.
Cool, I was missing this context, I am on board with that approach 👍
There was a problem hiding this comment.
Thanks everyone! I've updated the PR to reflect above discussed changes.
airflow/runtime_templates.go
Outdated
| tarReader := tar.NewReader(gr) | ||
| var baseDir string | ||
|
|
||
| limitTarReader := io.LimitReader(tarReader, maxExtractSize) |
There was a problem hiding this comment.
Yeah, I initially tried without LimitReader and ran into vulnerability warning by go linter: G110: Potential DoS vulnerability via decompression bomb. Therefore modified the code to use LimitReader
There was a problem hiding this comment.
We are in control of the tar files so that doesn't apply in this case.
There was a problem hiding this comment.
Updated the code to use tarReader and have flagged it with nolint comment.
airflow/runtime_templates.go
Outdated
| } | ||
| defer out.Close() | ||
|
|
||
| if _, err = io.Copy(out, in); err != nil { |
There was a problem hiding this comment.
I've hoping we don't need to do this copy, but if we do I'd expect this should just be a rename not a literal copy.
There was a problem hiding this comment.
We do not require this copy function anymore as we are now extracting the template files directly to user directory, hence cleaned up this code.
airflow/runtime_templates_test.go
Outdated
| "github.com/stretchr/testify/assert" | ||
| ) | ||
|
|
||
| func TestFetchTemplateList_Success(t *testing.T) { |
There was a problem hiding this comment.
Could we use the test suite like we have in other unit tests, even though there might not be much common between these tests, that would put it in the framework that we would enable others to use it easily if need be down the line.
Also, all of these could be sub-tests for a test function in that.
There was a problem hiding this comment.
Yeah, that make sense! I've updated the tests to use test suit to keep these in line with existing tests.
cmd/airflow_test.go
Outdated
| }) | ||
|
|
||
| s.Run("success", func() { | ||
| config.WorkingPath = s.tempDir |
There was a problem hiding this comment.
this could be part of setup, wdyt?
There was a problem hiding this comment.
Thanks @neel-astro ! I've updated the tests now.
|
|
||
| func (s *AirflowSuite) cleanUpInitFiles() { | ||
| s.T().Helper() | ||
| files := []string{ |
There was a problem hiding this comment.
thanks for fixing this 🙇♂️
--template flag on CLI Project Initialization to setup template based project--from-template flag on CLI Project Initialization to setup template based project
neel-astro
left a comment
There was a problem hiding this comment.
Left some nitpicks to squash, but otherwise LGTM
jeremybeard
left a comment
There was a problem hiding this comment.
Looks good! Thanks for working through all the feedback. I left a nit comment but approving ahead of that
airflow/runtime_templates.go
Outdated
| } | ||
|
|
||
| func SelectTemplate(templateList []string) (string, error) { | ||
| TemplatesTab := printutil.Table{ |
|
Blocking feedback request was handled and approved async from Github. Administrative merge for that reason. |
Description
This PR includes changes to add
--from-templateflag toastro dev initcommand. This option will allow users to specify runtime templates name likeetl,dbt-on-astro,generative-aietc and CLI will setup an astro project based on provided template by user.These changes are extension to the enhancements being made in onboarding-workflow where once user has completed the process of creating the deployment and now wants to make changes to deployed dags or deployment, we want them to explore astro-cli option.
These changes will make it easy for users to modify/update dags as the template based project would setup all the required files which were auto-deployed during deployment create process as part of onboarding-workflow.
🎟 Issue(s)
Related https://github.com/astronomer/astro/issues/24343
🧪 Functional Testing
Tested the change on local machine:
Template:
generative-aiTemplate:
etlDefault Project:
Interactive UX test:
Template:
dbt-on-astroTemplate:
learning-airflow📸 Screenshots
📋 Checklist
make testbefore taking out of draftmake lintbefore taking out of draft