{"id":60543,"date":"2020-12-23T07:00:46","date_gmt":"2020-12-23T15:00:46","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/devops\/?p=60543"},"modified":"2020-12-20T16:40:56","modified_gmt":"2020-12-21T00:40:56","slug":"using-azure-machine-learning-from-github-actions","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/devops\/using-azure-machine-learning-from-github-actions\/","title":{"rendered":"Using Azure Machine Learning from GitHub Actions"},"content":{"rendered":"<p><a href=\"https:\/\/azure.microsoft.com\/services\/machine-learning\/?WT.mc_id=devops-9793-dabrady\">Azure Machine Learning<\/a> is an Enterprise-grade Machine Learning service that can help you build and deploy your predictive models faster. It also has a number of features to help you mature your machine learning process with MLOps.<\/p>\n<p>One of the important steps a data science team should take when starting down an MLOps path is to <a href=\"https:\/\/damianbrady.com.au\/2020\/11\/02\/parts-of-an-ml-project-in-source-control\/\">put all their code in source control<\/a>. This allows the team to collaborate effectively and enables continuous integration in much the same way as we see for traditional software development. When any of the training code changes, we can automatically merge it with the rest of the team&#8217;s changes, check the quality, and even automate training a new model.<\/p>\n<h2>Why not use GitHub Actions for everything?<\/h2>\n<p>Automation tools like GitHub Actions are great for continuous integration and continuous delivery\/deployment (CI\/CD). They&#8217;re designed to take your code and build it. They&#8217;re also great for running tests, checking quality, and communicating with third party services.<\/p>\n<p>But the tasks we use for traditional software development tend to have limited time requirements. Even long-running builds tend to take minutes rather than hours. For example, the main build for <a href=\"https:\/\/azure.microsoft.com\/services\/devops?WT.mc_id=devops-9793-dabrady\">Azure DevOps<\/a> takes about an hour at most!<\/p>\n<p>By contrast, a training run for a reasonably complicated predictive model might take hours, days, or even weeks. It might require tens or hundreds of virtual machines with multiple GPUs and process terabytes or petabytes of data. GitHub Actions, Azure Pipelines, and other similar products are simply not built for this.<\/p>\n<p>Azure Machine Learning <em>does<\/em> support long-running, highly data- and compute-intensive pipelines. It&#8217;s designed specifically for that use case, with easy access to scalable cloud compute and data sources.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/github-actions.png\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/github-actions-300x176.png\" alt=\"GitHub Actions\" width=\"600\" height=\"352\" class=\"alignnone size-medium wp-image-60546\" srcset=\"https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/github-actions-300x176.png 300w, https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/github-actions-1024x602.png 1024w, https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/github-actions-768x451.png 768w, https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/github-actions.png 1043w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<h2>So why not use Azure Machine Learning for everything?<\/h2>\n<p>A reasonable question! If Azure Machine Learning is designed for these workloads, why not use it in isolation?<\/p>\n<p>The short answer is services like GitHub Actions give greater control over the orchestration of a pipeline in its entirety. While the <em>training run<\/em> is ideally handled in a service like Azure Machine Learning, GitHub is great at: * <a href=\"https:\/\/docs.github.comfree-pro-team@latest\/github\/creating-cloning-and-archiving-repositories\/about-repositories\">Managing your team&#8217;s code<\/a>, * <a href=\"https:\/\/docs.github.comfree-pro-team@latest\/actions\/reference\/events-that-trigger-workflows\">Triggering workflows on code changes<\/a>, * <a href=\"https:\/\/devblogs.microsoft.com\/devops\/i-need-manual-approvers-for-github-actions-and-i-got-them-now\/?WT.mc_id=devops-9793-dabrady\">Controlling the rollout of your model to test and production environments<\/a>, * <a href=\"https:\/\/docs.github.com\/free-pro-team@latest\/github\/collaborating-with-issues-and-pull-requests\/about-pull-requests\">Pull request processes<\/a>, * <a href=\"https:\/\/docs.github.comfree-pro-team@latest\/github\/administering-a-repository\/about-protected-branches\">Branch protection<\/a>, * <a href=\"https:\/\/github.com\/features#features-project-management\">Project management tools<\/a>, * <a href=\"https:\/\/github.com\/features\/security\">Advanced security features<\/a><\/p>\n<p>All of these features can help add another layer of DevOps maturity.<\/p>\n<p>The key here is to use the right tool for the right job. Pipelines that define the training run can be incredibly powerful, and that&#8217;s where you should focus your Azure Machine Learning efforts.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/azure-ml-pipeline.png\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/azure-ml-pipeline-300x168.png\" alt=\"Pipelines in Azure Machine Learning\" width=\"600\" height=\"336\" class=\"alignnone size-medium wp-image-60547\" srcset=\"https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/azure-ml-pipeline-300x168.png 300w, https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/azure-ml-pipeline-1024x575.png 1024w, https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/azure-ml-pipeline-768x431.png 768w, https:\/\/devblogs.microsoft.com\/devops\/wp-content\/uploads\/sites\/6\/2020\/12\/azure-ml-pipeline.png 1468w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<h2>How do we use them together?<\/h2>\n<p>There are three ways to work with Azure Machine Learning from GitHub Actions: 1. <a href=\"https:\/\/docs.microsoft.com\/python\/api\/overview\/azure\/ml\/?WT.mc_id=devops-9793-dabrady\">The Python SDK<\/a> 2. <a href=\"https:\/\/docs.microsoft.com\/azure\/machine-learning\/tutorial-train-deploy-model-cli?WT.mc_id=devops-9793-dabrady\">The Azure ML CLI<\/a> 3. <a href=\"https:\/\/github.com\/Azure\/actions-workflow-samples\/tree\/master\/MachineLearning\">GitHub Actions for Azure Machine Learning<\/a><\/p>\n<p>The best way to see some of these in action is to check out the <a href=\"https:\/\/github.com\/Azure\/azureml-examples\">Azure ML examples on GitHub<\/a>.<\/p>\n<p>Let&#8217;s look at how to run an Azure Machine Learning pipeline from GitHub Actions using each of these methods.<\/p>\n<h3>Using the Python SDK<\/h3>\n<p>The Python SDK is the most powerful way of integrating, but requires a little more work than the other methods.<\/p>\n<p>In your Actions workflow, you&#8217;ll need to set up the correct version of Python, then install the <a href=\"https:\/\/docs.microsoft.com\/python\/api\/azureml-core\/azureml.core?WT.mc_id=devops-9793-dabrady\"><code>azureml-core<\/code><\/a> python package. Finally, run a python script that makes use of the SDK.<\/p>\n<p><strong>Here&#8217;s an example of the job and steps you might have in your workflow to run a training job using the Python SDK:<\/strong><\/p>\n<pre><code class=\"yaml\">jobs:\n  build:\n    runs-on: ubuntu-latest \n\n    steps:\n    - name: check out repo\n      uses: actions\/checkout@v2\n\n    - name: setup python\n      uses: actions\/setup-python@v2\n      with: \n        python-version: \"3.8\"\n\n    - name: pip install\n      run: pip install azureml-core\n\n    - name: run workflow \n      run: python job.py\n<\/code><\/pre>\n<p>There are great examples of how to work with the Python SDK in the <a href=\"https:\/\/github.com\/Azure\/azureml-examples\"><code>azureml-examples<\/code> repository in GitHub<\/a>.<\/p>\n<p><strong>Here&#8217;s an example of a python script that triggers a training run on the MNIST data set using tensorflow (<a href=\"https:\/\/github.com\/Azure\/azureml-examples\/blob\/main\/workflows\/train\/tensorflow\/mnist\/job.py\">you can find the file here<\/a>):<\/strong><\/p>\n<pre><code class=\"py\"># description: train tensorflow NN model on mnist data\n\n# imports\nfrom pathlib import Path\nfrom azureml.core import Workspace\nfrom azureml.core import ScriptRunConfig, Experiment, Environment\n\n# get workspace\nws = Workspace.from_config()\n\n# get root of git repo\nprefix = Path(__file__).parent\n\n# training script\nscript_dir = str(prefix.joinpath(\"src\"))\nscript_name = \"train.py\"\n\n# environment file\nenvironment_file = str(prefix.joinpath(\"environment.yml\"))\n\n# azure ml settings\nenvironment_name = \"tf-gpu-example\"\nexperiment_name = \"tf-mnist-example\"\ncompute_name = \"gpu-cluster\"\n\n# create environment\nenv = Environment.from_conda_specification(environment_name, environment_file)\n\n# create job config\nsrc = ScriptRunConfig(\n    source_directory=script_dir,\n    script=script_name,\n    environment=env,\n    compute_target=compute_name,\n)\n\n# submit job\nrun = Experiment(ws, experiment_name).submit(src)\nrun.wait_for_completion(show_output=True)\n<\/code><\/pre>\n<p>Note that in this example, we wait for the experiment to complete. For long-running training jobs, we could just leave out the final line and our GitHub Actions workflow would finish, leaving the experiment to complete in its own time.<\/p>\n<h3>The Azure ML CLI<\/h3>\n<p>The Azure CLI is also extremely powerful. Because it&#8217;s an extension to the <a href=\"https:\/\/docs.microsoft.com\/cli\/azure\/install-azure-cli?WT.mc_id=devops-9793-dabrady\">cross-platform Azure CLI<\/a>, it can be used almost anywhere.<\/p>\n<p>The <a href=\"https:\/\/docs.microsoft.com\/azure\/machine-learning\/tutorial-train-deploy-model-cli?WT.mc_id=devops-9793-dabrady\">Azure ML CLI documentation<\/a> links to a great <a href=\"https:\/\/github.com\/microsoft\/MLOps\">example repository showing how to use the CLI<\/a>, but it doesn&#8217;t specifically talk about using it in the context of CI\/CD pipeline. Of course the great thing about GitHub Actions is that if you can script it, you can run it &#8211; so we can easily run the CLI from a workflow!<\/p>\n<p>The Azure CLI is already installed on the hosted agents you&#8217;ll use in your workflow. That means you just need to install the Azure Machine Learning extension. From there you&#8217;ll need a step to log into Azure and you&#8217;re ready to go!<\/p>\n<p><strong>Here&#8217;s an example of the job and steps you might have in your workflow:<\/strong><\/p>\n<pre><code class=\"yaml\">jobs:\n  build:\n    runs-on: ubuntu-latest \n    steps:\n    - name: check out repo\n      uses: actions\/checkout@v2\n    - name: azure login\n      uses: azure\/login@v1\n      with:\n        creds: ${{secrets.AZURE_CREDENTIALS}}\n    - name: attach workspace\n      run: az ml folder attach -w myworkspace -g myresourcegroup\n    - name: run pipeline\n      run: az ml run submit-pipeline -n myproject-pipeline -y aml_config\/pipeline.yml\n<\/code><\/pre>\n<p>That final line is a call to Azure Machine Learning to submit a pipeline for execution called <code>myproject-pipeline<\/code> defined in <code>aml_config\/pipeline.yml<\/code> in the repository.<\/p>\n<p>You can find more details of how this all works in the <a href=\"https:\/\/github.com\/microsoft\/AML-R-acceleration-template\"><code>AML-R-acceleration-template<\/code> repository in GitHub<\/a>.<\/p>\n<h3>The Azure GitHub Actions for Azure ML<\/h3>\n<p>Finally, Microsoft has published a number of GitHub Actions you can use directly in your workflow. You can find them all by [searching the Marketplace for &#8220;Azure Machine Learning&#8221;(https:\/\/github.com\/marketplace?type=actions&amp;query=azure+machine+learning).<\/p>\n<p>This is by far the easiest way to use Azure Machine Learning from GitHub Actions, however each Action has a single specific purpose. By contrast, The CLI and Python SDK allow you to perform any operation in Azure Machine Learning.<\/p>\n<p><strong>Here&#8217;s an example of a manually-run workflow that uses the Azure ML Actions to deploy a trained model:<\/strong><\/p>\n<pre><code class=\"yaml\">name: Deploy my model\non:\n  workflow_dispatch:\n    inputs:\n      name:\n        description: 'Model to deploy'\n        required: true\n        default: 'mymodel'\n      version:\n        description: 'Version'\n        required: true\n        default: '1'\n\njobs:\n  build:\n    runs-on: ubuntu-latest\n\n    steps:\n    - name: check out repo\n      uses: actions\/checkout@v2\n\n    - name: connect to workspace\n      uses: Azure\/aml-workspace@v1\n      with:\n        azure_credentials: ${{ secrets.AZURE_CREDENTIALS }}\n\n    - name: deploy model\n      uses: Azure\/aml-deploy@v1\n      with:\n        azure_credentials: ${{ secrets.AZURE_CREDENTIALS }}\n        model_name:  ${{ github.event.inputs.name }}\n        model_version: ${{ github.event.inputs.version }}\n<\/code><\/pre>\n<p>If you&#8217;re interested, you can have a look at the code behind these Actions! For example, <a href=\"https:\/\/github.com\/Azure\/aml-deploy\">here&#8217;s the repository with the aml-deploy action<\/a>.<\/p>\n<h2>Get Started<\/h2>\n<p>Using GitHub alongside Azure Machine Learning gives you the best of both worlds and lets you use the right tool for the job without compromising.<\/p>\n<p>To get started with Azure Machine Learning, I can highly recommend the <a href=\"https:\/\/docs.microsoft.com\/learn\/paths\/build-ai-solutions-with-azure-ml-service\/?WT.mc_id=devops-9793-dabrady\">Build AI solutions with Azure Machine Learning<\/a> learning path on Microsoft Learn. It takes you through the product step by step, and best of all, it&#8217;s free!<\/p>\n<p>There&#8217;s also a <a href=\"https:\/\/docs.microsoft.com\/learn\/modules\/github-actions-automate-tasks\/?WT.mc_id=devops-9793-dabrady\">great learning path that can teach you how to use GitHub Actions<\/a>. While it doesn&#8217;t have anything to do with machine learning, it will equip you with the skills you need to get started.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Azure Machine Learning is the ideal product to help you mature your machine learning process with MLOps. Even better, it integrates very easily with GitHub Actions, enabling you to train your models automatically when your code changes.<\/p>\n","protected":false},"author":39350,"featured_media":60549,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[226,1],"tags":[],"class_list":["post-60543","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ci","category-devops"],"acf":[],"blog_post_summary":"<p>Azure Machine Learning is the ideal product to help you mature your machine learning process with MLOps. Even better, it integrates very easily with GitHub Actions, enabling you to train your models automatically when your code changes.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/posts\/60543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/users\/39350"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/comments?post=60543"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/posts\/60543\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/media\/60549"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/media?parent=60543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/categories?post=60543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/devops\/wp-json\/wp\/v2\/tags?post=60543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}