-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Update workflows to use ARM runners and new Apple signing certificate #5977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In commit df6e49c of PR git-lfs#5243 we added a step to all of the jobs in our CI and release GitHub Actions workflows to work around the problem described in actions/checkout#290 and actions/checkout#882. This step, which only executes if the job is running due to the push of a new tag, performs a "git fetch" command of the tag's reference to ensure that the local copy is identical to the remote one and has not been converted from an annotated tag into a lightweight one. Starting with v2 of the actions/checkout action, annotated tags are by default replaced with lightweight ones, which then causes any subsequent "git describe" commands to return an incorrect value. Since we depend on the output of "git describe" in several places in our workflows to generate the appropriate version name for our release artifacts, we need to ensure we have the full annotated tag for the current reference rather than a lightweight one. Recently, in commit 9c3fab1 of PR git-lfs#5930, we strengthened the security of our GitHub Action workflows by setting the "persist-credentials" option of the actions/checkout action to "false", so that any cached Git credentials are removed at the end of that step. While this causes no problems when our CI workflow runs after a branch is pushed, as is the case for new PRs, when we push a new tag the "git fetch" step now fails as it depends on the cached Git credentials from the actions/checkout step. We could use the GITHUB_TOKEN Action secret to temporarily set an appropriate HTTP Authorization header to make the "git fetch" command succeed. However, a more straightforward solution exists whereby we specify explicitly the reference we want to check out using the "ref" option of the actions/checkout action. This causes the action to fetch the reference such that if the reference is an annotated tag, it remains one and is not converted into a lightweight one. For reference, see: actions/checkout#882 (comment) actions/runner-images#1717 (comment) h/t classabbyamp and xenoterracide for documenting this workaround
Since PR git-lfs#3840 introduced a GitHub Actions workflow to automate much of the process of releasing a new version of Git LFS, that workflow has concluded by running the script/upload script, which makes requests to GitHub's API to create a new draft release announcement and attach the binary release assets built by the workflow to the announcement. This step necessarily requires that the GITHUB_TOKEN credential provided to the workflow have permission to create a release, which implies fairly broad write access to the repository contents as GitHub does not offer a more fine-grained permission model for this particular task. In order to ensure that our GitHub Actions workflows can not be misused, we would much prefer to run them without write permissions to our repositories. We therefore revise the "build-main" job in our release workflow to remove the step which tries to run the script/upload script, and replace it with steps that instead create a "release-assets" artifact containing the binary release assets built by the workflow. The administrator performing the release may then download this artifact and unpack it, after which they can run the script/upload script manually using their own personal GitHub credentials. We update our release process documentation to explain this new approach and also clarify a few other details of the release process. This change does make our release process slightly more manual, and if in the future GitHub provides a convenient way to enable a single job to have just sufficient permissions to create a draft release announcement, we can revisit this decision in the future.
In commit bfc5304 of PR git-lfs#4143 we introduced a set of Makefile targets which sign and notarize binaries on macOS using the "codesign" and "security" utilities, among others. The recipes for these targets create a new, dedicated keychain and add our Apple Developer ID certificate to it, and then use the keychain to sign the macOS binaries we build. Although we refer to this keychain in some commands using the Makefile variable DARWIN_KEYCHAIN_ID, whose default is the value "CI.keychain", in other commands we refer to that keychain name explicitly. As a consequence, the variable can not actually be set to something other than its default value without causing problems. As we expect to further revise some of these Makefile targets in subsequent commits in this PR, we first adjust their recipes to use the DARWIN_KEYCHAIN_ID variable consistently, and we also take the opportunity to alter its default value to "lfs.keychain", since we run these recipes in both our CI and release GitHub Actions workflows, not just in the CI workflow.
In commit bfc5304 of PR git-lfs#4143 we introduced a set of Makefile targets which sign and notarize binaries on macOS using the "codesign" and "security" utilities, among other tools. The recipes for these targets create a new, dedicated keychain and add our Apple Developer ID certificate to it, and then use the keychain to sign the macOS binaries we build. These recipes include several commands which generate a considerable amount of detailed diagnostic log output, making the logs somewhat difficult to read. We can clarify the steps involved in the signing process by adding several log statements, removing commands which only generate extra diagnostics, and silencing the output of others. If we encounter problems with the macOS signing process in the future, we can always reinstate these commands while running the Makefile recipes on a local machine or in a private repository in order to trace the source of the issues.
In commit bfc5304 of PR git-lfs#4143 we introduced a set of Makefile targets which sign and notarize binaries on macOS using the "codesign" and "security" utilities, among other tools. The recipes for these targets create a new, dedicated keychain and add our Apple Developer ID certificate to it, and then use the keychain to sign the macOS binaries we build. At present, we retrieve our Apple Developer ID certificate and password from GitHub Actions secrets, along with our Developer ID signing identity, which is provided in an Actions secret named MACOS_CERT_ID. We then pass that identity to the "codesign" command with the -s option (i.e., the --sign option) in order to sign our binary release assets. In a subsequent commit in this PR we will modify our release GitHub Actions workflow to retrieve the certificate data from a new system, as our current Apple Developer ID certificate expired recently and can not simply be renewed. This new system to manage Apple Developer ID certificates does not provide the signing identity in an Actions secret, just the certificate and password. Fortunately, we can retrieve the signing identity from the local macOS keychain we have created in the recipe for our "release-import-certificate" Makefile target, using the "find-identity" subcommand of the "security" command, so we do that now at the start of our "release-darwin" target's recipe. This in turn allows us to remove the DARWIN_CERT_ID variable from our Makefile and update our release workflow as it no longer needs to set that environment variable from the MACOS_CERT_ID Actions secret.
In commit bfc5304 of PR git-lfs#4143 we introduced a set of Makefile targets which sign and notarize binaries on macOS using the "codesign" and "security" utilities, among other tools. The recipes for these targets create a new, dedicated keychain and add our Apple Developer ID certificate to it, and then use the keychain to sign the macOS binaries we build. Our most recent certificate expired recently, and per the current Apple documentation, a maximum of five certificates are permitted for each Apple Developer ID: You can create up to five Developer ID Application certificates and up to five Developer ID Installer certificates using either your developer account or Xcode. https://developer.apple.com/help/account/create-certificates/create-developer-id-certificates As our Developer ID is provided to us by GitHub, and they distribute multiple other projects which also need to be signed, we are no longer able to use a certificate dedicated just to Git LFS, and so need to adopt GitHub's process for managing these certificates. To do so we alter our release GitHub Actions workflow to retrieve the necessary certificate and password from the Actions secrets provided by GitHub's internal tooling. This tool makes the certificate data available in Actions environment secrets using an environment named "production", so we add that environment to most of the jobs in our release workflow. When we push a new release tag and our release workflow is enqueued, it will now wait to begin running these jobs until we manually validate the workflow in GitHub's Apple Developer ID certificate management interface, so we add that step to our release process documentation. Note that we exclude the "build-main" job in our release workflow from the "production" environment because it runs only after the "build-macos" and "build-windows" jobs are complete, and if it ran in the "production" environment it would therefore require a second manual validation in the GitHub certificate management tool. As this would be inconvenient and unnecessary, we just allow the job to run in the default environment.
In the "build-docker" and "build-docker-cross" jobs in our CI GitHub Actions workflow we currently install Ruby using the ruby/setup-ruby action, which has been true since we introduced the first of these jobs in commit 212c051 of PR git-lfs#3932. However, unlike the corresponding jobs in our release workflow, where we need the Ruby interpreter in order to run our script/packagecloud.rb utility at the end of the job, we do not actually require Ruby for the CI versions of these jobs, so we can simply remove the ruby/setup-ruby step from them now.
In commit 0044d78 of PR git-lfs#4728 we added the "build-docker-cross" job both our CI and release GitHub Actions workflows in order to build Debian Linux packages of Git LFS on the ARM64 platform. We only build an ARM64 package for Debian 12 because the build process is very slow, as we have had to use the Quick Emulator (QEMU) on AMD64 (x86_64) runners. Lately, some of these CI jobs have failed due to segmentation faults reported from the gcc compiler when it attempts to compile various modules from the Go standard packages. These failures appear to be unrelated to any changes in our code, but may be due to issues with the ARM64 emulation or some other dependency. Fortunately, GitHub has recently made ARM64 Actions runners available for use in public repositories like ours: https://github.com/orgs/community/discussions/148648 This means we can simply replace our "build-docker-cross" jobs with "build-docker-arm" jobs that utilize the ARM64 runners, and which do not exhibit the same kinds of failures with the gcc compiler. The build time for these new jobs is significantly faster than the old ones which ran in emulation, to the point where it now becomes feasible for us to build a Debian 11 package as well as the Debian 12 one. (We skip building a Debian 10 package since we expect to drop support for that platform fairly soon.) We should also be able to build an RPM package for ARM64 using Rocky 9, once we make some updates to our RPM package metadata files and the corresponding Dockerfile in our git-lfs/build-dockers repository. For the time being, however, we defer these changes to future PRs.
larsxschneider
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you 🙇
| ```ShellSession | ||
| $ script/upload --finalize vM.N.P | ||
| ``` | ||
| $ script/upload --skip-verify vM.N.P |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you skip verify here because this step would also try to download the asset which we already have downloaded manually. However, wouldn't it make sense to verify the signatures of the files that we have locally before we upload them or is that already checked elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "verify" step in this script downloads the assets attached to the release announcement page and then verifies that the GPG-signed hashes.asc and sha256sums.asc are valid and that the hashes in those files match the hashes of the binary assets. So that's a different operation from downloading the assets from the summary page of the release Actions job, which we'll now do manually, at least for the time being.
When we run script/upload --skip-verify without the --finalize option, that will create the draft release announcement page, and then upload the binary assets (which we've downloaded manually from the Actions job) as attachments to that page. The hashes and sha256sums files have not been created yet.
When we run script/upload --finalize, that will first download the binary assets attached the draft release announcement (as created by script/upload --skip-veify), create the hashes.asc and sha256sum.asc files for the first time, and then upload those and attach them to the announcement page. Finally, because we don't use --skip-verify in this version of the command, it will end by verifying, meaning it will download the binary assets and the hash files from the draft release announcement and check that they match.
Lastly, we manually click the "Publish" button on the announcement page, which makes the release public.
Users have recently requested packages for arm64, which is an increasingly popular architecture. In addition, it is the architecture used on newer Macs and some Windows systems, which may commonly run Linux VMs or WSL. However, GitHub Actions does not support arm64 natively, so we must build in emulation. This is extraordinarily slow, so we'll build only for the latest Debian release. All Debian-based images use our Debian images, and Debian is by far the most portable mainstream Linux distribution in terms of architecture support, so the cost to add additional architectures should be extremely minimal. Users who would like versions for older releases may build their own packages with our tooling. Build these packages in both CI and for releases.
This PR updates our GitHub Actions workflows so they avoid several permissions issues, make use of the newly available ARM runners, and sign our macOS release binaries with a new Apple Developer ID certificate that replaces the previous one, which has now expired.
First, we revise our CI and release workflows to avoid a problem introduced when we changed the
actions/checkoutactions in PR #5930 to no longer leave Git credentials in place for subsequent steps to use. While this had no effect on CI jobs run for PRs, it does affect CI and release jobs run when we push a new Git LFS version tag, because thegit fetchcommands we added in PR #5243 to work around the problem described in actions/checkout#290 and actions/checkout#882 now fail due to the lack of cached Git credentials.Next, because we would like to run our Actions workflows with the minimal required permissions, we alter the
build-mainjob in our release workflow so that is does not try to create a draft release announcement or attach binary release assets to that announcement. At present, this step is the only one in our workflows which requires write permissions on our whole repository's contents, and GitHub does not offer a way to make those permissions more fine-grained (e.g., by limiting them to just the creation of release announcements, for instance). Instead, in the future we will just run ourscript/uploadscript manually, which allows us to restrict all our workflows to having just read permissions on our repository contents.We then change our release workflow to replace our expired Apple Developer ID certificate with a new one that is provided by GitHub. Because Apple limits the number of certificates which can be created for a Developer ID to just five, GitHub is no longer able to assign a dedicated certificate to this project. In order to share a certificate with other projects, we adopt a GitHub management tool which allows us to approve our release workflow and confirm the use of the new certificate's secrets to sign our release binaries.
Finally, we revise our Actions workflows to make use of the ARM64 runners GitHub now offers for public repositories like ours. This allows us to bypass some recent issues seen in our Linux package builds that use QEMU-based ARM64 emulation, and because these jobs now run significantly faster than before, we can add a build of a Debian 11 release package for ARM64 systems. (We should also be able to offer a Rocky 9 RPM package for ARM64 systems, once we make some adjustments to our RPM package metadata and Docker containers.)
The changes to our release workflow in this PR have been tested using a private repository to confirm they work as expected when a new version tag is pushed. One exception is the use of ARM64 runners in the release workflow, as GitHub does not yet support ARM64 runners in private repositories without an Enterprise account, but we have tested these build jobs in our CI workflow in public repositories to confirm they succeed.
This PR will be most easily reviewed on a commit-by-commit basis.