Skip to content

Conversation

@BradyBonnette
Copy link
Contributor

@BradyBonnette BradyBonnette commented Jun 9, 2023

The changes included here, at a 10,000 foot view:

  • An overhaul of how tests are ran on Amazon Linux 2 on aarch64 platforms. An adjacent git repo (not shown here) also includes a lot of work to make this happen. Basically, we relied on a "pool" of CI runners to sit and wait idle until a new job came through. Now, CI runners are launched ad-hoc, for as many runners that are needed with or without a matrix. This also allows for the re-run of individual failed tasks if required, whereas before this simply would not work
  • Re-introduces sccache back into the runs. Previously, PL/Rust and some required dependencies flat out not work with sccache. However, those issues have been addressed
  • Introduces a nightly run setup, which for now does two things: 1.) Tests to make sure that the Dockerfile.try setup works as per the documentation, and 2.) Tests that the Debian packages can be both built and installed. These will be ran nightly against the develop branch, and we can certainly add more to it in the future

NOTE: The nightly CI runs will not happen until these changes eventually land in main on next release.

Comment on lines -22 to +24
if aws s3api head-object --profile $ARTIFACT_USER_AWS_PROFILE --bucket $AWS_CACHE_BUCKET --key $cache_key &> /dev/null; then
if aws s3api head-object --bucket $AWS_CACHE_BUCKET --key $cache_key &> /dev/null; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the instances have direct access to the buckets they require.

# If something is missing, it can be added here or baked into a new image outside of Github Actions.
plrust_arm64_amzn2:
name: aarch64 tests (amzn2)
runs-on: [self-hosted, linux, ARM64, launch_template_id__lt-0013395587950cb5d]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

launch_template_id__lt-xxxxxxxxxx is important for the new CI setup. The private ops repo has more explanation on what this means.

cd $WORK_DIR
shopt -s globstar
checksum=$(cat **/Cargo.lock .github/workflows/ci.yml | sha256sum | awk '{print $1}')
checksum=$(cat **/Cargo.lock **/rust-toolchain.toml .github/workflows/ci.yml | sha256sum | awk '{print $1}')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds any rust-toolchain.toml file as a cache key calculation

Comment on lines +236 to +237
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' && \
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/apt.postgresql.org.gpg >/dev/null && \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old way of installing keys was deprecated by Ubuntu. This is the new way.

Comment on lines 14 to 20
# CODE REVIEW: Ensure this event setup is used and not on push/PR below!
# CODE REVIEW: Ensure this event setup is used and not on push/PR below!
# CODE REVIEW: Ensure this event setup is used and not on push/PR below!
# on:
# schedule:
# - cron: '0 7 * * *'
# workflow_dispatch:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I promise Ill change this before merging in. I am leaving it like this until it's been reviewed because any subsequent CI run that's needed in the future (pre-merging) should also re-run the nightly things to be sure.

name: Spin up aarch64 runner instances
arm64_deb_artifacts:
name: Build aarch64 Debian artifacts
runs-on: [self-hosted, linux, ARM64, launch_template_id__lt-0bad2911d6aad1b0d]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above regarding launch_template_id__lt-xxxxxxxxxxxx

Comment on lines -97 to +102
# The following permissions are required to instruct GHA that it is allowed
# to communicate to another service via OIDC
permissions:
id-token: write # Required for requesting OIDC JWTs
contents: read # This is required for actions/checkout

runs-on: ubuntu-latest

# In order to run this against aarch64 instances in AWS, GHA needs to authenticate
# against AWS via OIDC so it can directly communicate and launch AWS instances as
# needed. All the infrastructure and machinery is set up outside of Github Actions.
# Using OIDC is an alternative to setting up hard-coded AWS CLI credentials somewhere
# on a Github Action runner.
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-region: us-east-2
role-to-assume: arn:aws:iam::950481341027:role/github_oidc_iam_role
role-session-name: GithubPlrustSession

# Launch as many instances as needed, dictated by strategy.matrix.pg_version above.
# LaunchTemplateId is something known ahead of time, and is created elsewhere outside
# of Github Actions.
- name: Launch runner instance
run: aws ec2 run-instances --launch-template LaunchTemplateId=lt-0bad2911d6aad1b0d

build_deb_artifacts:
name: Build aarch64 Debian artifacts
needs: spin_up_aarch64_instances # Dont run this until 'spin_up_aarch64_instances' has completed
runs-on: [self-hosted, linux, ARM64, plrust_artifacts]

strategy:
matrix:
pg_version: [pg13, pg14, pg15] # See MATRIX NOTES above
contents: write
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OIDC is no longer needed as per the new CI launcher setup.

opt-level = 3
lto = "fat"
codegen-units = 1
strip = "debuginfo"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: #305


[dev-dependencies]
compiletest_rs = "0.10"
compiletest_rs = "0.10.2"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An update to compiletest_rs was needed to make it compatible with sccache.

Copy link
Contributor

@eeeebbbbrrrr eeeebbbbrrrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to accept this on faith. I don't have enough XP to validate every line of these .yml and .sh files.

Just make sure to hit the scheduling section you already noted in your own review.

And thanks! Excited to get this merged -- it'll be great!

@BradyBonnette
Copy link
Contributor Author

@eeeebbbbrrrr

That's understandable, and also why I put a bunch of discussion comments above as to maybe take away some of the mystery. Also, we can use the actual CI runs/checks here as a guide as to whether or not this works properly.

@BradyBonnette BradyBonnette merged commit 6481666 into tcdi:develop Jun 9, 2023
@eeeebbbbrrrr eeeebbbbrrrr mentioned this pull request Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants