Refactor CI #1693

Zerpet · 2024-08-06T11:46:03Z

Note to reviewers: remember to look at the commits in this PR and consider if they can be squashed

Summary Of Changes

Do not notify failed tests in PR
Add a test-ci rule branch
Split image building into different jobs
Install tools as part of unit test
Add permissions for steps using GCP
[ci] Use local builds in other jobs
[ci] Refactor build-test workflow
Fix examples
Update CMCTL binary
Use github token in carvel-setup action
Checkout code before setup GO
Re-enable GChat triggers

Additional Context

We are moving away from the massive CI image (~1GB) with a catch-all for tools, into a actions based
on small simple steps that setup the worker as we need to, and rely on the worker cache.

Another important step is moving away from our GCP project for system tests, into local system tests
using KinD. This is important because it relieves the project from depending on sponsored infrastructure,
and it gives more independence to run its test suites.

There's a important fix to the examples job, as it was always passing earlier. Now, any failure in the
examples will also fail the workflow. In PRs, we only dry-run the examples, in order to verify they have
a correct syntax. Once merged, the examples will run in KinD. The examples job is the heaviest, taking
about 17 minutes to run.

Main reason is that PR won't have the secret to post to the webhook URL. In addition, there's someone looking at PRs, at least the author, and therefore the failure will be noticed.

This allows us to test CI changes in a dedicated branch, and keep the main branch history clean.

This allows to run them in parallel, unblocking other jobs that depend on the build_operator job.

Unfortunately, we cannot get rid of this step, because we need to install testenv, and the setup-env binary for testing.

We are pushing every commit to a registry. This seems a bit unnecessary, specially given that we can build locally very quickly. It's more efficient to build and export to a tarball, and then share this tarball among other jobs.

The GCS and manifest part is now a separate job, so it doesn't delay the execution of other parts of the workflow. kubectl-rabbitmq plugin tests now use the locally built image, instead of locally building one. This should speed-up the execution time of this job.

The examples test.sh script was not exiting when an error was returned. Any error in the test.sh scripts should now stop the execution and return the exit code. __ Downscale vault-tls example The github runners struggles to run 3-node rabbit cluster due to resource constraints. __ Delete obsolete example Federation is better achieved using the Messaging Topology Operator. There's another example that covers how to setup TLS. The case of federation over TLS is a combination of using the Topology Operator in combination with the TLS example. __ Fix external secret example The test.sh file was missing, and this test was not excluded from the CI runs. Added a short script to validate that the external secret is used to seed the default user. __ Removed system tests in GKE They do not provide more value than local KinD system tests __ cmctl now waits watching the correct namespace Earlier, if the kubeconfig had a namespace that didn't exist as current ns, cmctl will always fail, even when cert-manager was ready __ Skip mtls internode example tests It requires a 3 node cluster, and this setup is not reliable inside a GitHub runner, given the resource constraints inside the runner. Typo in kubectl testss

Starting with cert-manager 1.15.x, cmctl cli and cert manager are shipped in different repos, as different software.

To avoid rate limitting from GitHub

This allows the setup-go action to cache our dependencies. In order to cache our deps, the go.sum file needs to be present before executing the setup-go action.

docs/examples/federation-over-tls/README.md

Login to Dockerhub, otherwise, the push will fail. [skip ci]

Zerpet · 2024-08-08T12:41:31Z

@PujaVad @DanielePalaia would you have some bandwidth to review?
I'd like to merge this PR this week, as it is blocking other PRs.

DanielePalaia · 2024-08-08T13:10:26Z

Hey @Zerpet it looks good to me, I noticed that in the test_upgrade we are still using the GKE cluster. I was wondering if we are planning to migrate this one in a next step or it was for some limitations encountered.

Zerpet · 2024-08-13T15:42:38Z

I left it out because I suspect we may have a resource limitation in Actions. The upgrade tests use a 3-node rabbit. Those were not quite deploying when I did this refactor. We can try a couple things to make those tests work in KinD. I would prefer to do that in a separate PR.

Zerpet added 11 commits August 6, 2024 12:29

Do not notify failed tests in PR

81a0063

Main reason is that PR won't have the secret to post to the webhook URL. In addition, there's someone looking at PRs, at least the author, and therefore the failure will be noticed.

Add a test-ci rule branch

48a750c

This allows us to test CI changes in a dedicated branch, and keep the main branch history clean.

Split image building into different jobs

a7647b8

This allows to run them in parallel, unblocking other jobs that depend on the build_operator job.

Install tools as part of unit test

de6ee05

Unfortunately, we cannot get rid of this step, because we need to install testenv, and the setup-env binary for testing.

Add permissions for steps using GCP

02406c3

[ci] Use local builds in other jobs

70e1539

We are pushing every commit to a registry. This seems a bit unnecessary, specially given that we can build locally very quickly. It's more efficient to build and export to a tarball, and then share this tarball among other jobs.

Update CMCTL binary

04aa626

Starting with cert-manager 1.15.x, cmctl cli and cert manager are shipped in different repos, as different software.

Use github token in carvel-setup action

25773bd

To avoid rate limitting from GitHub

Checkout code before setup GO

e3b14e5

This allows the setup-go action to cache our dependencies. In order to cache our deps, the go.sum file needs to be present before executing the setup-go action.

Zerpet self-assigned this Aug 6, 2024

Zerpet requested review from DanielePalaia, MirahImage and PujaVad August 6, 2024 11:46

Re-enable GChat triggers

5ce3583

Zerpet force-pushed the test-ci/drop-gcr branch from b5c3ca7 to 5ce3583 Compare August 6, 2024 11:59

Zerpet mentioned this pull request Aug 6, 2024

Default to RabbitMQ 3.13.6 #1691

Merged

MirahImage reviewed Aug 6, 2024

View reviewed changes

docs/examples/federation-over-tls/README.md Show resolved Hide resolved

Zerpet added 2 commits August 6, 2024 13:20

Always build manifest in CI

69c9f1e

Fix single-arch builds in CI

0f722bb

Login to Dockerhub, otherwise, the push will fail. [skip ci]

Zerpet merged commit fb5e375 into main Aug 14, 2024

Zerpet deleted the test-ci/drop-gcr branch August 14, 2024 10:30

Zerpet linked an issue Aug 14, 2024 that may be closed by this pull request

Fix CI #1686

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor CI #1693

Refactor CI #1693

Zerpet commented Aug 6, 2024

Zerpet commented Aug 8, 2024

DanielePalaia commented Aug 8, 2024

Zerpet commented Aug 13, 2024

Refactor CI #1693

Refactor CI #1693

Conversation

Zerpet commented Aug 6, 2024

Summary Of Changes

Additional Context

Zerpet commented Aug 8, 2024

DanielePalaia commented Aug 8, 2024

Zerpet commented Aug 13, 2024