Gitlab CI Notes
Tools
pip
twine
poetry
CI Variables
Gitlab Runner
Automation tool that runs scripted commands on the GitLab CI/CD pipeline, allowing the
automation of tasks, tests, builds, and deployments.
Docker Executor
Gitlab Runner can perform tasks using different environments, like Shell, VMs, Docker. We
use the Docker executor which essentially:
1. Pulls a docker image (from remote repo or even a custom one)
2. Creates a container from the image
3. Executes the tasks in the CLI of the container
4. Removes the container
The files required for the execution of the tasks are mounted on the container. Using
the Docker executor we are flexible to install and manipulate the build and test
environments without messing with the host machine's system.
CI/CD Variables
A type of environment variables to be used by the runners, when performing tasks. They can
be defined manually in the UI. There are also predefined ones, like for instance
$CI_COMMIT_BRANCH, which gives the name of the branch where the commit will take
place, after the job.
CI/CD variables can be defined per project or per group, like CHIDOS.
Gitlab Package registry
Gitlab allows the use of package registries, i.e. a registry where we can store pre-built
python (+npm) libraries, serving as a private PyPI. Although each repo can have its own
package registry, a good practice is to have a unique one per group or instance. This way, all
related packages will be published and can be fetched from the same URL.
.[Link]
A configuration files stored in every repo, which tells gitlab which jobs to trigger. Gitlab runs
these jobs using the available runners, which can shell, docker or VM executor ones. An
example .[Link] is shown below.
stages:
- test
- build
test job:
stage: test
script:
- echo "Running some tests..."
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
build and publish:
stage: build
image: python:3.9
before_script:
- python -V
- apt-get update && apt-get install -y ca-certificates
- cat /etc/gitlab-runner/certs/[Link] >>
/usr/local/share/ca-certificates/[Link]
- update-ca-certificates
- pip install poetry
- poetry install
script:
- poetry build
- poetry config [Link] ${PACKAGE_REGISTRY}
- poetry publish --repository gitlab -u ${GITLAB_USER_LOGIN}--ci-token
-p glpat--${PACKAGE_REGISTRY_TOKEN} --cert /etc/gitlab-
runner/certs/[Link]
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
Here, we have two jobs, namely the test and the build and publish ones. The first one uses
the default image of the Docker executor (python:3.9), and prints an output to the terminal
and gets triggered only on merge requests. We control the triggering of this job using the
rules parameter and the predefined CI/CD variables.
The "build and publish" job, on the other hand follows the steps:
1. Pulls the defined image
2. Executes the "before-script" commands in the container's terminal one-by-one
1. Updates the certificates inside the container
2. Copies the certificate from /etc/gitlab-runner/certs/[Link] to
/usr/local/share/ca-certificates/[Link]. (Note: file /etc/gitlab-
runner/certs/[Link] is mounted from skala to the container by
default, by proper configuration in the /etc/gitlab-runner/[Link] file. This way
we can retrieve the certificate inside the container from the host machine.)
3. Updates the container's certificates
4. Installs poetry
5. Uses poetry to install the dependencies of the repo
3. Executes the "script" commands one=by-one:
1. Build the python package
2. Config the remote package registry for poetry
3. Publish the built package to the package registry.
For CHIDOS, we use the package-registry repo to host all packages. This means that all of
our projects will be published in the same package registry, in the URL [Link]
[Link]/api/v4/projects/81/packages/pypi/simple. This URL is defined as a group-level
CI/CD variable with the name ${PACKAGE_REGISTRY}.
To publish to the registry, we use the following command:
poetry publish --repository gitlab -u ${GITLAB_USER_LOGIN}--ci-token -p
glpat--${PACKAGE_REGISTRY_TOKEN} --cert /etc/gitlab-
runner/certs/[Link]
Here,
repository is the defined as mentioned previously
-u (user) is a predefined CI/CD variable with the name of the user, from Gitlab.
-p is the password needed to publish to the registry, which is a token. We create the
group level token (has an expiration date) under CHIDOS/Access Tokens. These
tokens come in the format:
glpat-XXXXXXXXXXXXXXXXXXXX
If needed, we create a new token and copy its contents after the dash ( the ones
marked with X above). This content must be saved within a CI/CD variable with the
name ${PACKAGE_REGISTRY_TOKEN} under CHIDOS/Settings/CI/CD, i.e. we
update the value of the CI/CD variable used by the .[Link] to publish the repos.
Finally, to publish the built package, we use the --cert parameter to pass the certificate
to poetry. Note: It uses the mounted path, not the installed one, which means that
installing the certificate might not be needed in the first place.
The publish job is triggered when the commit branch is named main:
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
Poetry vs PIP & TWINE
Poetry can be used as a package manager and publisher, requiring a single file for
dependencies.
Pip, on the other hand, can only fetch & install dependencies. Building is done by using
Python's build package, and publishing by using Twine.
Case 1: Testing
PIP
Let's consider a case where a repo has requirements both from PyPI (trimesh) and from
chidos (meshmedley, geometry). The associated [Link] is:
--index-url [Link]
[Link]/api/v4/projects/81/packages/pypi/simple
geometry
meshmedley
trimesh
Here, we used the --index-url setting to make pip default to the private package registry (we
use the URL in this case). Whenever pip cannot find a package in the default registry, it
looks into PyPI. To install those dependencies, we use the command
pip install --trusted-host [Link] -r [Link]
Here, we skip SSL certificate validation for our package registry (--trusted host flag, only
for [Link]). As of now, this is the only way to install packages from our own
private package registry.
POETRY
In the case of poetry, we must make sure to install it first:
pip install poetry
Poetry can be used both interactively and using a [Link] file. Let's consider the
latter.
[[Link]]
name = "your-project-name"
version = "0.1.0"
description = ""
authors = ["Your Name <you@[Link]>"]
[[[Link]]]
name = "pypi"
priority = "primary"
[[[Link]]]
name = "chidos"
url = "[Link]
priority = "explicit"
[[Link]]
python = "^3.9"
numpy = "^1.21"
meshmedley = {version ="^1.0", source = "chidos"}
geometry = {version ="^1.0", source = "chidos"}
[[Link]-dependencies]
ruff = "^0.3"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "[Link]"
Here, we define information about the project itself, as well as
sources, which are the registries where poetry will look for the dependencies. PyPI is
primary, chidos is explicit, i.e. it is only used when the dependency explicitly requires it.
[Link] list the main dependencies of the project. As you may see,
we can explicitly define which source to use for each package.
[Link]-dependencies is another "group" of dependencies. Poetry lets us
define as many groups as we want. For instance, in this case, it makes sense to install
ruff ( a linting-formatting tool) when developing the repo, but not on production or on
testing environments. We can do so by
poetry config [Link] false
poetry install --without dev