The objective of this project is to provide a generic machine learning template for python based projects.
This includes folder structure, testing and documentation tools which should work well for most small to midsize (in terms of number of features & examples) projects using a single instance of a machine.
This project template combines simplicity, best practice for folder structure and good OOP design.
The main idea is that there's much same stuff you do every time when you start your machine learning project, so wrapping all this shared stuff will help you to change just the core idea every time you start a new project.
So, here’s a simple template that help you get into your main project faster and just focus on your core (Model Architecture, Training Flow, etc.).
- Powered by
cruft; - Keep your project up-to-date with best practices;
- Good base folder structure for many kinds of ML Projects (see below);
- PEP8 is the universally accepted style guide for Python code. PEP 8 code
compliance is verified using
ruff; - Consistent code quality: formatting the code with
ruff, andisortfor sorting imports - Testing setup with
pytestwithcoverageplugin; - Type checks with
mypy; - Security checks with
safetyandbandit; - Ready-to-use
.editorconfig,.dockerignore,.gitignoreand.gitattributes. You don't have to worry about those things; - (Optional)
Hydraconfig templates withrayintegration for elegantly configuring complex applications; - (Optional)
typerCLI template to get you started quickly; - Simple
helmchart orkustomizeto deploy to k8s
- Docstring coverage with
interrogate; - Diagrams as code with
Diagrams; - Documentation with
MkDocs
- Standard way of committing rules and communicating it using
commitizen - Follow
conventional commits - Bump version automatically using
semantic versioningbased on the commits - Generate a changelog using
keep a Changelog
- Ready-to-use
pre-commithooks with code-formatting and security features; - Azure pipeline template available;
- Dockerfile linter with
hadolint.
Generate a machine learning project from this template:
cookiecutter [email protected]:thatmlopsguy/cookiecutter-ml-project.gitor for a specific branch, tag, or commit SHA {SPECIFIC}, run:
cookiecutter -c {SPECIFIC} [email protected]:thatmlopsguy/cookiecutter-ml-project.gitor using cruft:
cruft create -c {SPECIFIC} [email protected]:thatmlopsguy/cookiecutter-ml-project.gitFollow the prompts; if you are asked to re-download the cookiecutter template,
input yes.
Default responses are shown in the squared brackets; to use them, leave your response blank, and press enter.
After creating the project, you should follow a couple of steps to make sure everything works automatically.
Head over to the generated README.md file to read about the next steps and a more in-depth explanation of the generated project's features.
Have an existing project that you created from a template in the past using Cookiecutter directly?
Consider using the cruft package to integrate future cookiecutter releases.
pip3 install cruft[pyproject]
cruft link [email protected]:thatmlopsguy/cookiecutter-ml-project.gitTo update an existing project, that was created using cruft, run cruft update
in the root of the project.
If there are any updates, cruft will have you review them before applying.
If you accept the changes cruft will apply them to your project and update the
.cruft.json file for you.
Template generator will ask you to fill some variables.
The input variables, with their default values:
| Parameter | Default value | Description |
|---|---|---|
project_name |
project_name |
Project Name |
repo_name |
repo_name |
Repository Name |
description |
based on the project_name |
Brief description of your project. |
organization |
based on the project_name |
Name of the organization. We need to generate LICENSE and to specify ownership in pyproject.toml. |
license |
MIT |
One of MIT, BSD-3, GNU GPL v3.0 and Apache Software License 2.0. |
minimal_python_version |
3.10 |
Minimal Python version. It is used for builds and formatters ruff and isort. |
organization_email |
based on the organization |
Email for SECURITY.md files and to specify the ownership of the project in pyproject.toml. |
version |
0.0.0 |
Initial version of the package. Make sure it follows the semantic versions specification. |
line_length |
120 |
The max length per line (used for codestyle with ruff and isort). NOTE: This value must be between 50 and 140. |
command_line_interface |
none |
If typer is chosen generator will create simple CLI application with typer library. |
k8s |
none |
Choose if helm charts or kustomize to deploy to kubernetes |
Any contributions are welcome including improving the template and example projects.
Pull requests are welcome, if they're small, atomic, and if they make my own packaging experience better.
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements/requirements-dev.txtSee credits for all acknowledgements.