MLRun Demos

The mlrun/demos repository provides demos that implement full end-to-end ML use-case applications with MLRun and demonstrate different aspects of working with MLRun.

For more information about the MLRun Hackathon, refer to the hackathon getting-started section.

In This Document

Overview
- General ML Workflow
Prerequisites
Getting-started Tutorial
How-To: Converting Existing ML Code to an MLRun Project
Integrating with CI Pipelines
Model deployment Pipeline: Real-time operational Pipeline
Healthcare Demo with Feature Store

Overview

The MLRun demos are end-to-end use-case applications that leverage MLRun to implement complete machine-learning (ML) pipelines — including data collection and preparation, model training, and deployment automation.

The demos demonstrate how you can

Run ML pipelines locally from a web notebook such as Jupyter Notebook.
Run some or all tasks on an elastic Kubernetes cluster by using serverless functions.

The demo applications are tested on the Iguazio Data Science Platform ("the platform") and use its shared data fabric, which is accessible via the v3io file-system mount; if you're not already a platform user, request a free trial.

General ML Workflow

The provided demos implement some or all of the ML workflow steps illustrated in the following image:

Prerequisites

To run the MLRun demos, first do the following:

Prepare a Kubernetes cluster with preinstalled operators or custom resources (CRDs) for Horovod and/or Nuclio, depending on the demos that you wish to run.
Install an MLRun service on your cluster. See the instructions in the MLRun documentation.
Ensure that your cluster has a shared file or object storage for storing the data (artifacts).

Getting-started Tutorial

The tutorial covers MLRun fundamentals such as creation of projects and data ingestion and preparation, and demonstrates how to create an end-to-end machine-learning (ML) pipeline. MLRun is integrated as a default (pre-deployed) shared service in the Iguazio Data Science Platform.

You'll learn how to

Collect (ingest), prepare, and analyze data
Train, deploy, and monitor an ML model

You'll also learn about the basic concepts, components, and APIs that allow you to perform these tasks, including

Setting up MLRun
Creating and working with projects
Creating, deploying and running MLRun functions
Using MLRun to run functions, jobs, and full workflows
Deploying a model to a serving layer using serverless functions

How-To: Converting Existing ML Code to an MLRun Project

The converting-to-mlrun how-to demo demonstrates how to convert existing ML code to an MLRun project. The demo implements an MLRun project for taxi ride-fare prediction based on a Kaggle notebook with an ML Python script that uses data from the New York City Taxi Fare Prediction competition.

The code includes the following components:

Data ingestion
Data cleaning and preparation
Model training
Model serving

Pipeline Output

Integrating with CI Pipelines

The CI Pipeline demo demonstrates how to build a full end-to-end automated-ML pipeline using scikit-learn and the UCI Iris data set.

Users may want to run their ML Pipelines using CI frameworks like Github Actions, GitLab CI/CD, etc. MLRun support simple and native integration with the CI systems, see the following example in which we combine local code (from the repository) with MLRun marketplace functions to build an automated ML pipeline which:

Runs data preparation
Train a model
Test the trained model
Deploy the model into a cluster
Test the deployed model

The demo by default uses Slack notifications. To run slack notification, you will need to create an app and enable webhooks. This process is straightforward and should take a few minutes. For more information see the slack documentation

Model deployment Pipeline: Real-time operational Pipeline

This demo shows how to deploy a model with streaming information.

This demo is comprised of several steps:

Note: this demo uses the multi-model data layer (V3IO), primarily for real-time streaming. Contact Iguazio to get credentials to access a V3IO system. To test access to the V3IO API see the v3io-api test notebook.

While this demo covers the use case of 1^st-day churn, it is easy to replace the data, related features and training model and reuse the same workflow for different business cases.

These steps are covered by the following pipeline:

1. Data generator — Generates events for the training and serving and Create an enrichment table (lookup values).
2. Event handler - Receive data from the input. This is a common input stream for all the data. This way, one can easily replace the event source data (in this case we have a data generator) without affecting the rest of this flow. It also store all incoming data to parquet files.
3. Stream to features - Enrich the stream using the enrichment table and Update aggregation features using the incoming event handler.
4. Optional model training steps -
4.1 Get Data Snapshot - Takes a snapshot of the feature table for training.
4.2 Describe the Dataset - Runs common analysis on the datasets and produces plots suche as histogram, feature importance, corollation and more.
4.3 Training - Runing training with multiple classification models.
4.4 Testing - Testing the best performing model.
5. Serving - Serve the model and process the data from the enriched stream and aggregation features.
6. Inference logger - We use the same event handler function from above but only its capability to store incoming data to parquet files.

Healthcare Demo with Feature Store

This demo shows the usage of MLRun and the feature store. The demo will showcase:

Healthcare facilities need to closely monitor their patients and identify early signs that can indicate that medical intervention is necessary. Time is a key factor, the earlier the medical teams can attend to an issue, the better the outcome. This means an effective system that can alert of issues in real-time can save lives.

In this demo we will learn how to Ingest different data sources to our Feature Store. Specifically, this patient data has been successfully used to treat hospitalized COVID-19 patients prior to their condition becoming severe or critical. To do this we will use a medical dataset which includes three types of data:

Healthcare systems: Batch updated dataset, containing different lab test results (Blood test results for ex.).
Patient Records: Static dataset containing general patient details.
Real-time sensors: Real-Time patient metric monitoring sensor.

Note: this demo uses the multi-model data layer (V3IO), primarily for real-time streaming. Contact Iguazio to get credentials to access a V3IO system. To test access to the V3IO API see the v3io-api test notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 371 Commits
.github/workflows		.github/workflows
ci-pipeline		ci-pipeline
converting-to-mlrun		converting-to-mlrun
docs		docs
getting-started-tutorial		getting-started-tutorial
healthcare-feature-store		healthcare-feature-store
model-deployment-pipeline		model-deployment-pipeline
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
getting-started.md		getting-started.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLRun Demos

In This Document

Overview

General ML Workflow

Prerequisites

Getting-started Tutorial

How-To: Converting Existing ML Code to an MLRun Project

Integrating with CI Pipelines

Model deployment Pipeline: Real-time operational Pipeline

Healthcare Demo with Feature Store

About

Uh oh!

Releases 224

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MLRun Demos

In This Document

Overview

General ML Workflow

Prerequisites

Getting-started Tutorial

How-To: Converting Existing ML Code to an MLRun Project

Integrating with CI Pipelines

Model deployment Pipeline: Real-time operational Pipeline

Healthcare Demo with Feature Store

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 224

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages