0% found this document useful (0 votes)

125 views8 pages

Deploy ML Models with AWS Sagemaker

This document describes using AWS Sagemaker, S3, and Lambda to build a machine learning deployment pipeline that runs batch predictions on a regular basis. It details building a Docker container with prediction code, creating a Sagemaker model, and setting up a Lambda function to trigger batch transformations when new data is uploaded to S3.

Uploaded by

wpairo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

125 views8 pages

Deploy ML Models with AWS Sagemaker

Uploaded by

wpairo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Machine Learning Model Deployment Using AWS Sagemaker Batch

Transformation Job

In this tutorial, a solution for deploying Machine Learning models is implemented using S3, Sagemaker and Lambda AWS
services. The model to be deployed is a Kmeans saved in a pickle file and the OS used were Linux.

The goal here is to have a solution that allows you to create a system in which inferences are made on a regular basis.

Pre-requisites

AWS Credentials and IAM Roles with the right permissions

Docker and Python installed on your local machine

Scheme

Fig. 1: Solution architecture

1. Building and pushing the Docker Image

To build a Docker image you need to write the Dockerfile and use a standard container folder structure in order to push this
image into AWS ECR. The structure must be as shown below:

Fig. 2: Container folder structure

This allows the container to know where to look for the installed programs.

In the Dockerfile , you can specify requirements and dependencies to be installed in the container (such as Python, NGINX, and
Scikit). Next, you need to have a line in the Dockerfile to copy the program folder to container’s WORKDIR, which is also defined
in the Dockerfile. See the Dockerfile below:

FROM ubuntu:16.04

MAINTAINER fealbuqu

# 1. Define the packages required in our environment.

RUN apt-get -y update && apt-get install -y --no-install-recommends \
wget \
python \
python3 \
nginx \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*

# 2. Define the folder (kmeans) where our inference code is located and set the working directory
COPY kmeans /opt/program
WORKDIR /opt/program

# 3. Here we define all python packages we want to include in our environment.

RUN wget [Link] && python3 [Link] && \
pip install -r [Link] && \
rm -rf /root/.cache

# 4. Set some environment variables.

ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"

Then the installed container executes programs included in the kmeans folder to start the server. On the file [Link] is
where it is located the code for the predictions, more specifically inside the function transformation() as follows:

@[Link]('/invocations', methods=['POST'])
def transformation():
"""Do an inference on a single batch of data. In this sample server, we take data as CSV, convert
it to a pandas data frame for internal use and then convert the predictions back to CSV (which really
just means one prediction per line, since there's a single column.
"""
# Get input JSON data and convert it to a DF
input_json = [Link].get_json()
input_json = [Link](input_json)
input_df = pd.read_json(input_json)

print('Invoked with {} records'.format(input_df.shape[0]))

#Transforming the data

input_df.fecha=pd.to_datetime(input_df.fecha)
input_df.fecha=input_df.[Link]
fecha_predict=input_df.[Link]()
predict_cross_selling=input_df[input_df.fecha==fecha_predict]

#Scalling
scaler = StandardScaler()
[Link](predict_cross_selling.loc[:,(~predict_cross_selling.[Link](['tracab_idusua','fecha']))])
cross_scale_predict=[Link](predict_cross_selling.loc[:,(~predict_cross_selling.[Link](['tracab_idus
cross_scale_predict=[Link](cross_scale_predict)
cross_scale_predict.columns=list(predict_cross_selling.loc[:,(~predict_cross_selling.[Link](['tracab_idusua',

#PCA
pca_predict=PCA(n_components=10)
pca_predict.fit(cross_scale_predict)
cross_scale_predict=pca_predict.transform(cross_scale_predict)
cross_scale_predict=[Link](cross_scale_predict)

# Do the prediction
predictions = [Link](cross_scale_predict)

# Transform predictions to JSON

result = {'output': []}
list_out = []
for label in predictions:
row_format = {'label': '{}'.format(label)}
list_out.append(row_format)
result['output'] = list_out
result = [Link](result)
return [Link](response=result, status=200, mimetype='application/json')

The next step is to build and push the Docker image to a AWS ECR repository. So the first thing is to create an repository.

Fig. 3: Create repository

Now let's build and push the image to the created repository. For this, configure the AWS CLI on your local machine so you
interact with your account programatically. Install AWS CLI using pip install awscli and then set the AWS Credentials with
aws configure on terminal.

With all setted up just click in your repository, then View push commands and follow the steps.

Fig. 4: Steps to build and push image

Note: the commands bellow must be run in a terminal inside the container folder.
1. Retrieve an authentication token and authenticate your Docker client to your registry:

aws ecr get-login-password --region us-east-1 |

docker login --username AWS --password-stdin <user>.[Link].<region>.[Link]/<repository>

2. Build your Docker image:

docker build -t <image> .

3. After the build completes, tag your image so you can push the image to this repository:

docker tag <image>:latest <user>.[Link].<region>.[Link]/<repository>:latest

4. Run the following command to push this image to your newly created AWS repository:

docker push <user>.[Link].<region>.[Link]/<repository>:latest

At the end the image is already pushed to the repository.

2. Creating AWS Sagemaker's model

Now with the inference code and the image into a repository AWS ECR we can create a Sagemaker model that will use this
image.

For this, we need to compress the pickle file from the model with gzip. Use the following command to compress an entire
directory or a single file on Linux. It’ll also compress every other directory inside a directory you specify–in other words, it works
recursively.

tar -czvf [Link] /path/to/directory-or-file

Here’s what those switches actually mean:

-c: Create an archive

-z: Compress the archive with gzip
-v: Display progress in the terminal while creating the archive, also known as “verbose” mode. The v is always optional in
these commands, but it’s helpful.
-f: Allows you to specify the filename of the archive.

The next steps now is to upload the .[Link] file in S3, you can do that using AWS CLI or through the Console. With the S3 path
to the [Link] and the ECR path to the image you can create a Sagemaker model, as shown below.
Fig. 5: Create Sagemaker model

Fig. 6: Name it, set the permissions and the right paths then create

Now you should have a model setted up, with this model you can create an Endpoit or a Batch Transformation Job for the
inferences.

3. Creating Lambda function to start a Batch Transformation Job

In order to make the scheduled inferences, we can think that a cron job regularly loads a .json file into a specific S3 path,
containing the samples to be predicted with the model created before.

We can then create a Lambda function, triggered when the input data is uploaded, then run a script to get this input data from S3
and start a Batch Transformation Job in Sagemaker, where the predicted values will be stored in a specificied folder in S3
So let's start creating a Lambda function.

Fig. 7: Create lambda function

Our code is written in python, thus select python 3.8 for the execution, name it and choose/create the IAM role with the
permissions to access the files in S3 and start a Batch Transformation Job in Sagemaker.

Fig. 8: Lambda's function configuration

The inference code shown before receives as input a json, consequently the function must be triggered when an json object is
created in S3.

Fig. 9: Add the trigger

To start a Batch Transformation Job we can use the python module boto3 , where we instance a Sagemaker client object that
uses the path from the uploaded file, a specified output path and other keyword args to start the transformation job. See the code
below:

import json
import boto3
from datetime import datetime

def find_indices(lst, condition):

return [i for i, elem in enumerate(lst) if condition(elem)]

def lambda_handler(event, context):

for record in event['Records']:

bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
tmp = [Link]('/')
out_key = '/'.join(tmp[:find_indices(tmp, lambda e: e == 'input')[0]])
sm = [Link]('sagemaker')

stringNow = [Link]().strftime("%d%m%Y-%H%M%S")
data_path = "s3://{}/{}".format(bucket, key)
output_path="s3://{}/{}output/{}".format(bucket, out_key, stringNow)
print(output_path)
response = sm.create_transform_job(
TransformJobName='crossSellKmeans-'+stringNow,
ModelName='crossSellKmeans',
MaxConcurrentTransforms=1,
MaxPayloadInMB=100,
BatchStrategy='MultiRecord',
TransformInput={
'DataSource': {
'S3DataSource': {
'S3DataType': 'S3Prefix',
'S3Uri': data_path
}
},
'SplitType': 'Line',
'ContentType': 'application/json'
},
TransformOutput={
'S3OutputPath': output_path,
'Accept':'application/json'
},
TransformResources={
'InstanceType': '[Link]',
'InstanceCount': 1
}
)
r = {
'status': 200,
'body': response
}
return r

Just save the Lambda function with the script above, setting the paths and desired keyword args (e.g. InstanceType) for the job.
Fig. 10: Save the changes made

After all this we can upload a .json file to trigger our function and see in Sagemaker Batch Transformation Jobs if the job
started and/or completed succesfully.

Fig. 11: Batch transformation jobs

References

[Link]
[Link]
[Link]
scikit-docker-container/
[Link]
[Link]

AWS SageMaker Custom Algorithms and Frameworks
No ratings yet
AWS SageMaker Custom Algorithms and Frameworks
19 pages
SageMaker SDK Overview and Features
No ratings yet
SageMaker SDK Overview and Features
164 pages
Unit-4 Containers and Docker
No ratings yet
Unit-4 Containers and Docker
44 pages
Deploy A Machine Learning Model As An API On AWS, Step by Step
No ratings yet
Deploy A Machine Learning Model As An API On AWS, Step by Step
12 pages
Containerizing ML Model with Docker
No ratings yet
Containerizing ML Model with Docker
4 pages
AWS ML Engineer Exam Prep Guide
No ratings yet
AWS ML Engineer Exam Prep Guide
131 pages
SageMaker ML Model Deployment Guide
No ratings yet
SageMaker ML Model Deployment Guide
13 pages
Useful Commands
No ratings yet
Useful Commands
3 pages
Week 9-Module 10 Build and Deploy ML Models
No ratings yet
Week 9-Module 10 Build and Deploy ML Models
27 pages
Pdfdumps Can Solve All Your It Exam Problems and Broaden Your Knowledge
No ratings yet
Pdfdumps Can Solve All Your It Exam Problems and Broaden Your Knowledge
30 pages
chatGPT Deployment Pipeline Setup With Sagemaker and Lambda
No ratings yet
chatGPT Deployment Pipeline Setup With Sagemaker and Lambda
2 pages
Aws Sagemaker
No ratings yet
Aws Sagemaker
18 pages
Revolutionizing Healthcarewith Machine Learning 06143 A 1 D 15927607
No ratings yet
Revolutionizing Healthcarewith Machine Learning 06143 A 1 D 15927607
19 pages
Using Django, Docker and Scikit-Learn To Bootstrap Your Machine Learning Project
No ratings yet
Using Django, Docker and Scikit-Learn To Bootstrap Your Machine Learning Project
36 pages
Building Big Data Pipelines with Beam
No ratings yet
Building Big Data Pipelines with Beam
234 pages
Mamindla Sathvika - Lab10
No ratings yet
Mamindla Sathvika - Lab10
10 pages
Pytorch Extending Our Containers
No ratings yet
Pytorch Extending Our Containers
13 pages
Cloud Deployment for ML Models
No ratings yet
Cloud Deployment for ML Models
34 pages
Deploy Algo
No ratings yet
Deploy Algo
1 page
AWS ML Exam Notes - Important
No ratings yet
AWS ML Exam Notes - Important
20 pages
ML Ops Notes
No ratings yet
ML Ops Notes
5 pages
AWS Batch Deployment Options in CDK
No ratings yet
AWS Batch Deployment Options in CDK
30 pages
Note
No ratings yet
Note
17 pages
CCD Viva
No ratings yet
CCD Viva
6 pages
??????? ???????? ???????? ??????????
No ratings yet
??????? ???????? ???????? ??????????
6 pages
Serverless Inference in SageMaker
No ratings yet
Serverless Inference in SageMaker
45 pages
AI-Driven DevOps Failure Prediction
No ratings yet
AI-Driven DevOps Failure Prediction
7 pages
How To Deploy Machine Learning Models in Production As APIs
No ratings yet
How To Deploy Machine Learning Models in Production As APIs
2 pages
AWS ML Notes - Domain Misc
No ratings yet
AWS ML Notes - Domain Misc
15 pages
SageMaker Deployment for ML Pros
No ratings yet
SageMaker Deployment for ML Pros
21 pages
Build, Train, and Deploy Machine Learning Models On Aws With Amazon Sagemaker
No ratings yet
Build, Train, and Deploy Machine Learning Models On Aws With Amazon Sagemaker
21 pages
Predictive Maintenance Using Machine Learning: AWS Implementation Guide
No ratings yet
Predictive Maintenance Using Machine Learning: AWS Implementation Guide
11 pages
Technical Report Diffusion Platform
No ratings yet
Technical Report Diffusion Platform
4 pages
AWS SageMaker With Python
No ratings yet
AWS SageMaker With Python
6 pages
ML Pipeline Introduction
No ratings yet
ML Pipeline Introduction
29 pages
AWS ML for Beginners
100% (1)
AWS ML for Beginners
52 pages
Desine Data Struectres
No ratings yet
Desine Data Struectres
3 pages
Feature Store
No ratings yet
Feature Store
19 pages
Integrate ML with Flask Web Apps
No ratings yet
Integrate ML with Flask Web Apps
7 pages
Unit-3 Packaging ML Model
No ratings yet
Unit-3 Packaging ML Model
39 pages
Submitting PySpark Apps on AWS EMR
No ratings yet
Submitting PySpark Apps on AWS EMR
7 pages
Operationalizing The Model
No ratings yet
Operationalizing The Model
46 pages
AWS SageMaker Data Prep Guide
No ratings yet
AWS SageMaker Data Prep Guide
47 pages
Future of Supply Chain
No ratings yet
Future of Supply Chain
15 pages
SageMaker Data Prep for ML Experts
No ratings yet
SageMaker Data Prep for ML Experts
30 pages
Arnav MLOPSLab03
No ratings yet
Arnav MLOPSLab03
5 pages
NTFX Price Prediction
No ratings yet
NTFX Price Prediction
5 pages
Amazon SageMaker
No ratings yet
Amazon SageMaker
1,055 pages
AWS Machine Learning Engineer Nanodegree Program Syllabus
No ratings yet
AWS Machine Learning Engineer Nanodegree Program Syllabus
16 pages
SageMaker Custom Workshop For Rehrig Pacific
No ratings yet
SageMaker Custom Workshop For Rehrig Pacific
9 pages
Amazon SageMaker Guide - FAQs
No ratings yet
Amazon SageMaker Guide - FAQs
9 pages
Unit 5
No ratings yet
Unit 5
17 pages
MLOps Implementation with SageMaker
No ratings yet
MLOps Implementation with SageMaker
24 pages
Lab1-02 - Numoy and Pandas
No ratings yet
Lab1-02 - Numoy and Pandas
7 pages
AWS Certified AI Practitioner Study Guide
No ratings yet
AWS Certified AI Practitioner Study Guide
15 pages
Practical Data Science With Amazon Sagemaker
No ratings yet
Practical Data Science With Amazon Sagemaker
3 pages
File 22
No ratings yet
File 22
37 pages
Deploy A Machine Learning Model Using Flask - Towards Data Science
No ratings yet
Deploy A Machine Learning Model Using Flask - Towards Data Science
12 pages