Databricks - Scalable Machine Learning with Apache
Spark
Code: SCALABLEML
Length: 2 days
URL: View Online
This course guides students through the process of building machine learning solutions using Spark. You will build
and tune ML models with SparkML using transformers, estimators, and pipelines. This course highlights some of the
key differences between SparkML and single-node libraries such as scikit-learn. Furthermore, you will reproduce
your experiments and version your models using MLflow.
You will also integrate 3rd party libraries into Spark workloads, such as XGBoost. In addition, you will leverage Spark
to scale inference of single-node models and parallelize hyperparameter tuning. This course includes hands-on labs
and concludes with a collaborative capstone project. All of the notebooks are available in Python, and in Scala as
well where available.
Skills Gained
Create data processing pipelines with Spark
Build and tune machine learning models with SparkML
Track, version, and deploy models with MLflow
Perform distributed hyperparameter tuning with Hyperopt
Use Spark to scale the interence of single-node models
Who Can Benefit
Data scientist
Machine learning engineer
Prerequisites
Intermediate experience with Python Beginning experience with the PySpark DataFrame API (or have taken the Apache
Spark Programming with Databricks class) Working knowledge of machine learning and data science
Schedule (as of 4 )
Date Location
Aug 25, 2022 – Aug 26, 2022 iMVP Enroll
Sep 29, 2022 – Sep 30, 2022 iMVP Enroll
Oct 27, 2022 – Oct 28, 2022 iMVP Enroll
Nov 10, 2022 – Nov 11, 2022 iMVP Enroll
Dec 12, 2022 – Dec 13, 2022 iMVP Enroll
Download Whitepaper: Accelerate Your Modernization Efforts with a Cloud-Native
Strategy
Get Your Free Copy Now
ExitCertified® Corporation and iMVP® are registered trademarks of ExitCertified ULC and Generated 7
ExitCertified Corporation and Tech Data Corporation, respectively
Copyright ©2022 Tech Data Corporation and ExitCertified ULC & ExitCertified Corporation.
All Rights Reserved.