Post Graduate Program
in Data Engineering
In collaboration with IBM
1 | www.simplilearn.com
Table About the Program 03
Key Features of the Post Graduate
of Program in Data Engineering 04
Contents About the Post Graduate Program in
Data Engineering in Partnership with
Purdue University 05
About Simplilearn 05
Program Eligibility Criteria and
Application Process 06
Learning Path Visualization 08
Program Outcomes 09
Who Should Enroll in this Program 10
Courses
Step 1 - Big Data for Data Engineering 11
Step 2 - Data Engineering with Hadoop 12
Step 3 - Data Engineering with Scala 13
Step 4 - Big Data Hadoop and Spark Developer 14
Step 5 - AWS Tech Essentials 16
Step 6 - Big Data on AWS 17
Step 7 - Azure Fundamentals 18
Step 8 - Azure Data Engineer 19
Step 9 - Data Engineering Capstone 20
Electives 21
Certification 24
Advisory Board Members 25
2 | www.simplilearn.com
About
the Program
Accelerate your career with this
acclaimed Post Graduate Program in
Data Engineering, in partnership with
Purdue University and in collaboration
with IBM. This program provides a
perfect mix of theory, case studies,
and extensive hands-on practice.
Learners will receive a comprehensive
data engineering education, leveraging
Purdue’s academic excellence in
data engineering and Simplilearn’s
partnership with IBM.
This post graduate program is designed
to give experienced professionals,
coming from diverse backgrounds, an
extensive data engineering education
through a blend of online self-paced
videos, live virtual classes, hands-on
projects, and labs. Learners will also
get access to mentorship sessions
that provide a high-engagement
learning experience and real-world
applications to help master essential
data engineering skills. You’ll learn the
implementation of data engineering
concepts such as distributed processing
using the Hadoop framework, large
scale data processing using Spark,
building data pipelines with Kafka,
and working with databases both on
on-premise, AWS and Azure cloud
infrastructures.
3 | www.simplilearn.com
Key Features of the Post Graduate
Program in Data Engineering in
Partnership with Purdue University
Purdue Post 180+ hours of 14+ hands-on
Graduate Program Blended Learning projects
Certification
Masterclasses from Purdue Alumni Capstone
Purdue faculty Association project in 3
membership domains
Curriculum aligned with
Microsoft Azure (DP-200),
AWS (DAS-C01), and Cloudera
(CCA175) certifications
4 | www.simplilearn.com
About the Post Graduate Program in
Data Engineering in Partnership with
Purdue University
Purdue University, a top public the industry sectors and verticals -
research institution, offers higher including manufacturing, healthcare,
education at its highest proven and ecommerce.
value. Committed to student success, Upon completing this program, you
Purdue is changing the student will receive a joint Purdue-Simplilearn
experience with a greater focus certification of completion.
on faculty-student interaction and
creative use of technology.
This Post Graduate Program in Data
Engineering in partnership with
Purdue University will open pathways
for you in the data engineering
field, which has its presence in all
About Simplilearn
Simplilearn is a leader in digital skills training, focused on the emerging technologies
that are transforming our world. Our unique Blended Learning approach drives
learner engagement and is backed by the industry’s highest course completion rates.
Partnering with professionals and companies, we identify their unique needs and
provide outcome-centric solutions to help them achieve their professional goals.
5 | www.simplilearn.com
Program Eligibility Criteria and
Application Process
Those wishing to enroll in this Post Graduate Program in Data Engineering in
partnership with Purdue University will be required to apply for admission.
Eligibility Criteria
For admission to this Post Graduate Program in Data Engineering,
candidates should have:
A bachelor’s degree with an average of 50% or higher marks
2+ years of work experience (preferred)
Basic understanding of object oriented programming (preferred)
Application Process
The application process consists of three simple steps. An offer of
admission will be made to the selected candidates and accepted by the
candidates by paying the admission fee.
STEP 1 STEP 2 STEP 3
SUBMIT AN APPLICATION ADMISSION
APPLICATION REVIEW
Complete the application and A panel of admissions An offer of admission will be
include a brief statement of counselors will review your made to qualified candidates.
purpose. The latter informs application and statement of You can accept this offer by
our admissions counselors purpose to determine whether paying the program fee.
why you’re interested and you qualify for acceptance.
qualified for the program.
6 | www.simplilearn.com
Talk to an Admissions Counselor
We have a team of dedicated admissions counselors who are here to help
guide you in applying to the program. They are available to:
Address questions related to the application
Assist with financial aid (if required)
Help you resolve your questions and understand the program
7 | www.simplilearn.com
Learning Path Electives
Python for Data Science
PySpark Training
Apache Kafka
MongoDB Developer and Administrator
GCP Fundamentals
SQL Training
Java Training
Big Data for Spark for Scala Analytics
Data Engineering Academic master class - Data Engineering
- Purdue university
Data Engineering Data Engineering
with Hadoop with Scala
AWS Technical Big Data Hadoop
Essentials and Spark Developer
Big Data on AWS Azure Fundamentals
Data Engineering Azure Data Engineer
Capstone
8 | www.simplilearn.com
Program Outcomes
Gather data requirements, access Understand how to use Amazon EMR
data from multiple sources, process for processing data using Hadoop
data for business needs, and store ecosystem tools
data on the cloud as well as on-
premises
Understand how to use Amazon Kinesis
for Big Data processing in real-time
Gain insights into how to improve
business productivity by processing
Big Data on platforms that can Analyze and transform Big Data using
handle its volume, velocity, variety, Kinesis Streams and visualize data
and veracity and perform queries using Amazon
QuickSight
Get a solid understanding of the
fundamentals of the Scala language, Implement data storage solutions;
its tooling, and the development manage and develop data processing;
process and monitor and optimize data solutions
using Azure Cosmos DB, Azure SQL
Database, Azure Synapse Analytics,
Master the various components of Azure Data Lake Storage, Azure Data
the Hadoop ecosystem, such as Factory, Azure Stream Analytics, Azure
Hadoop, Yarn, MapReduce, Pig, Hive, Databricks, and Azure Blob storage
Impala, HBase, ZooKeeper, Oozie, services
Sqoop, and Flume
Apply the knowledge, skills, and
Identify AWS concepts, capabilities gathered throughout the
terminologies, benefits, and program to build an industry-ready
deployment options to meet product
business requirements
9 | www.simplilearn.com
Who Should Enroll in this Program?
This program caters to graduates experienced professionals with a
in any discipline and working passion for data, including:
professionals from diverse
backgrounds who have basic IT professionals
programming knowledge. The
diversity of our students adds Database administrators
richness to class discussions and
interactions. Beginners in the data
engineering domain
A data engineer builds and
maintains data structures and BI Developers
architectures for data ingestion,
Data science professionals
processing, and deployment
who want to expand their
for large-scale, data-intensive
skill set
applications. It’s a promising
career for both new and Students in UG/ PG
programs
10 | www.simplilearn.com
S
T
E
P
Big Data for Data Engineering 1
2
The Big Data for Data Engineering course from IBM covers the basic
concepts and terminologies of Big Data and its real-life applications 3
across industries. You will gain insights on how to improve business
productivity by processing large volumes of data and extracting
4
valuable information from them. 5
6
Key Learning Objectives
7
Understand what Big Data is, sources of Big Data, and real-life
examples 8
Learn the key difference between Big Data and data science 9
Master the usage of Big Data for operational analysis and better
customer service
Gain knowledge of the ecosystem of Big Data and the Hadoop
framework
Course curriculum
Lesson 01 - What is Big Data
Lesson 02 - Beyond the Hype
Lesson 03 - Big data and Data Science
Lesson 04 - Big Data Use Cases
Lesson 05 - Processing Big Data
11 | www.simplilearn.com
S
T
E
P
Data Engineering with Hadoop 1
2
The Data Engineering with Hadoop course by IBM will give you an
overview of what Hadoop is and its components, such as MapReduce 3
and Hadoop Distributed File System (HDFS). Additionally, this course will
teach you to explore with large data sets and use Hadoop’s method of
4
distributed processing. 5
6
Key Learning Objectives
7
Understand Hadoop’s architecture and primary components, such as
MapReduce and HDFS 8
Add and remove nodes from Hadoop clusters, check the available disk
space on each node, and modify configuration parameters
9
Learn about Apache projects that are part of the Hadoop ecosystem,
including Pig, Hive, HBase, ZooKeeper, Oozie, Sqoop, and Flume
Course curriculum
Lesson 01 - Introduction to Hadoop
Lesson 02 - Hadoop Architecture and HDFS
Lesson 03 - Hadoop Administration
Lesson 04 - Hadoop Components
12 | www.simplilearn.com
S
T
E
P
Data Engineering with Scala 1
2
The Data Engineering with Scala course carefully crafted by IBM focuses
on Scala programming. You will learn to write Scala codes, perform Big 3
Data analysis using Scala, and create your own Scala projects.
4
5
Key Learning Objectives
Create your own Scala project
6
Understand basic object-oriented programming methodologies 7
in Scala
8
Work with data in Scala, including pattern matching, applying
synthetic methods, and handling options, failures, and futures 9
Course curriculum
Lesson 01 - Introduction
Lesson 02 - Basic Object Oriented Programming
Lesson 03 - Case Objects and Classes
Lesson 04 - Collections
Lesson 05 - Idiomatic Scala
13 | www.simplilearn.com
S
T
E
P
Big Data Hadoop and Spark Developer 1
2
This Big Data Hadoop and Spark developer course helps you master the
concepts of the Hadoop framework, Big Data, and Hadoop ecosystem 3
tools such as HDFS, YARN, MapReduce, Hive, Impala, Pig, HBase,
Spark, Flume, Sqoop, and including additional concepts of the Big Data
4
processing life cycle. This course is aligned with Cloudera’s CCA175 5
Big Data Certification.
6
Key Learning Objectives 7
Learn how to navigate the Hadoop ecosystem and understand how to
8
optimize its use
9
Ingest data using Sqoop, Flume, and Kafka
Implement partitioning, bucketing, and indexing in Hive
Work with RDD in Apache Spark
Process real-time streaming data
Perform DataFrame operations in Spark using SQL queries
Implement User-Defined Functions (UDF) and User-Defined Attribute
Functions (UDAF) in Spark
Course curriculum
Lesson 01 - Course Introduction
Lesson 02 - Introduction to Big Data and Hadoop
Lesson 03 - Hadoop Architecture, Distributed Storage
(HDFS), and YARN
Lesson 04 - Data Ingestion into Big Data Systems and ETL
14 | www.simplilearn.com
Lesson 05 - Distributed Processing - MapReduce Framework and Pig
Lesson 06 - Apache Hive
Lesson 07 - NoSQL Databases - HBase
Lesson 08 - Basics of Functional Programming and Scala
Lesson 09 - Apache Spark Next Generation Big Data Framework
Lesson 10 - Spark Core Processing RDD0
Lesson 11 - Spark SQL - Processing DataFrames
Lesson 12 - Spark MLLib - Modelling BigData with Spark
Lesson 13 - Stream Processing Frameworks and Spark Streaming
Lesson 14 - Spark GraphX
15 | www.simplilearn.com
S
T
E
P
AWS Tech Essentials 1
2
The AWS Technical Essentials course teaches you how to navigate the
AWS management console, understand AWS security measures, storage, 3
and database options, and gain expertise in web services like RDS and
EBS. This course helps you become proficient in identifying and efficiently 4
using AWS services.
5
6
Key Learning Objectives
Understand the fundamental concepts of AWS platform and cloud
7
computing 8
Identify AWS concepts, terminologies, benefits, and deployment
options to meet business requirements 9
Identify deployment and network options in AWS
Course curriculum
Lesson 01 - Introduction to Cloud Computing
Lesson 02 - Introduction to AWS
Lesson 03 - Storage and Content Delivery
Lesson 04 - Compute Services and Networking
Lesson 05 - AWS Managed Services and Databases
Lesson 06 - Deployment and Management
16 | www.simplilearn.com
S
T
E
P
Big Data on AWS 1
2
The AWS Big Data course helps you understand the Amazon Web
Services cloud platform, Kinesis Analytics, AWS Big Data storage, 3
processing, analysis, visualization, and security service, EMR, AWS
Lambda, Glue, and machine learning algorithms.
4
5
Key Learning Objectives 6
Understand how to use Amazon EMR for processing the data using 7
Hadoop ecosystem tools
8
Understand how to use Amazon Kinesis for Big Data processing in
real-time and analyze and transform Big Data using Kinesis Streams 9
Visualize data and perform queries using Amazon QuickSight
Course curriculum
Lesson 01 - Big Data on AWS Certification Course Overview
Lesson 02 - Big Data on AWS Introduction
Lesson 03 - AWS Big Data Collection Services
Lesson 04 - AWS Big Data Storage Services
Lesson 05 - AWS Big Data Processing Services
Lesson 06 - Analysis
Lesson 07 - Visualization
Lesson 08 - Security
17 | www.simplilearn.com
S
T
E
P
Azure Fundamentals 1
2
The Azure Fundamentals course covers the main principles of cloud
computing and how they have been implemented in Microsoft Azure. 3
You will work on the concepts of Azure services, security, privacy,
compliance, trust, pricing, and support and learn how to create the most 4
common Azure services, including virtual machines, web apps, SQL
databases, features of Azure Active Directory, and methods of integrating
5
it with on-premises Active Directory. 6
7
Key Learning Objectives
8
Describe Azure storage and create Azure web apps
9
Deploy databases in Azure
Understand Azure AD, cloud computing, Azure, and Azure
subscriptions
Create and configure VMs in Microsoft Azure
Course curriculum
Lesson 01 - Cloud Concepts
Lesson 02 -Core Azure Services
Lesson 03 -Security, Privacy, Compliance, and Trust
Lesson 04 -Azure Pricing and Support
18 | www.simplilearn.com
S
T
E
P
Azure Data Engineer 1
2
The Azure Data Engineer course will focus on data-related
implementation which includes provisioning data storage services, 3
ingesting streaming and batch data, transforming data, implementing
security requirements, implementing data retention policies, identifying 4
performance bottlenecks, and accessing external data sources.
5
6
Key Learning Objectives
7
Implement data storage solutions using Azure Cosmos DB, Azure SQL
Database, Azure Synapse Analytics, Azure Data Lake Storage, Azure 8
Data Factory, Azure Stream Analytics, Azure Databricks, and Azure
Blob storage services 9
Develop batch processing and streaming solutions
Monitor Data Storage and Data Processing
Optimize Azure Data Solutions
Course curriculum
Implement data storage solutions
Manage and develop data processing
Monitor and Optimize Data Solutions
19 | www.simplilearn.com
S
T
E
P
Data Engineer Capstone 1
2
The data engineering capstone project will give you an opportunity
to implement the skills you learned throughout this program. Through 3
dedicated mentoring sessions, you’ll learn how to solve a real-world,
industry-aligned data engineering problem, from setting up configuration, 4
ETL, data streaming, and data analysis to data visualization. This project
is the final step in the learning path and will enable you to showcase your
5
expertise in data engineering to future employers. 6
You can choose to work on projects that cover the most relevant domains 7
(ecommerce, BFSI, video sharing) to make your practice more relevant.
8
9
20 | www.simplilearn.com
Elective Course
Python for Data Science
The Python for Data Science course, carefully
crafted by IBM helps you understand how to
integrate Python using the PySpark interface. This
course enables you to write your Python scripts,
perform fundamental hands-on data analysis using
the Jupyter-based lab environment, and create your
data science projects using IBM Watson.
Pyspark
This PySpark course provides an overview of Apache
Spark, the open-source query engine for processing
large datasets, and how to integrate it with Python
using the PySpark interface. You will learn to build
and implement data-intensive applications as you
dive into the world of high-performance machine
learning and leverage Spark RDD, Spark SQL, Spark
MLlib, Spark Streaming, HDFS, Sqoop, Flume, Spark
GraphX, and Kafka.
Apache Kafka
In this Apache Kafka course, you will learn the
architecture, installation, configuration, and
interfaces of Kafka open-source messaging. You
will gain a fair understanding of basics of Apache
ZooKeeper as a centralized service and develop the
skills to deploy Kafka for real-time messaging.
21 | www.simplilearn.com
MongoDB Developer and Administrator
This course provides you an in-depth knowledge of
NoSQL, data modeling, ingestion, query, sharding,
and data replication.
GCP Fundamentals
In GCP Fundamentals course, you will learn how
to analyze and deploy infrastructure components
such as networks, storage systems, and application
services in Google Cloud Platform. This course
covers IAM, networking, and cloud storage and
introduces you to the flexible infrastructure and
platform services provided by Google Cloud
Platform.
SQL Training
This SQL course covers all the information you need
to successfully start working with SQL databases
and make use of the database in your applications.
You will learn to structure your database, author
efficient SQL statements, and clauses, and manage
your SQL database for scalable growth.
22 | www.simplilearn.com
Java Training
This Java course covers the concepts of Java, from
introductory techniques to advanced programming
skills and provides you with the knowledge of Core
Java 8, operators, arrays, loops, methods, and
constructors in JDBC and JUnit framework.
Spark for Scala Analytics
This Spark for Scala Analytics course, provides you
an overview on the history of Apache Spark, how
it evolved, how to build applications with Spark,
RDDs and Data frames, Spark and its associated
ecosystems. It will teach you to leverage the core
RDD and DataFrame APIs to perform analytics on
datasets with Scala..
Academic master class - Data Engineering
- Purdue university
Attend an online interactive Masterclass and get insights into the world of
data engineering.
23 | www.simplilearn.com
Certification
Upon completion of this Post Graduate Program in Data Engineering in
partnership with Purdue University, you will receive the Post Graduate Program
Certification from Purdue University. You will also receive certificates from
Simplilearn for the courses in the learning path. These certificates will testify to
your skills as a data engineering Expert.
24 | www.simplilearn.com
Advisory Board Members
Aly El Gamal
Assistant Professor, Purdue University
Aly El Gamal has a Ph.D. in Electrical
and Computer Engineering and M.S. in
Mathematics from the University of Illinois. Dr.
El Gamal specializes in the areas of information
theory and machine learning and has received
multiple commendations for his research and
teaching expertise.
Ronald Van Loon
Big Data Expert, Director - Adversitement
Named by Onalytica as one of the three most
influential people in Big Data, Ronald van Loon
is an author for a number of leading Big Data
and Data Science websites, including Datafloq,
Data Science Central, and The Guardian. He is
also a renowned speaker at industry events.
25 | www.simplilearn.com
USA
Simplilearn Americas, Inc.
201 Spear Street, Suite 1100, San Francisco, CA 94105
United States
Phone No: +1-844-532-7688
INDIA
Simplilearn Solutions Pvt Ltd.
# 53/1 C, Manoj Arcade, 24th Main, Harlkunte
2nd Sector, HSR Layout
Bangalore - 560102
Call us at: 1800-212-7688
www.simplilearn.com
Disclaimer: All programs are offered on a non-credit basis and are not transferable to a degree.
26 | www.simplilearn.com