Learn
In Just 30 Days
*Disclaimer*
Everyone has their own way of learning. The key
is focusing on the core elements of PySpark to
build a strong understanding.
This guide is designed to assist you in that
journey.
www.bosscoderacademy.com 1
Introduction:
www.bosscoderacademy.com 2
Week 1: Introduction to PySpark and
Environment Setup
Day 1
Introduction to Big Data and Spark
Ecosystem
Objective:
Topics:
Practice:
www.bosscoderacademy.com 3
Day 2
Setting Up PySpark Environment
Objective:
Topics:
Practice:
www.bosscoderacademy.com 4
Day 3
SparkContext and SparkSession
Objective:
Topics:
Practice:
www.bosscoderacademy.com 5
Day 4
RDDs (Resilient Distributed
Datasets) Basics
Objective:
Topics:
Practice:
www.bosscoderacademy.com 6
Day 5
RDD Transformations and Actions
Objective:
Topics:
Practice:
www.bosscoderacademy.com 7
Day 6
Key-Value Pair RDDs
Objective:
Topics:
Practice:
www.bosscoderacademy.com 8
Day 7
Data Persistence and Partitioning
Objective:
Topics:
Practice:
www.bosscoderacademy.com 9
Week 2: DataFrames and Spark SQL
Day 8
Introduction to DataFrames
Objective:
Topics:
Practice:
www.bosscoderacademy.com 10
Day 9
DataFrame Operations
Objective:
Topics:
Practice:
www.bosscoderacademy.com 11
Day 10
Aggregations in DataFrames
Objective:
Topics:
Practice:
www.bosscoderacademy.com 12
Day 11
DataFrame Joins
Objective:
Topics:
Practice:
www.bosscoderacademy.com 13
Day 12
DataFrame Functions and
Expressions
Objective:
Topics:
Practice:
www.bosscoderacademy.com 14
Day 13
Working with Dates and Timestamps
Objective:
Topics:
Practice:
www.bosscoderacademy.com 15
Day 14
Spark SQL Introduction
Objective:
Topics:
Practice:
www.bosscoderacademy.com 16
Week 3: Advanced Data Processing
and Optimization
Day 15
Working with Complex Data Types
Objective:
Topics:
Practice:
www.bosscoderacademy.com 17
Day 16
User-Defined Functions (UDFs)
Objective:
Topics:
Practice:
www.bosscoderacademy.com 18
Day 17
Broadcasting Variables and
Accumulators
Objective:
Topics:
Practice:
www.bosscoderacademy.com 19
Day 18
Performance Tuning and
Optimization
Objective:
Topics:
Practice:
www.bosscoderacademy.com 20
Day 19
Working with Parquet and ORC
Formats
Objective:
Topics:
Practice:
www.bosscoderacademy.com 21
Day 20
Window Functions
Objective:
Topics:
Practice:
www.bosscoderacademy.com 22
Day 21
Handling Missing Data
Objective:
Topics:
Practice:
www.bosscoderacademy.com 23
Week 4: Machine Learning and PySpark
MLlib
Day 22
Introduction to PySpark MLlib
Objective:
Topics:
Practice:
www.bosscoderacademy.com 24
Day 23
Feature Engineering
Objective:
Topics:
Practice:
www.bosscoderacademy.com 25
Day 24
Classification Models in PySpark
Objective:
Topics:
Practice:
www.bosscoderacademy.com 26
Day 25
Regression Models in PySpark
Objective:
Topics:
Practice:
www.bosscoderacademy.com 27
Day 26
Clustering Models in PySpark
Objective:
Topics:
Practice:
www.bosscoderacademy.com 28
Day 27
Dimensionality Reduction
Techniques
Objective:
Topics:
Practice:
www.bosscoderacademy.com 29
Day 28
Model Persistence and Deployment
Objective:
Topics:
Practice:
www.bosscoderacademy.com 30
Day 29
Spark Streaming Basics
Objective:
Topics:
Practice:
www.bosscoderacademy.com 31
Day 30
Spark Structured Streaming
Objective:
Topics:
Practice:
www.bosscoderacademy.com 32
Conclusion:
www.bosscoderacademy.com 33
Why Bosscoder?
1000+ Alumni placed at Top Product-
based companies.
More than 136% hike for every
2 out of 3 Working Professional.
Average Package of 24LPA.
Explore More