Predictive Analytics
with R
Session 1
INSTRUCTOR: DR. NITIN KOSHTA
QUANTITATIVE METHODS & OPERATIONS MANAGEMENT
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 1
Overview
Textbook
Data Mining and Predictive Analytics by D.T. Larose and C.D. Larose, Wiley Publication
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 2
Overview
Assessment
Assessment tool Weightage
Mid-term 20%
End term 35%
Group Project 25%
Class participation 10%
Quiz 10%
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 3
Overview
Group Project
▪Project proposal – Not exceeding 3 pages
▪Final project report – At least 3 pages but not exceeding 10 pages
The following information is mandatory for the final submission:
➢ Title of the project with introduction and motivation for the problem
➢ Discussion on the methodology used for collecting the data
➢ Methodologies used and details on data analyses
➢ Results, Implications and Conclusion
https://www.tableau.com/learn/articles/free-public-data-sets
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 4
Overview
Group Project
▪Soft copies of the project report (in PDF format), R file, and data set(s) used (in MS Excel format) must be
submitted.
▪The report must include the names of all the group members along with the group number.
▪Also include details of individual contributions (e.g., data collection or analysis or problem selection,
preparation of graphs/tables etc.) at the end of the report.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 5
Overview
Academic Dishonesty
The following illustrates only three forms of academic dishonesty:
▪Plagiarism, e.g. the submission of work that is not one’s own or for which other credit has been obtained.
▪Improper collaboration in group work.
▪Copying or using unauthorized aids in tests and examinations
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 6
Introduction
❑During the early period of the 20th century, many companies
were making business decisions based on ‘opinions’ rather
than decisions based on proper data analysis.
❑Opinion-based decision-making often leads to incorrect
decisions
❑Primary objective of business analytics - to improve the
quality of decision-making using data analysis.
❑Business analytics is about extracting meaningful insights
from data to guide organizational decision-making.
❑Tasks such as budgeting, forecasting, and product
development.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 7
Introduction
Example:
❑At technology giant Microsoft, collaboration is key to a productive, innovative work environment.
❑Following a 2015 move of its engineering group's offices, the company sought to understand
how fostering face-to-face interactions among staff could boost employee performance and save
money.
❑Microsoft’s Workplace Analytics team hypothesized that moving the 1,200-person group from
five buildings to four could improve collaboration by cutting down on the number of buildings and
reducing the distance that staff needed to travel for meetings.
❑people are more likely to collaborate when they’re more closely located to one another – past
study
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 8
Introduction
Example:
❑Results
46% decrease in meeting travel time
100 hours saved per week across all $520,000 per year savings in employee
relocated staff members time
Meetings per person increased from 14 The average duration of meetings
to 18 declined from 0.85 to 0.77 hours
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 9
Introduction
Example:
❑For meal kit delivery service Blue Apron, understanding customer behaviour and preferences is
vitally important to its success.
❑Each week, the company presents subscribers with a fixed menu of meals available for purchase
and employs predictive analytics to forecast demand, with the aim of using data to avoid product
spoilage and fulfil orders.
❑What variables/factors should Blue Apron use to make these predictions?
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 10
Introduction
Example:
Three categories of variables
Recipe-related
Seasonality
Customer-related Features
Subscriber’s past
recipe
Historical data When order rates
preferences;
that depicts a may be higher or
Which upcoming
given user’s order lower, depending
meal they are
frequency on the time of the
likely to order
year
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 11
Introduction
Example:
❑Results: Regression analysis
❑High level of forecasting accuracy (RMSE and difference between predicted and observed value)
❑What did Blue Apron achieve?
❑Improved user experience
❑Identified how subscriber tastes change over time
❑Recognize how shifting preferences are impacted by recipe offerings.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 12
Types of Data Analytics
Descriptive Analytics Diagnostic Analytics
Predictive Analytics Prescriptive Analytics
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 13
Descriptive Analytics
❑Simplest type of analytics
❑Foundation for other types of analytics
❑Pull trends from raw data
❑Data visualization – charts, graphs, maps
❑It answers the question “What happened?”
❑Example, seasonal surge in sales
❑The sales of video game console experiences an increase in sales in October, November, and
early December each year.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 14
Diagnostic Analytics
❑Address the next logical question ❑Example, video games example
❑“Why did this happen?” ❑users are aged between eight and 18
❑It includes comparing coexisting trends or ❑Buyers are aged between 35 and 55
movements
❑Primary motivator for the purchase - gifting
❑Uncovering correlations between variables
❑The spike in video game sales in the fall and
❑Determining causal relationships early winter months may be due to the holidays
that include gift-giving
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 15
Predictive Analytics
❑Used to make predictions about future trends or
events
❑Finds an answer to the question – “What might
happen in the future?”
❑Historical data in tandem with industry trends –
informed predictions about what the future could hold
for your company.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 16
Predictive Analytics
❑Example, the sales data of video games for the past decade provides you with ample information
to predict that the same trend will occur next year.
❑Backed by upward trends in the video game industry as a whole.
❑Making predictions for the future can help your organization formulate strategies based on likely
scenarios.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 17
Predictive Analytics
❑Other examples
❑Forecast demand for products and services
❑Stock market fluctuations
❑Economic slowdown
❑Which customer is likely to churn
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 18
Prescriptive Analytics
❑It helps answer the question – “What should we do next?”
❑It takes into account all possible factors in a scenario and suggests actionable takeaways.
❑Example, what should your team decide to do given the predicted trend in seasonality due to
winter gift-giving?
❑Strategies on how to capitalize on the seasonal spikes and its supposed cause even further.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 19
Prerequisites and Background for
Prescriptive Analytics
Knowledge of Inferential Statistics
Probability & Probability Chi-squared test / Non parametric
Distributions tests
Sampling & Sampling Hypothesis
Distributions Testing
Confidence Interval Estimation
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 20
Prerequisites and Background for
Prescriptive Analytics
Statistical tools and Models Application areas
Probability concepts What is the probability that the stock index
will go up by 15% by the end of this year?
What is the probability that the Reserve
Bank will raise the interest rates?
What is the probability that a customer will
default on a loan, and is a potential risk?
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 21
Prerequisites and Background for
Prescriptive Analytics
Statistical tools and Models Application areas
Probability distributions To study customer calls in a call center and
customer arrival in a drive-through restaurant
To study customer waiting time and service
time
Length of time to assemble a product or life
span of a product
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 22
Prerequisites and Background for
Prescriptive Analytics
Statistical tools and Models Application areas
Sampling and Sampling In the case of assembly, sample data can
distributions answer the probability that the mean assembly
time will be longer than X minutes.
Samples are used to make inferences
about the population.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 23
Prerequisites and Background for
Prescriptive Analytics
Statistical tools and Models Application areas
Confidence Interval To estimate the proportion of people with
Estimation diabetes in a population.
To predict the mean IQ score of the
students of IIM Amritsar.
Length of time to assemble a product OR
life span of a product
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 24
Prerequisites and Background for
Prescriptive Analytics
Statistical tools and Models Application areas
Hypothesis Testing It helps analyse the claims.
Toyota Prius claims to provide about 50
mpg in the city and 48 mpg on the highway.
Honda Civic Hybrid claims to provide 40
mpg in the city and 45 mpg on the highway.
Ford Fusion Hybrid claims to provide 41
mpg in the city and 36 mpg on the highway.
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 25
PREDICTIVE ANALYTICS USING R: SESSION 1 October 10, 2023 26