Applied Predictive Analytics
Key Data Mining Tasks Difficulty
Automated
Business Example:
Level Usage Traffic Management
1. Descriptive Analytics -
Human interpretable Patterns from Data (large volume)
Clustering, Summary & Central Tendency, Association Analysis .. Inform
2. Predictive Analytics -
Build Models to predict outcome of future data
Classification, Regression, Anomaly & Deviation Detection .. Inform & Warn
3. Prescriptive Analytics -
Predict future outcome & suggest actions to mitigate the
Impact of predicted outcomes. Inform, Warn &
Optimization Techniques .. Advisory Role
Predictive Analytics Process
Uses techniques from DM, Statistics, Modelling, Machine Learning and AI to analyse current data &
make predictions about future
TIME
Reporting / Predictive
DATA Monitoring
Analysis Analytics
What Happened? What is Happening What is going to
Why that happened? Now? Happen in Future?
ACTION
Predictive Models
In Predictive Modelling the data is collected for the Relevant Predictors, a statistical model is formulated,
predictions are made and the model is validated (or revised) as additional data becomes available. The
model may employ a linear equation or a complex neural network .
Business Process Features of Modelling
Create the Model Data Analysis and Manipulation
Test the Model Visualization
Validate the Model Statistics
Evaluate the Model Hypothesis Testing
Predictive Analytics Everywhere
How the Model Works?
Predictive Modelling can be actively guiding the Marketing Campaign
(a key form of Business Intelligence) such as ..
• Predictors Rank your Customers to Common Models to predict
guide your Marketing Customer Relationship Management
• Combined Predictors means Chance/ prospect to respond to an Ad
Smarter Rankings Mail recipients likely to Buy
• The Computer makes your Model When is a Customer likely to Churn
from your Customer Data Is a person likely to get sick
• A simple Curve Shows How well Portfolio/ Product prediction
your Model Works. Risk Management & Pricing
Statistical Learning Explained
Shown are Sales of TV, Radio and Newspaper with a Linear-Regression
line fit for each of the Sales
Question➔ How do we predict the Sales in general? We do it with a
Model
Sales = f (TV, Radio, Newspaper)
Statistical Learning cont …
SALES → is what we want to PREDICT- also called RESPONSE, OUTPUT and usually denoted by Y
TV, RADIO, NEWSPAPER → each one is a PREDICTOR/ FEATURE/ ATTRIBUTE/INPUT denoted as X (X1,X2,X3 .)
X1
X is an INPUT VECTOR = X2 Now we write the Model Y = f(X) + ϵ
X3 (ϵ - stands for measurement errors / discrepancies)
With a good f we can predict Y for X taking values and depending on the Complexity
of f we can may be able to understand how every component of X affects Y
Can we have an ideal f(X)?
What value does y take for x=4?
Good Value is f(4) = E(Y|X =4) (average value of Y given X=4)
Regression function f(x) =E(Y|X=x)
Broad Types of Learning
Supervised Learning
- With a Pre-classified Training Sample
input is X and Output is Y
- Given an Observation x, What is the BEST label for y?
Unsupervised Learning
- Given input x
- From the set of X’s, Cluster or Summarize them
Semi-Supervised Learning
- Combination of Supervised & Unsupervised Learning
Reinforcement Learning
- Determine What to do based on Rewards and Punishment
Broad Types of Learning cont..
Supervised Learning –
Predict the Values for the Target Features for the new example
Input features X1, X2 …Xn and Output feature Y
Classification ➔ when Y is Discrete
Regression ➔ When Y is Continuous.
Semi-Supervised Learning –
Self Training Method – Popular
Labelled Data Learn a Classification Model using
Labelled Data Training Base is included
+ with Confident examples
Unlabelled Data Predict classes from Unlabelled Data
Feature/ Attribute Types
❖ Categorical (Example: A,B,AB,O for Blood Group)
❖ Ordinal (Example: Large, Medium, Small)
❖Integer-Valued (Example: Number of words in a Text)
❖Real-Valued (Example: Height)