0% found this document useful (0 votes)
17 views27 pages

Lecture 9 Segmentation Methods

This lecture focuses on understanding consumers through segmentation methods in data-driven marketing, covering key concepts such as CART, CHAID, and neural networks. It discusses various segmentation variables, both a priori and post hoc, and provides examples of segmentation schemes based on income, gender, and age. The lecture also evaluates the advantages and disadvantages of different segmentation methods, emphasizing their applications in predicting consumer behavior.

Uploaded by

Haocheng Zhang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views27 pages

Lecture 9 Segmentation Methods

This lecture focuses on understanding consumers through segmentation methods in data-driven marketing, covering key concepts such as CART, CHAID, and neural networks. It discusses various segmentation variables, both a priori and post hoc, and provides examples of segmentation schemes based on income, gender, and age. The lecture also evaluates the advantages and disadvantages of different segmentation methods, emphasizing their applications in predicting consumer behavior.

Uploaded by

Haocheng Zhang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

BUS119 Data-Driven Marketing

Lecture 9: Understanding Consumers


Using Segmentation Methods in
Data-Driven Marketing

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 1
Class Outline

In this class, we will learn:


• What is Segmentation?
• Several common segmentation methods:
• CART

• CHAID

• Neural Network

• Applications of Segmentation in Data-Driven


Marketing

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 2
Review: what is segmentation?

Segment: A group of consumers who are similar in


terms of how they respond to your marketing mix.

Segmentation: The act of dividing the market into


segments.

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 3
Segmentation Variables – “A Priori”

1. Demographic e.g., gender, age, income,


education, family size etc.

2. Geographic e.g., zip code, census tract, block


group etc.

3. Psychographic e.g., lifestyle, personality etc.

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 4
Segmentation Variables – “Post Hoc”

1. Attributes e.g., Book Club’s Selection of the Month.

2. Benefit e.g., Satisfaction

3. Behavioral e.g., RFM (recency, frequency, and monetary


value)

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 5
The following two types of segmentation methods are
often used in Data-Driven Marketing

 Classification Tree: inclusive of all the tree


generating techniques
 CHAID: Chi Square Automatic Interaction Detector

 CART: Classification and Regression Trees

 Neural Networks

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 6
The goals of Netflix’s database marketing
efforts are…

Predicting movie preferences


Purchase price for DVD rentals
“Throttling” heavy renters

Here is the question:


How do you become more intelligent about your potential consumers’
purchase? (Assume that each consumer observation contains Income,
Gender and Age)

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 7
Segmentation Scheme 1

Income Response rate

< 30K 0.087

30K-60K 0.087

> 60K 0.087

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 8
Segmentation Scheme 2

Gender Response rate

Female 0.086

Male 0.088

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 9
Segmentation Scheme 3

Age Response rate

< 25 0.119

25 - 35 0.070

> 35 0.039

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 10
CHAID (Chi-square Automatic Interaction Detection)
Analysis: segmentation tree

How do we come up with


the CHAID tree?

Not P-value in this tree


Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 11
As a review, let’s look at the chi-square test
of independence again:

Response

Will Buy Will Not Buy

Male 255 1554


Gender

127 1274
Female

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 12
Chi-Square Test of Association

Ho: Gender and Response are independent.


Chi-Square statistic = Σi (Observedi – Expectedi)2 /
Expectedi

If chi-square statistic is “large” (i.e. P-value<0.05)


Observed and expected frequencies are different from each other
We reject Ho

If chi-square statistic is “small” (i.e. P-value>0.05)


Observed and expected frequencies are similar to each other
We do not reject Ho

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 13
To test independence, we need to get the expected
frequency conditional on independence for each cell.
How to do it? Here is Step 1.

Response
Will Buy Will Not Buy

Male 255 1554 1809/3210=0.56


Gender

Female 127 1274 1401/3210=0.44

382/3210=0.12Data-Driven2828/3210=0.88 N = 3210
Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 14
To test independence, we need to get the expected
frequency conditional on independence for each
cell. How to do it? Here is Step 2.

Response
Will Buy Will Not Buy

Male 0.56*0.12*3210 0.56*0.88*3210 0.56


=215.7 =1581.9
Gender

Female 0.44*0.12*3210 0.44*0.88*3210 0.44


=169.5 =1242.9

0.12 0.88 N = 3210


Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 15
Lastly, we compare the observed & Expected
Frequencies, and this gives us the test of independence

Response
Will Buy Will Not Buy

Male 255 1554


215.7 1581.9 Actual
Gender Expected

Female 127 1274


169.5 1242.9

N = 3210
Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 16
Chi-Square Test of Independence

Chi-Square statistic
= (255-215.7)2/215.7+ (127-169.5)2/169.5+
(1554-1581.9)2/1581.9+ (1274-1242.9)2/1242.9
= 19.09

How does one define “large”?


Using a critical value for the chi-square statistic
Degrees of freedom (df) = (# rows – 1)*(# columns – 1)
Choose P-value smaller than 0.05 level

And this is easily done in SPSS!

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 17
Why CHAID divided AGE1 by GENDER (not
INCOME)

Run Chi-Square Test between Gender and Response; get


p-value

Run Chi-Square Test between Income and Response; get


p-value

Whichever p-value is lower indicates superior


segmentation.
……

And this continues for the second and final steps of


segmentation
Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 18
CHAID (Chi-square Automatic Interaction Detection)
Analysis: segmentation tree

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 19
CHAID Segments

Segment Response Rate

1. < 25 Male 0.141


2. < 25, Female 0.091
3. 25-35, < 30K 0.081
4. 25-35, 30K-60K 0.069
5. 25-35, > 60K 0.059
6. > 35 0.039

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 20
Summary of CHAID
 CHAID analysis has the following requirements:
 The predictor variables must be categorical;
 The splitting can be 2-branches or more;
 The splitting is based on a chi-square test;
 No variable is included unless there is statistically significant
association between the dependent variable and the predictor;
 There is NO pruning of the final tree.

 Disadvantage of CHAID:
 no pruning, tree can be over-fitting
 Once a variable is used, it cannot be used again.

 Advantage of CHAID:
 Allow splits to be more than binary;
 Also is part of SPSS family (an add-on product)
Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 21
CART (Classification and Regression Trees)
is very similar to CHAID
 Very similar to CHAID, but the differences are:
 The predictor variables can be both categorical and interval;

 The splitting must be binary;

 The splitting is based on a Gini measure, a measure of “impurity”,


which is not a chi-square test;
 The final tree can be pruned backwards.

 CART’s advantage is its ability to handle complex interactions and to


uncover these interactions through data analysis; CART is also robust
to outliers.

 CART’s disadvantage: since it is based on stepwise sample splits and


not precise values, it is potentially unstable.
Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 22
Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 23
Neural Networks: handles complex
relationships by assuming NO specific
statistical relationship between variables

 Used for building predictive models in situations where the


analyst has little knowledge about the form of relationship
between the independent and dependent variables;

 Previous lectures: tools (e.g. linear regression) that can be


used for prediction. However, they assume very specific
mathematical forms (often linear) in the pattern relating the
dependent and independent variables.

 When patterns are too complex to be captured by these


forms, neural networks provide a viable alternative.

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 24
A Common Neural Network

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 25
Summary of Neural Networks: pros and cons

 Neural networks are “trained” since there is no


pre-specified mathematical model relating input
and output. Therefore it places an extremely heavy
burden on the data in the training sample.

 Disadvantage: not intuitive, often difficult to


interpret the results (black box), and require very
specific software.

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 26
Overall evaluation of tree and neural net
segmentation methods
 Useful in dealing with large amounts of data with
many variables;

 Neural net (NN) provides an alternative to


regression or other segmentation analysis methods.
However, because NN depends so much on the
training sample, its performance in the test dataset
is questionable. Select the right variables (e.g.
stepwise regression) could improve the fit.

Data-Driven Marketing
Lecture 9 Understanding Consumers Using
11/10/2023 Segmentation Methods 27

You might also like