0% found this document useful (0 votes)

217 views24 pages

Multivariate Data Analysis Guide

The document discusses quantitative methods for multivariate data analysis, including multiple regression, multivariate analysis of variance (MANOVA), and factor analysis. It provides examples of using these methods to predict variables like profitability based on other variables, and to analyze if numeric variables like sales depend on categorical variables like store size and location. The document also outlines the basic steps, assumptions, and models for multivariate data analysis techniques.

Uploaded by

Denise Maciel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

217 views24 pages

Multivariate Data Analysis Guide

Uploaded by

Denise Maciel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

15/11/2016

Work Plan
Quantitative Methods
Multivariate Data Analysis
2016/2017
Pedro Campos / Paula Brito

DATA ARRAY
n "individuals" in rows
p variables (attributes) in columns

nb. children weight gender education

I1 2 52 1 2
I2 1 55 1 3
I3 0 50 1 2
I4 3 60 2 1

1
15/11/2016

Data array - Example

The table below records, for some portuguese towns, the values
of % of workers in industry, nb.. of ATM machines and nb. of
available sportive facilities (old data...)

Workers in industry ATM Sportive

(%) Machines Facilities
Aveiro 47,07 36 81
Beja 10,35 12 52
Braga 46,81 50 125
Guimarães 79,36 34 103
Portimão 6,07 19 104

DATA ARRAY
X Y1 Y2 ... Yj ... Yp
I1 x11 x12 ... x1j ... x1p
I2 x21 x22 ... x2j ... x2p
... ... ... ... ... ... ...
Ii xi1 xi2 ... xij ... xip
... ... ... ... ... ... ...
In xn1 xn2 ... xnj ... xnp
4

2
15/11/2016

VARIABLES
• Numerical (Quantitative)
When their values are real numbers

• Discrete : if the value set is finite or infinite

but countable
Ex. : nb. children
nb. times you use the cell phone each day

• Continuous : if the value set is infinite and non-

countable
Ex. : height, weight, temperature
5

VARIABLES
• Numerical

• Interval scale : if they do not have an absolute zero

Ex: temperature

• Ratio scale : it is possible to define an exact relation

between variable values, since the scale has an
absolute zero
Ex: weight

3
15/11/2016

VARIABLES
• Categorical
If their values - categories, modalities -
are not real numbers, although numerical codes may
be used

• Ordinal : if the values are naturally ordered

Ex.: education level

• Nominal : if the values are not ordered

Ex.: nationality, job

Multivariate Data Analysis

Multivariate data analysis comprises a set of

statistical methods, which are used to analyse
together several variables, observed for each
individual or object.

4
15/11/2016

Multivariate Data Analysis

Steps for a multivariate data analysis:
1. Establish the objectives of the analysis
2. Design the analysis (sample size, type of variables,
statistical methods,…);
3. Check the hypothesis/assumptions of the selected
methods
4. Perform the analysis (estimation of the multivariate
method);
5. Interpret the obtained results (this often leads to a
reformulation of the model);
6. Validate the results.

Multivariate Data Analysis

There are 2 large groups of multivariate methods :

• Dependence methods
• Interdependence methods

5
15/11/2016

Main Techniques of Multivariate Analysis

The dependence methods assume the division of the
variables in two groups, the dependent and the independent
variables, and the objective is to assess whether the
independent variables have some influence in the dependent
ones, and how.

The interdependence methods make no distinction between

dependent and independent variables, and the objective is to
determine which are related, how they are related, and why.

Examples of Dependence Methods

Quantitative • Multiple Linear Regression

Dependent • Multivariate Analysis of Variance
Variable (MANOVA)

Qualitative • Discriminant Analysis

Dependent • Logistic Regression
Variable

6
15/11/2016

Interdependence Methods

• Factor Analysis
Quantitative Data • Principal Component Analysis
• Canonical Correlation
• Cluster Analysis

• Log-linear models
Qualitative Data • Multiple Correspondence Analysis
(HOMALS)
• CatPCA

One view of multivariate methods…

Nonsupervised methods
What type of relation?

Supervised Methods
Dependence Interdependence

How many variables to predict? Is a relation between:

Variables Cases
Several dependent variables in One dependent variable in
one relation one relation
- Factor Cluster
Analysis Analysis
- Principal
Type of dependent Type of dependent
Component
variables variable
Analysis
- Canonical
Quantitative Correlation
Qualitative Quantitative Qualitative

What is the type of Multiple - Discriminant

independente Regression Analysis
variables? - Logistic
Regression
Qualitative

Quantitative Multivariate
Analysis of
Multivariate Variance
Regression (MANOVA)

7
15/11/2016

Multiple Regression
Goal: explain the behaviour of one or more variables
according to other variables
Dependent variables : quantitative
Independent variables : quantitative or qualitative -
changed to binary (dummy)

Example:
Create a model to predict the profitability according to equity, return
on equity and solvability.

Linear Regression: the relation between the variables may be

described through a linear function
(if there is only one independent variable → a line)

Multiple Regression
Model:
Y = β0 + β1 X1+ β2 X2 + …+ βp Xp + ε

For each case:

Yi = β0 + β1 Xi1 + β2 Xi2 + …+ βp Xip + εi (i=1,… n)
βj – regression coefficients; εi – residuals

Hypotheses :
Only Y is affected by measurement errors
Residuals εi are random, independent, with Normal distribution
with zero mean and constant variance: εi ~ N(0, σ)
Residuals εi are non correlated with independent variables X1,…, Xp

8
15/11/2016

Analysis of Variance – (M)ANOVA

Goal: verify if the behaviour of one (or more) numerical
variables depends on qualitative variables - factors
Dependent variables: quantitative
Independent variables: qualitative

Example :
To verify if sales of specific products depend on size of
store and location
Hypotheses: we assume that numeric (dependent) variables follow
Normal distribution in each population, and that variances in
different populations are equal.

Discriminant Analysis (Linear)

Dependent variable: qualitative, categories: groups to
discriminate
Independent variables : quantitative

Examples :
A Marketing department wants to find certain parameters of
customers, to distinguish buyers from non-buyers of some products,
and to use this information to predict the behaviour of new customers.
A bank needs to find parameters that identify successful firms (and
those who fail), and use this information to take decisions about loans.

9
15/11/2016

Discriminant Analysis (Linear)

Goals:
Identify the variables that most distinguish groups
Use these variables to build an index to briefly represent
difference between groups
Use the identified variables and the index to create a rule
that allows classifying future observations in one of the
groups.
Hypothesis :
Explicative (independent) variables have multivariate
Normal distribution in each group
Variance-Covariances matrices are equal in all groups

Discriminant Analysis (Linear)

When assumptions are not met:

Variance - covariance matrix IS NOT equal in all the

groups:
→ Quadratic Discriminant Analysis. Requires large samples.

Explicative (independent) variables deviate a lot from

Normal Distribution:
→ Logistic regression

10
15/11/2016

Factorial Analysis
Applies to quantitative (numerical) variables

Objective : to identify a small number of factors that

allow explaining the relations between variables.

Example : sales values of different products may be

explained by common factors such as quality, utility, etc.

Factorial Analysis allows identifying underlying factors,

which cannot be directly observed.

Factorial Analysis

The observed correlations between variables are then

due to the fact that they "share" these factors.

Analysis of the correlation matrix :

the factorial model only makes sense if the variables
are indeed correlated ;
if correlations are very low, it is unlikely that the
variables share common factors.

11
15/11/2016

Factorial Analysis

In general, the model is written as :

Yj = aj1 F1 + aj2 F2 +... + ajk Fk + Uj

F1, F2 , ... , Fk - common factors

Uj - specific factor

aj1 , aj2 , ..., ajk : loadings

Factorial Analysis
It is assumed that :
a) The observed variables Yj, the common factors and
the specific factors have null mean ;
b) The specific factors are not correlated among
themselves, nor with the common factors ;

Orthogonal Model :
c) The common factors are not correlated among
themselves and have unit variance

12
15/11/2016

Factorial Analysis
Example (Sharma) :

Consider students’ marks on 6 subjects :

mathematics, physics, chemistry, english, history and
french.
Each mark may be written as a function of
- the student’s intelligence/capacity - common factor
- the oposition between quantitative capacity and
verbal capacity - common factor
- aptitute to the subject – specific factor

Example - cont.
Correlation matrix between marks (given) :

M P C E H F
M 1
P 0,62 1
C 0,54 0,51 1
E 0,32 0,38 0,36 1
H 0,284 0,351 0,336 0,686 1
F 0,37 0,43 0,405 0,73 0,735 1

13
15/11/2016

Factorial Analysis

M = 0,675 F1 + 0,557 F2 + ApM

F = 0,717 F1 + 0,447 F2 + ApF
Q = 0,683 F1 + 0,418 F2 + ApQ
I = 0,793 F1 - 0,410 F2 + ApI
H = 0,774 F1 - 0,461 F2 + ApH
Fr = 0,837 F1 - 0,359 F2 + ApFr

Factorial Analysis

The correlations between the observed variables

and the common factors (standardized principal
components) are given by the pattern loadings.

→ Interpretation of the factors

14
15/11/2016

Factorial Analysis

Methods for factor extraction :

Principal Components
Principal Axis
Non-weighted mean-squares
Generalized mean-squares
Maximum Likelihood
Alpha Method
Image Factoring

Principal Component Analysis

Principal Components :

New variables
Linear combinations of the original variables, non-
correlated, and that maximize variance

They are obtained from the eigenvectors of the

correlation matrix, associated with the largest
eigenvalues

15
15/11/2016

Principal Component Analysis

If an important part of the dispersion is explained by a

small number of principal components, then we may
use just some of them for interpretation and future
analysis, instead of the original p.

How many components should be kept?

Which percentage of dispersion are we ready to
sacrifice ?
How much is just “noise” ?

Principal Component Analysis

1) Pearson’s criterion:
Keep a number q of components such that they explain at
least 80% of the total dispersion.
2) Observe the graphical representation of the eigenvalues
and keep those λα for which : λα- λα -1 > ε (ε relatively
small) - “elbow’s rule”.
3) Kaiser proposed to only keep the eigenvalues above 1 -
i.e., the principal components which are “more
informative” than the original variables, i.e., whose
variance is above the original variables’ variance.

16
15/11/2016

Factorial Analysis with Qualitative Variables

Specific methods for qualitative variables

• Multiple Correspondence Analysis

• CatPCA

CLUSTER ANALYSIS
Marketing:
Potential clients :
socio-economic characteristics, preferences
→ IdenTﬁcaTon of market segments

Finance:
Companies : financial indicators
→ Typology of companies ?

17
15/11/2016

CLUSTER ANALYSIS
Applies to elements described by numerical or binary
variables (not simultaneaously)
Objective :
Given : n objects described by p variables
Potential clients socio-economic charac., past expenses
Companies financial indicators
Cities social structure, facilities
...
Determine a CLUSTERING :
Structure the objects in classes

CLUSTER ANALYSIS
The objective is grouping the objects in classes, such that

- elements of a given class are quite similar among each

other – homogeneous classes

- classes are "relatively distinct" from each other –

well separated classes

18
15/11/2016

Clustering Models
Partition

Disjoint classes which together cover the whole set to

be clustered

Clustering Models
Hierarchical Models

Classes are organized in a nested structure

19
15/11/2016

Comparing elements
It is necessary to select a comparison measure between
pairs of elements of the set to be clustered
Examples of measures for numerical data:
- Euclidean distance
- Manhattan, or City-Block distance
- Mahalanobis distance
- …
Consider standardization

Many measures for binary variables

Hierarchical Clustering
Hierarchical model:
Set of nested partitions

Dendrogram

20
15/11/2016

Example
BUYING HOTEL AVERAGE
CITY BASKET RENT TAXI
POWER NIGHT INCOME
Amsterdam 78,00 1339,00 520,00 10,58 286,00 16486,00
Caracas 14,30 795,00 210,00 2,96 148,00 1910,00
Chicago 99,70 1474,00 900,00 5,00 218,00 25129,00
Helsinki 54,80 1597,00 570,00 7,56 194,00 13463,00
Houston 96,30 1314,00 430,00 6,00 149,00 21997,00
Jakarta 18,10 1035,00 980,00 1,42 245,00 3253,00
London 59,90 1354,00 810,00 7,16 375,00 13348,00
Luxembourg 114,00 1371,00 1080,00 8,92 227,00 24564,00
RiodeJaneiro 22,20 1067,00 450,00 2,58 194,00 3900,00
Zurich 100,00 1946,00 740,00 14,36 287,00 32420,00

Example

21
15/11/2016

Example
Class 1 : Amsterdam, Chicago, Helsinki, Houston,
Luxembourg, Zurich

Class 2 : Caracas, Jakarta, London, Rio de Janeiro

BUYING HOTEL AVERAGE

CITY BASKET RENT TAXI
POWER NIGHT INCOME
class1 90,47 1506,83 706,67 8,74 226,83 22343,17
class2 28,63 1062,75 612,50 3,53 240,50 5602,75

Non-Hierarchical Clustering
Objective :
Determine (directly) partitions P = {C1,… , Ck}, i.e.,
families of k classes which do not intersect and that
jointly cover the whole :

22
15/11/2016

K-Means method
Fix the number of clusters – k
Starting from a set of k initial centers - elements of W -
assign each element to the class with nearest center.
After each assignment the cluster center is re-
computed.
After assigning all elements, the method may be
iterated.

Known as : moving-centers method

Hierarchical VS
Non-hierarchical Clustering
Hierarchical Non-hierarchical
Series of “solutions” One single solution
No need to fix the number of Need to fix the number of
clusters clusters
Solution not improved Optimized solution
Computationally very heavy Computationally “lighter” :
less number of calculations
and comparisons
Not indicated for large Indicated for large datasets
datasets

23
15/11/2016

Combining Factorial Analysis

and Clustering
Determine principal components

Select the relevant ones

Cluster the data with the values of the principal

components (or factor scores) instead of original
data

Clustering with qualitative data

Do not apply directly a clustering method!

Perform Multiple Correspondence Analysis
Select the relevant ones
Cluster the data with the values of the principal
components (or factor scores) instead of original
data

Consolidated DA
No ratings yet
Consolidated DA
41 pages
Data Analysis Techniques Guide
No ratings yet
Data Analysis Techniques Guide
9 pages
Chapter 13 Multivariate Analysis Techniques
No ratings yet
Chapter 13 Multivariate Analysis Techniques
58 pages
Multivariate Analysis Guide
No ratings yet
Multivariate Analysis Guide
7 pages
Metodos de Regresion
No ratings yet
Metodos de Regresion
8 pages
BRM chp09
No ratings yet
BRM chp09
41 pages
Business Research Methods Guide
No ratings yet
Business Research Methods Guide
13 pages
Unit-3 Research Methods-MCA
No ratings yet
Unit-3 Research Methods-MCA
15 pages
Mutivariate and Baysian
No ratings yet
Mutivariate and Baysian
21 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
23 pages
Statistical Techniques Overview by Banerjee
No ratings yet
Statistical Techniques Overview by Banerjee
18 pages
Exploratory Data Analysis v3 Part3
No ratings yet
Exploratory Data Analysis v3 Part3
11 pages
Pertemuan 1 SNN
No ratings yet
Pertemuan 1 SNN
37 pages
Multivariate Analysis Techniques Overview
No ratings yet
Multivariate Analysis Techniques Overview
19 pages
Dva 2
No ratings yet
Dva 2
13 pages
Unit III Data Analysis
No ratings yet
Unit III Data Analysis
34 pages
Multivariate Analysis An Overview
No ratings yet
Multivariate Analysis An Overview
9 pages
2 - Univerate and Multiveriate
No ratings yet
2 - Univerate and Multiveriate
11 pages
Multivariate Statistical Methods A First Course - 1st Edition Readable PDF Download
100% (20)
Multivariate Statistical Methods A First Course - 1st Edition Readable PDF Download
15 pages
Presentation Transcript: Multivariate Analysis
No ratings yet
Presentation Transcript: Multivariate Analysis
8 pages
Multivariate Analysis Techniques Guide
No ratings yet
Multivariate Analysis Techniques Guide
26 pages
FALLSEM2023-24 - ITE2011 - ETH - VL2023240102356 - 2023-09-01 - Reference-Material-I (3 Files Merged)
No ratings yet
FALLSEM2023-24 - ITE2011 - ETH - VL2023240102356 - 2023-09-01 - Reference-Material-I (3 Files Merged)
191 pages
Chapter 14 - Analyzing Quantitative Data
No ratings yet
Chapter 14 - Analyzing Quantitative Data
8 pages
Multivariate Analysis Guide
No ratings yet
Multivariate Analysis Guide
23 pages
Session5 Factor Analysis Handout
No ratings yet
Session5 Factor Analysis Handout
16 pages
Introduction to Multivariate Analysis
No ratings yet
Introduction to Multivariate Analysis
21 pages
5th Module SDS
No ratings yet
5th Module SDS
13 pages
Multivariate
100% (1)
Multivariate
78 pages
Multivariate Techniques Final
No ratings yet
Multivariate Techniques Final
128 pages
Quantitative Analysis Using Spss
100% (1)
Quantitative Analysis Using Spss
42 pages
Understanding Exploratory Data Analysis
No ratings yet
Understanding Exploratory Data Analysis
131 pages
Multivariate & Factor Analysis Guide
No ratings yet
Multivariate & Factor Analysis Guide
12 pages
Analysis of Data
No ratings yet
Analysis of Data
27 pages
Real Statistics Multivariate Examples
No ratings yet
Real Statistics Multivariate Examples
340 pages
Data Processing & Analysis Guide
No ratings yet
Data Processing & Analysis Guide
13 pages
Section 1 - Multivariate Data and Matrix Algebra
No ratings yet
Section 1 - Multivariate Data and Matrix Algebra
14 pages
Lecture Notes On Multivariate Analysis
100% (1)
Lecture Notes On Multivariate Analysis
75 pages
Research Methodology - Multi Variate Analysis 13 10 23
No ratings yet
Research Methodology - Multi Variate Analysis 13 10 23
17 pages
Multivariate Analysis Course Notes
No ratings yet
Multivariate Analysis Course Notes
261 pages
1 Introduction
No ratings yet
1 Introduction
41 pages
Factor Analysis
No ratings yet
Factor Analysis
39 pages
Data Analysis - Selecting A Test
No ratings yet
Data Analysis - Selecting A Test
5 pages
ADA Chapter5
No ratings yet
ADA Chapter5
6 pages
Univariate vs Bivariate Analysis Guide
100% (1)
Univariate vs Bivariate Analysis Guide
6 pages
Dimensionality Reduction Techniques Explained
No ratings yet
Dimensionality Reduction Techniques Explained
26 pages
Unit III Data Analysis and Reporting
No ratings yet
Unit III Data Analysis and Reporting
13 pages
Data Science Presentation
100% (3)
Data Science Presentation
113 pages
Principles of Multivariate Analysis
No ratings yet
Principles of Multivariate Analysis
6 pages
Sessions 21-24 Factor Analysis - Ppt-Rev
No ratings yet
Sessions 21-24 Factor Analysis - Ppt-Rev
61 pages
Stevens J.P. Applied Multivariate Statistics Part1 PDF
0% (1)
Stevens J.P. Applied Multivariate Statistics Part1 PDF
350 pages
C01 Introduction S
No ratings yet
C01 Introduction S
20 pages
Applied Multivariate Research - Design and Interpretation P1
No ratings yet
Applied Multivariate Research - Design and Interpretation P1
60 pages
Q. Anaysis of Variance (Anova)
No ratings yet
Q. Anaysis of Variance (Anova)
29 pages
Amrcb Unit 5
No ratings yet
Amrcb Unit 5
29 pages
Using Multivariate Statistics: Barbara G. Tabachnick
100% (1)
Using Multivariate Statistics: Barbara G. Tabachnick
22 pages
Factor Analysis
No ratings yet
Factor Analysis
42 pages
8 Dimensionality Reduction
No ratings yet
8 Dimensionality Reduction
49 pages
Understanding Academic Plagiarism
No ratings yet
Understanding Academic Plagiarism
19 pages
Theme 3 - Bibliometrics: Sandra T. Silva
No ratings yet
Theme 3 - Bibliometrics: Sandra T. Silva
34 pages
Sampling and Data Collection
No ratings yet
Sampling and Data Collection
69 pages
Parametric and Non-Parametric Statistical Testing
No ratings yet
Parametric and Non-Parametric Statistical Testing
19 pages
WIREs Data Min Knowl 2014 Brito Symbolic Data Analysis Another Look
No ratings yet
WIREs Data Min Knowl 2014 Brito Symbolic Data Analysis Another Look
15 pages
Airtel Bangladesh Brand Study
No ratings yet
Airtel Bangladesh Brand Study
23 pages
Shreya Bansal - 250418 - 153433
No ratings yet
Shreya Bansal - 250418 - 153433
971 pages
Fraud Detection in Land Records
No ratings yet
Fraud Detection in Land Records
180 pages
Driver Drowsiness Recognition Based On Computer Vision Technology
No ratings yet
Driver Drowsiness Recognition Based On Computer Vision Technology
9 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
104 pages
Impact of Risk Tolerance and Demographic Factors On Financial Investment Decision Mitali Baruah, Abhishek Kiritkumar Parikh
No ratings yet
Impact of Risk Tolerance and Demographic Factors On Financial Investment Decision Mitali Baruah, Abhishek Kiritkumar Parikh
13 pages
Walker 2008 Cranial Sex
No ratings yet
Walker 2008 Cranial Sex
10 pages
Unit 1
No ratings yet
Unit 1
21 pages
Predicting PANCE Passage and Failure
No ratings yet
Predicting PANCE Passage and Failure
12 pages
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
4 pages
Chapter 21: Multidimensional Scaling and Conjoint Analysis: Advance Marketing Research
No ratings yet
Chapter 21: Multidimensional Scaling and Conjoint Analysis: Advance Marketing Research
58 pages
DNN Based Smart Attendance Management System
No ratings yet
DNN Based Smart Attendance Management System
18 pages
Full Introduction To Research Methods and Data Analysis in Psychology 3rd Edition Darren Langdridge PDF All Chapters
100% (8)
Full Introduction To Research Methods and Data Analysis in Psychology 3rd Edition Darren Langdridge PDF All Chapters
82 pages
Multivariate Exams
No ratings yet
Multivariate Exams
14 pages
Predictive Analytics and Machine Learning in Business
No ratings yet
Predictive Analytics and Machine Learning in Business
7 pages
Chapter 2013 Multivariate Statistical Analysis I
No ratings yet
Chapter 2013 Multivariate Statistical Analysis I
36 pages
Dissertation
No ratings yet
Dissertation
136 pages
Corporate Failure Prediction Models Applied
No ratings yet
Corporate Failure Prediction Models Applied
4 pages
Cluster vs. Discriminant Analysis Explained
No ratings yet
Cluster vs. Discriminant Analysis Explained
2 pages
1 s2.0 S2352409X24003730 Main
No ratings yet
1 s2.0 S2352409X24003730 Main
8 pages
Malhotra Mr05 PPT 19
100% (9)
Malhotra Mr05 PPT 19
40 pages
Automated Measurement of Species and Length of SH by Computer Vision
No ratings yet
Automated Measurement of Species and Length of SH by Computer Vision
8 pages
Discriminant Analysis Guide
No ratings yet
Discriminant Analysis Guide
29 pages
M.Sc. Statistics Curriculum 2018
No ratings yet
M.Sc. Statistics Curriculum 2018
31 pages
Decision Trees for Classification Explained
No ratings yet
Decision Trees for Classification Explained
54 pages
BUS 172 Practice Maths
0% (1)
BUS 172 Practice Maths
18 pages
Applied Multivariate Analysis (Multivariate Analysis) DR Amit Mitra Iit Kanpur
0% (1)
Applied Multivariate Analysis (Multivariate Analysis) DR Amit Mitra Iit Kanpur
2 pages
Bank Failure Prediction Using Modified Minimum Deviation Model
No ratings yet
Bank Failure Prediction Using Modified Minimum Deviation Model
13 pages

Multivariate Data Analysis Guide

Uploaded by

Multivariate Data Analysis Guide

Uploaded by

15/11/2016

nb. children weight gender education

Data array - Example

Workers in industry ATM Sportive

• Discrete : if the value set is finite or infinite

• Continuous : if the value set is infinite and non-

• Interval scale : if they do not have an absolute zero

• Ratio scale : it is possible to define an exact relation

• Ordinal : if the values are naturally ordered

• Nominal : if the values are not ordered

Multivariate Data Analysis

Multivariate data analysis comprises a set of

Multivariate Data Analysis

Multivariate Data Analysis

There are 2 large groups of multivariate methods :

Main Techniques of Multivariate Analysis

The interdependence methods make no distinction between

Examples of Dependence Methods

Quantitative • Multiple Linear Regression

Qualitative • Discriminant Analysis

One view of multivariate methods…

How many variables to predict? Is a relation between:

What is the type of Multiple - Discriminant

Linear Regression: the relation between the variables may be

For each case:

Analysis of Variance – (M)ANOVA

Discriminant Analysis (Linear)

Discriminant Analysis (Linear)

Discriminant Analysis (Linear)

Variance - covariance matrix IS NOT equal in all the

Explicative (independent) variables deviate a lot from

Objective : to identify a small number of factors that

Example : sales values of different products may be

Factorial Analysis allows identifying underlying factors,

The observed correlations between variables are then

Analysis of the correlation matrix :

In general, the model is written as :

Yj = aj1 F1 + aj2 F2 +... + ajk Fk + Uj

F1, F2 , ... , Fk - common factors

aj1 , aj2 , ..., ajk : loadings

Consider students’ marks on 6 subjects :

M = 0,675 F1 + 0,557 F2 + ApM

The correlations between the observed variables

→ Interpretation of the factors

Methods for factor extraction :

Principal Component Analysis

They are obtained from the eigenvectors of the

Principal Component Analysis

If an important part of the dispersion is explained by a

How many components should be kept?

Principal Component Analysis

Factorial Analysis with Qualitative Variables

Specific methods for qualitative variables

• Multiple Correspondence Analysis

- elements of a given class are quite similar among each

- classes are "relatively distinct" from each other –

Disjoint classes which together cover the whole set to

Classes are organized in a nested structure

Many measures for binary variables

Class 2 : Caracas, Jakarta, London, Rio de Janeiro

BUYING HOTEL AVERAGE

Known as : moving-centers method

Combining Factorial Analysis

Select the relevant ones

Cluster the data with the values of the principal

Clustering with qualitative data

Do not apply directly a clustering method!

You might also like