14-Jan-21
DATA PROCESSING AND Lecture 11
ANALYSIS
1
DATA ANALYSIS
The purpose of analyzing data is
• To obtain usable and useful information.
• To answer the research questions and to help determine the
trends and relationships among the variables.
The analysis, irrespective of whether the data is
qualitative or quantitative, may:
• describe and summarize the data
• identify relationships between variables
• compare variables
• identify the difference between variables
• forecast outcomes
1
14-Jan-21
THREE TYPES OF ANALYSIS
Univariate Analysis
• The examination of the distribution of cases on only one variable at a
time.
• Purpose: mainly Description.
Bivariate Analysis
• The examination of two variables simultaneously.
• Purpose: determining the empirical relationship between two variables.
Multivariate Analysis
• The examination of more than two variables simultaneously.
• Purpose: determining the empirical relationship among multiple
variables.
DATA ANALYSIS
Data analysis concerned with the analysis of data of any kind,
and by any means.
Types of data analysis
Descriptive Analysis
Inferential Analysis
2
14-Jan-21
DESCRIPTIVE ANALYSIS
Refers to the description of the data from a particular sample;
hence the conclusion must refer only to the sample.
In other words, these summarize the data and describe sample
characteristics.
Descriptive Statistics are numerical values obtained from the
sample that gives meaning to the data collection.
Quantitative description of the main features of a set of data
or sample (simple summaries – summary statistics - about the
sample) or visual, i.e., simple graphs.
DESCRIPTIVE ANALYSIS OF
UNIVARIATE
A. Frequency Distribution
A systematic arrangement of numeric
values from the lowest to the highest or
highest to lowest.
A frequency distribution groups
respondents into the subcategories into
which a variable can be divided.
Unless you are not planning to use
answers to some of the questions, you
should have a frequency distribution for
all the variables.
Each variable can be specified either
separately or collectively in the frame of
analysis.
3
14-Jan-21
DESCRIPTIVE ANALYSIS OF
UNIVARIATE
B. Measure of Central Tendency
• A statistical index that describes the average of the set values.
Types of Averages
• Mode:
• a numerical value in a distribution that occurs most frequently.
• Median:
• an index of average position in a distribution of numbers.
• Mean:
• the point on the score scale that is equal to the sum of the
scores divided by the total number of scores.
DESCRIPTIVE ANALYSIS OF
UNIVARIATE
C. Measure of Variability
• Statistics that concern the degree to which the scores in a distribution are
different from or like each other.
Range
• The distance between the highest score and the lowest score in a distribution
• Example The range for learning center A 500 (750-250) and the range
from learning center is about 300 (650-350).
Standard Deviation
• The most used measure of variability that indicates the average to which the
scores deviate from the mean.
4
14-Jan-21
DESCRIPTIVE ANALYSIS OF
BIVARIATE
D. Cross Tables
• For interpretation of cross tables, it is required to identify
dependent and independent variable.
• Percentage should be computed in the direction of independent
variable.
INFERENTIAL ANALYSIS
The use of statistical tests, either to test for significant
relationships among variables or to find statistical support
for the hypothesis.
Inferential statistics are numerical values that enable the
researcher to draw conclusion about a population based
on the characteristics of a population sample.
• This is based on the laws of probability.
10
5
14-Jan-21
INFERENTIAL ANALYSIS
Level Significance
• An important factor in determining the representativeness
of the sample population and the degree to which the
chance affects the findings.
The level of significance is a numerical value selected by the
researcher before data collection to indicate the probability
of erroneous findings being accepted as true.
• This value is represented typically as 0.01 or 0.05.
11
INFERENTIAL ANALYSIS
Uses of Inferential Analysis
• Cited some statistical test for inferential analysis.
T-test
• Is used to examine the difference between the means of two
independent groups.
Analysis of Variance (ANOVA)
• Is used to test the significance of differences between means of two or
more groups.
Chi-square
• This is used to test hypothesis about the proportion of elements that fall
into various cells of a contingency table.
12
6
14-Jan-21
INFERENTIAL ANALYSIS
Cross Tabulation
Cross-tabulations analyze two variables, usually
independent and dependent or attribute and
dependent, to determine if there is a relationship
between them.
The subcategories of both the variables are cross-
tabulated to ascertain if a relationship exists
between them. Usually, the absolute number of
respondents, and the row and column percentages,
give you a reasonably good idea as to the possible
association.
13
DATA ANALYSIS
Linear correlation
• refers to straight-line relationships between two variables.
A correlation can range between -1 (perfect negative
relationship) and +1 (perfect positive relationship), with 0
indicating no straight-line relationship.
Simple linear regression analysis
• is a statistical tool for quantifying the relationship between
just one independent variable (hence "simple") and one
dependent variable based on experience (observations).
14
7
14-Jan-21
HYPOTHESIS – TESTING
PROCEDURES
The outcome of the study perhaps may retain, revise
or reject the hypothesis and this determines the
acceptability of hypothesis and the theory from
which it was derived.
Steps in testing hypothesis:
Determine the test statistics to be used
Establish the level of significance
Select a one-tailed or two-tailed test
Compute a test statistic
Calculate the degrees of freedom
Obtain a tabled value for statistical test
Compare the test statistics to the tabled value
15
MULTIVARIATE ANALYSIS
A collection of procedure for analyzing the
association between two or more sets of
measurement that were made of each object in one
or more sample of objects.
16
8
14-Jan-21
SELECTING A MULTIVARIATE
TECHNIQUE
Dependency
• Dependent variables and independent
variables are present.
Interdependency
• Variables are interrelated without
designating some dependent and others
independent.
17
DEPENDENCY TECHNIQUES
MULTIPLE REGRESSION DISCRIMINANT MULTIVARIATE
ANALYSIS ANALYSIS OF
VARIANCE (MANOVA)
18
9
14-Jan-21
MULTIPLE REGRESSION
Multiple Regression is a measure of relationship and it involve a
single dependent variable and two or more then two independent
variable.
Predict values for a dependent variable by developing a self-
weighting estimating equation.
Control for confusing variables to better evaluate the
contribution of other variables
Test and explain causal theories
• Path Analysis
19
DISCRIMINANT ANALYSIS
Classify person or objects into various
groups
Analyze know groups to determine the
relative influence of specific factors.
Examining if there is any significant
differences between the group created.
20
10
14-Jan-21
MULTIVARIATE ANALYSIS OF
VARIANCE (MANOVA)
Assess relationship between two or more dependent
variables and classificatory variables of factors
samples.
Example:
Measure difference between
Employees
Customers
Manufactured items
Production Parts
21
INTERDEPENDENCY TECHNIQUES
Factor Analysis
Cluster Analysis
Multidimensional Scaling Analysis (MDS)
22
11
14-Jan-21
FACTOR ANALYSIS
Computational techniques that reduce
variables to a manageable number
• Construction of new set of variables based on
relationships in the correlation matrix
• Principle Components Analysis
• Communalities
• Rotation
Measurement statistics
23
CLUSTER ANALYSIS
Select sample to be clustered
Define measurement variables
Compute similarities among the entities through correlation,
Euclidean distance, and other techniques
Select mutually excusive cluster
Compute and validate the cluster
24
12
14-Jan-21
MULTIDIMENSIONAL SCALING
Creates a special description of a participant’s
perception about a product, service, or other object
of interest.
25
ANY QUESTIONS???
26
13