0% found this document useful (0 votes)
25 views67 pages

NIshant Chahal RM

H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views67 pages

NIshant Chahal RM

H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

MAHARAJA SURAJMAL INSTITUTE

RESEARCH METHODOLOGY LAB


PRACTICAL FILE

Submitted to: Submitted by:


Dr. Anupama Sharma Nishant Chahal
Associate Professor BBA(Gen) 3rd Sem 1st Shift
Department Of Business Administration 17714901722

BATCH 2022-25

DEPARTMENT OF BUSINESS ADMINISTRATION

Maharaja Surajmal Institute


Recognized by UGC u/s 2(f), NAAC Accredited ‘A’ Grade
Affiliated to Guru Gobind Singh Indraprastha University, Delhi
C-4, Janakpuri, New Delhi-110058
ACKNOWLEDGEMENT

The success & the final outcome of this RM Lab Practical File require a lot of guidance and
assistance from many people and we extremely fortunate to have got this all along the
completion of our assistance work. Whether we have done only to such guidance and
assistance and we could not forget to thank them. I respect and thanks Dr. Anupama Sharma
for giving us an opportunity to do this project work and providing us all support and guidance
which made me complete the project on time, we are extremely grateful to her for providing
such a nice support and guidance.

We are really grateful because we managed to complete this project within the time given by
Dr. Anupama Sharma. Last but not the least we would like to express our gratitude to our
friends and parents for support and willingness to spend some time with us.
INDEX
Modules Topics Pg no.

Module 1 Q 1- Define SPSS. Explain its steps.

Q 2- Difference between Data View and


Variable
Module 2 Q 3- Explain the basic elements of SPSS.

Q 4- Describe the functions, advantages, and


disadvantages of SPSS.
Module 3 Q 5- How to import data in SPSS. Assigning
values in variables.

Q 6- Descriptive Analysis through SPSS-


Frequency distribution, Measures of Central
Tendency, Defining Mean, Median, Mode,
Minimum, Maximum, etc.
Module 4 Q 7- Define graphs. Explain nodes and edges.

Q 8- Explain pie chart, bar graph, histogram,


and scatterer plots.
Module 5 Q 9- Define box plots and also explain the
significance of box plots in research.

Q 10- Explain the single box plot and cluster


box plot. Also, explain their path and steps.
Module 6 Q 11- Cross Tabulation- steps and
interpretation.
Module 7 Q 12- Meaning of correlation, understanding
the different categories of correlation- high,
moderate, low.
Q13- Steps for computing correlation,
cleaning correlation table and its
interpretation.
Module 8 Q 14-Define Skewness and Kurtosis , Types
of Kurtosis
Q15 Steps to Compute Skewness And
Kurtosis in SPSS, Understanding
Computation of z value ,Shapiro -Wilk test of
Normality . Understanding the Significance
of the p value in the test of Normality
Module 9 Q16- How to compute average mean for each
construct variable? Also explain its steps
showing in data view window. Calculate
mean for each variable in the study data.
Module 10 Q 17- Meaning and significance of reliability
in research. Steps to compare reliability.
Understanding the significance of Cronbach’s
Alpha value in reliability.
Module 11 Q 18- Defining Chi-square, its steps, and
understanding the significance of p-value in
chi-square. Understanding acceptance and
rejection of null hypothesis based on the p-
value.
Module 12 Q 19-
A. Define t test discuss the procedure for
t test with one sapmle of SPSS .(car,
with ethanol and without ethanol)
B. To understand the procedure for
repeated measures t test (dependent
samples/paired t test in spss
C. C. To know the procedure for
independent groups in t test.

Module 13 Q20 Explain Regression its significance in


research steps to compute Regression in SPSS
Understanding and r var-value 0 - , t test ,f test,
z test

Module 14 Q 21 ANOVA
MODULE-1

Q1- Define SPSS. Explain its steps.


SPSS is short for Statistical Package for the Social Sciences, and it’s used by various kinds of
researchers for complex statistical data analysis. The SPSS software package was created for
the management and statistical analysis of social science data. It was originally launched in
1968 by SPSS Inc., and was later acquired by IBM in 2009.
To begin, follow these steps:
Step1- Choose Start→ All Programs→ IBM SPSS Statistics→ SPSS Statistics 27. The SPSS
Welcome dialog shown here appears. This is where you can see what’s new in the software,
provide user feedback, and navigate to data files. You'll open one the sample SPSS data files.

Step2- Click the Sample Files tab in the lower-left corner of the dialog.
Step3- Select the bankloan.sav data file and then click Open.

Q2- Differentiate between data view and variable view.


Data View Variable View
Used to display data Used to display information on variables in
data set
Columns represent variables TYPE: Allows for various styles of
displaying
MEASURE : Allows choice of measurement
scale
Rows represent individual units or groups of LABEL: Allows for longer description of
units that share common values of variables variable name
VALUES: Allows for longer description of
variable levels
DATA VIEW

VARIABLE VIEW
MODULE-2

Q3- Describe the basic elements of the SPSS.


1) Name: This is a column field, which accepts the unique ID. This helps in sorting the
data. For example, the different demographic parameters such as name, gender, age,
educational qualification are the parameters for sorting data.
The only restriction is special characters which are not allowed in this type.
2) Label: The name itself suggests it gives the label. Which also gives the ability to add
special characters.
3) Type: This is very useful when different kind of data are getting inserted.
4) Width: We can measure the length of characters.
5) Decimal: While entering the percentage value, this type helps us to decide how much
one needs to define the digits required after the decimal.
6) Value: This helps the user to enter the value.
7) Missing: This helps the user to skip unnecessary data which is not required during
analysis.
8) Align: Alignment, as the name suggests, helps to align left or right. But in this case,
for ex. Left align.
9) Measure: This helps to measure the data being entered in the tools like ordinal,
cardinal, nominal.
Variable View
The data has to enter in the sheet named “variable view”. It allows us to customize the data
type as required for analyzing it.
To analyze the data, one needs to populate the different column headings like Name, Label,
Type, Width, Decimals, Values, Missing, Columns, Align, and Measures.
These headings are the different attributes which, help to characterize the data accordingly.
Data View
The data view is structured as rows and columns. By importing a file or adding data
manually, we can work with SPSS.

Q4- Describe the function, advantage and disadvantage of SPSS.


Functions of SPSS
a) SPSS offers four programs that assist researchers with their complex data analysis
needs.
b) Statistics Program: It furnishes a plethora of basic statistical functions like frequencies
and cross tabulation.
c) Modeler Program: It enables researchers to build and validate predictive models using
advanced statistical procedures.
d) Text Analytics for Surveys Program: It helps survey administrators uncover powerful
insights
e) Visualization Designer: It allows researchers to use their data to create a wide variety
of visuals like density charts and radial boxplots very easily.

ADVANTAGES:

• The advantages of using SPSS as a software package compared to other are: • SPSS is
a comprehensive statistical software.
• Many complex statistical tests are available as a built-in feature.
• Interpretation of results is relatively easy.
• Easily and quickly displays data tables can be expanded.
LIMITATIONS
• SPSS can be expensive to purchase for students.
• Usually involves added training to completely exploit all the available features.
• The graph features are not as simple as of Microsoft Excel.
MODULE-3

Q5- How to import data in the SPSS and assigning values to variables?
Step1- Create an excel sheet with the data:-
Gender:- Male(1)
Female(2)
Marital Status:- Married(1)
Unmarried(2)
Education:- 12th Pass(1)
PG(2)
UG(3)
Experience:- 0-2 years(1)
3-5 years(2)
6-8 years(3)
9-10 years(4)

Step2- Now import data from excel to the SPSS file


Step3- Now add value to the numerical data
Step4- The values will then be reflected in the SPSS file

Q6- Descriptive Analysis through SPSS -Frequency distribution, Measures of Central


Tendency Defining Mean, Median, Mode, Minimum, Maximum, etc.
Descriptive Statistics are used to have a basic and brief view of the data. These are useful to
have superficial and quick information regarding some aspects of data. Like, how many males
and females participated in the survey, what is the average age range of respondents, what is
the median salary, etc. Even, standard deviation can be used to know about the distribution
characteristics of the data.
Frequency Distributions: These are display of the frequency of occurrence of each value. The
frequency distribution can be represented in tabular or graphical form. For continuous variables
with ratio / interval scales, you may use histogram or frequency polygons.)For categorical
variables with nominal/ ordinal scales, bar charts can be used.
Measures of Central Tendency:
These measures are mainly useful with interval or ratio scales. There are three main measures
of central tendency, that is mean, median and mode. The measures of variability include range,
interquartile range, standard deviation and variance. The term descriptive statistics basically
mean summary, analysis, presentation of the data set derived from the sample population.
Mean – It is the average of the actual given data.

Median – Mid – value of the data set when it is arranged in ascending order.

Mode – The most frequent number occurring in the data set.


Maximum – The maximum value occurring in the data set.

Minimum – The minimum value occurring in the data set.

To perform the descriptive analysis of the SPSS data:-

Step1- Click on the analyse option then on the descriptive statistics

Step2- Select frequencies and then the values for analysis


Step3- The below output will be displayed
MODULE 4

Q7- Define graphs. Explain nodes and edges.

Graph is a mathematical representation of a network and it describes the relationship between


lines and points. A graph consists of some points and lines between them. The length of the
lines and position of the points do not matter.

Node is also known as graph vertex. It is a point on which the graph is defined and maybe
connected by graph edges. Each object in a graph is called a node.

Edge: An edge e is a link between two nodes. A link denotes movements between nodes. It has
a direction that is generally represented as an arrow. If an arrow is not used, it means the link
is bidirectional.

Q8- Explain pie chart, bar graph, histogram, scatter plots.

1. Pie Chart:

A pie chart is a circular graphical representation used to display data in a way that illustrates
the proportion of various categories within a whole. The circle represents the whole data set,
and the slices of the pie represent the individual categories or segments. The size of each slice
is proportional to the quantity it represents. Pie charts are most effective when showcasing data
with a relatively small number of categories and when the categories are non-overlapping and
distinct. They help in conveying the distribution of parts within a whole at a glance.

2. Bar Graph:

A bar graph (also known as a bar chart or bar diagram) is a graphical representation of data
using rectangular bars of varying lengths. It's used to compare values across different categories
or groups. Each bar represents a category, and the length or height of the bar is proportional to
the value it represents. Bar graphs can be displayed either horizontally or vertically, with the
categories on one axis and the values on the other. They're useful for showing comparisons,
trends, and patterns within categorical data.
3. Histogram:

A histogram is a graphical representation used to visualize the distribution of continuous data


or a large dataset. It consists of a series of adjacent rectangles, or bins, where the width of each
bin corresponds to a range of values and the height represents the frequency or count of data
points within that range. Histograms help in understanding the shape of the data distribution,
whether it's symmetric, skewed, bimodal, etc. They're particularly useful for identifying
patterns, trends, and outliers in data.
4. Scatter Plot:
A scatter plot is a graphical representation that displays individual data points as dots on a two-
dimensional plane. It's used to visualize the relationship between two continuous variables.
Each dot on the scatter plot represents a single data point with a specific value for each variable.
The position of the dot is determined by the values of the two variables it represents. Scatter
plots are helpful in identifying correlations, patterns, clusters, and outliers within data. They
can provide insights into how two variables interact or influence each other.
MODULE 5

Q9- Define boxplot and also explain the significance of boxplot in research.

Ans) When we display the data distribution in a standardize way using five summary –
minimum, Q1 (first quartile), median, Q3 (third quartile), maximum.

It is called a boxplot it is also termed as box and whisker plot.

A box plot is a chart that shows data from a five number summary including one of the measures
of central tendency. It does not show the distribution in particular as much as the stem and leaf
plot or histogram does. But it is primarily used to indicate a distribution is skewed or not and
if there are potential unusual observations (also called outliers) present in a data set. Box plots
are also very beneficial when large number of data sets are involved or compared.

SIGNIFICANCE

It is used to know:-

• The outliers and their values


• Symmetry of data
• Tight grouping of data
• Data skewness if in which direction and how

Q10- Explain single box plot and cluster box plot and also explain the path and steps.

Ans) Single box plot graph displaying data from one quantitative variable. Also known as a
“box-and-whisker plot”. The box represents the middle 50% of observed values. The bottom
of the box is the first quartile (25th quartile) and the top of the box is the third quartile.

When a box plot is designed for a data asset with two or more categorical variables, one may
need to group/cluster some of the boxes by category. Such a clustered (grouped) box plot is
called a clustered box plot.

Steps:

Step 1 - Create data file


Step 2 - Click on chart builder
Step 3 - Click box plot and take gender on x axis and age on y axis
Step 4 - Click ok to view result.
Step 5 - Now right click on the diagram, a dialog box will open. Now click on data value
labels and click the options that are are clicked in the given image.
Step 6 - Click apply and a median line will be seen on the box plot.
MODULE-6

Q11- Cross Tabulation-steps and its interpretation

Definition: To describe a single categorical variable, we use frequency tables. To describe the
relationship between two categorical variables, we use a special type of table called a cross-
tabulation (or "crosstab" for short). In a cross-tabulation, the categories of one variable
determine the rows of the table, and the categories of the other variable determine the
columns. The cells of the table contain the number of times that a particular combination of
categories occurred. The "edges" (or "margins") of the table typically contain the total
number of observations for that category.

Sandhya Gupta wants to see the scatter plot of information availability (cause) in online
ordering (effect) by the respondents.

Information Availability Online Ordering


Not Important Never
Less Important Occasionally
Important Considerably
Very Important Almost always
Extremely Important Always

Step1- Create Data File


Step2- Import the file in SPSS

Step3- Put the values of the variables in design view


Datasheet View:

Step4- Insert crosstabs and put the rows and columns from the data

Step5- Output of the cross tabulation


Step6- Select both the variables

Step7- Select Chi square in crosstabs statistics

Step8- Output obtained


Interpretation: The above table shows the cross tabulation of information with the
frequency of online ordering and the importance they place on the information.
MODULE 7
Q12- Meaning of correlation, understanding the different categories of correlation-
high, moderate, low.

Correlation is a statistical measure that expresses the extent to which two variables are
linearly related (meaning they change together at a constant rate). It’s a common tool for
describing simple relationships without making a statement about cause and effect.
Correlation is measured by the correlation coefficient. It is very easy to calculate the
correlation coefficient in SPSS. Before calculating the correlation in SPSS, we should have
some basic knowledge about correlation.

The correlation coefficient should always be in the range of -1 to 1.

Assumptions:

• Related pairs
• Data should be ratio or internal in nature
• Scores of each variable should be normally distributed
• Linear relationship between two variable
• Homoscedasticity- the variability in scores for one variable is approximately the same
at all values of other variables.

Types of correlation:

1. Bivariate correlation- It signifies a correlation between 2 continuous variables and is


the most common measure of linear relationships. The possible values in this
correlation range from -1 to +1. The value indicates the strength and relationship and
sign (-or+)indicates the direction.
2. Partial correlation- partial correlation shows a single measure of linear association
between two variables. This correlation adjust for the effects of one or more additional
variables. In case the assumptions discussed above the correlation can’t be met the non-
parametric Spearsmen rent order correlation can be used.

Degree Of Correlation:

• r = +1 : Perfect positive correlation


• r = [0.75, 1] : High degree of positive correlation
• = [0.5, 0.75] : Moderate positive correlation
• r = [0.25, 0.5] : Low degree of positive correlation
• r = [0, 0.25] : No correlation or absence of correlation
• same for negative

Q13- Steps for computing correlation, cleaning correlation table and its interpretation.

Mr. X wants to see the scatter plot of information availability that is the (cause) and online
ordering (effect) by the respondents. Also find the Pearson’s correlation. Descriptive are below:
Information Availability Online Ordering
Not Important Never
Less Important Occasionally
Important Considerably
Very Important Almost always
Extremely Important Always

Step 1- Below is the variable view.

Step 2 - Data View

Step 3 - Go to Analyze tab. Go to correlate and select Bivariate.


Step 4 - Select the fields (Information and Order Frequency) and drag them to Variables box.

Step 5 - Select the Pearson option in correlation coefficients. Click ok and then the output
window will open.
Step 6 -
a. For Scatter plot, go to Graph tab and select chart builder
b. Select Scatter/Dot from “Choose from” menu and double-click on second graph.

Step 7 - Click ok.


INTERPRETATION

• Value of Pearson correlation is between 0.75 and 1(i.e. 0.801), hence there is high
degree of positive correlation relationship between order frequency and
information.
• The scatter diagram also shows the dispersion of data.
MODULE-9

Q16- How to compute average mean for each construct variable? Also explain its steps
showing in data view window. Calculate mean for each variable in the study data.
Step- Create a data file in Excel and import it in SPSS.

Step2- Add value of Gender[Male(1), Female(2)], Marital Status[Married(1), Un-Married(2)],


Education[Graduate(1), Post Graduate(2), PHD(3)] and Experience[0-2 years(1), 3-5
years(2), 5-7 years(3), 7-9 years(4), Above 10 years(5)]

Step 3- Add values for OB1-3, MET1-6 and USE1-3 given below in the image.
Step4- Go to transform tab and commute variable
Step5- Calculate the mean of the variable.

Step6- Display the mean of the variables in the data view.


MODULE 10

Q17- Meaning and significance of reliability in research. Steps to compute reliability.


Understanding the significance of Cronbach’s Alpha value in Reliability.

Sandhya wants to conduct research on internet use and she gathers 13 variables and 32
responses for each 13 variables based on: 1 (never), 2 (occasionally), 3 (considerably), 4
(almost always), 5 (always). The details of the variables are shown below.

Name Label

InfoG1 Collecting product/service information and specification


InfoG2 Collecting information of current vendor
InfoG3 Searching and collecting information of new vendor
InfoG4 Collecting competitive and other information for purchase
IndoG5 Cost/Price comparison
InfoE1 Email
InfoE2 Web conferencing with vendors
InfoE3 Electronic Data Interchange (EDI)
InfoE4 Discussion groups
InfoE5 Just in time inventory planning
Online1 Online ordering
Online2 Online status checking
Online3 Online product/service support

❖ Steps

Step 1 - Create data file


Step 2 - Go to analyze -> scale -> reliability analysis

Step 3 - A dialog box will open. Add all the variables to know the reliability.
Step 4 - Click on statistics. And click the following dialog boxes.
Step 5 - Click continue to view results
❖ Interpretation

The value of Cronbach’s alpha should be .7 or more to report reliability of the data. As it
can be seen, the value of Cronbach’s alpha is .732 which is more than the standard, which
signifies that data is reliable.
Module 11

Q18- Defining Chi-square, its steps and understanding the significance of p-value
in chi-square. Understanding acceptance and rejection of null hypothesis based on
p-value.
Respondents were asked their gender whether or not they are cigarette smokers. There
were 3 choices- smokers, past smokers and non-smokers. Suppose we want to test for
association between gender (male and female) and smoking behaviour using chi-square
test of independence.
Meaning --- Chi square test for association is used when you want to check
association between 2 categorical variables on nominal scale. However, it is important
to note that in the case of 2 variables, we compare the test can also be interpreted as
determining if there is a difference between 2 variables. The test is also referred as chi-
square test of independence and also known as Pearson Chi Square test. It is used to
determine whether there is a testimony significant difference between the expected
frequency and the observed frequency would be assuming the null hypothesis.
Problem Statement- To identify the association between gender and smoking
behaviour.

❖ Hypothesis:
H0- There is insignificant association between gender and smoking behaviour.
H1- There is significant association between gender and smoking behaviour.

❖ STEPS
1) To create a data file showing gender and smoking behaviour in excel
2) Import excel file in SPSS

3) Assign values to the data


4) Click on analyze- descriptive statistic-cross tab

5) Drag and drop smoking behavior into the row box and gender in column box
6) Click on statistics and select chi square

7) Press continue and ok to do chi square


8) The result will appear in SPSS output viewer
❖ INTERPRETATION
The chi square statistics were used to examine association between the categorical
variables. There was significant relationship at 5% significance level between
gender and smoking behavior of respondents (X2 =10.153, df=2, p=0.006)

As it can be seen from the above table the p value (0.006) which is lower than the
alpha value (0.05) .Hence H1 was supported.
MODULE 12
Q19-
A. Define t test discuss the procedure for t test with one sapmle of spss.(car,with
ethanol and without ethanol)
B. To undestand the procedure for repeated measures t test (dependent
samples/paired t test in spss
C. To know the procedure for independent groups in t test.

ANS- (A)
• T- test-

T- test are used to determine the significant difference between 2 sets of scores. T test may be
one sample, independent groups and repeated measures test.
• Basic assumptions of t test
1. Data should be at interval or ratio level of measure
2. Data should be randomly sampled.
3. Data is numerical data representing samples from normally distributed population.

What is one sample T – test?

This test is used when data from single sample of participants is there and you want to
know whether the mean of the population from which the sample is drawn is the same as
hypothesized mean.

How to use one sample T – test in SPSS?

Q 1. Indian oil has developed a formulation with increased use of ethanol in petroleum
products, which increases engine efficiency with less harmful emissions. 30 cars were
test driven with and without the ethanol and the number of kilometers per litre were
recorded. The cars used for tests were having either automatic or manual transmission.

Car coding: 1(automatic), 2 (manual)

Steps:

Step 1 –Enter variable in variable view.


Step 2 – Enter data in data view.
Step 3 – Click Analyse -> Compare Means -> One-Sample t Test.

Output :
Interpretation:

There are two tables, the table named as one sample statistics shows mean and standard
deviation values of one sample t test along with standard error mean.
In the next table named; one sample test, if the calculated value of t is greater than the table
value, we accept the alternate hypothesis, but if the calculated value of t is lesser than the
table value, then we accept the null hypothesis. Now in our working example, the value of
two tail significance is less than .05(p<.05), as such the difference between means is
significant. The output indicates that there is a significant difference in engine efficiency
between previous and current trial. The cars with current trial have more engine efficiency
than those in earlier trial with t(29) = 4.597, p< .05.

(B).
What is paired t test sample in SPSS?
This test is used to ask whether two sets of values are random samples from same or
different populations.
• If they are random samples from same population, then any differences across
conditions or groups can be attributed to random sampling variability.
• And, if two sets of values are random samples from different populations, then you
can attribute any difference between means across conditions to the independent
variable.

Paired t test is used when you have data from one group of participants, that individual
obtains two values under different levels of the independent variable.
STEPS;

Step 1 - Click Analyse -> Compare Means -> Paired-Sample T Test.


Step 2 - Now click ok and a dialog box will appear. Select both the variables to study in
pairs.

Output:
• Interpretation
The first table named, paired sample statistics shows statistics of both with ethanol and
without ethanol
Next table, the paired sample correlations shows the correlation value of .934, p<.05.
The last table of paired sample test shows the value of two tail significance is less than
.05 (p<.05), as such the difference between means is significant. The output indicates
ythat there is a significant difference in engine efficiency between with ethanol and
without ethanol trial. The cars with ethanol additive have more engine efficiency than
those without ethanol, with t(29)= 3.753, p<.05.

(c).
• Steps for independent sample test-

1. Follow the same steps to insert the data.


2. Go to analyze, then compare means, and further choose independent sample test
3. Tranfer values of with ethanol and without ethanol to test variable ,and car to
grouping variable.
4. Then go to define group, and continue, press ok and get the results.
MODULE 14

Q2- What is ANOVA?


ANOVA – Analysis of Variance
ANOVA is used to find significant relation between various variables. The procedure of
ANOVA involves the derivation of two different estimates of population variance from the
data. Then statistic is calculated from the ratio of these two estimates. One of these estimates
(between group variance) is the measure of the effect of independent variable combined with
error variance.
F ratio is the ratio of, between groups and within group variance. In case the null hypothesis
is rejected, i.e. when significant different lies, post adhoc analysis or other tests need to be
performed to see the results.
The ANOVA test has certain assumptions these are:
1) Population normality – data is numerical data representing samples from normally
distributed populations
2) Homogeneity of variance – the variances of the groups are “similar”
3) The sizes of the groups are “similar”
4) The groups should be independent

• Assumptions of ANOVA: Homogeneity if variance. As such homogeneity of variance


tests are performed. If this assumption is broken then Brown-Forsythe test option and
Welch test option display alternate versions of F-statistics.
• Homogeneity of Variance: If significance value is less than 0.05, variances of groups
are significantly different.
• Brown-Forsythe and Welch test option: If significance value is less than 0.05, reject
null hypothesis
• ANOVA: If significance value is less than 0.05, reject null hypothesis

Post Hoc analysis involves hunting through data for some significance. This test carries risk
of type 1 errors. Post hoc tests are designed to protect against type 1 errors, given that all the
possible comparisons are going to be made. These tests are stricter than planned comparisons
and it is difficult to obtain significance. Some post hoc tests are:
1) Scheffe test – allows every possible comparison to be made but is tough on rejecting
the null hypothesis
2) Tukey test/honestly significant difference test – lenient but the types of comparison
that can be made are restricted. This chapter will show Tukey test also

Q) Vijender Gupta wants to compare the scores of CBSE students from four metro
cities of India i.e. Delhi, Kolkata, Mumbai, Chennai. He obtained 20 participants on
random sampling from each of the four metro cities, collecting 100 responses. Also note
that, this is independent design, since the respondents are from different cities. He made
following hypothesis:
Null Hypothesis: There is no significant difference in scores from different metro cities of
India
Alternate Hypothesis: There is significant difference in scores from different metro cities of
India

Enter the values of city as 1-Delhi, 2-Kolkata, 3-Mumbai, 4-Chennai.

• Fill the data view with the following data

CITY SCORE
1 400
1 450
1 499
1 480
1 495
1 300
1 350
1 356
1 269
1 298
1 299
1 599
1 466
1 591
1 502
1 598
1 548
1 459
1 489
1 499
2 389
2 398
2 399
2 599
2 598
2 457
2 498
2 400
2 300
2 369
2 368
2 348
2 499
2 475
2 489
2 498
2 399
2 398
2 378
2 498
3 488
3 469
3 425
3 450
3 399
3 385
3 358
3 299
3 298
3 389
3 398
3 349
3 358
3 498
3 452
3 411
3 398
3 379
3 295
3 250
4 450
4 400
4 450
4 428
4 398
4 359
4 360
4 302
4 310
4 295
4 259
4 301
4 322
4 365
4 389
4 378
4 345
4 498
4 489
4 456
❖ According to the test of homogeneity of variances the significance value is 0.077
(p>0.05) which means the null hypothesis which says that there is no significant
difference in scores from different metro cities of India will be accepted.

❖ According to the test of ANOVA the significance value is 0.15 (p<0.05) which means
the alternative hypothesis which says there is significant difference in scores from
different metro cities of India will be accepted and the null hypothesis will be
rejected.

You might also like