0% found this document useful (0 votes)
16 views2 pages

Program 5

Uploaded by

Kantara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views2 pages

Program 5

Uploaded by

Kantara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

5.

Analysing categorical data and assessing relationships between variables (Chi-Square)


Step 1: Structure of data
Let’s assume you have two categorical variables as follows:
 Variable 1 (Rows): Customer Satisfaction (Satisfied, Neutral, Unsatisfied)
 Variable 2 (Columns): Age Group (18-29, 30-49, 50+)
Your data (observed frequencies) could look like this:

18-29 30-49 50+ Total (Row Sum)

Satisfied 45 30 25 100

Neutral 35 40 25 100

Unsatisfied 20 30 50 100

Total 100 100 100 300

Step 2: Calculate Expected Frequencies


You will need to calculate the expected frequencies under the assumption that the two variables are
independent.
In Excel, you can calculate this using the following formula in the corresponding cells:
Expected Frequencies= (Row Total) * (Column Total) / (Grand Total)
1. Observed Frequencies: Place your observed frequencies in a table as shown above (A1
).
2. Row and Column Totals: Calculate the row and column totals using the SUM function. For
example:
o Row Sum for Satisfied: =SUM(B2:D2)

o Column Sum for 18-29: =SUM(B2:B4)

Apply the above sum function on rest of the customer satisfaction and Age group.
3. Grand Total: This will be the sum of all observations, which can be calculated as:
= SUM(B5:D5)
4. Expected Frequencies: Create another table for the expected frequencies by placing the
formula for each cell:
o Expected frequency for Satisfied and 18-29

= (B5 * E2) / E5

Step 3: Chi-Square Test Formula in Excel


1. Chi-Square Statistic: For each cell, calculate the Chi-Square statistic using the formula:
In Excel, the formula for the first cell can be:
= (B2 – B7)^2 / B7
Repeat this for all cells and sum them up to get the total Chi-Square statistic:
= SUM(B10:D12)
2. Degrees of Freedom: Degrees of freedom (df) is calculated as:

In Excel, calculate the degrees of freedom as:


= (Number of Rows - 1) * (Number of Columns - 1)
= (3 - 1) * (3 - 1) = 4
3. P-Value Calculation: Finally, calculate the p-value using the CHISQ.DIST.RT function in
Excel, which gives the right-tailed Chi-Square distribution:
= CHISQ.DIST.RT(Chi-Square Statistic, Degrees of Freedom)
This will give you the p-value to assess whether there is a statistically significant relationship
between the two categorical variables.
Step 4: Interpretation
 If the p-value is less than the significance level (commonly 0.05), reject the null hypothesis.
This means that there is a significant relationship between the two variables.
 If the p-value is greater than 0.05, you fail to reject the null hypothesis, meaning there is no
significant relationship between the variables.

This will give you a p-value ≈ 0.00007

 Since the p-value is very small (less than 0.05), you reject the null hypothesis. This
suggests that there is a statistically significant relationship between Customer
Satisfaction and Age Group.

You might also like