0% found this document useful (0 votes)
35 views5 pages

Categorical Data Notes

The document discusses categorical data and different ways to represent it visually. It introduces frequency tables, bar charts, pie charts and contingency tables. It also covers marginal and conditional distributions and how to identify if variables are independent using these distributions.

Uploaded by

Quackadack
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views5 pages

Categorical Data Notes

The document discusses categorical data and different ways to represent it visually. It introduces frequency tables, bar charts, pie charts and contingency tables. It also covers marginal and conditional distributions and how to identify if variables are independent using these distributions.

Uploaded by

Quackadack
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CHAPTER 3 CATEGORICAL DATA

Create a frequency table

Example: number of students at CB South High School in each grade:

Grade Total
9 520
10 534
11 552
12 515

Categorical Distributions:

1. Bar Chart
Percent of Students Taking AP Number of Students Taking AP
Exam Exam
50.0% 80
40.0%
60
Frequency
Percent

30.0%
20.0% 40
10.0% 20
0.0%
0
9th 10th 11th 12th
9th 10th 11th 12th

2. Pie Chart
Percent of Students Taking AP
Exam by Grade Level
9th
6%
10th
12th 19%
43%

11th
32%
3. Contingency tables (aka 2-Way tables): Use the information from CB South High School
and the data provided in the chart to complete the Contingency table.

Frosh Soph Junior Senior Total


Male 270 281 1124
Female 257
Total

Identify:
• Row variable
• Column variable
• Values of the variable
• Total (n)
• # of Cells
• Totals

Example: Hospitals
Hospital A Hospital B Totals
Died 63 16
Survived 2037 784
Totals

• What percent of people died?

• Of those people that went to Hospital A, what percent died?

• Given that someone went to Hospital B, what is the chance that they died?

• Of those people who died, what percent went to Hospital A?

• What percent of people died and went to Hospital B?

• What percent of people survived or went to Hospital A?


2 types of Distributions for Categorical Variables

1) MARGINAL DISTRIBUTIONS

• Example: Hair color vs. Gender


Brown Blonde Black Red Total
MALE 26 24 10 3 63
FEMALE 20 35 12 6 73
TOTALs 46 59 22 9 136

- Find the marginal distribution for the HAIR COLOR variable

- Find the marginal distribution for the GENDER variable

• Bar Chart or Pie Graph


2) CONDITIONAL DISTRIBUTIONS

Example: Hair Color vs. Gender Brown Blonde Black Red Total
MALE 26 24 10 3 63
FEMALE 20 35 12 6 73
TOTALs 46 59 22 9 136

- Find the conditional Distribution for the HAIR COLOR variable

Brown Blonde Black Red Total


MALE
FEMALE
TOTALs

- Find the conditional Distribution for the GENDER variable

Brown Blonde Black Red Total


MALE
FEMALE
TOTALs

Represented visually: Segmented Bar Graph or


o Each bar = 100%
o Values of variable on the x-axis
o Bars are segmented into parts of each value

Represented Visually: Comparative Pie Charts


o Each Circle = 100%
o Pie pieces in the same order
o Circles are segmented in parts of each value
Independence: If two categorical variables are independent, then the value of one variable does not
change the probability distribution of the other.

Independent: Dependent:

Categorical Variable practice

A 4-year study reported in The New York Times, on men more than 70 years old analyzed blood
cholesterol and noted how many men with different cholesterol levels suffered nonfatal or fatal heart
attacks.

Low Medium High


cholesterol cholesterol cholesterol
Nonfatal
29 17 18
heart attacks
Fatal heart
19 20 9
attacks

a. Calculate the marginal distribution for cholesterol level and make a pie graph.
b. Calculate the marginal distribution for severity of heart attack and make a bar graph.
c. Calculate three conditional distributions for the three levels of cholesterol and make a stacked bar
graph.
d. Calculate the conditional distributions for the type of heart attack and make a stacked bar graph.
e. Are the two variables independent?

You might also like