0% found this document useful (0 votes)

220 views75 pages

Analysis of Categorical Data

This document discusses the analysis of categorical data. It defines categorical data as data that classifies observations into categories. Some common methods for analyzing categorical data discussed include goodness-of-fit tests, contingency tables, and odds ratios. The chi-square test and Fisher's exact test are presented as methods for analyzing contingency tables. Examples are provided to demonstrate how to perform chi-square tests and calculate odds ratios.

Uploaded by

Malik Shabbir Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

220 views75 pages

Analysis of Categorical Data

Uploaded by

Malik Shabbir Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 75

.

STAT-7213
BIO-STATISTICS

M.PHIL STATISTICS
YEAR-1 (SEMESTER-II)

Submitted to:

Dr. Jamal Abdul Nasir

PRESENTATION
PRESENTED BY

MUBEEN ASGHAR (0557)

LAIBA SUBHANI (0559)
NOOR-E-AMNA (0561)
RABIA SAIF (0563)

3
TOPIC

ANALYSIS OF
CATEGORICAL
.

DATA
Categorical data is data that classifies an observation as belonging
to one or more categories. For example, an item might be judged as
good or bad, or a response to a survey might includes categories
such as agree, disagree, or no opinion
CATEGORICAL DATA
5
ANALYSIS OF CATEGORICAL DATA

Categorical data analysis is the analysis of data where the

response variable has been grouped into a set of mutually exclusive
ordered (such as age group) or unordered (such as eye color)
categories.

6
BASIC ANALYSIS

 The Goodness-of-Fit Test

 The 2 by 2 Contingency Table
 The r by c Contingency Table
 Multiple 2 by 2 Contingency Tables

7
GOODNESS-OF-FIT TEST

The goodness-of-fit test is a statistical hypothesis test to see how

well sample data fit a distribution from a population with a normal
distribution.

8
USES OF GOF TEST

• Goodness-of-fit tests are statistical methods often used to make

inferences about observed values.
• These tests determine how related actual values are to the predicted
values in a model, and when used in decision-making, goodness-of-
fit tests can help predict future trends and patterns.
• Goodness-of-fit tests are commonly used to test for the normality of
residuals or to determine whether two samples are gathered from
identical distributions.

9
METHODS OF GOODNESS-OF-FIT TEST

There are multiple methods for determining goodness-of-fit. Some

of the most popular methods used in statistics include the
• Chi-square
• The Kolmogorov-Smirnov test
• The Anderson-Darling test
• The Shipiro-Wilk test

10
KEY POINTS

• Goodness-of-fit tests are statistical tests aiming to determine whether a set of

observed values match those expected under the applicable model.
• There are multiple types of goodness-of-fit tests, but the most common is the
chi-square test.
• Chi-square determines if a relationship exists between categorical data.
• The Kolmogorov-Smirnov test used for large samples determines whether a
sample comes from a specific distribution of a population.
• Goodness-of-fit tests can show you whether your sample data fit an expected set
of data from a population with normal distribution.

11
CHI-SQUARE TEST

The chi-square independence test is a procedure for

testing, if two categorical variables are related in some
population.

12
TEST STATISTIC

Where, O is an observed frequency and E is an estimated expected frequency.

13
DEGREES OF FREEDOM

The degrees of freedom is basically a

number that determines the exact
shape of our distribution. The figure
illustrates this point.
degrees of freedom -or df- are calculated as
df = (r-1)*(c-1)

14
PROCEDURE
1. State Null and Alternative Hypothesis.

2. Level of Significance.

3. Test Statistic.
2 =
4. Computation.
5. Critical Region:
reject

6. Conclusion.
If reject
15
GOVERNMENT COLLEGE UNIVERSITY
EXAMPLE

Popularity of psychology professors who enrolled students in college at 0.05

significance level test the random enrolment of students.

16
GOVERNMENT COLLEGE UNIVERSITY
SOLUTION

• State Null and Alternative Hypothesis.

• Level of Significance.

• Test Statistic.
2 =
17
GOVERNMENT COLLEGE UNIVERSITY
PROCEDURE

• Computation.

• Critical Region:

• Conclusion.
As so we reject . And conclude that students do not enroll at random. 18
GOVERNMENT COLLEGE UNIVERSITY
CONTINGENCY TABLE

A contingency table (also known as a cross tabulation or crosstab)

is a type of table in a matrix format that displays the
(multivariate) frequency distribution of the variables.

19
GOVERNMENT COLLEGE UNIVERSITY
TYPES CONTINGENCY TABLE

• 2×2 Contingency table

• r × c Contingency table
• Multiple 2×2 Contingency table

20
GOVERNMENT COLLEGE UNIVERSITY
2×2 CONTINGENCY TABLE

The two by two or fourfold contingency

table represents two classifications of a set of counts or frequencies.
The rows represent two classifications of one variable (e.g.,
outcome positive/outcome negative) and the columns
represent two classifications of another variable (e.g.,
intervention/no intervention).

21
TEST STATISTIC

where, for r rows and c columns of n observations, O is an observed frequency and E

is an estimated expected frequency.

E=
22
FISHER EXACT TEST

Fisher's exact test is a statistical significance test used in the

analysis of contingency tables.
Although in practice it is employed when sample sizes are small.

23
CRITERIA FOR FISHER EXACT TEST

 Both variables are dichotomous qualitative (2 cross 2 table).

 When the overall total of the table (sample size) is 30.
 When anyone expected cell value is less than 5.

24
ASSUMPTIONS

 Data consist of two population. A sample observation from

population 1 and B sample observation from population 2.
 The samples are random and independent.
 Each observation can be categorized as one of two mutually
exclusive type.

25
26
27
ODD RATIO

28
Difference Between ODDS AND ODDS RATIO

DEF: The odds for success are the ratio Odds ratio that we may compute from the
of the probability of success to the data of a retrospective study.
probability of failure. We use symbol OR to indicate that the
measure is computed from sample data and
The odds of being a case(having used as an estimate of population odds ratio
disease) to being a control(not having
disease) among subjects with risk factor
is [a/(a+b)]/[b/(a+b)]=a/b
The odds of being a case(having
disease) to being a control(not having
disease) among subjects without risk
factor is [c/(c+d)]/[d/(c+d)]=c/d

29
PROPERTIES

 Equal to any non-negative number

 The odds of success are higher in row 1 as compared to row 2 when OR>1
 When one cell has zero probability, OR equals 0 or ∞

30
INTERPRETATION

A value of 1 indicates no association between the risk factor and disease status.
A value less than 1 indicates reduced odds of the disease among subjects with
the risk factor.
A value greater than 1 indicates increased odds of having the disease among
subjects in whom the risk factor is present

31
EXAMPLE

To compute the odds of receiving a death penalty for each groups

32
The odds of death sentence if the defendant was blacks= 28/45=0.6222

The odds of death sentence if the defendant was non-black=22/52=0.4231

The impact of being black on receiving the death penalty is measured by the odds ratio. Such
as ;

INTERPRET
The odds of death sentence for black is 47% higher for blacks as compared to
non-blacks

33
YATE’S METHOD

Cochran suggests that chi square test should not be used if n is small and
expected frequency less than 5.
Yates (1934) proposed a procedure for correcting in case of 2*2 table, That is,

34
CRITERIA FOR YATE’S CORRECTION

 Both variables are dichotomous qualitative (2 cross 2 table).

 When the overall total of the table (sample size) is 30.
 When anyone expected cell value is less than 5.

35
36
37
38
MATCHED-PAIR STUDIES

A matched pairs design is an experimental design that is used when

an experiment only has two treatment conditions. The subjects in
the experiment are grouped together into pairs based on some
variable they “match” on, such as age or gender. Then, within each
pair, subjects are randomly assigned to different treatments.

39
EXAMPLE

Pairs with the same exposure status for both case and control the diagonal cells
are called concordant pairs (c1and c2), and pairs with different exposures the off-
diagonal cells are called discordant pairs (d1 and d2).
40
EXAMPLE

Let be the probability that a discordant pair has an exposed case. Then, from the
preceding table, can be estimated by the following proportion,

41
HYPOTHESIS

Under the null hypothesis of no association between the risk factor and the
disease, each discordant pair is just as likely to have a case exposed as to have a
control
exposed. Thus, the null hypothesis can be written as

42
APPROXIMATION

For
large samples, we can use the normal approximation.

43
44
45
R × C CONTINGENCY TABLE

We now consider the more general situation where two

classification variables have more than two categories. First, we
consider the situation where both variables are nominal followed by
the situation when one of the variables is ordinal.

46
R × C CONTINGENCY TABLE

Testing Hypothesis of No Association

The same ideas used in the 2 by 2 table still apply to the r by c
contingency table. If there is no association between a row variable
and a column variable, the ratio of the expected cell frequency in
the ith row and jth column, mij, to the ith row total, ni⋅, should
equal the ratio of the jth column total, n⋅j, to the overall total.

47
R × C CONTINGENCY TABLE

There are (r - 1)(c - 1) degrees of freedom for the r by c table because once we
know the frequencies of any (r - 1)(c - 1) cells, we can find the values of the
other frequencies by subtraction from the row and column totals. The hypothesis
of no association between the row and column variables is tested using the chi-
square goodness-of-fi t statistic. Most statisticians perform no adjustment to the
test statistic when used with tables other than the 2 by 2 table. If the test statistic
is greater than the value of , we reject the hypothesis of no association in favor
of the alternative that the row and column variables are related. If the test statistic
is less than we fail to reject the null hypothesis.

48
49
50
MULTIPLE 2×2 CONTINGENCY
TABLE

Here, we gonna focus on the relationship between 2 factors in the

presence of a third factor. We examined the relationship between 2
categorical variables (factors).

51
EXAMPLE

For example, we might be interested in the relationship between

smoking and lung cancer, and how this relationship may change
with gender (a third factor). We observe that the apparent
(combining) relationship between 2 factors may switch or change
its direction and magnitude depending on third factor.

52
EXAMINE THE RELATIONSHIP

We will test for such a dependency, and, if we don’t

seem to find one, we will analyze the aggregated data; if we do find
such a dependency, then it is appropriate to examine the relationship
of the 2 factors of interest separately for each of the levels of the
third factor (don’t aggregate).
We will focus on 2 factors each with 2 levels, including a third
factor with possibly several (g) levels; thus, we will be working
with multiple 2x2 contingency tables.

53
A study to determine if there is any association between the occurrence of upper respiratory infections (URI) of young children and outdoor
air pollution. There are several variables that could affect the relationship between the occurrence of infections and outdoor air pollution.
(I.E, dust, traffic, smoke etc) hypothetical data for this situation are based on an article by jaakkola et al. (1991) and are shown in table

54
EXAMPLE
55
EXAMPLE
56
EXAMPLE
57
SOLUTION

One way of taking the passive smoke variable into account is to analyze each 2 by 2 table
separately. Then we have two tables i.e, one who smoked and other who don’t smoked
Table.1

Passive smoke City polluted URI URI total

in the home some none

yes high 100 20 120

yes low 124 40 164
total 224 60 284
58
SOLUTION

Calculations:
By using the chi-square and odd ratio formula, we have the XYC -square statistic is 2.039
and its p-value is 0.1533 for homes in which someone smoked. The odds ratio for this data
is 1.613. The 95 percent confidence intervals for the odds ratios is from 0.887 to 2.933

Table 2.

Passive smoke City polluted URI URI total

in the home some none
NO high 128 62 190
NO low 166 119 285
total 294 181 475

59
SOLUTION

Table 2.

Passive smoke City polluted URI URI total

in the home some none
NO high 128 62 190
NO low 166 119 285
total 294 181 475

60
SOLUTION
Calculations
The XYC -square value is 3.645, and its p-value is 0.0562 for those without passive smoke
in the home. The odds ratio for this data is 1.480. The 95 percent confidence intervals for
the odds ratios is from 1.007 to 2.171
Interpretation
The first confidence interval, a much wider interval than the second interval, includes the
value of one that suggests that there is no relation between the two variables. The second
interval barely misses including one. The second interval’s smaller size reflects the larger
sample size associated with the
home in which there was no passive smoke. Neither of these tables has a statistically
significant association between the outdoor air pollution and the occurrence of URI at the
0.05 level based on the test statistics. The conclusion from the analyses of the separate
tables is different from that of the combined table.
A problem with the use of the separate tables is that the analyses are based on the smaller
sample sizes associated with each sub-table, not on the sample size of the combined table.
This makes it diffificult to find the presence of small but consistent trends across tables.
61
COCHRAN MENTAL HAENSZEL TEST

Two bio statisticians, Nathan Mantel and William Haenszel, developed a method in 1959
for examining the relation between two categorical variables while controlling for another
categorical variable (Mantel and Haenszel 1959).
This method, like a method published by William Cochran in 1954, uses all the data in the
combined table and produces one overall test statistic. The test is designed to detect the
consistent effect of the independent variable on the dependent variable across the levels of
the extraneous variable.
Thus, this method should only be used when the estimated odds ratios in the Sub-tables
are similar to one another. One very attractive feature of this test is that it can be used with
extremely small sample sizes.

62
PROPERTIES

 For large samples, when H0 is true, CMH has chi-squared distribution with df = 1.

 If all θ(AB(k))=1, then CMH is close to zero
 If some or all θ(AB(k))>1, then CMH is large
 If some or all θ(AB(k))<1, then CMH is large
 If some θ(AB(k))<1 and others θ(AB(k))>1, then CMH is NOT an appropriate test;
that is, the test works well if the conditional odds ratios are in the same direction and
comparable in size.
This test has also been generalized for application to three-way tables of size other than 2
by 2 by k (Landis, Heyman, and Koch 1978)

63
WHEN TO USE

Use the Cochran–Mantel–Haenszel test (which is sometimes called the Mantel–

Haenszel test) for repeated tests of independence. The most common situation is
that you have multiple 2×2 tables of independence; we're analyzing the kind of
experiment that we had to analyze with a test of independence, and we have done
the experiment multiple times or at multiple locations. There are three nominal
variables: the two variables of the 2×2 test of independence, and the third
nominal variable that identifies the repeats (such as different times, different
locations, or different studies).

64
CMH

We have one Z* test statistics, but we are dealing with discrete variables, we should use
the continuity correction with Z*. However, instead of using the continuity-corrected
Z* statistic, we would prefer to use a chi-square statistic, since all the other tests
associated with contingency tables use a chi-square statistic. This poses no problem, since
the square of a standard normal variable follows a chi-square distribution with one degree
of freedom. Thus, the statistic to be used to test the hypothesis of no association between
air pollution and the occurrence of upper respiratory problems is the Cochran-Mantel-
Haenszel chi-square statistic.
65
CMH
. Also called the Mantel-Haenszel statistic, it is defined by

where Oi and Ei are the observed and expected values in the (1,1) cell in the ith
sub-table.
In terms of the entries in the ith table, Ei is defined as,

66
VARIANCE

Vi, with a variance of Oi minus Ei, can be as,

In XCMH-square O, E, and V are defined as the sums of the Oi, the Ei and the Vi
over the k subtables. If XCMH-square is greater than chi-square table value, we
reject the hypothesis of no association between air
pollution and the occurrence of upper respiratory infections. Otherwise we fail to
reject the null hypothesis.

67
EXAMPLE
68
EXAMPLE
69
EXAMPLE
70
MENTAL HEANSZEL COMMON ODD RATIOS

Mantel and Haenszel also showed how to combine the data from the separate sub tables to
form a common odds ratio for the data. Again, this should only be done when the
estimated odds ratios in the sub tables are similar. If the estimated odds ratios for the sub
tables are not similar — for example, some are less than one and some are greater than one
— the common odds ratio would not be very useful. The relation between the independent
and dependent variable would depend on the level of the extraneous variable, and the use
of a common odds ratio would mask this. The Mantel-Haenszel
estimator of the common odds ratio, θ is,

71
DISADVANTAGES

• There is a limit to the kind of statistical analysis that can be performed on

categorical data.
• The options in categorical data do not have a standardized interval scale.
Therefore, respondents are not able to effectively gauge their options before
responding.
• Quantitative analysis cannot be performed on categorical data. Therefore,
numerical or arithmetic operations can not be performed.

72
REFERENCES
• https://www3.nd.edu/~rwilliam/stats1/x51.pdf
• https://www.investopedia.com/terms/g/goodness-of-fit.asp
• https://www.statsdirect.com/help/chi_square_tests/22.htm
• https://www.statsdirect.com/help/chi_square_tests/22.htm
• https://onlinestatbook.com/2/chi_square/contingency.html
• https://www2.stat.duke.edu/courses/Spring02/sta102/chap16.pdf

73
RECOMMENDATION

• https://ncss-wpengine.netdna-ssl.com/wp-
content/themes/ncss/pdf/Procedures/NCSS/Contingency_Tables-Crosstabs-
Chi-Square_Test.pdf

74
THE END

Week 1 Analytics in Practice
100% (2)
Week 1 Analytics in Practice
12 pages
R Manual To Agresti's Categorical Data Analysis
100% (1)
R Manual To Agresti's Categorical Data Analysis
280 pages
Statistical Computing by Using R
100% (1)
Statistical Computing by Using R
11 pages
Categorical Data Analysis With Graphics
No ratings yet
Categorical Data Analysis With Graphics
104 pages
Categorical Data Frequency Distribution
No ratings yet
Categorical Data Frequency Distribution
6 pages
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
No ratings yet
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
658 pages
Bayesian Statistics Primer PDF
No ratings yet
Bayesian Statistics Primer PDF
23 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
David Gerbing - R Visualizations Derive Meaning From Data (2020) - 1 - CRC Press (9780429894923)
100% (1)
David Gerbing - R Visualizations Derive Meaning From Data (2020) - 1 - CRC Press (9780429894923)
252 pages
Notes For Multivariate Statistics With R
No ratings yet
Notes For Multivariate Statistics With R
189 pages
Probability and Statistics Resources
No ratings yet
Probability and Statistics Resources
106 pages
Statistics and Basic Distribution - Mabe
No ratings yet
Statistics and Basic Distribution - Mabe
103 pages
Probability Distributions
100% (1)
Probability Distributions
248 pages
Statistical Modeling for Analysts
No ratings yet
Statistical Modeling for Analysts
22 pages
RStudio Shortcuts Cheat Sheet
No ratings yet
RStudio Shortcuts Cheat Sheet
3 pages
Introduction to Basic Statistics Concepts
No ratings yet
Introduction to Basic Statistics Concepts
27 pages
Survival Plots with Survminer
No ratings yet
Survival Plots with Survminer
5 pages
Estimation and Hypothesis Testing Guide
100% (2)
Estimation and Hypothesis Testing Guide
32 pages
Session 1 (The Nature of Probability and Statistics) PDF
No ratings yet
Session 1 (The Nature of Probability and Statistics) PDF
173 pages
Monte Carlo Studies Using SAS
100% (2)
Monte Carlo Studies Using SAS
258 pages
Fundamentals of Statistical Inference: What Is The Meaning of Random Error?
100% (2)
Fundamentals of Statistical Inference: What Is The Meaning of Random Error?
141 pages
R Packages for Machine Learning
No ratings yet
R Packages for Machine Learning
3 pages
R for Statistics and Data Analysis
No ratings yet
R for Statistics and Data Analysis
91 pages
Top R Data Visualizations Guide
No ratings yet
Top R Data Visualizations Guide
48 pages
Ingmar Visser, Maarten Speekenbrink - Mixture and Hidden Markov Models With R (Use R!) - Springer (2022)
No ratings yet
Ingmar Visser, Maarten Speekenbrink - Mixture and Hidden Markov Models With R (Use R!) - Springer (2022)
277 pages
Predictive Modeling Project Report
100% (2)
Predictive Modeling Project Report
31 pages
Introduction to MCMC and Bayesian Stats
No ratings yet
Introduction to MCMC and Bayesian Stats
69 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
195 pages
Longitudinal PDF
No ratings yet
Longitudinal PDF
664 pages
UGC Statistics Curriculum 2001
No ratings yet
UGC Statistics Curriculum 2001
101 pages
Bayesian Monte Carlo in WinBUGS
No ratings yet
Bayesian Monte Carlo in WinBUGS
37 pages
Longitudinal Data Analysis
100% (1)
Longitudinal Data Analysis
103 pages
Statistics in Data Science
No ratings yet
Statistics in Data Science
100 pages
Bayesian Statistics: A User's Perspective
No ratings yet
Bayesian Statistics: A User's Perspective
24 pages
Statistics For Data Science 1
No ratings yet
Statistics For Data Science 1
65 pages
An Introduction To R
No ratings yet
An Introduction To R
109 pages
Hangal - Frailty Models
No ratings yet
Hangal - Frailty Models
307 pages
Confidence Interval and t* Values
No ratings yet
Confidence Interval and t* Values
211 pages
Estimation of Parameters: Example
No ratings yet
Estimation of Parameters: Example
2 pages
ggplot2 for Data Visualization Enthusiasts
No ratings yet
ggplot2 for Data Visualization Enthusiasts
281 pages
Statistical Tests, P Values, Confidence Intervals, and Power, A Guide To Misinterpretations.
No ratings yet
Statistical Tests, P Values, Confidence Intervals, and Power, A Guide To Misinterpretations.
15 pages
Spatio-Temporal Statistics With R
No ratings yet
Spatio-Temporal Statistics With R
396 pages
13 Pag Design and Analysis of Experiments in The Health Sciences
100% (1)
13 Pag Design and Analysis of Experiments in The Health Sciences
13 pages
Multivariate Statistics With R
No ratings yet
Multivariate Statistics With R
190 pages
R Companion Bio Statistics
No ratings yet
R Companion Bio Statistics
294 pages
Diggle 2013 Statistical Analysis of Spatial and
No ratings yet
Diggle 2013 Statistical Analysis of Spatial and
69 pages
Lindsey 1997 Applying Generalized Linear Models PDF
100% (2)
Lindsey 1997 Applying Generalized Linear Models PDF
265 pages
Seefeld-Statistics Using R With Biological Examples PDF
No ratings yet
Seefeld-Statistics Using R With Biological Examples PDF
325 pages
10measures of Association
No ratings yet
10measures of Association
249 pages
Lecture 3 - Measuresof Assocn
No ratings yet
Lecture 3 - Measuresof Assocn
55 pages
Basic Biostatistics - Wakgari Module 17-21
No ratings yet
Basic Biostatistics - Wakgari Module 17-21
82 pages
Chi-Square Test for PG Students
No ratings yet
Chi-Square Test for PG Students
32 pages
Chi-Square Test Fall Semester 2024
No ratings yet
Chi-Square Test Fall Semester 2024
21 pages
Chapter 8-10 Contigency Table, Correlation and Regression
No ratings yet
Chapter 8-10 Contigency Table, Correlation and Regression
91 pages
Lecture 17 - Ch10 - ChiSquare Test
No ratings yet
Lecture 17 - Ch10 - ChiSquare Test
35 pages
6 Contingency Tables
No ratings yet
6 Contingency Tables
72 pages
Measures of Association
No ratings yet
Measures of Association
56 pages
Lecture Note BUS173 02
No ratings yet
Lecture Note BUS173 02
16 pages
Chi-Square (X2) Distribution
No ratings yet
Chi-Square (X2) Distribution
35 pages
Statistics: The Chi Square Test
No ratings yet
Statistics: The Chi Square Test
41 pages
Exploring Philosophical Questions
100% (1)
Exploring Philosophical Questions
3 pages
Presentation Reminders - Docx Public Forum
No ratings yet
Presentation Reminders - Docx Public Forum
3 pages
5 Unique HUMSS Quantitative Titles
No ratings yet
5 Unique HUMSS Quantitative Titles
3 pages
Daftar Pustaka Teori Komunikasi
No ratings yet
Daftar Pustaka Teori Komunikasi
8 pages
Research Management
No ratings yet
Research Management
12 pages
Grounded Theory Method Overview
100% (1)
Grounded Theory Method Overview
30 pages
Research Variables & Process Guide
No ratings yet
Research Variables & Process Guide
6 pages
(Qualitative Research Methods) Felice D. Billups - Qualitative Data Collection Tools - Design, Development, and Applications-SAGE Publications, Inc (2020)
No ratings yet
(Qualitative Research Methods) Felice D. Billups - Qualitative Data Collection Tools - Design, Development, and Applications-SAGE Publications, Inc (2020)
240 pages
Clinical Proteomics Methods and Protocols 2nd Edition Antonia Vlahou PDF Available
No ratings yet
Clinical Proteomics Methods and Protocols 2nd Edition Antonia Vlahou PDF Available
69 pages
Q.R II Notes Hypothesis Testing, Z-Test, T-Test
No ratings yet
Q.R II Notes Hypothesis Testing, Z-Test, T-Test
7 pages
Test Bank For Principles of Macroeconomics Sixth Canadian Edition Canadian 6th Edition Mankiw Kneebone and McKenzie 0176530851 9780176530853 PDF Download
100% (16)
Test Bank For Principles of Macroeconomics Sixth Canadian Edition Canadian 6th Edition Mankiw Kneebone and McKenzie 0176530851 9780176530853 PDF Download
76 pages
2024 NLC Monitoring & Evaluation Plan
No ratings yet
2024 NLC Monitoring & Evaluation Plan
19 pages
Hypothesis Testing Exam Guide
No ratings yet
Hypothesis Testing Exam Guide
4 pages
C3 Coursework Guide for OCR MEI Maths
100% (2)
C3 Coursework Guide for OCR MEI Maths
5 pages
Introduction To Nursing Research
No ratings yet
Introduction To Nursing Research
19 pages
2.jurnal Analisis Pengelolaan Keuangan Dana Desa Di Desa Bululoe Kec. Turatea Kab. Jeneponto (Muh
No ratings yet
2.jurnal Analisis Pengelolaan Keuangan Dana Desa Di Desa Bululoe Kec. Turatea Kab. Jeneponto (Muh
16 pages
Xiaomi's Digital Marketing Strategies
No ratings yet
Xiaomi's Digital Marketing Strategies
28 pages
Proposal Tesis RAK
No ratings yet
Proposal Tesis RAK
9 pages
Re-Evaluation of Recent Research On Metabolic Utilization of Energy in Poultry: Recommendations For A Net Energy System For Broilers
No ratings yet
Re-Evaluation of Recent Research On Metabolic Utilization of Energy in Poultry: Recommendations For A Net Energy System For Broilers
11 pages
Dissertation Topics For Digital Forensics
100% (2)
Dissertation Topics For Digital Forensics
7 pages
Business Foundations: A Changing World, 13e ISE 13th Edition O. C. Ferrell PDF Download
No ratings yet
Business Foundations: A Changing World, 13e ISE 13th Edition O. C. Ferrell PDF Download
140 pages
Lesson 2
No ratings yet
Lesson 2
12 pages
The Holocaust Memories and History 1st Edition Victoria Khiterer PDF Download
No ratings yet
The Holocaust Memories and History 1st Edition Victoria Khiterer PDF Download
112 pages
NSDM Lecture 5 - Networking Design Methodology
No ratings yet
NSDM Lecture 5 - Networking Design Methodology
8 pages
Collegial Coaching Impact on Teachers
No ratings yet
Collegial Coaching Impact on Teachers
21 pages
Pengaruh Rekrutmen Dan Seleksi Terhadap Kinerja Karyawan Pt. Telkom Indonesia, TBK Cabang Sidoarjo
No ratings yet
Pengaruh Rekrutmen Dan Seleksi Terhadap Kinerja Karyawan Pt. Telkom Indonesia, TBK Cabang Sidoarjo
15 pages
Stt151a Notes
No ratings yet
Stt151a Notes
14 pages
International Relations Thesis Ideas
100% (2)
International Relations Thesis Ideas
8 pages
RESEARCH METHODS NOTES Masters
No ratings yet
RESEARCH METHODS NOTES Masters
74 pages
SL and HL Internal Assessment Sample A Commentary May 2016 (Category 1)
No ratings yet
SL and HL Internal Assessment Sample A Commentary May 2016 (Category 1)
2 pages

Analysis of Categorical Data

Uploaded by

Analysis of Categorical Data

Uploaded by

.

Dr. Jamal Abdul Nasir

MUBEEN ASGHAR (0557)

Categorical data analysis is the analysis of data where the

 The Goodness-of-Fit Test

The goodness-of-fit test is a statistical hypothesis test to see how

• Goodness-of-fit tests are statistical methods often used to make

There are multiple methods for determining goodness-of-fit. Some

• Goodness-of-fit tests are statistical tests aiming to determine whether a set of

The chi-square independence test is a procedure for

Where, O is an observed frequency and E is an estimated expected frequency.

The degrees of freedom is basically a

Popularity of psychology professors who enrolled students in college at 0.05

• State Null and Alternative Hypothesis.

A contingency table (also known as a cross tabulation or crosstab)

• 2×2 Contingency table

The two by two or fourfold contingency

where, for r rows and c columns of n observations, O is an observed frequency and E

Fisher's exact test is a statistical significance test used in the

 Both variables are dichotomous qualitative (2 cross 2 table).

 Data consist of two population. A sample observation from

 Equal to any non-negative number

To compute the odds of receiving a death penalty for each groups

The odds of death sentence if the defendant was non-black=22/52=0.4231

 Both variables are dichotomous qualitative (2 cross 2 table).

A matched pairs design is an experimental design that is used when

We now consider the more general situation where two

Testing Hypothesis of No Association

Here, we gonna focus on the relationship between 2 factors in the

For example, we might be interested in the relationship between

We will test for such a dependency, and, if we don’t

Passive smoke City polluted URI URI total

yes high 100 20 120

Passive smoke City polluted URI URI total

Passive smoke City polluted URI URI total

 For large samples, when H0 is true, CMH has chi-squared distribution with df = 1.

Use the Cochran–Mantel–Haenszel test (which is sometimes called the Mantel–

Vi, with a variance of Oi minus Ei, can be as,

• There is a limit to the kind of statistical analysis that can be performed on

You might also like