0% found this document useful (0 votes)

184 views3 pages

Inferential Statistics Project2

The document examines racial discrimination in the US job market by randomly assigning identical resumes to black-sounding or white-sounding names and observing callback rates from employers. It finds that resumes with white-sounding names received a higher callback rate of 9.7% than resumes with black-sounding names, which received a callback rate of 6.4%. A two-sample t-test is used to analyze the data and finds a statistically significant difference between the callback rates, indicating racial discrimination. However, the analysis does not definitively conclude that name/race is the most important factor, and further regression or PCA analysis could help determine the impact of other variables.

Uploaded by

api-362845526

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

184 views3 pages

Inferential Statistics Project2

Uploaded by

api-362845526

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

inferential_statistics_project2

July 5, 2017

1 Examining Racial Discrimination in the US Job Market

1.0.1 Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers
examined the level of racial discrimination in the United States labor market by randomly assign-
ing identical rsums to black-sounding or white-sounding names and observing the impact on
requests for interviews from employers.

1.0.2 Data
In the dataset provided, each row represents a resume. The race column has two values, b and
w, indicating black-sounding and white-sounding. The column call has two values, 1 and 0,
indicating whether the resume received a call from employers or not.
Note that the b and w values in race are assigned randomly to the resumes when presented
to the employer.

In [79]: import pandas as pd

import numpy as np
from scipy import stats
import statsmodels.stats.api as sms

In [80]: data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [81]: # number of callbacks for black-sounding names

sum(data[data.race=='b'].call)

Out[81]: 157.0

In [82]: data.head()

Out[82]: id ad education ofjobs yearsexp honors volunteer military emphole

0 b 1 4 2 6 0 0 0
1 b 1 3 3 6 0 1 1
2 b 1 4 1 6 0 0 0
3 b 1 3 4 6 0 1 0
4 b 1 3 3 22 0 0 0

1
occupspecific ... compreq orgreq manuf transcom bankreal tr
0 17 ... 1.0 0.0 1.0 0.0 0.0
1 316 ... 1.0 0.0 1.0 0.0 0.0
2 19 ... 1.0 0.0 1.0 0.0 0.0
3 313 ... 1.0 0.0 1.0 0.0 0.0
4 313 ... 1.0 1.0 0.0 0.0 0.0

busservice othservice missind ownership

0 0.0 0.0 0.0
1 0.0 0.0 0.0
2 0.0 0.0 0.0
3 0.0 0.0 0.0
4 0.0 1.0 0.0 Nonprofit

[5 rows x 65 columns]

1.0.3 1. What test is appropriate for this problem? Does CLT apply?
In [83]: # data size
len(data.id)

Out[83]: 4870

In [84]: # data size of black people

len(data[data.race=="b"])

Out[84]: 2435

In [85]: # data sice of white people

4870 - 2435

Out[85]: 2435

Since the sample size is really large for both of the group (black/white), we can apply CLT
here.
Since we are comparing 2 samples, a 2-sample t test should we consider.

1.0.4 2. What are the null and alternate hypotheses?

H0 : There is NO difference between the means of 2 groups.
HA : The difference between the means of 2 groups is significant.

1.0.5 3. Compute margin of error, confidence interval, and p-value.

In [86]: b_data = data[data.race=="b"]
w_data = data[data.race=="w"]
b_call_mean = b_data.call.mean()
w_call_mean = w_data.call.mean()
b_data_var = b_data.call.var()

2
w_data_var = w_data.call.var()
avg_var = (b_data_var + w_data_var)/2 # since n1 = n2, weighted avg var th

print("The mean for the callback of black race is %.3f," % round(b_call_me

"variance is %.3f." % round(b_data_var, 3))
print("The mean for the callback of white race is %.3f," % round(w_call_me
"variance is %.3f." % round(w_data_var, 3))
print("The difference between the means of calls is %.3f," % round(w_call_
"variance is %.3f." % round(avg_var, 3))

The mean for the callback of black race is 0.064, variance is 0.060.
The mean for the callback of white race is 0.097, variance is 0.087.
The difference between the means of calls is 0.032, variance is 0.074.

In [87]: result = stats.ttest_ind(w_data.call, b_data.call, equal_var=True) # since

print("The t score is %.3f, the p-value is %.7f." % result)

The t score is 4.115, the p-value is 0.0000394.

In [97]: cm = sms.CompareMeans(sms.DescrStatsW(w_data.call), sms.DescrStatsW(b_data

ME = 1.96 * avg_var
print("The confidence interval is (%.3f, %.3f)" % cm.tconfint_diff(usevar=
print("The margin or error is %.3f." % round(ME, 3))

The confidence interval is (0.017, 0.047)

The margin or error is 0.145.

1.0.6 4. Story & Explination

Since p-value of the 2-sample t test < 0.05 and the confidence interval does not include 0, we con-
clude that we reject the null hypothesis and there a difference between the means of number of
callback of 2 samples. Therefore, there is racial discrimination towards job seekers with similar
background. However, there are also many factors we have not considered in this case and there-
fore the sounding of names might not be the only factor contributing to the different number of
callback of 2 different races.

1.0.7 5. Does your analysis mean that race/name is the most important factor in callback suc-
cess? Why or why not? If not, how would you amend your analysis?
Not necessarily. The analysis above indicates that the sounding of names (race) is significant in
affecting the number of callback. However, we are still not sure about whether other variables are
also significant or whether race is the most important factor. To understand the relation between
callback and other variable, we could run a regression test (possibly LASSO) or PCA as further
analysis.

In [ ]:

Week05 TutorialSlidesECO372
No ratings yet
Week05 TutorialSlidesECO372
23 pages
7 8 STS Handout Key
No ratings yet
7 8 STS Handout Key
9 pages
Problem Set
No ratings yet
Problem Set
3 pages
Efectos de Interacción
No ratings yet
Efectos de Interacción
30 pages
Emmanuel Nkansah Hw1
No ratings yet
Emmanuel Nkansah Hw1
16 pages
Seminaron CH13
No ratings yet
Seminaron CH13
14 pages
Stats Homework-2-Nguyen Hoang Dung-ENENIU19006
No ratings yet
Stats Homework-2-Nguyen Hoang Dung-ENENIU19006
3 pages
Hypothesis Testing and T-tests in R
No ratings yet
Hypothesis Testing and T-tests in R
16 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Module 5
No ratings yet
Module 5
24 pages
DoE 2
No ratings yet
DoE 2
24 pages
Case 1 - Interview Experiment
No ratings yet
Case 1 - Interview Experiment
11 pages
Hypothesis Testing For Two Populations (Excel Tutorial)
No ratings yet
Hypothesis Testing For Two Populations (Excel Tutorial)
5 pages
A5 Final Hussein: E M Se M .
No ratings yet
A5 Final Hussein: E M Se M .
9 pages
Practice Solutions
No ratings yet
Practice Solutions
4 pages
Metrics Topic3 Statistics Brief
No ratings yet
Metrics Topic3 Statistics Brief
24 pages
15 Two Sample T-Test
No ratings yet
15 Two Sample T-Test
13 pages
QRM 7B Inferential Statistics
No ratings yet
QRM 7B Inferential Statistics
84 pages
Lecture8-Two Sample T Tests
No ratings yet
Lecture8-Two Sample T Tests
25 pages
BADM 572 - Stats Homework Answers 6
No ratings yet
BADM 572 - Stats Homework Answers 6
7 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
PS1 Oess
No ratings yet
PS1 Oess
5 pages
Pay Discrepancies Analysis Report
No ratings yet
Pay Discrepancies Analysis Report
21 pages
Multiple Regression Inference Overview
No ratings yet
Multiple Regression Inference Overview
89 pages
Confidence Intervals and Hypothesis Testing
No ratings yet
Confidence Intervals and Hypothesis Testing
103 pages
Understanding Statistical Inferences
No ratings yet
Understanding Statistical Inferences
38 pages
Practice For Quiz 2 (Answers)
No ratings yet
Practice For Quiz 2 (Answers)
7 pages
Inferential Statistics Lecture 6
No ratings yet
Inferential Statistics Lecture 6
11 pages
Group Comparisons: Differences in Composition Versus Differences in Models and Effects
No ratings yet
Group Comparisons: Differences in Composition Versus Differences in Models and Effects
6 pages
Docxbvcxzasqwrretyhgf 4532 Wesadfgvc
No ratings yet
Docxbvcxzasqwrretyhgf 4532 Wesadfgvc
3 pages
Homework 9: Independent and Paired Samples T-Tests: Information 1
No ratings yet
Homework 9: Independent and Paired Samples T-Tests: Information 1
7 pages
Aqt 1
No ratings yet
Aqt 1
33 pages
Assignment 1.2
No ratings yet
Assignment 1.2
5 pages
Computer Project MAS291 SE150263
No ratings yet
Computer Project MAS291 SE150263
12 pages
Stat1302 Assignment2 Solutions w18
No ratings yet
Stat1302 Assignment2 Solutions w18
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
10 pages
Stat 7
No ratings yet
Stat 7
48 pages
Lect 2
No ratings yet
Lect 2
38 pages
Diff. Means and ANOVA
No ratings yet
Diff. Means and ANOVA
28 pages
2 - Estimation - 2 Pop Mean - Independent
No ratings yet
2 - Estimation - 2 Pop Mean - Independent
16 pages
Basic Statistical Analysis
No ratings yet
Basic Statistical Analysis
12 pages
Hypothesis Testing Guide
No ratings yet
Hypothesis Testing Guide
8 pages
October 25, 2011
No ratings yet
October 25, 2011
27 pages
Mid 1 Practice Questions
No ratings yet
Mid 1 Practice Questions
4 pages
One-Sample Hypothesis Testing Results
No ratings yet
One-Sample Hypothesis Testing Results
14 pages
STAT 206 - Chapter 10 (Two-Sample Hypothesis Tests)
No ratings yet
STAT 206 - Chapter 10 (Two-Sample Hypothesis Tests)
38 pages
Data Analysis for GPA & Wage Studies
No ratings yet
Data Analysis for GPA & Wage Studies
5 pages
Chapter9 Stats
No ratings yet
Chapter9 Stats
7 pages
Hypothesis Testing & T-Test Guide
No ratings yet
Hypothesis Testing & T-Test Guide
20 pages
Assignment 7
No ratings yet
Assignment 7
23 pages
Midterm Fall2011
No ratings yet
Midterm Fall2011
13 pages
SMPTHO4 With Answers
100% (1)
SMPTHO4 With Answers
14 pages
Qam Ii - Ps 3 Ans
No ratings yet
Qam Ii - Ps 3 Ans
8 pages
Problem Set 2 Quantitative Methods UNIGE
No ratings yet
Problem Set 2 Quantitative Methods UNIGE
10 pages
Annotated Follow-Along Guide - Use Python To Conduct A Hypothesis Test
No ratings yet
Annotated Follow-Along Guide - Use Python To Conduct A Hypothesis Test
5 pages
Verizon Repair Time Analysis and Testing
No ratings yet
Verizon Repair Time Analysis and Testing
14 pages
Stat - 5 Two Sample Test
100% (1)
Stat - 5 Two Sample Test
15 pages
ISOM2500Practice Final
No ratings yet
ISOM2500Practice Final
4 pages
Discussion1 Solution
No ratings yet
Discussion1 Solution
5 pages
Stats 101c Final Project
100% (1)
Stats 101c Final Project
16 pages
Customer Clustering Insights
50% (2)
Customer Clustering Insights
33 pages
2048 Game Simulation in Python
No ratings yet
2048 Game Simulation in Python
9 pages
Effect Size
No ratings yet
Effect Size
9 pages
Command Line: Change Directory
No ratings yet
Command Line: Change Directory
12 pages
World Bank Data Wrangling Guide
No ratings yet
World Bank Data Wrangling Guide
5 pages
Series Notes
No ratings yet
Series Notes
3 pages
Essential Pandas DataFrame Guide
No ratings yet
Essential Pandas DataFrame Guide
11 pages
Python Programming Basics Guide
No ratings yet
Python Programming Basics Guide
37 pages
Data Camp - Matplot
No ratings yet
Data Camp - Matplot
3 pages
Codecademy SQL
No ratings yet
Codecademy SQL
11 pages
PASS Posttest Schedule
No ratings yet
PASS Posttest Schedule
1 page
BGCSE ESL Exam's Impact on Speaking Skills
No ratings yet
BGCSE ESL Exam's Impact on Speaking Skills
15 pages
Leg Res Syllabus
No ratings yet
Leg Res Syllabus
5 pages
IELTS Writing Correction Course Guide
100% (1)
IELTS Writing Correction Course Guide
22 pages
Econometrics C Econ 120C, Fall 2020: M1o@ucsd - Edu
No ratings yet
Econometrics C Econ 120C, Fall 2020: M1o@ucsd - Edu
3 pages
Get PDF of Textbook of Diabetes 6th EditionRichard I G Holt
No ratings yet
Get PDF of Textbook of Diabetes 6th EditionRichard I G Holt
319 pages
Piyush Admit Card
No ratings yet
Piyush Admit Card
1 page
Types of Sentences Explained
No ratings yet
Types of Sentences Explained
14 pages
Phuong Dong Students' Speaking Challenges
0% (1)
Phuong Dong Students' Speaking Challenges
44 pages
The Components of Curriculum
No ratings yet
The Components of Curriculum
3 pages
Css 2025
No ratings yet
Css 2025
7 pages
Diada Rtu
No ratings yet
Diada Rtu
1 page
Unit Plan Grade 6
No ratings yet
Unit Plan Grade 6
39 pages
Dick and Carey Model
No ratings yet
Dick and Carey Model
29 pages
Guideline On The Accreditation and Approval Process For VTS Training
No ratings yet
Guideline On The Accreditation and Approval Process For VTS Training
34 pages
Psychology Assignment Guide
No ratings yet
Psychology Assignment Guide
6 pages
Metu Ula
No ratings yet
Metu Ula
1 page
Module 4 - Effective Delegations in An Organization
No ratings yet
Module 4 - Effective Delegations in An Organization
32 pages
Composite Technician Training Guidelines
No ratings yet
Composite Technician Training Guidelines
8 pages
TVL Strand Survey Questionnaire Guide
No ratings yet
TVL Strand Survey Questionnaire Guide
8 pages
p1 Marking Scheme
No ratings yet
p1 Marking Scheme
20 pages
Expert Help for AP European History
100% (1)
Expert Help for AP European History
6 pages
Reviewer Math 231
No ratings yet
Reviewer Math 231
2 pages
Cem Handbook
No ratings yet
Cem Handbook
34 pages
Technology Based Instructional Materials in Araling Panlipunan: Its Acceptability
No ratings yet
Technology Based Instructional Materials in Araling Panlipunan: Its Acceptability
25 pages
NCP Level 2 - Instructor's Manual
No ratings yet
NCP Level 2 - Instructor's Manual
6 pages
Logical Reasoning for Kids
50% (2)
Logical Reasoning for Kids
17 pages
1st Semester Admit Card & Marksheet
No ratings yet
1st Semester Admit Card & Marksheet
3 pages
Testbank World History Volume II Since 1500 9th Edition William Duiker Jackson Spielvogel Verified PDF
No ratings yet
Testbank World History Volume II Since 1500 9th Edition William Duiker Jackson Spielvogel Verified PDF
404 pages
M.Sc. Nursing Examination Regulations
No ratings yet
M.Sc. Nursing Examination Regulations
3 pages

Inferential Statistics Project2

Uploaded by

Inferential Statistics Project2

Uploaded by

inferential_statistics_project2

1 Examining Racial Discrimination in the US Job Market

In [79]: import pandas as pd

In [80]: data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [81]: # number of callbacks for black-sounding names

Out[82]: id ad education ofjobs yearsexp honors volunteer military emphole

busservice othservice missind ownership

In [84]: # data size of black people

In [85]: # data sice of white people

1.0.4 2. What are the null and alternate hypotheses?

1.0.5 3. Compute margin of error, confidence interval, and p-value.

print("The mean for the callback of black race is %.3f," % round(b_call_me

In [87]: result = stats.ttest_ind(w_data.call, b_data.call, equal_var=True) # since

The t score is 4.115, the p-value is 0.0000394.

In [97]: cm = sms.CompareMeans(sms.DescrStatsW(w_data.call), sms.DescrStatsW(b_data

The confidence interval is (0.017, 0.047)

1.0.6 4. Story & Explination

You might also like