inferential_statistics_project2
July 5, 2017
1 Examining Racial Discrimination in the US Job Market
1.0.1 Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers
examined the level of racial discrimination in the United States labor market by randomly assign-
ing identical rsums to black-sounding or white-sounding names and observing the impact on
requests for interviews from employers.
1.0.2 Data
In the dataset provided, each row represents a resume. The race column has two values, b and
w, indicating black-sounding and white-sounding. The column call has two values, 1 and 0,
indicating whether the resume received a call from employers or not.
Note that the b and w values in race are assigned randomly to the resumes when presented
to the employer.
In [79]: import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.stats.api as sms
In [80]: data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')
In [81]: # number of callbacks for black-sounding names
sum(data[data.race=='b'].call)
Out[81]: 157.0
In [82]: data.head()
Out[82]: id ad education ofjobs yearsexp honors volunteer military emphole
0 b 1 4 2 6 0 0 0
1 b 1 3 3 6 0 1 1
2 b 1 4 1 6 0 0 0
3 b 1 3 4 6 0 1 0
4 b 1 3 3 22 0 0 0
1
occupspecific ... compreq orgreq manuf transcom bankreal tr
0 17 ... 1.0 0.0 1.0 0.0 0.0
1 316 ... 1.0 0.0 1.0 0.0 0.0
2 19 ... 1.0 0.0 1.0 0.0 0.0
3 313 ... 1.0 0.0 1.0 0.0 0.0
4 313 ... 1.0 1.0 0.0 0.0 0.0
busservice othservice missind ownership
0 0.0 0.0 0.0
1 0.0 0.0 0.0
2 0.0 0.0 0.0
3 0.0 0.0 0.0
4 0.0 1.0 0.0 Nonprofit
[5 rows x 65 columns]
1.0.3 1. What test is appropriate for this problem? Does CLT apply?
In [83]: # data size
len(data.id)
Out[83]: 4870
In [84]: # data size of black people
len(data[data.race=="b"])
Out[84]: 2435
In [85]: # data sice of white people
4870 - 2435
Out[85]: 2435
Since the sample size is really large for both of the group (black/white), we can apply CLT
here.
Since we are comparing 2 samples, a 2-sample t test should we consider.
1.0.4 2. What are the null and alternate hypotheses?
H0 : There is NO difference between the means of 2 groups.
HA : The difference between the means of 2 groups is significant.
1.0.5 3. Compute margin of error, confidence interval, and p-value.
In [86]: b_data = data[data.race=="b"]
w_data = data[data.race=="w"]
b_call_mean = b_data.call.mean()
w_call_mean = w_data.call.mean()
b_data_var = b_data.call.var()
2
w_data_var = w_data.call.var()
avg_var = (b_data_var + w_data_var)/2 # since n1 = n2, weighted avg var th
print("The mean for the callback of black race is %.3f," % round(b_call_me
"variance is %.3f." % round(b_data_var, 3))
print("The mean for the callback of white race is %.3f," % round(w_call_me
"variance is %.3f." % round(w_data_var, 3))
print("The difference between the means of calls is %.3f," % round(w_call_
"variance is %.3f." % round(avg_var, 3))
The mean for the callback of black race is 0.064, variance is 0.060.
The mean for the callback of white race is 0.097, variance is 0.087.
The difference between the means of calls is 0.032, variance is 0.074.
In [87]: result = stats.ttest_ind(w_data.call, b_data.call, equal_var=True) # since
print("The t score is %.3f, the p-value is %.7f." % result)
The t score is 4.115, the p-value is 0.0000394.
In [97]: cm = sms.CompareMeans(sms.DescrStatsW(w_data.call), sms.DescrStatsW(b_data
ME = 1.96 * avg_var
print("The confidence interval is (%.3f, %.3f)" % cm.tconfint_diff(usevar=
print("The margin or error is %.3f." % round(ME, 3))
The confidence interval is (0.017, 0.047)
The margin or error is 0.145.
1.0.6 4. Story & Explination
Since p-value of the 2-sample t test < 0.05 and the confidence interval does not include 0, we con-
clude that we reject the null hypothesis and there a difference between the means of number of
callback of 2 samples. Therefore, there is racial discrimination towards job seekers with similar
background. However, there are also many factors we have not considered in this case and there-
fore the sounding of names might not be the only factor contributing to the different number of
callback of 2 different races.
1.0.7 5. Does your analysis mean that race/name is the most important factor in callback suc-
cess? Why or why not? If not, how would you amend your analysis?
Not necessarily. The analysis above indicates that the sounding of names (race) is significant in
affecting the number of callback. However, we are still not sure about whether other variables are
also significant or whether race is the most important factor. To understand the relation between
callback and other variable, we could run a regression test (possibly LASSO) or PCA as further
analysis.
In [ ]: