WEEK3 Multiple Regression
WEEK3 Multiple Regression
MULTIPLE REGRESSION
WEEK3
FALL 2024
ESTIMATION
While sea waves might look like an almost
random movement, in every moment and location
the basic laws of hydrodynamics and gravity
hold without change. 3
PARALLELS WITH SIMPLE
REGRESSION
y = b0 + b1x1 + b2x2 + . . . bkxk + u
➢ b0 is still the intercept
➢ b1 to bk all called slope parameters
➢ u is still the error term (or disturbance)
➢ Still need to make a zero conditional mean
assumption, so now assume that
E(u|x1,x2, …,xk) = 0
➢ Still minimizing the sum of squared
residuals. 4
INTERPRETING MULTIPLE
REGRESSION COEFFICIENTS
𝑦𝑖 = 𝛼 + 𝛽1 𝑥1𝑖 + 𝛽2 𝑥2𝑖 + 𝑢𝑖
𝜕𝑆 𝜕𝑆 𝜕𝑆
=0 =0 =0
𝜕𝑎 𝜕𝑏1 𝜕𝑏2
First we expand S as shown, and then we use the first order conditions for
minimizing it. 7
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES: DRIVING COEFFICIENTS
𝑎 = 𝑦lj − 𝑏1 𝑥lj1 − 𝑏2 𝑥lj 2
We thus obtain three equations in three unknowns. Solving for a, b1, and b2,
we obtain the expressions shown above.
The expression for a is a straightforward extension of the expression for it in
simple regression analysis.
However, the expressions for the slope coefficients are considerably more
complex than that for the slope coefficient in simple regression analysis. 8
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES: DRIVING COEFFICIENTS
𝑎 = 𝑦lj − 𝑏1 𝑥lj1 − 𝑏2 𝑥lj 2
𝑌 = 𝑋𝛽 + 𝑢
𝑦1 1 x11 x21 ... xk1 𝑢1
𝛽0
𝑦2 1 x12 x22 ...xk2 𝑢2
𝛽1
𝑌= . X= . . . 𝛽= u= .
. . .
. . .
𝑦𝑛 𝛽k 𝑢𝑛
1 x1n x2n ....xkn
OLS ESTIMATES
𝛽መ = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑌 መ = 𝜎 2 (𝑋 ′ 𝑋)−1
Var(𝛽)
10
OLS ESTIMATION
MATRIX APPROACH
n x1 x2 ..... xk
𝑦
x1 x12 x1 x2 ..... x1 xk
′
𝑋𝑋=
𝑋 ′ 𝑌 = 𝑥1 𝑦
x2 x1 x2 x22 ....... x2 xk .
.. .. .. ... 𝑥𝑘 𝑦
xk xk x1 x2 xk ....... xk2
OLS
𝛽መ = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑌 መ = 𝜎 2 (𝑋 ′ 𝑋)−1
Var(𝛽)
ESTIMATES
11
MULTIPLE REGRESSION WITH TWO
EXPLANATORY VARIABLES: EXAMPLE
Hourly earnings, EARNINGS, depend on highest grade completed,
HGC, and a measure of ability, ASVABC.
Here is the regression output for the earnings function using Data Set.
. reg earnings hgc asvabc
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .7390366 .1606216 4.601 0.000 .4235506 1.054523
asvabc | .1545341 .0429486 3.598 0.000 .0701764 .2388918
_cons | -4.624749 2.0132 -2.297 0.022 -8.578989 -.6705095
------------------------------------------------------------------------------
𝐸𝐴𝑅 𝑁𝐼𝑁𝐺𝑆 = −4.62 + 0.74𝐻𝐺𝐶 + 0.15𝐴𝑆𝑉𝐴𝐵𝐶
It indicates that earnings increase by $0.74 for every extra year of
schooling and by $0.15 for every extra point increase in ASVABC. 12
MULTIPLE REGRESSION WITH TWO
EXPLANATORY VARIABLES: EXAMPLE
. reg earnings hgc asvabc
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .7390366 .1606216 4.601 0.000 .4235506 1.054523
asvabc | .1545341 .0429486 3.598 0.000 .0701764 .2388918
_cons | -4.624749 2.0132 -2.297 0.022 -8.578989 -.6705095
------------------------------------------------------------------------------
𝐸𝐴𝑅 𝑁𝐼𝑁𝐺𝑆 = −4.62 + 0.74𝐻𝐺𝐶 + 0.16𝐴𝑆𝑉𝐴𝐵𝐶
Literally, the intercept indicates that an individual who had no schooling and an
ASVABC score of zero would have hourly earnings of -$4.62.
Obviously, this is impossible. The lowest value of HGC in the sample was 6, and
the lowest ASVABC score was 22. We have obtained a nonsense estimate
because we have extrapolated too far from the data range. 13
A “PARTIALLING OUT”
INTERPRETATION
Consider the case where 𝑘 = 2, i.e.
15
GRAPHING A RELATIONSHIP IN A MULTIPLE
REGRESSION MODEL
90
80
. cor hgc asvabc
70 (obs=570)
| hgc asvabc
Hourly earnings ($)
60
--------+------------------
50
hgc| 1.0000
40 asvabc| 0.5779 1.0000
30
20
10
0
0 5 10 15 20 25
-10
Highest grade completed
Suppose that you were particularly interested in the relationship between EARNINGS
and HGC and wished to represent it graphically, using the sample data.
A simple plot, like the one above, would be misleading.
There appears to be a strong positive relationship, but it is distorted by the fact that
HGC is positively correlated with ASVABC, which also has a positive effect on
EARNINGS. 16
GRAPHING A RELATIONSHIP IN A
MULTIPLE REGRESSION MODEL
90
80
. cor hgc asvabc
70
(obs=570)
| hgc asvabc
Hourly earnings ($)
60
--------+------------------
50 hgc| 1.0000
40 asvabc| 0.5779 1.0000
30
20
10
0
0 5 10 15 20 25
-10
Highest grade completed
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .2687432 .035666 7.535 0.000 .1986898 .3387966
_cons | -.359883 1.818571 -0.198 0.843 -3.931829 3.212063
------------------------------------------------------------------------------
To eliminate the distortion, you purge both EARNINGS and HGC of their
components related to ASVABC and then draw a scatter diagram using the
purged variables.
We start by regressing EARNINGS on ASVABC, as shown above. The residuals
are the part of EARNINGS which is not related to ASVABC. The "predict"
command is the Stata command for saving the residuals from the most recent
regression. We name them EEARN. 18
GRAPHING A RELATIONSHIP IN A MULTIPLE
REGRESSION MODEL
. reg hgc asvabc
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1545378 .0091559 16.879 0.000 .1365543 .1725213
_cons | 5.770845 .4668473 12.361 0.000 4.853888 6.687803
------------------------------------------------------------------------------
60
EEARN (EARNINGS residuals)
50
40
30
20
10
0
-6 -4 -2 0 2 4 6 8
-10
-20
EHGC (HGC residuals)
60
EEARN (EARNINGS residuals)
50
40
30
20
10
0
-6 -4 -2 0 2 4 6 8
-10
-20
EHGC (HGC residuals)
As you would expect, the trend line is flatter that in scatter diagram
which did not control for ASVABC (reproduced here as the gray line). 21
GRAPHING A RELATIONSHIP IN A MULTIPLE
REGRESSION MODEL
Here is the regression of EEARN on EHGC.
. reg eearn ehgc
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 1, 568) = 21.21
Model | 1256.44239 1 1256.44239 Prob > F = 0.0000
Residual | 33651.2873 568 59.2452241 R-squared = 0.0360
---------+------------------------------ Adj R-squared = 0.0343
Total | 34907.7297 569 61.3492613 Root MSE = 7.6971
------------------------------------------------------------------------------
eearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
ehgc | .7390366 .1604802 4.605 0.000 .4238296 1.054244
_cons | -5.99e-09 .3223957 0.000 1.000 -.6332333 .6332333
------------------------------------------------------------------------------
From multiple regression:
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .7390366 .1606216 4.601 0.000 .4235506 1.054523
asvabc | .1545341 .0429486 3.598 0.000 .0701764 .2388918
_cons | -4.624749 2.0132 -2.297 0.022 -8.578989 -.6705095
------------------------------------------------------------------------------
A mathematical proof that the technique works requires matrix algebra. We will content
ourselves by verifying that the estimate of the slope coefficient, and equally importantly,
its standard error and t statistic, are the same as in the multiple regression 22
SIMPLE VS MULTIPLE
REGRESSION ESTIMATE
Compare the simple regression
𝑦 = 𝛽෨0 + 𝛽෨1 𝑥1
with the multiple regression
𝑦ො = 𝛽መ0 + 𝛽መ1 𝑥1 + 𝛽መ2 𝑥2
GOODNESS OF FIT
GOODNESS-OF-FIT
We can think of each observation as being made up of an
explained part, and an unexplained part,
𝑦𝑖 = 𝑦ො𝑖 + 𝑢ො 𝑖 We then define the following:
𝑦𝑖 − 𝑦lj 2 is the total sum of squares (SST)
25
GOODNESS-OF-FIT
How do we think about how well our
sample regression line fits our sample data?
R2 = SSE/SST = 1 – SSR/SST
26
GOODNESS-OF-FIT
We can also think of 𝑅2 as being equal to
the squared correlation coefficient between
the actual 𝑦𝑖 and the values 𝑦ො𝑖
2
2
∑ 𝑦𝑖 − 𝑦lj 𝑦ො𝑖 − 𝑦ොሜ
𝑅 = 2 2
∑ 𝑦𝑖 − 𝑦lj ∑ 𝑦ො𝑖 − 𝑦ොሜ
27
GOODNESS-OF-FIT: EXAMPLE
. reg earnings hgc asvabc
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .7390366 .1606216 4.601 0.000 .4235506 1.054523
asvabc | .1545341 .0429486 3.598 0.000 .0701764 .2388918
_cons | -4.624749 2.0132 -2.297 0.022 -8.578989 -.6705095
------------------------------------------------------------------------------
29
R2 COMPARISON OF DIFFERENT
REGRESSION MODELS
1. y= b 0 +b 1 x 1 +u 𝑹 𝟐𝟏
2. y= b 0 +b 1 x 1 + b 2 x 2 + u 𝑹 𝟐𝟐
3. y= b 0 +b 2 x 2 +u 𝑹 𝟐𝟑
4. Ln(y)= b 0 +b 2 x 2 +u 𝑹 𝟐𝟒
▪ For R2 comparisons
• The number of explanatory variables of the models and
• The dependent variables of the models must be the same
▪ Only 𝑅12 and 𝑅32 𝑎𝑟𝑒 comparable
▪ Adjusted R2, AIC, SC, or other criteria may be used to compare
models 1, 2, and 3.
30
Multiple Regression Analysis
y = b0 + b1x1 + b2x2 + . . . bkxk + u
MODEL MISSPECIFICATION
ASSUMPTIONS FOR
UNBIASEDNESS
Population model is linear in parameters:
y = b0 + b1x1 + b2x2 +…+ bkxk + u
We can use a random sample of size n, {(xi1,
xi2,…, xik, yi): i=1, 2, …, n}, from the
population model, so that the sample model
is yi = b0 + b1xi1 + b2xi2 +…+ bkxik + ui
E(u|x1, x2,… xk) = 0, implying that all of the
explanatory variables are exogenous
None of the x’s is constant, and there are no
exact linear relationships among them. 32
TOO MANY OR TOO FEW VARIABLES
x x
x
x xx x
x x xx x x
xx
Unbiased x
x Unbiased
Efficient Inefficient
x x x
xxx
xx
Biased xx x Biased
x
Efficient x x Inefficient
34
VARIABLE MISSPECIFICATION I: OMISSION OF A
RELEVANT VARIABLE
Consequences of Variable Misspecification
True Model
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝑢 𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢
𝑦ො = 𝑎 + 𝑏1 𝑥1
Fitted Model
𝑦ො = 𝑎 + 𝑏1 𝑥1
+𝑏2 𝑥2
To keep the analysis simple, we will assume that there are only two
possibilities. Either y depends only on x1, or it depends on both x1 and x2. 35
VARIABLE MISSPECIFICATION I: OMISSION OF A
RELEVANT VARIABLE
True Model
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝑢 𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢
𝑦ො = 𝑎 + 𝑏1 𝑥1 Coefficients are
Correct specification,
+𝑏2 𝑥2 unbaised
no problems
but
Inefficient
36
MULTIPLE REGRESSION
ANALYSIS
y = b0 + b1x1 + b2x2 + . . . bkxk + u
MODEL MISSPECIFICATION I:
OMITTED VARIABLE BIAS
VARIABLE MISSPECIFICATION I: OMISSION OF A
RELEVANT VARIABLE
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢
𝑦ො = 𝑎 + 𝑏1 𝑥1
Cov(𝑥1 , 𝑥2 )
𝐸(𝑏1 ) = 𝛽1 + 𝛽2
Var(𝑥1 )
y
effect of x2
direct effect of
x1, holding x2 b1 b2
constant apparent effect of x1,
acting as a mimic for x2
x1 x2
In the present case, the omission of x2 causes b1 to be biased by an amount
b2 Cov(x1, x2)/Var(x1). We will demonstrate this first intuitively and then mathematically.
The intuitive reason is that, in addition to its direct effect b1, x1 has an apparent indirect effect
as a consequence of acting as a proxy for the missing x2. 38
VARIABLE MISSPECIFICATION I: OMISSION OF A
RELEVANT VARIABLE
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢
𝑦ො = 𝑎 + 𝑏1 𝑥1
Cov(𝑥1 , 𝑥2 )
𝐸(𝑏1 ) = 𝛽1 + 𝛽2
Var(𝑥1 )
y
effect of x2
direct effect of
x1, holding x2 b1 b2
constant apparent effect of x1,
acting as a mimic for x2
x1 x2
The strength of the proxy effect depends on two factors: the strength of the effect of
x2 on y, which is given by b2, and the ability of x1 to mimic x2. 39
VARIABLE MISSPECIFICATION I: OMISSION OF A
RELEVANT VARIABLE
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢
𝑦ො = 𝑎 + 𝑏1 𝑥1
Cov(𝑥1 , 𝑥2 )
𝐸(𝑏1 ) = 𝛽1 + 𝛽2
Var(𝑥1 )
y
effect of x2
direct effect of
x1, holding x2 b1 b2
constant apparent effect of x1,
acting as a mimic for x2
x1 x2
𝑦ො = 𝑎 + 𝑏1 𝑥1
Cov(𝑥1 , 𝑥2 ) Cov(𝑥1 , 𝑢)
𝐸(𝑏1 ) = 𝐸 𝛽1 + 𝛽2 +
Var(𝑥1 ) Var(𝑥1 )
Cov(𝑥1 , 𝑥2 )
= 𝐸(𝛽1 ) + 𝐸 𝛽2
Var(𝑥1 )
Cov(𝑥1 , 𝑥2 )
= 𝛽1 + 𝛽2
Var(𝑥1 )
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1381062 .0097494 14.166 0.000 .1189567 .1572556
hgcm | .154783 .0350728 4.413 0.000 .0858946 .2236715
_cons | 4.791277 .5102431 9.390 0.000 3.78908 5.793475
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1381062 .0097494 14.166 0.000 .1189567 .1572556
hgcm | .154783 .0350728 4.413 0.000 .0858946 .2236715
_cons | 4.791277 .5102431 9.390 0.000 3.78908 5.793475
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1381062 .0097494 14.166 0.000 .1189567 .1572556
hgcm | .154783 .0350728 4.413 0.000 .0858946 .2236715
_cons | 4.791277 .5102431 9.390 0.000 3.78908 5.793475
------------------------------------------------------------------------------
Source | SS df MS Number
. cor hgcm of obs
asvabc =
(obs=570) 570
---------+------------------------------ F( 2, 567) = 156.81
Model | 1230.2039 2 615.101949 Prob
| > Fhgcm =asvabc
0.0000
Residual | 2224.04347 567 3.92247526 R-squared = 0.3561
--------+------------------
---------+------------------------------ Adj R-squared
hgcm| 1.0000 = 0.3539
Total | 3454.24737 569 6.07073351 Root MSE
asvabc| 0.3819 = 1.0000
1.9805
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1381062 .0097494 14.166 0.000 .1189567 .1572556
hgcm | .154783 .0350728 4.413 0.000 .0858946 .2236715
_cons | 4.791277 .5102431 9.390 0.000 3.78908 5.793475
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1545378 .0091559 16.879 0.000 .1365543 .1725213
_cons | 5.770845 .4668473 12.361 0.000 4.853888 6.687803
------------------------------------------------------------------------------
2446
VARIABLE MISSPECIFICATION I: OMISSION OF A
RELEVANT VARIABLE: An Example
. reg hgc asvabc hgcm
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1381062 .0097494 14.166 0.000 .1189567 .1572556
hgcm | .154783 .0350728 4.413 0.000 .0858946 .2236715
_cons | 4.791277 .5102431 9.390 0.000 3.78908 5.793475
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1545378 .0091559 16.879 0.000 .1365543 .1725213
_cons | 5.770845 .4668473 12.361 0.000 4.853888 6.687803
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgcm | .3445198 .0376833 9.142 0.000 .2705041 .4185354
_cons | 9.506491 .4495754 21.145 0.000 8.623458 10.38952
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1381062 .0097494 14.166 0.000 .1189567 .1572556
hgcm | .154783 .0350728 4.413 0.000 .0858946 .2236715
_cons | 4.791277 .5102431 9.390 0.000 3.78908 5.793475
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgcm | .3445198 .0376833 9.142 0.000 .2705041 .4185354
_cons | 9.506491 .4495754 21.145 0.000 8.623458 10.38952
------------------------------------------------------------------------------
In this case, the bias is quite dramatic. The coefficient of HGCM has
more than doubled. (The reason for the bigger effect is that Var(HGCM)
is much smaller than Var(ASVABC), while b1 and b2 are similar in size,
judging by their estimates.)
49
SUMMARY OF DIRECTION OF BIAS
True Model
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝑢 𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢
Coefficients are
𝑦ො = 𝑎 + 𝑏1 𝑥1 unbiased (in general),
Correct specification,
+𝑏2 𝑥2 but inefficient.
no problems
Standard errors are
valid (in general)
𝑦ො = 𝑎 + 𝑏1 𝑥1 + 𝑏2 𝑥2
𝑦 = 𝛼 + 𝛽1 𝑥1 + 0𝑥2 + 𝑢
𝜎𝑢2 1
𝜎𝑏21 = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
These results can be demonstrated quickly.
Rewrite the actual model, adding x2 as an explanatory variable with a coefficient
of 0. Now, the actual model and the fitted model coincide. Hence, b1 will be an
unbiased estimator of b1, and b2 will be an unbiased estimator of 0.
However, the population variance of b1 will be larger than it would have been if
the correct simple regression had been run because it includes the factor.
1/ (1 − 𝑟𝑥21 𝑥2 )
Therefore, the estimator of b1 using the multiple regression model will be less
efficient than the alternative using the simple regression model. 54
VARIABLE MISSPECIFICATION II: INCLUSION OF AN
IRRELEVANT VARIABLE
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝑢
𝑦ො = 𝑎 + 𝑏1 𝑥1 + 𝑏2 𝑥2
𝑦 = 𝛼 + 𝛽1 𝑥1 + 0𝑥2 + 𝑢
𝜎𝑢2 1
𝜎𝑏21 = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
The intuitive reason for this is that the simple regression model exploits the
information that x2 should not be in the regression. In contrast, with the
multiple regression model, you find this out from the regression results.
The standard errors remain valid because the model is formally correctly
specified. Still, they will tend to be larger than those obtained in a simple
regression, reflecting the loss of efficiency.
These are the results in general. Note that if x1 and x2 are uncorrelated, there
will be no loss of efficiency after all.
55
VARIABLE MISSPECIFICATION II: INCLUSION OF AN
IRRELEVANT VARIABLE
------------------------------------------------------------------------------
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .0544266 .0099018 5.497 0.000 .034978 .0738753
asvabc | .0114733 .0026476 4.333 0.000 .0062729 .0166736
_cons | 1.118832 .124107 9.015 0.000 .8750665 1.362598
------------------------------------------------------------------------------
56
VARIABLE MISSPECIFICATION II: INCLUSION OF AN
IRRELEVANT VARIABLE
. reg lgearn hgc asvabc hgcm hgcf
------------------------------------------------------------------------------
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .0511811 .0101812 5.027 0.000 .0311835 .0711788
asvabc | .010444 .0027481 3.800 0.000 .0050463 .0158417
hgcm | .0071835 .0102695 0.699 0.485 -.0129876 .0273547
hgcf | .004794 .0076389 0.628 0.531 -.0102101 .0197981
_cons | 1.073972 .1324621 8.108 0.000 .8137933 1.33415
------------------------------------------------------------------------------
Now add the parental education variables, HGCM and HGCF. These
variables are determinants of educational attainment and indirectly
affect earnings, but there is no evidence that they have any
additional direct effect on earnings.
The fact that the t statistics of both variables are low is evidence that
they are probably irrelevant. 57
VARIABLE MISSPECIFICATION II: INCLUSION OF AN
IRRELEVANT VARIABLE
. reg lgearn hgc asvabc
------------------------------------------------------------------------------
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .0544266 .0099018 5.497 0.000 .034978 .0738753
asvabc | .0114733 .0026476 4.333 0.000 .0062729 .0166736
_cons | 1.118832 .124107 9.015 0.000 .8750665 1.362598
------------------------------------------------------------------------------
------------------------------------------------------------------------------
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .0511811 .0101812 5.027 0.000 .0311835 .0711788
asvabc | .010444 .0027481 3.800 0.000 .0050463 .0158417
hgcm | .0071835 .0102695 0.699 0.485 -.0129876 .0273547
hgcf | .004794 .0076389 0.628 0.531 -.0102101 .0197981
_cons | 1.073972 .1324621 8.108 0.000 .8137933 1.33415
------------------------------------------------------------------------------
------------------------------------------------------------------------------
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .0544266 .0099018 5.497 0.000 .034978 .0738753
asvabc | .0114733 .0026476 4.333 0.000 .0062729 .0166736
_cons | 1.118832 .124107 9.015 0.000 .8750665 1.362598
------------------------------------------------------------------------------
------------------------------------------------------------------------------
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .0511811 .0101812 5.027 0.000 .0311835 .0711788
asvabc | .010444 .0027481 3.800 0.000 .0050463 .0158417
hgcm | .0071835 .0102695 0.699 0.485 -.0129876 .0273547
hgcf | .004794 .0076389 0.628 0.531 -.0102101 .0197981
_cons | 1.073972 .1324621 8.108 0.000 .8137933 1.33415
------------------------------------------------------------------------------
59
VARIABLE MISSPECIFICATION II: INCLUSION OF AN
IRRELEVANT VARIABLE
. reg lgearn hgc asvabc
. cor hgc asvabc hgcm hgcf (obs=570)
------------------------------------------------------------------------------
| hgc asvabc hgcm hgcf
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------+------------------------------------
---------+--------------------------------------------------------------------
hgc| 1.0000
hgc | .0544266 .0099018 5.497 0.000 .034978 .0738753
asvabc| 0.5779 1.0000
asvabc | .0114733 .0026476 4.333 0.000 .0062729 .0166736
hgcm| 0.3582 0.3819 1.0000
_cons | 1.118832 .124107 9.015 0.000 .8750665 1.362598
hgcf| 0.4066 0.4179 0.6391 1.0000
------------------------------------------------------------------------------
------------------------------------------------------------------------------
lgearn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .0511811 .0101812 5.027 0.000 .0311835 .0711788
asvabc | .010444 .0027481 3.800 0.000 .0050463 .0158417
hgcm | .0071835 .0102695 0.699 0.485 -.0129876 .0273547
hgcf | .004794 .0076389 0.628 0.531 -.0102101 .0197981
_cons | 1.073972 .1324621 8.108 0.000 .8137933 1.33415
------------------------------------------------------------------------------
MODEL MISSPECIFICATION:
PROXY VARIABLES
PROXY VARIABLES
SOLVING OMITED VARIABLE BIAS PROBLEM
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
𝑥1 = 𝜆 + 𝜇𝑧
Suppose that a variable y is hypothesized to depend on a set of explanatory
variables x1, ..., xk as shown above, and suppose there are no data on x1
for some reason.
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
𝑥1 = 𝜆 + 𝜇𝑧
𝑦 = 𝛼 + 𝛽1 (𝜆 + 𝜇𝑧) + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
= (𝛼 + 𝛽1 𝜆) + 𝛽1 𝜇𝑧 + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘
63
PROXY VARIABLES
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢 𝑥1 = 𝜆 + 𝜇𝑧
𝑦 = (𝛼 + 𝛽1 𝜆) + 𝛽1 𝜇𝑧 + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
1. The estimates of the coefficients of x2, ..., xk will be the same as those
that would have been obtained if it had been possible to regress y on
x1, ..., xk.
2. The standard errors and t statistics of the coefficients of x2, ..., xk will
be the same as those that would have been obtained if it had been
possible to regress y on x1, ..., xk.
3. R2 will be the same as it would have been if it had been possible to
regress y on x1, ..., xk.
4. The coefficient of z will be an estimate of b1m, so it will not be possible to
obtain an estimate of b1 unless you can guess the value of m..
5. However, the t statistic for z will be the same as that which would
have been obtained for x1 if it had been possible to regress y on x1, ., xk,
and so you can assess the significance of x1, even if you are not able to
estimate its coefficient. 64
PROXY VARIABLES
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
𝑥1 = 𝜆 + 𝜇𝑧
𝑦 = 𝛼 + 𝛽1 (𝜆 + 𝜇𝑧) + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
= (𝛼 + 𝛽1 𝜆) + 𝛽1 𝜇𝑧 + 𝛽2 𝑥2 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
68
PROXY VARIABLES
. reg hgc asvabc hgcm hgcf
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
69
PROXY VARIABLES
. reg hgc asvabc
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1545378 .0091559 16.879 0.000 .1365543 .1725213
_cons | 5.770845 .4668473 12.361 0.000 4.853888 6.687803
------------------------------------------------------------------------------
70
PROXY VARIABLES
. reg hgc asvabc hgcm hgcf
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1545378 .0091559 16.879 0.000 .1365543 .1725213
_cons | 5.770845 .4668473 12.361 0.000 4.853888 6.687803
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1545378 .0091559 16.879 0.000 .1365543 .1725213
_cons | 5.770845 .4668473 12.361 0.000 4.853888 6.687803
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1277852 .010054 12.710 0.000 .1080373 .147533
hgcm | .0619975 .0427558 1.450 0.148 -.0219826 .1459775
hgcf | .1045035 .0314928 3.318 0.001 .042646 .166361
library | .1151269 .1969844 0.584 0.559 -.2717856 .5020394
siblings | -.0509486 .039956 -1.275 0.203 -.1294293 .027532
_cons | 5.236995 .5665539 9.244 0.000 4.124181 6.349808
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1277852 .010054 12.710 0.000 .1080373 .147533
hgcm | .0619975 .0427558 1.450 0.148 -.0219826 .1459775
hgcf | .1045035 .0314928 3.318 0.001 .042646 .166361
library | .1151269 .1969844 0.584 0.559 -.2717856 .5020394
siblings | -.0509486 .039956 -1.275 0.203 -.1294293 .027532
_cons | 5.236995 .5665539 9.244 0.000 4.124181 6.349808
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1277852 .010054 12.710 0.000 .1080373 .147533
hgcm | .0619975 .0427558 1.450 0.148 -.0219826 .1459775
hgcf | .1045035 .0314928 3.318 0.001 .042646 .166361
library | .1151269 .1969844 0.584 0.559 -.2717856 .5020394
siblings | -.0509486 .039956 -1.275 0.203 -.1294293 .027532
_cons | 5.236995 .5665539 9.244 0.000 4.124181 6.349808
------------------------------------------------------------------------------
There is a tendency for parents who are ambitious for their children
to limit their number, so SIBLINGS should be expected to have a
negative coefficient. It does, but it is also insignificant.
75
PROXY VARIABLES
. reg hgc asvabc hgcm hgcf library siblings
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1277852 .010054 12.710 0.000 .1080373 .147533
hgcm | .0619975 .0427558 1.450 0.148 -.0219826 .1459775
hgcf | .1045035 .0314928 3.318 0.001 .042646 .166361
library | .1151269 .1969844 0.584 0.559 -.2717856 .5020394
siblings | -.0509486 .039956 -1.275 0.203 -.1294293 .027532
_cons | 5.236995 .5665539 9.244 0.000 4.124181 6.349808
------------------------------------------------------------------------------
Coffee bc = b 0 +b 1 P bc +b 2 P tea +b 3 Yd +e
b 1<0 b 2>0 b 3>0
77
EXAMPLE
Coffeebc = 9.1 + 7.8Pbc + 2.4Ptea + 0.0035Yd
(15.6) (1.2) (0.0010)
t= 0.5 2.0 3.5
𝑅ത 2 = .60 N = 25
78
EXAMPLE
➢ If you think there is a possibility that the demand
for Brazilian coffee is price-inelastic (that is, its
coefficient is zero), you might decide to run the
same equation without the price variable, obtaining:
79
EXAMPLE
➢ By comparing two equations, we can apply our four
specification criteria for the inclusion of a variable in
an equation
1. Theory: If it’s possible that the demand for coffee could be
price-inelastic, the theory behind dropping the variable seems
plausible.
2. t-test: The t-score of the possibly irrelevant variable is 0.5,
insignificant at any level.
ഥ 𝟐 : 𝑅ത 2 increases when the variable is dropped, indicating
3. 𝐑
that the variable is irrelevant.
4. Bias: The remaining coefficients change only slightly when
Pbc is dropped, suggesting slight bias is caused by excluding
the variable. 80
EXAMPLE
➢ Based upon this analysis, you might conclude that the
demand for Brazilian coffee is indeed price-inelastic and
that the variable is irrelevant and should be dropped from
the model.
➢ Although this conclusion would be unwarranted.
➢ The elasticity of demand for coffee, in general, might be
pretty low (the evidence suggests that it is inelastic only
over a particular range of prices); it is hard to believe that
Brazilian coffee is immune to price competition from other
kinds of coffee.
➢ Indeed, one would expect quite a bit of sensitivity in the
demand for Brazilian coffee concerning the price of, for
example, Colombian coffee. 81
EXAMPLE
➢ To test this hypothesis, Pcc should be added to the
first Equation as a positive function of the price of
Colombian coffee.
82
EXAMPLE
➢ By comparing the first and last equations, we can once again
apply our four specification criteria:
1. Theory: The model should always have included both
prices; their logical justification is quite strong.
2. t-Test: The t-score of the new variable, the price of
Colombian coffee, is 2.0, significant at most levels.
3. 𝐑𝟐 : 𝑅2 increases with adding the variable, indicating that
the variable was omitted.
4. Bias: Although two of the coefficients remain virtually
unchanged, indicating that the correlations between these
variables and the price of Colombian coffee variable are
low, the coefficient for the price of Brazilian coffee does
change significantly, indicating bias in the original result.
83
EXAMPLE
➢ Theoretical considerations should never be discarded,
even in the face of statistical insignificance.
➢ If a variable known to be extremely important from a
theoretical point of view turns out to be statistically
insignificant in a particular sample, that variable
should be left in the equation even though it makes
the results look bad.
➢ The more thinking done before the first regression is
run, and the fewer alternative specifications
estimated, the better the regression results will likely
be.
84
Multiple Regression Analysis
y = b0 + b1x1 + b2x2 + . . . bkxk + u
MULTICOLLINEARITY
OLS ASSUMPTIONS
1. Assumptions on regressors
a. Fixed - nonstochastic regressors
b. No multicollinearity
2. Assumptions on the disturbances
a. Random disturbances have zero mean E[ui] = 0
b. Homoskedasticity Var(ui) = s2
c. No serial correlation Cov(ui uj) = 0 i j
3. Assumptions on model and its parameters
a. Constant parameters
b. Linear model
4. Assumption on the probability distribution
a. Normal distribution u ~N(0, s2 ) 86
THE GAUSS-MARKOV THEOREM
➢ Given Gauss-Markov Assumptions it can
be shown that OLS is “BLUE”
➢ Best
➢ Linear
➢ Unbiased
➢ Estimator
87
VARIANCE OF OLS
Given the Gauss−Markov Assumptions
𝜎2
𝑉𝑎𝑟 𝛽መ𝑗 = 2
, where
𝑆𝑆𝑇𝑗 1 − 𝑅𝑗
2
𝑆𝑆𝑇𝑗 = 𝑥𝑖𝑗 − 𝑥𝑗lj and 𝑅𝑗2 is the 𝑅2
𝜎ො 2 = 𝑢ො 𝑖2 ൘ 𝑛 − 𝑘 − 1 ≡ 𝑆𝑆𝑅 Τ𝑑𝑓
1Τ2
thus, 𝑠𝑒 𝛽መ𝑗 = 𝜎ො Τ 𝑆𝑆𝑇𝑗 1 − 𝑅𝑗2
➢ df = n – (k + 1), or df = n – k – 1
➢ df (i.e. degrees of freedom) is the
(number of observations) – (number of
estimated parameters)
89
COMPONENTS OF OLS
VARIANCES
➢ The error variance: a larger s2 implies a
larger variance for the OLS estimators
➢ The total sample variation: a larger SSTj
implies a smaller variance for the
estimators
➢ Linear relationships among the
independent variables: a larger Rj2
implies a larger variance for the estimators
90
MULTICOLLINEARITY
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢 𝑥2 = 𝜆 + 𝜇𝑥1
Cov(𝑥1 , 𝑦)Var(𝑥2 )−Cov(𝑥2 , 𝑦)Cov(𝑥1 , 𝑥2 )
𝑏1 =
Var(𝑥1 )Var(𝑥2 ) − Cov(𝑥1 , 𝑥2 ) 2
What would happen if you tried to run a regression when there is an exact linear
relationship among the explanatory variables?
We will investigate, using the model with two explanatory variables shown above.
91
MULTICOLLINEARITY
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢 𝑥2 = 𝜆 + 𝜇𝑥1
It turns out that both the numerator and the denominator are equal
to zero. The regression coefficient is not defined.
92
DETECTING
MULTICOLLINEARITY
➢ It is unusual for there to be an exact relationship among the
explanatory variables in a regression. When this occurs, it is
typically because of a logical error in the specification.
➢ How can we measure the multicollinearity in the regression
equation?
Eigenvalues of correlation matrix 𝐑
𝜆1 > 𝜆2 >. . . . > 𝜆𝑠 > 𝜆𝑠+1 . . > 𝜆𝑘 1. High R2 and F values
det( 𝐑) = 𝜆1 ∗ 𝜆2 ∗. . . . .∗ 𝜆𝑘 but insignificant
coefficient estimates
𝑇𝑟𝑎𝑐𝑒(𝐑) = 𝜆1 + 𝜆2 +. . . . . +𝜆𝑘
2. Unexpected
𝜆𝑠 ≈ 𝜆𝑠+1 . . ≈ 𝜆𝑘 ≈ 0 coefficient sign and
𝜆𝑚𝑎𝑥 value
𝐂𝐨𝐧𝐝𝐢𝐭𝐢𝐨𝐧 𝐈𝐧𝐝𝐞𝐱 = 𝜅 =
𝜆𝑚𝑖𝑛 3. Condition index >30
93
ALLEVIATING MULTICOLLINEARITY
PROBLEM
𝜎𝑢2 1
pop.var(𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
What can you do about this problem if you encounter it? Looking at the
model with two explanatory variables. Before doing this, two important
points should be emphasized.
First, multicollinearity does not cause the regression coefficients to be
biased. Their probability distributions are still centered over the actual values
if the regression specification is correct, but they have unsatisfactorily
large variances.
Second, the standard errors and t-tests remain valid. The standard errors
are larger than they would have been without multicollinearity, warning us
that the regression estimates are erratic.
Since the problem of multicollinearity is caused by the population variances
of the coefficients being unsatisfactorily large, we will seek ways of reducing
the variances. 94
ALLEVIATING
MULTICOLLINEARITY PROBLEM
Possible measures for alleviating multicollinearity
𝜎𝑢2 1
pop.var(𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
4. Reduce 𝑟.𝑥1 𝑥2
5898
ALLEVIATING
MULTICOLLINEARITY PROBLEM
Possible measures for alleviating multicollinearity
𝜎𝑢2 1
pop.var(𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
5999
ALLEVIATING
MULTICOLLINEARITY PROBLEM
Possible measures for alleviating multicollinearity
𝜎𝑢2 1
pop.var(𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
7. Empirical restriction 𝑦 = 𝛼 + 𝛽1 𝑥 + 𝛽2 𝑝 + 𝑢
7. Empirical restriction
𝑦 = 𝛼 + 𝛽1 𝑥 + 𝛽2 𝑝 + 𝑢 𝑦 ′ = 𝛼 ′ + 𝛽1′ 𝑥 ′ + 𝑢
𝑦ො ′ = 𝑎′ + 𝑏1′ 𝑥 ′
𝑦ො ′ = 𝑎′ + 𝑏1′ 𝑥 ′
𝑦=𝛼+ 𝑏1′ 𝑥 + 𝛽2 𝑝 + 𝑢
𝑧 = 𝑦 − 𝑏1′ 𝑥 = 𝛼 + 𝛽2 𝑝 + 𝑢
𝑧 = 𝑦 − 𝑏1′ 𝑥 = 𝛼 + 𝛽2 𝑝 + 𝑢
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
A one-point increase in ASVABC increases HGC by 0.13 years.
HGC increases by 0.07 years for every extra year of schooling of the mother and
0.11 years for every additional year of schooling of the father.
Mother's education is generally held to be at least, if not more, important than
father's education for educational attainment, so this outcome is unexpected. 106
ALLEVIATING MULTICOLLINEARITY
. reg hgc asvabc hgcm hgcf
PROBLEM
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
Theoretical restriction 𝛽2 = 𝛽3
108
ALLEVIATING MULTICOLLINEARITY
. g hgcp=hgcm+hgcf PROBLEM
. reg hgc asvabc hgcp
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295653 .0099485 13.024 0.000 .1100249 .1491057
hgcp | .093741 .0165688 5.658 0.000 .0611973 .1262847
_cons | 4.823123 .4844829 9.955 0.000 3.871523 5.774724
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295653 .0099485 13.024 0.000 .1100249 .1491057
hgcp | .093741 .0165688 5.658 0.000 .0611973 .1262847
_cons | 4.823123 .4844829 9.955 0.000 3.871523 5.774724
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
The standard error of HGCP is much smaller than those of HGCM and
HGCF. The restriction has led to a large gain in efficiency, and the
multicollinearity problem has been eliminated.
111
ALLEVIATING MULTICOLLINEARITY
. g hgcp=hgcm+hgcf PROBLEM
. reg hgc asvabc hgcp
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295653 .0099485 13.024 0.000 .1100249 .1491057
hgcp | .093741 .0165688 5.658 0.000 .0611973 .1262847
_cons | 4.823123 .4844829 9.955 0.000 3.871523 5.774724
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
The t statistic is very high. Thus, imposing the restriction has improved
the regression results. However, the restriction may not be valid. We
should test it. Testing theoretical restrictions is one of the topics discussed later.
112
ALLEVIATING MULTICOLLINEARITY
PROBLEM
113
Multiple Regression Analysis
y = b0 + b1x1 + b2x2 + . . . bkxk + u
𝜎𝑢2 1
pop.var(𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21 ,𝑥2
This sequence investigates the population variances and standard errors of the
slope coefficients in a model with two explanatory variables.
The expression for the population variance of b1 is shown above. The
expression for b2 is the same, with the subscripts 1 and 2 interchanged.
The first factor in the expression is identical to that for the population variance
of the slope coefficient in a simple regression model.
The population variance of b1 depends on the population variance of the
disturbance term, the number of observations, and the variance of x1 for the
same reasons as in a simple regression model. 115
PRECISION OF THE MULTIPLE
REGRESSION COEFFICIENTS
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢 𝑦ො𝑖 = 𝑎 + 𝑏1 𝑥1𝑖 + 𝑏2 𝑥2𝑖
𝜎𝑢2 1
pop.var(𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
𝜎𝑢2 1
pop.var(𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
The population variance of u has to be
estimated. The sample variance of the 𝑛
residuals provides a consistent estimator. 𝑠𝑢2 = Var(𝑒)
𝑛−𝑘−1
Still, it is biased downwards by a factor
(n-k-1)/n in a finite sample, where k is
the number of explanatory variables.
𝑠𝑢2 1
s.e. (𝑏1 ) = ×
𝑛Var(𝑥1 ) 1 − 𝑟𝑥21,𝑥2
Thus, the expression above estimates the standard deviation of the probability
distribution of b1, known as the standard error of b1 for short.
117
ASSUMPTIONS OF THE CLASSICAL
LINEAR MODEL (CLM)
➢ So far, we know that given the Gauss-
Markov assumptions, OLS is BLUE,
➢ In order to do classical hypothesis testing,
we need to add another assumption (beyond
the Gauss-Markov assumptions)
➢ Assume that u is independent of x1, x2,…, xk
and u is normally distributed with zero
mean and variance s2: u ~ Normal(0,s2)
118
CLM ASSUMPTIONS (cont)
➢ Under CLM, OLS is not only BLUE, but is
the minimum variance unbiased estimator
➢ We can summarize the population
assumptions of CLM as follows
➢ y|x ~ Normal(b0 + b1x1 +…+ bkxk, s2)
➢ While for now we just assume normality,
clear that sometimes not the case
➢ Large samples will let us drop normality
119
The homoskedastic normal distribution with
a single explanatory variable
y
f(y|x)
. E(y|x) = b + b x
0 1
.
Normal
distributions
x1 x2
120
NORMAL SAMPLING
DISTRIBUTIONS
Under the CLM assumptions, conditional on
the sample values of the independent
variables
𝛽መ𝑗 ~ Normal 𝛽𝑗 , 𝑉𝑎𝑟 𝛽መ𝑗 , so that
𝛽መ𝑗 − 𝛽𝑗
൘ መ ~ Normal 0,1
𝑠𝑑 𝛽𝑗
𝛽መj is distributed normally because it
is a linear combination of the errors
121
The t Test
Under the CLM assumptions
𝛽መj − 𝛽𝑗
൘ መ ~ 𝑡𝑛−𝑘−1
𝑠𝑒 𝛽𝑗
Note this is a 𝑡 distribution (vs
normal)
because we have to estimate 𝜎 2 by 𝜎ො 2
Note the degrees of freedom:𝑛 − 𝑘 − 1
122
The t Test
➢ Knowing the sampling distribution for the
standardized estimator allows us to carry
out hypothesis tests
➢ Start with a null hypothesis
➢ For example, H0: bj=0
➢ If accept null, then accept that xj does not
affect y, controlling for other x’s
123
The t Test
To perform our test we first need to form
𝛽መ𝑗
"the" 𝑡 statistic for 𝛽መj : 𝑡𝛽𝑗 ≡ ൘ መ
𝑠𝑒 𝛽𝑗
We will then use our 𝑡 statistic along
with
a rejection rule to determine whether to
accept the null hypothesis, H0
124
t Test: ONE-SIDED ALTERNATIVES
➢ Besides our null, H0, we need an
alternative hypothesis, H1, and a
significance level
➢ H1 may be one-sided, or two-sided
◼ H1: bj > 0 and H1: bj < 0 are one-sided
◼ H1: bj 0 is a two-sided alternative
➢ If we want to have only a 5% probability
of rejecting H0 if it is really true, then we
say our significance level is 5% 125
ONE-SIDED ALTERNATIVES (cont)
➢ Having picked a significance level, , we
look up the (1 – )th percentile in a t
distribution with n – k – 1 df and call this c,
the critical value
➢ We can reject the null hypothesis if the t
statistic is greater than the critical value
➢ If the t statistic is less than the critical
value then we fail to reject the null
126
ONE-SIDED ALTERNATIVES (cont)
yi = b0 + b1xi1 + … + bkxik + ui
Fail to reject
reject
(1 - )
0 c
127
ONE-SIDED vs TWO-SIDED
➢ Because the t distribution is symmetric,
testing H1: bj < 0 is straightforward. The
critical value is just the negative of before
➢ We can reject the null if the t statistic < –c,
and if the t statistic > than –c then we fail
to reject the null
➢ For a two-sided test, we set the critical
value based on /2 and reject H1: bj 0 if
the absolute value of the t statistic > c
128
TWO-SIDED ALTERNATIVES
yi = b0 + b1Xi1 + … + bkXik + ui
H0: bj = 0 H1: bj 0
fail to reject
reject reject
/2 (1 - ) /2
-c 0 c
129
SUMMARY FOR H0: bj = 0
➢ Unless otherwise stated, the alternative is
assumed to be two-sided
➢ If we reject the null, we typically say “xj is
statistically significant at the % level”
➢ If we fail to reject the null, we typically
say “xj is statistically insignificant at the
% level”
130
COEFFICIENT HYPOTHES TEST:
EXAMPLE
. reg earnings hgc asvabc
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .7390366 .1606216 4.601 0.000 .4235506 1.054523
asvabc | .1545341 .0429486 3.598 0.000 .0701764 .2388918
_cons | -4.624749 2.0132 -2.297 0.022 -8.578989 -.6705095
------------------------------------------------------------------------------
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | .7390366 .1606216 4.601 0.000 .4235506 1.054523
asvabc | .1545341 .0429486 3.598 0.000 .0701764 .2388918
_cons | -4.624749 2.0132 -2.297 0.022 -8.578989 -.6705095
------------------------------------------------------------------------------
133
TESTING OTHER HYPOTHESES
A more general form of the t statistic
recognizes that we may want to test
something like H0: bj = aj
In this case, the appropriate t statistic is
𝛽መ𝑗 − 𝑎𝑗
𝑡= ൘ መ , where
𝑠𝑒 𝛽𝑗
𝑎𝑗 = 0 for the standard test
134
Computing p-values for t tests
➢ An alternative to the classical approach is
to ask, “what is the smallest significance
level at which the null would be
rejected?”
➢ So, compute the t statistic, and then look
up what percentile it is in the appropriate
t distribution – this is the p-value
➢ p-value is the probability we would
observe the t statistic we did, if the null
were true 135
Stata and p-values, t tests, etc.
➢ Most computer packages will compute the
p-value for you, assuming a two-sided test
➢ If you really want a one-sided alternative,
just divide the two-sided p-value by 2
➢ Stata provides the t statistic, p-value, and
95% confidence interval for H0: bj = 0 for
you, in columns labeled “t”, “P > |t|” and
“[95% Conf. Interval]”, respectively
136
TESTING A LINEAR
COMBINATION
Suppose instead of testing whether b1 is
equal to a constant, you want to test if it is
equal to another parameter,
that is H0 : b1 = b2
Use same basic procedure for forming a t
statistic
𝛽መ1 − 𝛽መ2
𝑡=
𝑠𝑒 𝛽መ1 − 𝛽መ2
137
TESTING LINEAR COMBO
Since
𝑠𝑒 𝛽መ1 − 𝛽መ2 = 𝑉𝑎𝑟 𝛽መ1 − 𝛽መ2 , then
𝑉𝑎𝑟 𝛽መ1 − 𝛽መ2 = 𝑉𝑎𝑟 𝛽መ1 + 𝑉𝑎𝑟 𝛽መ2 − 2𝐶𝑜𝑣 𝛽መ1 , 𝛽መ2
1ൗ
2 2 2
𝑠𝑒 𝛽መ1 − 𝛽መ2 = 𝑠𝑒 𝛽መ1 + 𝑠𝑒 𝛽መ2 − 2𝑠12
where 𝑠12 is an estimate of 𝐶𝑜𝑣 𝛽መ1 , 𝛽መ2
138
TESTING A LINEAR COMBO
➢ So, to use formula, need s12, which
standard output does not have
➢ Many packages will have the option to get
it or will perform the test for you
➢ In Stata, after reg y x1 x2 … xk you would
type test x1 = x2 to get a p-value for the test
➢ More generally, you can always restate the
problem to get the test you want
139
EXAMPLE:
➢ Suppose you are interested in the effect of campaign
expenditures on outcomes
➢ Model is
voteA = b0+b1log(expendA)+b2log(expendB)+b3prtystrA + u
➢ H0: b1 = - b2, or H0: q1 = b1 + b2 = 0
b1 = q1 – b2, so substitute in and rearrange Model
voteA = b0+q1log(expendA)+b2log(expendB - expendA)+
b3prtystrA + u
140
EXAMPLE:
➢ This is the same model as originally, but
now you get a standard error for b1 – b2 = q1
directly from the basic regression
➢ Any linear combination of parameters could
be tested in a similar manner
➢ Other examples of hypotheses about a
single linear combination of parameters:
◼ b1 = 1 + b2 ; b1 = 5b2 ; b1 = -1/2b2 ; etc
141
MULTIPLE LINEAR
RESTRICTIONS
➢ Everything we’ve done so far has involved
testing a single linear restriction (e.g., b1 = 0
or b1 = b2 )
➢ However, we may want to test multiple
hypotheses about our parameters jointly
➢ A typical example is testing “exclusion
restrictions” – we want to know if a group
of parameters are all equal to zero
142
TESTING EXCLUSION
RESTRICTIONS
➢ Now the null hypothesis might be
something like H0: bk-q+1 = 0, ... , bk = 0
➢ The alternative is just H1: H0 is not true
➢ Can’t just check each t statistic separately
because we want to know if the q
parameters are jointly significant at a given
level – it is possible for none to be
individually significant at that level
143
EXCLUSION RESTRICTIONS (cont)
To do the test, we need to estimate the “restricted
model” without xk-q+1,, …, xk included, as well as
the “unrestricted model” with all x’s included
Intuitively, we want to know if the change in SSR
is big enough to warrant the inclusion of xk-q+1,,
…, xk
𝑆𝑆𝑅𝑟 − 𝑆𝑆𝑅𝑢𝑟 Τ𝑞
𝐹≡
𝑆𝑆𝑅𝑢𝑟 Τ 𝑛 − 𝑘 − 1
where r is restricted and ur is unrestricted 144
The F statistic
➢ The F statistic is always positive since the
SSR from the restricted model can’t be less
than the SSR from the unrestricted
➢ Essentially, the F statistic measures the
relative increase in SSR when moving from
the unrestricted to a restricted model
▪ q = number of restrictions, or dfr – dfur
▪ n – k – 1 = dfur
145
The F statistic
➢ To decide if the increase in SSR when we
move to a restricted model is “big enough”
to reject the exclusions, we need to know
about the sampling distribution of our F stat
➢ Not surprisingly, F ~ Fq,n-k-1, where q is
referred to as the numerator degrees of
freedom and n – k – 1 as the denominator
degrees of freedom
146
The F statistic
f(F)
fail to reject If F > c Reject H0
at significance level
reject
(1 - )
0 c F
147
The R 2 form of the F statistic
Because the SSR’s may be large and unwieldy, an
alternative form of the formula is useful
We use the fact that SSR = SST(1 – R2) for any
regression, so can substitute in for SSRu and SSRur
2
𝑅𝑢𝑟 − 𝑅𝑟2 Τ𝑞
𝐹= 2 Τ
1 − 𝑅𝑢𝑟 𝑛−𝑘−1
where again r is restricted and ur is unrestricted
148
OVERALL SIGNIFICANCE
𝑦 = 𝛼 + 𝛽1 𝑥1 +. . . +𝛽𝑘 𝑥𝑘 + 𝑢
A special case of exclusion restrictions is to test
◼ H0: b1 = b2 =…= bk = 0
◼ H1: At least one of the b ≠ 0
Since the R2 from a model with only an intercept
will be zero, the F statistic is simply
𝑅 2 Τ𝑘
𝐹=
1 − 𝑅2 Τ 𝑛 − 𝑘 − 1
149
OVERALL SIGNIFICANCE
➢ In the multiple regression model, the roles of
the F and t tests differ.
▪ The F test tests the joint explanatory power
of the variables,
▪ while the t-tests test their explanatory
power individually.
150
F TESTS OF OVERALL SIGNIFICANCE
𝐻𝐺𝐶 = 𝛼 + 𝛽1 𝐴𝑆𝑉𝐴𝐵𝐶 + 𝛽2 𝐻𝐺𝐶𝑀 + 𝛽3 𝐻𝐺𝐶𝐹 + 𝑢
𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 0
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
𝐸𝑆𝑆/𝑘 1278/3
𝐹(𝑘, 𝑛 − 𝑘 − 1) = 𝐹(3,566) = = 110.8
𝑅𝑆𝑆/(𝑛 − 𝑘 − 1) 2176/566
Hence, the F statistic is 110.8. All serious regression packages compute it for you
as part of the diagnostics in the regression output.
151
F TESTS OF OVERALL SIGNIFICANCE
𝐻𝐺𝐶 = 𝛼 + 𝛽1 𝐴𝑆𝑉𝐴𝐵𝐶 + 𝛽2 𝐻𝐺𝐶𝑀 + 𝛽3 𝐻𝐺𝐶𝐹 + 𝑢
𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 0
. reg hgc asvabc hgcm hgcf
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
1278/3
𝐹crit,0.1% (3,120) = 5.78 𝐹(3,566) = = 110.8
2176/566
This result could have been anticipated because ASVABC and HGCF have highly
significant t statistics. So, we knew in advance that both b1 and b3 were non-zero.
152
F TESTS OF OVERALL SIGNIFICANCE
𝐻𝐺𝐶 = 𝛼 + 𝛽1 𝐴𝑆𝑉𝐴𝐵𝐶 + 𝛽2 𝐻𝐺𝐶𝑀 + 𝛽3 𝐻𝐺𝐶𝐹 + 𝑢
𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 0
. reg hgc asvabc hgcm hgcf
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
1278/3
𝐹crit,0.1% (3,120) = 5.78 𝐹(3,566) = = 110.8
2176/566
It is unusual for the F statistic to be insignificant if some of the t statistics are
significant. In principle, it could happen, however. Suppose you ran a regression
with 40 explanatory variables, none being a true determinant of the dependent
variable. 153
F TESTS OF OVERALL SIGNIFICANCE
𝐻𝐺𝐶 = 𝛼 + 𝛽1 𝐴𝑆𝑉𝐴𝐵𝐶 + 𝛽2 𝐻𝐺𝐶𝑀 + 𝛽3 𝐻𝐺𝐶𝐹 + 𝑢
𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 0
. reg hgc asvabc hgcm hgcf
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
𝐹crit,0.1% (3,120) = 5.78 1278/3
𝐹(3,566) = = 110.8
2176/566
The opposite can easily happen, however. Suppose you have a multiple
regression model which is correctly specified and the R2 is high. You would
expect to have a highly significant F statistic.
However, if the explanatory variables are highly correlated and the model is
subject to severe multicollinearity, the standard errors of the slope coefficients
could all be so large that none of the t statistics is significant. 154
GENERAL LINEAR RESTRICTIONS
➢ The basic form of the F statistic will work
for any set of linear restrictions
➢ First estimate the unrestricted model and
then estimate the restricted model
➢ In each case, make a note of the SSR
➢ Imposing the restrictions can be tricky –
we will likely have to redefine variables
again
155
GENERAL LINEAR RESTRICTIONS
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝑢 𝑅𝑆𝑆1
RSS1 > RSS2
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝑢 𝑅𝑆𝑆2
𝐻0 : 𝛽2 = 𝛽3 = 0
𝐻1 : 𝛽2 ≠ 0 or 𝛽3 ≠ 0 or both 𝛽2 and 𝛽3 ≠ 0
We now come to the other F test of goodness of fit. This is a test of the joint
explanatory power of a group of variables when they are added to a regression
model.
For example, y may be written as a simple function of x1 in the original
specification. In the second, we add x2 and x3.
The null hypothesis for the F test is that neither x2 nor x3 belongs in the model.
The alternative hypothesis is that at least one does, perhaps both.
156
GENERAL LINEAR RESTRICTIONS
Full Equation: 𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝑢 𝑅𝑆𝑆1 𝑅12
𝑯𝟎 : 𝜷𝟐 = 𝜷𝟑 = 𝟎
𝐻1 : 𝛽2 ≠ 0 or 𝛽3 ≠ 0 or both 𝛽2 and 𝛽3 ≠ 0
Restricted Equation: 𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝑢 𝑅𝑆𝑆2 𝑅22
q = number of restrictions,
k= number of explanatory variables in the Unrestricred Equation.
157
GENERAL LINEAR RESTRICTIONS
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝑢
(𝑅𝑆𝑆1 − 𝑅𝑆𝑆2 )Τ𝑞 (𝑅22 − 𝑅12 )Τ𝑞
𝐹= =
𝑅𝑆𝑆2 Τ(𝑛 − 𝑘 − 1) (1 − 𝑅22 )Τ(𝑛 − 𝑘 − 1)
𝐻0 : 𝛽2 = 𝛽3 = 0
𝐻1 : 𝛽2 ≠ 0 or 𝛽3 ≠ 0 or both 𝛽2 and 𝛽3 ≠ 0
▪ For this F test and several others we will encounter, it is helpful to think of the F
statistic as having the structure indicated above.
▪ The “improvement” is the reduction in the residual sum of squares when the
change is made, in this case, when the group of new variables is added.
▪ The “cost” is the reduction in the number of degrees of freedom remaining after
making the change. In the present case, it is equal to the number of new variables
added because that number of new parameters is estimated.
▪ The "remaining unexplained" is the residual sum of squares after making the
change-improvement. The "degrees of freedom remaining" is the number of
degrees of freedom remaining after making the change. 158
GENERAL LINEAR RESTRICTIONS
. reg hgc asvabc
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1545378 .0091559 16.879 0.000 .1365543 .1725213
_cons | 5.770845 .4668473 12.361 0.000 4.853888 6.687803
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
𝑦 = 𝛼 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝑢 𝑅𝑆𝑆2
𝐻0 : 𝛽2 = 𝛽3 = 0
𝐻1 : 𝛽2 ≠ 0 or 𝛽3 ≠ 0 or both 𝛽2 and 𝛽3 ≠ 0
162
GENERAL LINEAR RESTRICTIONS
➢ The basic form of the F statistic will work
for any set of linear restrictions
➢ First estimate the unrestricted model and
then estimate the restricted model
➢ In each case, make a note of the SSR
➢ Imposing the restrictions can be tricky – we
will likely have to redefine variables again
163
TESTING A LINEAR RESTRICTION
An Example
164
TESTING A LINEAR RESTRICTION
An Example
. reg hgc asvabc hgcm hgcf
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
165
TESTING A LINEAR RESTRICTION
. reg hgc asvabc hgcm hgcf An Example
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
. cor hgcm hgcf
(obs=570)
| hgcm hgcf
--------+------------------
hgcm| 1.0000
hgcf| 0.6391 1.0000
167
TESTING A LINEAR RESTRICTION
An Example
HGC = + b 1 ASVABC + b 2 HGCM + b 3 HGCF + u
b3 = b2
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295653 .0099485 13.024 0.000 .1100249 .1491057
hgcp | .093741 .0165688 5.658 0.000 .0611973 .1262847
_cons | 4.823123 .4844829 9.955 0.000 3.871523 5.774724
------------------------------------------------------------------------------
169
TESTING A LINEAR RESTRICTION
. reg hgc asvabc hgcm hgcf An Example
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295653 .0099485 13.024 0.000 .1100249 .1491057
hgcp | .093741 .0165688 5.658 0.000 .0611973 .1262847
_cons | 4.823123 .4844829 9.955 0.000 3.871523 5.774724
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295653 .0099485 13.024 0.000 .1100249 .1491057
hgcp | .093741 .0165688 5.658 0.000 .0611973 .1262847
_cons | 4.823123 .4844829 9.955 0.000 3.871523 5.774724
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcm | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295653 .0099485 13.024 0.000 .1100249 .1491057
hgcp | .093741 .0165688 5.658 0.000 .0611973 .1262847
_cons | 4.823123 .4844829 9.955 0.000 3.871523 5.774724
------------------------------------------------------------------------------
173
TESTING A LINEAR RESTRICTION
. reg hgc asvabc hgcm hgcf An Example
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
If the restriction is valid, the deterioration in the fit should be a small, random
amount. However, if the restriction is invalid, the distortion caused by its
imposition will significantly deteriorate the fit.
In the present case, we can see that the increase in RSS is very small, so
we are unlikely to reject the restriction.
174
TESTING A LINEAR RESTRICTION
An Example
b3 = b2
𝐻0 : 𝛽3 = 𝛽2 , 𝐻1 : 𝛽3 ≠ 𝛽2
175
TESTING A LINEAR RESTRICTION
An Example
HGC = + b 1 ASVABC + b 2 HGCM + b 3 HGCF + u
b3 = b2
b3 = b2
𝐻𝐺𝐶 = 𝛼 + 𝛽1 𝐴𝑆𝑉𝐴𝐵𝐶 + 𝛽2 (𝐻𝐺𝐶𝑀 + 𝐻𝐺𝐶𝐹) + 𝑢
= 𝛼 + 𝛽1 𝐴𝑆𝑉𝐴𝐵𝐶 + 𝛽2 𝐻𝐺𝐶𝑃 + 𝑢
178
TESTING A LINEAR RESTRICTION
An Example
179
TESTING A LINEAR RESTRICTION
An Example
180
TESTING A LINEAR RESTRICTION
An Example
HGC = + b 1 ASVABC + b 2 HGCM + b 3 HGCF + u
181
TESTING A LINEAR RESTRICTION
An Example
HGC = + b 1 ASVABC + b 2 HGCM + b 3 HGCF + u
𝐻0 : 𝛽3 − 𝛽2 = 0, 𝐻1 : 𝛽3 − 𝛽2 ≠ 0
The null hypothesis is that the coefficient of the conversion term is 0, and the
alternative hypothesis is that it is different from 0.
Of course, the null hypothesis is that the restriction is valid. If it is valid, the
conversion term is unnecessary, and the restricted version adequately
represents the data. 182
TESTING A LINEAR RESTRICTION
An Example
. reg hgc asvabc hgcp hgcf
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcp | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .0408654 .0653386 0.625 0.532 -.0874704 .1692012
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
183
TESTING A LINEAR RESTRICTION
. reg hgc asvabc hgcp hgcf An Example
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
hgc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
asvabc | .1295006 .0099544 13.009 0.000 .1099486 .1490527
hgcp | .069403 .0422974 1.641 0.101 -.013676 .152482
hgcf | .0408654 .0653386 0.625 0.532 -.0874704 .1692012
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
184
MULTIPLE REGRESSION
ANALYSIS
A
MULTIPLE RESTRICTIONS
MULTIPLE RESTRICTIONS
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝛽4 𝑋4 + 𝛽5 𝑋5 + 𝑢
𝛽3 = 𝛽2 , 𝛽4 + 𝛽5 = 0
Each one will result in one of the original parameters being dropped
and replaced by a test statistic for the restriction.
186
MULTIPLE RESTRICTIONS
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝛽4 𝑋4 + 𝛽5 𝑋5 + 𝑢
𝛽3 = 𝛽2 , 𝛽4 + 𝛽5 = 0
𝜃 = 𝛽3 − 𝛽2 , 𝜑 = 𝛽4 + 𝛽5
𝛽3 = 𝛽2 + 𝜃, 𝛽5 = 𝜑 − 𝛽4
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝛽4 𝑋4 + 𝛽5 𝑋5 + 𝑢
𝛽3 = 𝛽2 , 𝛽4 + 𝛽5 = 0
𝜃 = 𝛽3 − 𝛽2 , 𝜑 = 𝛽4 + 𝛽5
𝛽3 = 𝛽2 + 𝜃, 𝛽5 = 𝜑 − 𝛽4
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽2 + 𝜃 𝑋3 + 𝛽4 𝑋4 + 𝜑 − 𝛽4 𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑋2 + 𝑋3 + 𝛽4 𝑋4 − 𝑋5 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑍 + 𝛽4 𝑊 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
𝛽3 = 𝛽2 , 𝛽4 + 𝛽5 = 0
𝜃 = 𝛽3 − 𝛽2 , 𝜑 = 𝛽4 + 𝛽5
𝛽3 = 𝛽2 + 𝜃, 𝛽5 = 𝜑 − 𝛽4
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽2 + 𝜃 𝑋3 + 𝛽4 𝑋4 + 𝜑 − 𝛽4 𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑋2 + 𝑋3 + 𝛽4 𝑋4 − 𝑋5 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑍 + 𝛽4 𝑊 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
189
MULTIPLE RESTRICTIONS
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝛽4 𝑋4 + 𝛽5 𝑋5 + 𝑢
𝛽3 = 𝛽2 , 𝛽4 + 𝛽5 = 0
𝜃 = 𝛽3 − 𝛽2 , 𝜑 = 𝛽4 + 𝛽5
𝛽3 = 𝛽2 + 𝜃, 𝛽5 = 𝜑 − 𝛽4
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽2 + 𝜃 𝑋3 + 𝛽4 𝑋4 + 𝜑 − 𝛽4 𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑋2 + 𝑋3 + 𝛽4 𝑋4 − 𝑋5 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑍 + 𝛽4 𝑊 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
𝛽3 = 𝛽2 , 𝛽4 + 𝛽5 = 0
𝜃 = 𝛽3 − 𝛽2 , 𝜑 = 𝛽4 + 𝛽5
𝛽3 = 𝛽2 + 𝜃, 𝛽5 = 𝜑 − 𝛽4
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽2 + 𝜃 𝑋3 + 𝛽4 𝑋4 + 𝜑 − 𝛽4 𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑋2 + 𝑋3 + 𝛽4 𝑋4 − 𝑋5 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑍 + 𝛽4 𝑊 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
Fit and save RSSU
𝛽3 = 𝛽2 , 𝛽4 + 𝛽5 = 0
𝜃 = 𝛽3 − 𝛽2 , 𝜑 = 𝛽4 + 𝛽5
𝑅𝑆𝑆𝑅 − 𝑅𝑆𝑆𝑈 /2
𝛽3 = 𝛽2 + 𝜃, 𝛽5 = 𝜑 − 𝛽4 𝐹 2, 𝑛 − 𝑘 =
𝑅𝑆𝑆𝑈 / 𝑛 − 𝑘
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽2 + 𝜃 𝑋3 + 𝛽4 𝑋4 + 𝜑 − 𝛽4 𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑋2 + 𝑋3 + 𝛽4 𝑋4 − 𝑋5 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
= 𝛽1 + 𝛽2 𝑍 + 𝛽4 𝑊 + 𝜃𝑋3 + 𝜑𝑋5 + 𝑢
Fit and save RSSU
𝑌 = 𝛽1 + 𝛽2 𝑍 + 𝛽4 𝑊 + 𝑢 Fit and save RSSR
The test statistic would be as shown, where RSSU is the residual sum of squares in
the unrestricted model, RSSR is the residual sum of squares in the model with both
restrictions and k is the number of parameters in the original, unrestricted version.
192