0% found this document useful (0 votes)
34 views5 pages

Chapter10 Specification Error

Uploaded by

Joshua Iwuoha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views5 pages

Chapter10 Specification Error

Uploaded by

Joshua Iwuoha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CHAPTER 10: SPECIFICATION ERROR

10.1 INTRODUCTION

In the previous chapters we introduced the problems of estimation,


prediction and hypothesis testing in simple linear regression and
multiple regression models. It is noteworthy that for the hypotheses
tests associated with these tests to be strictly valid, the regression model
under consideration is assumed to be correctly specified in which case
there is no specification error. In this chapter, we will examine the
impacts of the following specification errors:

1. Omission of relevant independent variables (e.g. price from a


demand equation).
(3)
2. Inclusion of irrelevant independent variables (e.g. temperature in
a supply equation).

Specifically, we will demonstrate that omission of relevant independent


variables will, in general, lead to biased estimates of the parameters of
the model unless the excluded and included independent variables are
perfectly uncorrelated. We will also demonstrate that inclusion of
irrelevant independent variables will, in general, not affect the bias of
the estimated coefficients. In addition, we will also show that omission
of relevant independent variables will, in general, lead to larger t-ratios
and inclusion of irrelevant independent variables will, in general, lead
to smaller t-ratios.

In summary, the effect of omitting relevant independent variables is


relatively more serious than that of including irrelevant independent Substituting in equation (3), we obtain:
variables (in terms of bias and variance). The impacts of other
specification errors such as incorrect functional form and mis-
specification of the nature of the disturbance term will be discussed
elsewhere.

It must be pointed out at the outset that the choice of independent


variables is based on theory. However, not all independent variables (4)
can be included in a given regression model, since the number of
independent variables must be less than the number of observations.

10.2 EFFECTS OF THE OMISSION OF RELEVANT


INDEPENDENT VAR IABLES

Suppose the true model representing the expectations augmented


Phillips curve is given by: Equation (4) can be rewritten as:
(1)
(5)
where y, x1, and x2 denote, actual inflation rate, unemployment rate and
expected inflation, respectively; , denotes the disturbance term
Taking expectations in equation (5) yields:
Furthermore, suppose that the estimated model is:
since E(,t) = 0 by assumption. i.e.
(2)
In model (2) a relevant variable, x2t, is omitted, in which case ,t* is, in
fact,

Since (2) is a simple linear regression model, the OLS estimator of $1


in equation (2) is given by:
(6)

page 86
CHAPTER 10: SPECIFICATION ERROR

Since in general, it follows that omission of relevant introduced in chapter 3. From chapter 3, section 3.3, we note that the
OLS estimator of $1 in equation (8) is given by:
variables will, in general, lead to biased estimates of the parameters of
the model. However, there are certain cases for which the omission of
relevant variables lead to unbiased estimates of the parameter of the
model. To see this, we note that equation (6) is, in fact,

In general, inclusion of irrelevant variables does not cause any problem


Thus, if the covariance between x1 and x2 is equal to zero (ie, x1 and x2 with bias, even though $1 has been calculated incorrectly. In fact,
are uncorrelated), then the second term in equation (6) will be equal to if the true coefficient of the irrelevant variable (i.e. $2 ) is
zero and implying that is an unbiased estimator of $1 . equal to zero.

Also, if the true $2 is equal to zero, the second term in (6) will be equal The inclusion of irrelevant variables, however, increases the variance
in which case the OLS estimator will be inefficient, which reduces
to zero, in which case implying that is an unbiased precision of regression, unless the correlation coefficient between x1
estimator of $1 . (the included and relevant independent variable) and x2 (the included
and irrelevant independent variable) is equal to zero. To see this, we
These results may be summarized as follows: note that:

1. Omission of relevant variables will in general lead to biased


estimates of parameters. The bias is called specification bias or
omitted variable bias. The direction of the bias depends on the
covariance between x1 and x2 as well as the sign of $2.
2. Omission of relevant variables will lead to unbiased estimates of
the parameters of the model only in the following two cases:

Case 1: True value of $2 = 0 (i.e. x2 is not a relevant variable in


the model), in which case the model (2) is correctly
specified.
Case 2: The correlation coefficient between x1 and x2 = 0, i.e.,
the excluded and included independent variables are
perfectly uncorrelated.

3. Omission of relevant variables generally lead to invalid standard


errors and t-ratios. Specifically, the standard errors of OLS
estimators will be smaller than what they should be in which case
the t-ratios will be larger - unless the correlation coefficient
between x1 and x2 = 0. Implication: Unless r12 = 0 (i.e. the correlation coefficient between
x1 and x2 is equal to 0), including irrelevant variables
10.3 E F F E C TS OF THE IN C L U S IO N OF reduces the precision of estimation of the relevant
IRRELEVANT INDEPENDENT V ARIABLES coefficients.

In the estimation of the traditional Phillips curve relationship, let us These results may be summarized as follows:
suppose that the true model is:
1. Inclusion of irrelevant variables will in general not affect the bias
(7)
of the estimated coefficients.

where, yt, x1 and x2 denote the actual inflation rate, unemployment rate 2. Inclusion of irrelevant variables will in general lead to imprecise
and the number of registered students at UNBC (i.e. variable x2 is estimates of the relevant coefficients (due to larger standard
irrelevant and should be excluded, since its coefficient is zero) errors).

Furthermore, suppose that the estimated model is: 3. Inclusion of irrelevant variables will not lead to imprecise
(8) estimates of relevant coefficients if the correlation between x1 and
x2 is equal to 0, i.e. the irrelevant variables and other included
for which the irrelevant variable x2 is, in fact, included. variables are perfectly uncorrelated.
Equation (8) is a multiple regression model with k=2, which was

page 87
CHAPTER 10: SPECIFICATION ERROR

The effects of the two types of specification errors are summarized in


the diagram below. tested for specification error as well as , , and .
i.e. we run the regression:

In the absence of any specification error, , in


which case testing for specification error amounts to testing the
joint significance of a subset of coefficients, which we described
in chapter 3, section 3.8. This is achieved by computing the F-
statistic as described generally in section 3.8 and is described
specifically in step 4 in the present context..

Step 4: Compute the F-statistic as follows:

F=

10.4 SPECIFICATION ERRO R TESTS


where
Having introduced two types of specification error and their effects q is the number of restrictions (q=3 since we are testing whether
above, an issue arises as to whether or not it is possible to conduct more
in the present context);
formal tests for specification error.

It should be pointed out that numerous specification error tests have is the sum of squared residuals in the unrestricted regression
been proposed in the literature. One of the most popular and most
widely discussed tests for specification errors is Ramsey’s REgression ;
Specification Error Test, commonly referred to as the RESET Test.
k is the number of independent variables (excluding the intercept) in the
To aid in the description of the RESET test, let us consider the model unrestricted regression (k=5 in the present context, i.e. the two
independent variables in the model being tested for specification error,
t=1,2,..,n and the three polynomials in the predicted values of the dependent
variable, and )
Testing for specification error in the above model entails the following
steps:
is the sum of squared residuals in the restricted regression
Step 1: Estimate the above model by OLS and obtain the predicted
.
vales , where , and are the
OLS estimators of , and , respectively. Compare the F statistic with the F distribution with q (=3) degrees of
freedom in the numerator and n-(k+1) (=(n-6)) degrees of freedom in
the denominator. Specifically, if the F statistic is less than
Step 2: Use the predicted values in step 1 to generate the following
, we conclude, at level of significance ", that there is no
new variables: , where s is the maximum
specification error. Otherwise, we conclude that specification error
degree of the polynomial in that the econometrician exists in the model.
chooses. Note that in choosing the maximum order of the
polynomial, it is important to mention that the number of
observations cannot exceed the number of independent
variables. To illustrate the rest of the steps, we will assume
that the econometrician has chosen s=4, and has generated
, , and .

Step 3: Use the generated polynomials in the predicted values of the


dependent variable in step 2 to run an augmented regression
that includes the independent variables in the model being

page 88
CHAPTER 10: SPECIFICATION ERROR

Clearly, one problem with the RESET test is with respect to the choice S t e p 3 : W e r u n t h e r e g r es s i o n
of the maximum order of the polynomial in the predicted value of the
dependent variable. Fortunately, most software packages (e.g.
SHAZAM) automatically report the F-statistics corresponding to
different orders of the polynomial, which helps in checking for The results are given below:
robustness of the results.
SE 1593.0 63.417 50.100 2.6072 0.050287
Example 10-1
TR -0.70516 -0.72906 0.69510 -0.64928 0.59616
Consider the model Where y and x denote Sum of squared residuals (Eei2)=281.73
earnings and age, respectively. Number of observations (n)=25

(a) Use the following data to obtain OLS estimates of the parameters Step 4: We then compute the test statisic
of the model.

(b) Perform Ramsey’s RESET test using the square, cube and fourth
powers of the predicted value of the dependent variable. Use 5%
level of significance.

Earnings (y) Age (x) Earnings (y) Age (x)

9.0 16.0 18.0 35.0 Since the computed value of the F-statistic (2.6188) is less than
F0.05,3,20=3.10, we conclude that , in which case there is
10.0 26.0 12.0 37.0 no evidence of specification error if 5 percent level of significance.
8.0 20.0 10.0 38.0
Remarks: One potential problem associated with Ramsey’s RESET
5.0 24.0 20.0 39.0 test arises when the test results provide evidence of specification error.
In these cases, the econometrician has a daunting task of trying to
9.0 26.0 17.0 41.0 figure out the source of the specification error and getting rid of the
problem.
11.0 28.0 10.0 43.0

14.0 29.0 16.0 50.0


10.5 MODEL SELECTION CRITERIA
5.0 30.0 20.0 51.0
In light of the specification error issues raised in this section, another
7.0 32.0 17.0 55.0 issue that arises is, given a wide variety of models, how does the
econometrician choose the best model specification? In attempt to
10.0 32.0 10.0 65.0
address this issue, several model selection criteria have been suggested
12.0 33.0 14.0 62.0 in the literature. These criteria include the following:
1. The conventional coefficient of determination criterion ( or
15.0 34.0 15.0 65.0
).
20.0 34.0 2. Akaike’s final prediction error (FPE)
3. Amemiya’s prediction criterion
Solution 10-1 4. Akaike’s information criterion
Step 1: The OLS estimates of parameters of the model and their 5. The Schwarz criterion
standard errors and t-ratios are given below: 6. Shibata criterion
7. Rice criterion

The first model selection criterion (conventional coefficient of


SE 2.5083 0.062655 determination involves selecting a model that yields the highest R2 ).
TR 2.7472 2.3937 Recall, from chapter 3, that a problem with the unadjusted coefficient
Sum of squared residuals (Eer2 )=392.40 of determination is that its value increases when more independent
Number of observations (n)=25 variables are added. Thus, it is commonplace for econometricians to
pick models with highest adjusted coefficients of determination. This
Step 2: Using the estimated model in step 1, we generate the predicted
criterion is commonly referred to as the (or ) criterion.
values ( ), their squares ( ), cubes ( ) and fourth power
The second to seventh model selection criteria involve introducing a
( ). penalty for having more independent variables in terms of loss of
degrees of freedom. Specifically, these criteria involve minimizing

page 89
CHAPTER 10: SPECIFICATION ERROR

weighted values of the squared residuals, taking into account the b. Give rise to invalid t-ratios
number of independent variables included in various model c. Not affect the t-ratios
specifications, as well as the number of observations. It is noteworthy d. None of the above
that these criteria differ only in the manner the penalty functions are
defined. For an excellent discussion of the model selection criteria, see, 5. Which of the following specification errors is relatively more
for example, Maddala, G.S., 1992, Introduction to Econometrics, serious?
Second Edition, MacMillan Publishing Company. Most software a. Omission of a relevant independent variable
packages, e.g. SHAZAM, automatically provide the values of the b. Inclusion of an irrelevant independent variable
weighted sums of squared residuals based on most of the c. Both a and b
aforementioned model selection criteria to aid in the model selection d. None of the above
process. 6. Suppose that the true model to be estimated is yi=$0 +
$1 x1i+$2x2i+,i but the econometrician erroneously estimated the
model yi=$0 + $1x1i+,i (i.e. a relevant independent variable x2 is
KEY CONCEPTS AND TERMS omitted), the OLS estimator of will be unbiased except when
Specification error a. The covariance between x1 and x2 is positive
Specification bias (omitted variable bias) b. The covariance between x1 and x2 is negative
Specification error test c. The covariance between x1 and x2 is zero
Ramsey specification error test (RESET) test d. None of the above
The conventional coefficient of determination criterion ( or ).
Akaike’s final prediction error (FPE) 7. Refer to question 6. The estimates of the standard error of will be
Amemiya’s prediction criterion a. Smaller
Akaike’s information criterion b. Bigger
The Schwarz criterion c. The same
Shibata criterion d. None of the above
Rice criterion
8. Refer to question 6. If $2 is, in fact, zero the OLS estimator of $1
will be:
EXERCISES a. Unbiased
b. Biased
1. Explain the econometric concept of specification error. c. Neither unbiased nor biased
d. None of the above
2. What is the impact of omission of a relevant independent variable
on the bias and standard errors of the OLS estimators? 9. Suppose that the true model to be estimated is yi=$0 + x1i+,i but
the econometrician erroneously estimated the model yi=$0 +
3. What is the impact of omission of a relevant independent variable $1 x1i+$2x2i+,i(i.e. an irrelevant independent variable x 2 is
on the bias and standard errors of the OLS estimators? included), the OLS estimator of of $1 will be
a. Unbiased
SELF REVIEW TEST b. Biased
c. Neither unbiased nor biased
1. Which of the following is a specification error? d. None of the above
a. Omission of relevant independent variable
b. Inclusion of an irrelevant independent variable 10. Refer to question 9. The estimates of the standard error of $1 will
c. Exclusion of an irrelevant independent variable be
d. None of the above a. Smaller
b. Bigger
c. The same
2. Which of the following specification errors affects the bias of the d. None of the above
OLS estimator in general?
a. Omission of a relevant independent variable
b. Inclusion of an irrelevant independent variable
c. Both a and b
d. None of the above

3. Which of the following specification errors does not affect the


bias of the OLS estimator in general?
a. Omission of a relevant independent variable
b. Inclusion of an irrelevant independent variable
c. Both a and b
d. None of the above

4. Omission of a relevant independent variable will


a. Give rise to valid t-ratios

page 90

You might also like