0% found this document useful (0 votes)
16 views19 pages

Extending The Multiple Regression

Uploaded by

HK gamer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views19 pages

Extending The Multiple Regression

Uploaded by

HK gamer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

12/18/2024

Extending the Multiple Regression


Model

ECN326: Basic Econometrics


Dr Arshad Ali Bhatti
Spring 2019

Dr Arshad Ali Bhatti/ Spring


12 - 1
2019

Introduction
• Sometimes we need to include explanatory variables in our
regression model that are qualitative in nature

• Dummy variables allow us to incorporate such variables into


the regression model

• This lecture shows how dummy variables can be used to


– Allow the intercept in the regression model to differ across different
groups in the sample
– Allow the slope(s) in a regression model to vary across groups
– Perform hypothesis tests on several coefficients jointly in the
regression model

Dr Arshad Ali Bhatti/ Spring 2019 12 - 2

1
12/18/2024

Dummy Explanatory Variables


• Allow the incorporation of qualitative variables into
the regression model
• For example
– Gender—a qualitative variable taking on only two values:
male or female
• A dummy variable will convert gender into a quantitative variable,
perhaps taking on the value 0 for men and 1 for women

• Dummy variables are mutually exclusive and


exhaustive
– It must be possible to assign each observation a single
value for the dummy variable

Dr Arshad Ali Bhatti/ Spring 2019 12 - 3

Defining and Interpreting Dummy Variables

• Dummy variables allow the intercept of the


regression line to vary for different groups in
the population

– It is qualitative in nature
• Examples include race, gender, union status, region of
country, Bank ownership (Pvt and Pub) etc.

Dr Arshad Ali Bhatti/ Spring 2019 12 - 4

2
12/18/2024

Defining and Interpreting Dummy Variables

• Consider the following regression equation


that relates earnings to gender

Gender =1 for a woman


Gender = 0 for a man

Dr Arshad Ali Bhatti/ Spring 2019 12 - 5

Defining and Interpreting Dummy Variables

• How should the coefficients in this simple


model be interpreted?
– If we take the expected value of the equation for
women (i.e., Gender = 1) we have the following
conditional expectation:

 Giving us the mean earnings for women


 For men it’s

Dr Arshad Ali Bhatti/ Spring 2019 12 - 6

3
12/18/2024

Defining and Interpreting Dummy Variables

• Subtracting the mean male earnings from the


mean female earnings gives us the difference
in mean earnings between women and men

Dr Arshad Ali Bhatti/ Spring 2019 12 - 7

Defining and Interpreting Dummy Variables

• Let’s enhance the previous example by adding Education


to the equation

• A scatter plot of men’s and women’s wages in relation to


their level of education is shown below

Dr Arshad Ali Bhatti/ Spring 2019 12 - 8

4
12/18/2024

Defining and Interpreting Dummy Variables

Dr Arshad Ali Bhatti/ Spring 2019 12 - 9

Defining and Interpreting Dummy Variables

• The regression equation is

 β1 measures the difference in mean


earnings between women and men
holding education constant

Dr Arshad Ali Bhatti/ Spring 2019 12 - 10

10

5
12/18/2024

Interaction Variables
• We may wish to allow the slope coefficient(s)
to vary across groups as well

– Done via an interaction variable (or term)

Dr Arshad Ali Bhatti/ Spring 2019 12 - 11

11

Interaction Variables
• In the previous example, the coefficient on Gender
tells us the difference between female and male
earnings, holding education constant
– β1 measures the difference in earnings between women
and men, holding education constant
– β2—the slope coefficient on education—measures the
increment in earnings resulting from an additional year of
schooling
• The return to education is assumed to be the same for men and
women
• But, this may not be true—rather Figure 2 may be more representative of
the real world

Dr Arshad Ali Bhatti/ Spring 2019 12 - 12

12

6
12/18/2024

Interaction Variables

Dr Arshad Ali Bhatti/ Spring 2019 12 - 13

13

Interaction Variables
• An interaction variable or term
– Will capture any interaction between gender and
the impact of education on earnings
– Included via a new variable in the model
• The education variable multiplied by the gender
variable

 Where β 3 represents the difference in the slope between


men and women

Dr Arshad Ali Bhatti/ Spring 2019 12 - 14

14

7
12/18/2024

Interaction Variables
• Dummy variables allow the intercept to differ across groups
• Interaction terms allow the slope to differ across groups
• A model might contain
– Neither dummy variables nor interaction terms
• All groups have the same intercept and slopes
– Both dummy variables and interaction terms
• Intercept and slopes vary across groups
– Only dummy variables
• Only the intercept varies across groups
– Only interaction terms
• Only the slopes vary across groups

• You may interact a dummy variable with some, but not all,
explanatory variables
– Only the interacted variables are allowed to have a different effect, by
group, on the dependent variable

Dr Arshad Ali Bhatti/ Spring 2019 12 - 15

15

Experimental Vs Observational Studies


• Dummy variables can summarize the key differences
between experimental and observational studies

• Experimental studies may involve a treatment group


and a control group
– Where a dummy variable is used to label items as to
whether they are in the experimental or control group

• Observational studies are those in which the


differences between a group of interest are observed

Dr Arshad Ali Bhatti/ Spring 2019 12 - 16

16

8
12/18/2024

Dummy Variables When There Are More


Than Two Groups
• The model can be extended to the case where
the qualitative variable takes on more than
two possible values

• Consider a model relating the earnings of a


person (Y) to the education of person’s
parents

Dr Arshad Ali Bhatti/ Spring 2019 12 - 17

17

Dummy Variables When There Are More


Than Two Groups

• We assume the education of the parents is


classified as

 Creating three dummy variables will


incorporate the parents’ education into the
model
Dr Arshad Ali Bhatti/ Spring 2019 12 - 18

18

9
12/18/2024

Dummy Variables When There Are More


Than Two Groups
• Only three dummy variables are created even though
there are four categories
– Allows us to estimate one intercept for each group

• If a qualitative variable assumes J outcomes, J − 1


dummy variables are included in the model
– Knowing a person is not in one of the J − 1 categories tells
us they must be in the Jth category
– The Jth category is redundant—including J dummy
variables would create perfect multicollinearity or dummy
variable trap
Dr Arshad Ali Bhatti/ Spring 2019 12 - 19

19

Dummy Variables When There Are More


Than Two Groups
• Now, we can structure the above model as
follows:

Dr Arshad Ali Bhatti/ Spring 2019 12 - 20

20

10
12/18/2024

Dummy Variables When There Are More


Than Two Groups
• A dummy variable indicating those parents
with college education or higher was left out

 Any group might have been omitted


 But the interpretation of the regression
coefficients is affected by the omitted group

Dr Arshad Ali Bhatti/ Spring 2019 12 - 21

21

Dummy Variables When There Are More


Than Two Groups
• The coefficients on the included dummy
variables measure the impact of the
corresponding explanatory variable compared
to the excluded category

• The results are always compared to the one


category that is omitted

Dr Arshad Ali Bhatti/ Spring 2019 12 - 22

22

11
12/18/2024

Hypothesis Tests on Several Regression


Coefficients: F Tests
• We have learned to perform hypothesis tests on
individual regression coefficients
– For example, in this model [Yi = β0 + β1X1i + β2X2i + β3X3i + εi]
we could test whether
• β1 = 0 or β3 = 5
• This section shows how to test more complicated
hypotheses
– For example, whether both β1 = 0 and β3 = 0 or whether
β1 = 0 and β3 = 5 jointly
– Called F tests

Dr Arshad Ali Bhatti/ Spring 2019 12 - 23

23

Joint Tests on Several Regression


Coefficients

• Suppose we have this regression model

 To test whether both β1 = 0 and β3 = 0, the null


and alternative hypotheses are specified as
follows:

 H1 indicates only that the null hypothesis is false, without


necessarily indicating why it is false
 The null hypothesis is very specific in indicating that both
coefficients are equal to zero

Dr Arshad Ali Bhatti/ Spring 2019 12 - 24

24

12
12/18/2024

Joint Tests on Several Regression


Coefficients
• That said, we then form the restricted
regression—the model if the null is true
– Imposes the null hypothesis under consideration
– Embodies the null hypothesis: Yi = β0 + β2X2i + εi

• An unrestricted model does not impose the


restrictions embodied in the null hypothesis
– Does not restrict the coefficients in any way—is
given by the original model

Dr Arshad Ali Bhatti/ Spring 2019 12 - 25

25

Joint Tests on Several Regression


Coefficients
• Does imposing the null hypothesis have much
of an impact on how well the model fits the
data?
– If the null hypothesis is true, both models should
“fit” the data equally well
• Even if the null hypothesis were true, the unrestricted
model would better capture random variation in the
sample
• But, does the unrestricted model provide a sufficiently
better fit that we are willing to reject the null
hypothesis?

Dr Arshad Ali Bhatti/ Spring 2019 12 - 26

26

13
12/18/2024

Joint Tests on Several Regression


Coefficients
• If the null hypothesis is true
– The unexplained residual sum of squares (RSS)
and the R2 would be the same in both the
restricted and unrestricted models

– Although in practice they may differ somewhat


• A relevant test statistic would compare the RSS or the
R2 in both models to determine if the difference is large
enough to be statistically significant

Dr Arshad Ali Bhatti/ Spring 2019 12 - 27

27

Joint Tests on Several Regression


Coefficients

• To test for statistical significance, the statistic is

 Where
 RSSunrestricted = the unexplained sum of squared residuals for
the unrestricted regression
 RSSrestricted = the unexplained sum of squared residuals for
the restricted regression
 q = the number of restrictions
 n − k − 1 = the number of observations minus the number of
explanatory variables
Dr Arshad Ali Bhatti/ Spring 2019 12 - 28

28

14
12/18/2024

Joint Tests on Several Regression


Coefficients
• We can convert the format of the F-statistic to contain the R2

 F will be close to zero when the null is true


 Because R2restricted ≈ R2 unrestricted
 Values of F that are “far” from zero would provide
evidence favoring H1
 F follows the F distribution with q, and n − k − 1 degrees of
freedom
 Can compare F * to F q, n−k−1 (as reported in F-Tables)
 Reject the null hypothesis if F * > Fq, n−k−1

Dr Arshad Ali Bhatti/ Spring 2019 12 - 29

29

Joint Tests on Several Regression


Coefficients
• In summary, implementing an F test involves four
separate steps
a. Run the unrestricted regression and calculate the
resulting R2
b. Run the restricted regression, again calculating the
resulting R2
c. Form the F statistic (F*)
d. Find the critical value of Fq,n − k − 1 in F-Tables that is large
enough for whatever significance level you think
appropriate
• If F* > Fq,n − k − 1 reject the null hypothesis

Dr Arshad Ali Bhatti/ Spring 2019 12 - 30

30

15
12/18/2024

Testing Whether All of the Regression Slope


Coefficients are Zero: The F Test
• We have discussed the use of F tests to test
hypotheses about a subset of coefficients
• But one specific F test is so common that it is
sometimes called “The F Test”
– Tests the hypothesis that all of the slope coefficients in a
model are jointly zero
• Does not require the intercept to be zero
• “The F Test” is not “The Only F Test”
– But a specific example of the more general form of F tests
described earlier

Dr Arshad Ali Bhatti/ Spring 2019 12 - 31

31

Testing Whether All of the Regression Slope


Coefficients are Zero: The F Test

• Consider the regression model:

 “The F Test” will test whether all three


coefficients are equal to zero
 The null and alternative hypotheses are

 H0 states that none of the variables in the model


(excluding the intercept) is statistically significant
Dr Arshad Ali Bhatti/ Spring 2019 12 - 32

32

16
12/18/2024

Testing Whether All of the Regression Slope


Coefficients are Zero: The F Test
• In the special case of “The F Test” when the null is
true, the restricted model involves a regression of Y
on a constant alone
• Should not expect a constant to explain the variation
in the dependent variable
– Thus the R2 for the restricted model would be zero
– The number of restrictions would be equal to the number
of parameters set to zero, or k

Dr Arshad Ali Bhatti/ Spring 2019 12 - 33

33

Testing Whether All of the Regression Slope


Coefficients are Zero: The F Test

• The F statistic would simplify to

Dr Arshad Ali Bhatti/ Spring 2019 12 - 34

34

17
12/18/2024

Dr Arshad Ali Bhatti/ Spring 2019 12 - 35

35

Applications

Dr Arshad Ali Bhatti/ Spring 2019 12 - 36

36

18
12/18/2024

References
• See Course outline plus
• Ashenfelter (2003), Statistics and
Econometrics, John Wiley

Dr Arshad Ali Bhatti/ Spring 2019 12 - 37

37

19

You might also like