0% found this document useful (0 votes)
16 views33 pages

L1090 Lecture5 AU24

This document discusses inference and hypothesis testing in multiple regression analysis, focusing on the statistical significance of estimated coefficients and the testing of population parameters. It outlines the assumptions required for normal sampling distributions, the use of t-distribution for hypothesis tests, and the steps involved in testing hypotheses about population parameters. Additionally, it covers the F-test for testing multiple linear restrictions and the overall significance of regression models.

Uploaded by

dkmcaxahoi8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views33 pages

L1090 Lecture5 AU24

This document discusses inference and hypothesis testing in multiple regression analysis, focusing on the statistical significance of estimated coefficients and the testing of population parameters. It outlines the assumptions required for normal sampling distributions, the use of t-distribution for hypothesis tests, and the steps involved in testing hypotheses about population parameters. Additionally, it covers the F-test for testing multiple linear restrictions and the overall significance of regression models.

Uploaded by

dkmcaxahoi8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

L1090 Intro to Econometrics

Lecture Five:
Inference and Hypothesis Testing

Shilan Dargahi
Learning Outcomes
• Discuss the statistical significance of estimated coefficients
• Test predictions about population parameters
• Test exclusion restrictions on OLS coefficients

2
Multiple Regression Analysis: Inference
• Statistical inference in the regression model
– Hypothesis tests about population parameters
– Construction of confidence intervals

• Sampling distributions of the OLS estimators


– The OLS estimates are random variables with expected value and variance
– However, for hypothesis tests we also need to know their distribution
– In order to derive their distribution we need additional assumptions
– Assumption about distribution of errors: normal distribution
Normality of the Error Term
Assumption MLR.6

It is assumed that the


unobserved factors are normally
distributed around the
population regression function.

4
Multiple Regression Analysis: Inference
Theorem : Normal sampling distributions
• Under assumptions MLR.1 – MLR.6:
The estimators are normally
𝛽መ𝑗 ~ Normal (𝛽𝑗 ,var(𝛽መ𝑗 ) ) distributed around the true
parameters with the variance
that was derived earlier
Moreover,
෡𝑗 −𝛽𝑗 The standardised estimators
𝛽 follow a standard normal
෡ ~ Normal (0,1) distribution
𝑠𝑑(𝛽𝑗 )

5
Testing hypotheses about a single population parameter
• Replacing sd(𝛽መ𝑗 ) with its sample estimate (why?)
෡𝑗 − 𝛽𝑗
𝛽 If the standardisation is done using
෡𝑗 ) ~ 𝑡𝑛−𝑘−1 the estimated standard errors, the
𝑠𝑒(𝛽
normal distribution is replaced by a
t-distribution

• The t-distribution is close to standard normal distribution if n-k-1 is


large.
• In parctice we alwyas use 𝑠𝑒(𝛽መ𝑗 )
t-distribution and standard normal distribution (z)
• At higher degrees of
freedom, the t-
distribution
converges towards z-
distribution

• When d.f.>120 we
can use the two
distributions
interchangeably

7
Inference and Hypothesis testing
෡𝑗 − 𝛽𝑗
𝛽
We can use the fact that ෡𝑗 ) ~ 𝑡𝑛−𝑘−1 to make inferences about the
𝑠𝑒(𝛽
effect of 𝑥𝑗 on expected value of y, in the population

➢ Recall that 𝛽𝑗 is unknown


➢ We can hypothesise its value and use the evidence from the sample to
test our hypothesis

8
Inference and Hypothesis testing
In a nutshell the procedure is as following

• Make a hypothesis about the value of 𝛽𝑗 . For example, H0: 𝛽𝑗 = a against H1: 𝛽𝑗 > a

• Given the observed value of 𝛽መ𝑗 in the sample, how likely is it that the hypothesis about
𝛽𝑗 is true?

• Since E(𝛽መ𝑗 ) = 𝛽𝑗 , if the hypothesis is true the difference between 𝛽𝑗 and 𝛽መ𝑗 , weighted
෡𝑗 − 𝛽𝑗
𝛽
by sampling error of 𝛽መ𝑗 ( ෡ ) should be small.
𝑠𝑒(𝛽𝑗 )
• How small the ratio needs be, to be convinced that true population parameter is equal
to the hypothesised value? (significance level)

• Decide the level of significance, which is the margin of error we are willing to accept
9
Inference and Hypothesis testing Contd.
• Suppose the hypothesis (H0) is that 𝛽𝑗 = 0 and the alternative is 𝛽𝑗 > 0
෡𝑗 − 𝛽𝑗
𝛽
• IF the hypothesis is true, the t-ratio ෡𝑗 ) follows a t-distribution and its value is close to zero
𝑠𝑒(𝛽

• But if the ratio is sufficiently larger than zero then we have evidence against our hypothesis
• By comparing this t-ratio against the t-distribution we can check the probability that our
hypothesised value is correct
• In our inference we may commit an error. We might reject our hypothesis when it is in fact true
(level of significance / type I error).
Example: if we settle to a significance level of 5% , that means we are willing to mistakenly
reject H0 when there is 5% probability that the hypothesis is true

10
Significance level and the decision rule
95% of the time,
values from t- 5% of the time,
distribution fall in the value falls in
this region this region (reject)
(Accept hypothesis)
Critical value
Is our t-ratio
smaller or large
than this value?
Steps in Testing Hypothesis about Population Parameters
• Write the null and alternative
– eg: H0: =1 against H1:  <1
• Choose a significance level: e.g. 5%
Sketch the distribution,
• Look up critical value, c, in relevant statistical tables mark on the critical value
• Calculate the test statistic and reject/non-reject
regions to avoid
• Decision: Compare the test statistic with critical value confusion
reject H0 or do not reject H0 B

Do not reject Ho

c
Reject Ho

12
Possible alternatives in hypothesis
about single population parameter

• One-sided Hypothesis tests


• H1: i>a [right-tailed test] or
• H1: i<a [left-tailed test]
• Two-sided Hypothesis tests
• H1: ia

13
Poll 5.1) Which of the following are the correct
hypotheses to test that population parameter i
is smaller than 1
a) H0: i1 ; H1: i<1
b) H0: i=1 ; H1: i<1
c) H0: ෡i=1; H1: ෡i < 1
d) H0: i>1; H1: i≤1
e) H0: i≥1 ; H1: i<1
14
Test of significance This null tests the hypothesis that
population parameter is equal to
• Null hypothesis 𝑯𝟎 : 𝜷𝒋 = 𝟎 zero. i.e. after controlling for the
other independent variables there is
no effect of 𝑥𝑗 on 𝑦
• The alternative is usually two sided 𝑯𝟏 : 𝜷𝒋 ≠ 𝟎
• t-statistic (or t-ratio) and its distribution if the null is true
෡𝑗
𝛽
𝑡𝛽෡𝑗 = ෡𝑗 ) ~𝑡𝑛−𝑘−1
𝑠𝑒(𝛽

• Define a rejection rule (Type I error): 𝐻0 is rejected in favour of


𝐻1 at the chosen significance level if | 𝑡𝛽෡𝑗 | > 𝑐
15
“Statistically significant” variables in a regression
– If a regression coefficient is different from zero, based on a two-sided test, the
corresponding variable is said to be “statistically significant”.
– If the degree of freedom is large enough so that the normal approximation
applies, we have the following rules of thumb:

16
Testing against two-sided alternatives
• Reject the null hypothesis in favour of the
alternative hypothesis if the absolute value of t-
ratio is too large.

• Construct the critical value: for ex. Critical value


of 5% means 5% chance that the null hypothesis
is true, but t-ratio falls in the rejection area.
1
• Pay attention that the rejection area is of the
2
chosen significance level at each tail of the
distribution

• Reject if absolute value of t-statistic is greater


than 2.06

17
Example: Wage equation
Question: What is the null and alternative hypotheses for the test that, after controlling for
education and tenure, the effect of experience on earning is zero against the alternative that
higher work experience leads to higher hourly wages.

18
Example: Student performance and school size
Test whether smaller school size leads to better student performance

19
Example: Student performance and school size
• Null and alternative
H0 : 𝛽መ𝑒𝑛𝑟𝑜𝑙𝑙 = 0 ; H1 : 𝛽መ𝑒𝑛𝑟𝑜𝑙𝑙 < 0 [left-tailed test] (spot the mistake?!)
• Significance level is 5%
• Critical value 𝑡𝑐 = 1.65 (n=408)
෡𝑗 − 𝛽𝑗
𝛽 −0.0002 − 0
• t-ratio = ෡𝑗 ) = = - 0.91
𝑠𝑒(𝛽 0.00022

• Decision: reject the null if 𝑡𝛽෡𝑒𝑛𝑟𝑜𝑙𝑙 < - 𝑡𝑐

• Conclusion we cannot reject the null. At 5% level of significance we don’t have


evidence that smaller class size leads to better performance.

20
Testing more general hypotheses about a
regression coefficient
• Null hypothesis
• Alternative can be one or two sided
• t-statistic

• The test works exactly as before, except that the hypothesized


value is substracted from the estimate when forming the statistic.
21
Confidence intervals

• Interpretation of the confidence interval


– The bounds of the interval are random.
– In repeated samples, the interval that is constructed in the above way
will cover the population regression coefficient in 95% of the cases.
22
Confidence intervals for typical confidence levels
• Confidence intervals for typical confidence levels

• Relationship between confidence intervals and hypotheses tests

23
Testing multiple linear restrictions: The F-test
• Testing exclusion restrictions

24
Testing multiple linear restrictions: The F-test
• Estimation of the unrestricted model

25
Testing multiple linear restrictions: The F-test
• Estimation of the restricted model

• Test statistic:

26
Testing multiple linear restrictions: The F-test
Rejection rule • F-distributed variables only take on
positive values. This corresponds to
the fact that the sum of squared
residuals can only increase if one
moves from H1 to H0.
• Choose the significance level to
certify we may incorrectly reject the
null hypothesis in, for example 5%
of the cases.
• Look up the critical value from F-
2.6
distrbution
27
Testing multiple linear restrictions: The F-test

𝐹~𝐹3,347 ⟹ Critical value at 5% level of significance with 3 and 347 degrees of freedom is 2.6
Since 𝐹 > 𝑐0.05 the null hypothesis is rejected

• Discussion
– The three variables are “jointly significant”
– They were not significant when tested individually
– The likely reason is multicollinearity between them
28
Testing multiple linear restrictions: The F-test
• The null 𝐻𝑜 : 𝛽1 = 𝛽2 = ⋯ 𝛽𝑗 = 0
• The alternative: 𝐻𝑜 is not true
• Under the null we construct the following test statistic:
(𝑆𝑆𝑅𝑟 − 𝑆𝑆𝑅𝑢𝑟 )/𝑞
𝐹= ~𝐹𝑞,𝑛−𝑘−1
𝑆𝑆𝑅𝑢𝑟
• Where 𝑆𝑆𝑅𝑟 is the SSR of restricted model under the null
• This test can be done with the R-squared of the regressions too
• The test statistic based on R-squared of restricted and unrestricted models
2 − 𝑅 2 )/𝑞
(𝑅𝑢𝑟 𝑟
𝐹= 2 ~𝐹𝑞,𝑛−𝑘−1 Degrees of
(1 − 𝑅𝑢𝑟 )/(𝑛 − 𝑘 − 1) freedom of
unrestricted
model
29
Test of overall significance of a regression
Model:

Since the restricted model has no


explanatory variables, its 𝑅2 is zero

30
Testing general linear restrictions with the F-test
• Example: Test whether house price assessments are rational

31
Testing general linear restrictions with the
F-test
• Unrestricted regression

• Restricted regression

• Test statistic

32
Summary of hypothesis testing
• Tests about a single parameter: use t-test
– Use the t-distribution to obtain critical values if the
sample size is less than 120
– Use the z-distribution for critical values for sample
sizes of 120 and above
• Tests about multiple restrictions
– Exclusion restrictions (or overall significance of
model): use F test (where possible the R2 form)
– General restrictions: use F test (SSR form only)
33

You might also like