Chapter Two
Chapter Two
CHAPTER TWO
SIMPLE LINEAR REGRESSION
2.1 Definition
Regression analysis is the process of estimating the relationship between two or more
variables. In any regression there are dependent variables and explanatory
(independent) variables; hence, regression used to study the dependence of one
variable (dependent variable) over one or more of explanatory (independent) variables.
We regress the dependent variable over the explanatory variables (Regress Y on X); and
we estimate or predict the expected/mean value of the dependent variable in terms of
the known (fixed) values of the independent variables.
The Population Regression Function (PRF) which shows the Conditional Expected value
of the dependent variable (conditional upon X, the independent variable) is given as:
( )
Where: ( ): conditional mean of Y at a given value of X.
Y: the dependent and X is independent variables
: are regression coefficients. : is the Intercept Coefficient and : is
the Slope Coefficients.
Suppose the relation between household consumption expenditure and income; where
consumption expenditure is (Y) dependent variable while income (X) is the explanatory
variable. That is, consumption expenditure increases as income increases.
1
Chapter Two Simple Linear Regression
Using the population regression line, the PRF is graphically shown as follows
120 D
𝐸(𝑌 𝑋) 𝛽 𝛽𝑋
95 C
90 B
70 A
50
The population regression line is the locus of the conditional means of the dependent
variable for the fixed values of the explanatory variable(s). It contains the average
distributions of consumption (Y) at a given income (X). Consumption expenditure
increases as income increases, on average.
Note that at any given X, expenditure is random; it could be above or below the
regression line. That is, Y is randomly distributed while X is statistical fixed. However,
the conditional mean of consumption expenditure ( ( )) of household in each income
level is predictable and it is denoted by points on the regression line such as at points: A,
B, C and D. For example at income 100, the expected expenditure is birr 70 (at point A on
PRF line). But, the actual consumption expenditure be anywhere above or below point A (it
could be 50, 55, 60,… or 75, 80, 90 , etc .
Our objective in regression analysis is to find out how the average value of the
dependent variable varies with the given value of the explanatory variable.
2
Chapter Two Simple Linear Regression
Since we don’t have the entire population data we should rely on sample data to
estimate population mean value. Thus, the sample counter part of the population
regression function is referred as Sample Regression Function (SRF) which given as:
̂ ̂ ̂
If we draw the line for the estimated mean values (SRF) it will not necessarily overlap
with the line of PRF, it only approximately closes to PRF line (below the broken line
shows the estimated SRF).
Y SRF
PRF
100 D
C
90 B -
A
70
The simple linear regression model is a regression model that contains only two
variables: one dependent and one explanatory variable.
Ordinary Least Square (OLS) is one of the most widely used methods in regression. OLS
Estimators are used to estimate population parameters. However, estimate of OLS are
valid if certain key assumptions are satisfied, which referred as assumptions CLRM,
discussed below.
3
Chapter Two Simple Linear Regression
the parameters, the ' s ; however, it may or may not be linear with respect to the
explanatory, X.
Example:
Y1 0 1 X u1
Y2 i 1 X 2 u 2 are all linear in parameters regression functions
ny 0 1nX u 3
Y1 0 12 X u1
2
But Y2 0 X u 2 are non linear in parameter regressioin functions.
1
Y3 0 1 2 X u 3
Assumption 3: The expected value of the error terms (mean value) is zero.
Zero mean value of the error terms is essential summation of errors at each value of the
explanatory variable is zero; ( )= 0.
u
X
Assumption 4: Homoscedasticity or constant variance of the error terms, u i .
conditional variance of the error term (conditional upon the explanatory variable, X);
more specifically:
Var ui xi Eu i E u i xi
2
E ui2 2 i 1 n
( )
4
Chapter Two Simple Linear Regression
Even if we vary the value of the explanatory variable (X), the variance of error terms
corresponding to each value of the explanatory variable is the same. The opposite of
Homoscedasticity is heteroscedasticity which means the variance of the error term is
not constant.
Assumption 5: No autocorrelation between the disturbance or error terms.
Given any two X values, X i & X j i j , the correlation between any two error terms
ui & u j i j is zero. There should be no correlation / covariance between two or more
error terms. That is:
( ) [[ ( )][ ( )]]
( ) [[ ][ ]] ( )= ( ) ( )=0
Assumption 6: Zero covariance between the error terms and the explanatory
Variables, Xi. That is: E u i xi 0
Assumption 7: The number of observations ‘n’ must be greater than the number of
Explanatory variables. In other words, the number of observations ‘n’ must be greater
than the number of parameters to be estimated.
5
Chapter Two Simple Linear Regression
When the error term is normally distributed, then, Y (dependent variable) and the
parameters of the regression are also normally distributed, so that tests can be
conducted on the statistical significance of parameters.
The OLS regression method is based on the assumptions of CLRM. Estimates of OLS are
acceptable if the CLRM assumptions are satisfied in the process. If those assumptions
are not satisfied, OLS can’t be used. The OLS method has some very attractive statistical
properties that have made it one of the most powerful and popular methods of
regression analysis which will be discussed later.
Since the PRF is not observable, we estimate it from the SRF (Sample Regression
Function):
̂ = ̂ ̂ Y
SRF
̂ ̂ * *
̂ ̂ ̂
̂ - (̂ ̂ )
*
̂ = -̂
e1
X
Now, given data or observations on Y& X, we would like to determine the SRF in such a
manner that the estimated value Yˆi is as close as possible to the actual Y. To this end,
6
Chapter Two Simple Linear Regression
the OLS method determines or estimates ˆ 0 and ˆ1 in such a way that it minimizes the
error or, it minimizes the Squared Sum of the Residuals (RSS) as possible as it could be.
∑ ̂ Yi ˆ0 ˆ1 X i 2
Min
ˆ ˆ
:
0 .1
2
u i
Yi ˆ0 ˆ1 X i
2
Then take the partial derivatives with respect to ˆ0 and ˆ1 as follows
F.O.C
2
ui
2 Yi ˆ0 ˆ1 X i 2 u i 01
ˆ
0
2
ui
2 Yi ˆ0 ˆ1 X i X i 2 xi ui 0 2
ˆ 1
Y i
ˆ0 ˆ1
X i 0 ; Thus,
n n
̂ =̅ ̂ ̅
̂ is the least square point estimator for
Where: ̅ ̅ are average values (mean) of Y and X respectively. This is the least
square estimator for 0 the intercept term.
[∑( ̂ ̂ )]
∑ ̂ ∑ ̂ ∑
∑ (̅ ̂ ̅) ∑ ̂ ∑
∑ (̅ ̂ ̅) ̅ ̂ ∑
∑ ̅̅ + ̂ ̅ ̂ ∑
7
Chapter Two Simple Linear Regression
∑ ̅̅
̂
∑ ̅
-
̂ is the least square point estimator for
Numerical Example 1: Consider a hypothetical date on output (Y) produced and labor
input ( ) used for a firm are give as follows:
Obs. 1 2 3 4 5 6 7 8 9 10
(Firm)
Y 11 10 12 6 10 7 9 10 11 10
X 10 7 10 5 8 8 6 7 9 10
Then we have: two variables Y (the dependent variable) and X the explanatory variable,
sample size: ,
∑ , ∑ , ∑ ,∑ ,∑ ,̅ ̅
ˆ1
Y X nyx 789 1089.6 21 0.75
i i
X nx 668 108 28
2 2 2
i
8
Chapter Two Simple Linear Regression
Where ̂ and ̂ are Point Estimates of the true parameters. The value
for ̂ ( ) interpreted as the marginal product of labor; for a one unit increase in
labor employment, total output will increase by 0.75 unit.
∑ ,∑ ,∑ , ∑ ̂ =̂ ∑ Note:̂
∑
∑
=
̅
X
X
2. The mean value of the estimated or the fitted value (̂) is equal to the mean of the
actual Y: ̅ ̅
̂
̅ = ̅̂ + 0; Hence, ̅ = ̂
̅
̅ ̅
̂ ̂ ̂ ̅= ̅
3. The residuals and the estimated value of the dependent variable are uncorrelated,
That is: (̂ ̂ ) = 0
𝑐𝑜𝑣(𝑌̂𝑖 𝑢̂ 𝑖 ) =𝐸 [ 𝑌̂ 𝑌̅̂ 𝑢̂ 𝑈 ̅
̂ ] = 𝐸[(𝑦̂)𝑢̂]
= 𝐸[(𝛽̂ 𝑥𝑖 )𝑢 ̂] = 𝛽̂ 𝐸[𝑥 𝑢
𝑖 ̂] = 0
Since 𝐸[𝑥𝑖 𝑢̂] = cov [𝑥𝑖 𝑢̂] = 0
9
Chapter Two Simple Linear Regression
The coefficient of determination is a summary measure that tells how well the
sample regression line fits the observation (data), in simple regression. Using sample
observation we produce the SRF (Sample Regression Function). The measure of the
‘goodness of fit’, which denoted by in simple regression model, helps us to see how
close is the estimated sample regression line to the population regression line.
Recall that, Written In deviation form; where ̅ ̂ ̂ ̅̂
ESS = ∑ ̂ ̂ ∑ = ∑ ∑̂
RSS = ∑ ̂ = ∑ ∑̂ = ∑ ̂ ∑
TSS = ESS + RSS
Where:
TSS: Total sum of square (Total variation of the dependent variable),
ESS: Explained sum of Square, Explained variation accounted for the explanatory
variable.
RSS: Residual sum of square, (Unexplained variation), the variation in the dependent
variable that is not explained by explanatory variable in the model.
The coefficient of determination ( ) is computed as a ratio between the ESS and TSS
obtained from the data.
From our numerical example 1 above, we compute TSS, ESS and RSS as follows:
a) TSS = ∑ = ∑ ̅
10
Chapter Two Simple Linear Regression
b) ESS ̂ ∑ = = 15.75
c) RSS = TSS – ESS = 30.4 – 15.75 = 14.65
= = 0.52
Interpretation: = 0.52 means that about 52% of the variation in output(Y) is explained
by the variation in labor hour input (X).
If = 0; the model doesn’t explain anything, the explanatory variable doesn’t explain
the changes on the dependent variable.
If = 1 means perfect fit: ̂
1) It is linear, that is, a linear function of a random variable, such as the dependent
variable Y
2) It is unbiased, that is, the expected value of the estimator is equal to its true
value. E ̂1 = 1 and E( ̂ )
3) It has minimum variance (efficient estimator) in the class of all such linear
unbiased estimators.
Proof
1) ˆ1 is linear in Yi
Yi ˆ ˆ
0 1
xi U i
ˆ1
y x
i i
, Let
xi
wi
x xi
2 2
i
ˆ1 wY
i i wiY1 w2Y2 wnYn , where wi ' s are
fixed sin ce xi ' s are fixed.
11
Chapter Two Simple Linear Regression
E ˆ1 E 1 wi ui
E 1 E wi u1 w2u 2 wn u n
E 1 w1 E u1 w2 E u 2 wn E u n
1 0
E ˆ1 1
time, we rely on the precision of these estimates in representing the true parameters
( ). The measure of such precision is the standard error.
The variances and standard error of the OLS estimators are computed below.
∑̂
̂ = and ̂ √ ̂ - the standard error
Where
̂ estimator of the actual variance of the error term ( )
RSS: Residual Sum of Squares, and
is the degree of freedom, where n is the sample size and k is the number of
variables in the model.
̂ ∑ ̂ √∑
b) (̂ ) = ∑
and (̂ )
√ ∑
̂ ∑
From our example: (̂ ) = ∑
= = = 4.366 ( ̂ )= √ = 2.09
̂
c) (̂ ) =∑ and (̂ )
√∑
12
Chapter Two Simple Linear Regression
̂
From our numerical example 1: (̂ ) = ∑
= = 0.065’ ( ̂ )= √ = 0.256
d)
cov ˆ0 , ˆ1 x var ˆ1
More variation in X (the explanatory variable) and increase in sample size (n) increases
precision of the estimators ( ˆ0 & ˆ1 ), this so because it will reduce the variance of
estimators
2.5. Implications of the Normality Assumption
The classical normal linear regression model assumes that ui N 0, 2 : The Error
Term is normally distributed with mean zero and variance this would imply that the
dependent variable and the coefficients are also normally distributed.
Yi N 0 1 x, 2 , because Yi is a linear fucntion of ui , hence Y is also normally
distributed
Similarly, ˆ0 & ˆ1 are Normally distributed with the following mean and
∑
variances ̂ N( ⁄∑ ) and ̂ N ∑
Z
1 1 ˆ 1
ˆ 0
t nk
1 0
t nk Similarly,
Se ˆ Se ˆ
se 1 1 0
Symbolically: ˆ2 2 ˆ2 1 , where 0 1 is known as the level of
significance (or probability of committing a type – I error)
13
Chapter Two Simple Linear Regression
The confidence interval for the true will be constructed as follows given the value
and the degree of freedom for the t- critical value.
̂
P[ ⁄ ̂ ⁄ ]=
Where: Denote the degree of freedom for the t – critical value, since we have two
parameters (in the case of two variables model) k = 2, hence, and if = 5%;
Hence, , the confidence level, will be 0.95 or 95%.Then a 95% confidence interval
for is given as follows by rearranging the above statement:
p[ ̂ ( ⁄ ) (̂ ) ̂ ( ⁄ ) ( ̂ )]= 95%
Using the previous firm example; we have estimated the model as:
; & , ( ̂ )= 2.09
Since we are intended to construct a 95% confidence interval, 0.5 and = 0.025, the
Interpretation: Given the confidence coefficient of 95%, in the long run, in 95% out of
100 cases intervals like (-1.22, 8.42) will contain the true 0 .
p[ ̂ ( ⁄ ) (̂ ) ̂ ( ⁄ ) ( ̂ )] = 95%
0.16, 1.34 The interval estimates of the true value of at 95% confidence
14
Chapter Two Simple Linear Regression
Hypothesis testing could be a two- or one-tail test .Whether one uses a two or one-tail
test will depend upon how the alternative hypothesis is formulated.
Critical region
𝛼⁄ Critical region (rejection
region
95% Region of 𝛼⁄
Acceptance
25 %
⁄ ⁄
Decision rule:
Reject H0 if the absolute : | ⁄
15
Chapter Two Simple Linear Regression
̂
̂ ̂ =
-2.306 2.306
t – tabulated - the critical Value from the table at ⁄ = 0.025 and is equal to
2.306; [ ]. But, the t- value computed is 2.93.which is outside the
acceptance region, in other words, the estimated value lies in the critical (rejection)
region.
y yˆ u
2 2
TSS ESS RSS
i i i
2 2 2 2 2 2
Thus,
ESS df
ESS k 1
RSS df RSS n k
F k 1, n k
16
Chapter Two Simple Linear Regression
Decision Rule
Reject H0 if and conclude that the regression coefficients are
jointly statistically significantly different from zero. The joint effects of the explanatory
variables on the dependent variable is significant; and the model reliable for prediction.
In reporting regression results we should follow a standard format where all relevant
results are included. The most commonly reported results includes: estimates of
coefficients, standard errors, t- statistics computed, sample size, , F – values , level of
significance ( ), and other test results.
Y i 3.6 0.75xi
Se 2.09 0.256 r 2 0.52 n = 10, RSS = 14.65,
t 1.72 2.93 df 8 , F1,8 8.6
Obs. 1 2 3 4 5 6 7 8 9 10
Y 70 65 90 95 110 115 120 140 155 150
X 80 100 120 140 160 180 200 220 240 260
17
Chapter Two Simple Linear Regression
We have: two variables Y (the dependent variable) and X the explanatory variable and
sample size: .∑ ,∑ ,∑ ,
̅ ̅ In deviation form: ∑ ∑
The model is specified as:
a) Estimate the two regression coefficients:
∑
̂ = = 0.5091
∑
̂ ̅ ̂ ̅ = 111
d) Compute the estimated variance or the sample variance, ̂ , and the standard error
̂ and √̂ = √
̂ ∑
se( ̂ ) = √ ( ̂ ) =√ ∑
=√ ; var( ̂ ) = = 41.104
18
Chapter Two Simple Linear Regression
From the report above, we can directly conclude that both coefficients are statistically
significantly different from zero at 0.05 level of significant and the claim of the null
hypothesis is rejected. This is because the absolute value of is
greater than the - critical valueat the given ( )and level of significant
( ) , for both coefficients.
19