Pertemuan 2 - Simple Linear Regression
Pertemuan 2 - Simple Linear Regression
Meet 2
Population Model
• Cross-sectional analysis
• Assume that sample is collected randomly from the population.
• We want to know how y varies with changes in x.
• What if y is affected by factors other than x.
• What is the functional form.
• How we can distinguish causality from correlation?
• Consider the following model, that hold in the population:
𝑦 = β0 + β1 𝑥 + 𝑢
Population Model
• We allow for other factors to affect y by including u (error term).
• If the other factors in u are held fixed, ∆u = 0, then x has a linear
effect on y.
• Linearity: a one unit change in x has the same effect on y.
f(y)
. E(y|x) = 0 + 1x
x1 x2
Ordinary Least Squares
● Basic idea of regression is to estimate the population parameters
from a sample
● Let {(xi,yi): I = 1, …,n} denote a random sample of size n from the
population
● For each observation in this sample, it will be the case that: yi = 0 +
1xi + ui
● ui is unobserved.
Deriving OLS Estimates
● To derive the OLS estimates we need to realize that our
main assumption of E(u|x) = E(u) = 0 also implies that
● Cov(x,u) = E(xu) = 0
● Why? Remember from basic probability that Cov(X,Y) =
E(XY) – E(X)E(Y).
● Derive this!
● We can write our 2 restrictions just in terms of x, y, 0 and
, since u = y – 0 – 1x
Deriving OLS continued
● We can write our 2 restrictions just in terms of x, y, 0 and
, since u = y – 0 – 1x
E(y – 0 – 1x) = 0
E[x(y – 0 – 1x)] = 0
𝑛
𝑦 −𝛽 −𝛽𝑥 = 0
𝑖 0 1 𝑖
𝑖=1
𝑛
𝑥 𝑖 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖 =0
𝑖=1
A short simulation
Residuals and fitted values are uncorrelated, by construction!
Algebraic Properties of OLS
● The sum of the OLS residuals is zero, coefficients were
optimally chosen to ensure that the residuals sum to zero.
● Thus, the sample average of the OLS residuals is zero as
well.
● The sample covariance (correlation) between the regressors
and the OLS residuals is zero.
● 𝑥
Because fitted values are linear functions of the 𝑖 , fitted
values and residuals are uncorrelated too.
● The OLS regression line always goes through the mean of
the sample.
● If we plug 𝑥,ҧ we predict 𝑦,
ത that is the point (𝑥, 𝑦)
ത is on the OLS
0 + β
regression line: 𝑦ത = β 1 𝑥ҧ
Algebraic Properties of OLS
• Residuals sum to zero!
തො since
• 𝑦ത = 𝑦,
• The OLS regression line always goes through the mean of the sample.