0% found this document useful (0 votes)
13 views21 pages

Lesson 2 Simple Linear Regression

The document introduces the simple linear regression model, covering its fundamental concepts, terminology, and assumptions. It explains the relationship between dependent and independent variables, the estimation of coefficients, and the interpretation of results. The session also emphasizes the importance of minimizing residuals to find the best fit line and sets the stage for future topics on multiple regression.

Uploaded by

areebxahmed11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views21 pages

Lesson 2 Simple Linear Regression

The document introduces the simple linear regression model, covering its fundamental concepts, terminology, and assumptions. It explains the relationship between dependent and independent variables, the estimation of coefficients, and the interpretation of results. The session also emphasizes the importance of minimizing residuals to find the best fit line and sets the stage for future topics on multiple regression.

Uploaded by

areebxahmed11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Simple Linear Regression

Juvaria Tariq

Fundamentals of Econometrics — Fall 2025


Class Objectives

▶ Introduce the simple linear regression model


▶ Define regression-related terms and notation
▶ Understand residuals and the idea of the “best fit” line
▶ Derive OLS estimators
▶ Interpret estimated coefficients and visualize the fitted model
The Simple Linear Regression Model

▶ The population model:

y = β0 + β1 x + u

▶ y : dependent variable (outcome we want to explain)


▶ x: independent variable (explanatory factor)
▶ β0 , β1 : unknown parameters
▶ u: error term (other unobserved factors)
Terminology: Model vs Estimates
▶ Population model:

y = β0 + β1 x + u

▶ Estimated equation (sample):


Data: (xi , yi )
ŷ = β̂0 + β̂1 x

▶ Residual:
ûi = yi − ŷi

Population Sample (Estimate)


Parameter: β0 , β1 Coefficient: β̂0 , β̂1
Error: u Residual: û
What is Regression?

▶ Regression is a statistical method for modeling relationships


between variables.
▶ In simple linear regression, we ask:
▶ How much does y change when x changes?
▶ We fit a line that best represents the relationship between x
and y .
Basic Assumptions of the Linear Model

▶ Linearity in parameters
▶ E[u|x] = 0: zero conditional mean
▶ Random samples
Sample Dataset (Experience vs. Income)

Obs Experience (x) Income (y)


(years) ($)
1 1 19
2 2 22
3 3 21
4 4 26
5 5 24
Scatter Plot

Scatter Plot of Experience vs Income


28
27
26
25
Income ($)

24
23
22
21
20
19
18
0 1 2 3 4 5 6
Experience (years)
Finding Best Fit

Scatter Plot of Experience vs Income


28
27
26
25
Income ($)

24
23
22
21
20
19
18
0 1 2 3 4 5 6
Experience (years)
Multiple possible Lines

Many Possible Lines Through the Data


28
27 y = 18 + 1.2x
y = 17 + 1.6x
26
25
Income ($)

24
23
22
21
20
19
18
0 1 2 3 4 5 6
Experience (years)
What is the “Line of Best Fit”?

▶ Visualize data as points in a scatterplot


▶ A line attempts to capture the general trend in the data
▶ But which line is “best”?
▶ The line that minimizes the sum of squared vertical distances
(errors) between actual and predicted values
Residuals: The Vertical Gaps
▶ Residual: the prediction error for each observation
▶ ûi = yi − ŷi
▶ These are the vertical distances from the data points to the
fitted line
The Least Squares Idea

▶ We estimate ŷi = β̂0 + β̂1 xi


▶ Goal: minimize the sum of squared residuals:
n
X
(yi − (β̂0 + β̂1 xi ))2
i=1

▶ Why square?
▶ Avoid cancelling out positive/negative errors
▶ Penalize large errors more
Key Assumption: E[u|x] = 0

▶ Model: y = β0 + β1 x + u
▶ If E[u|x] = 0, then:

E[y |x] = β0 + β1 x

▶ This is the population regression function (PRF)


From Model to Data: Sample Version

▶ We observe (xi , yi ), and assume:

yi = β0 + β1 xi + ui
OLS Estimators

P
(xi − x̄)(yi − ȳ )
β̂1 =
(xi − x̄)2
P

β̂0 = ȳ − b1 x̄

▶ These are your slope and intercept estimates from data


▶ Easy to compute and interpret!
Worked Example (Using Our Dataset)

▶ Compute x̄, ȳ , then apply formulas


▶ Calculate fitted values ŷi and residuals ei = yi − ŷi
Visualizing the Fitted Line

▶ This is the line: ŷi = xi


Terminology Recap

▶ Model: population equation y = β0 + β1 x + u


▶ Estimates: fitted equation ŷi = b0 + b1 xi
▶ Residuals: ei = yi − ŷi
▶ Line of best fit: minimizes squared residuals
Interpretation of Coefficients

▶ Slope b1 : expected change in y for one-unit increase in x


▶ Intercept b0 : value of y when x = 0 (may not always be
meaningful)
Wrap-Up What’s Next

▶ Today’s focus: simple regression, derivation, interpretation


▶ Key concepts: model vs. estimates, residuals, least squares
▶ Next time: multiple regression and assumption testing

You might also like