LINEAR REGRESSION
(NUMERICAL EXAMPLE)
Lecture # 02a
Dr. Imran Khalil
[email protected] Contents
• Linear Regression with one
Variable
• Cost Function
• Squared Error Cost Function
2
Linear Regression with One Variable
❑ How to represent 𝒇?
𝒇𝒘,𝒃 𝒙 = 𝒘𝒙 + 𝒃
We can write as
𝒇 𝒙 = 𝒘𝒙 + 𝒃
❑ 𝒘 = slope of the function
❑ 𝒃 = y-intercept
❑ Univariate linear regression
3
Advertising Expenditures and Sales Revenues of the
Firm in Each of 10 Years Scatter Diagram
Year Adv. Sale
(𝒙𝒊 ) (𝒚𝒊 )
1 10 44
2 9 40
3 11 42
4 12 46
5 11 48
6 12 52
7 13 54
8 13 58
9 14 56
10 15 60
4
Regression Analysis
Regression Line: Line of
Best Fit:
Draw the line, by
visual inspection, the
positively sloped straight
line that “best” fits
between the data points
(so that the data points
are about equally distant
on either side of the
line).
5
Cost Function Intuition
❑ Model
𝒇𝒘,𝒃 𝒙 = 𝒘𝒙 + 𝒃
❑ Parameters
𝒘, 𝒃
❑ Cost Function 𝒎
𝟏 𝟐
𝑱𝒘,𝒃 = 𝒇𝒘,𝒃 𝒙𝒊 − 𝒚𝒊
𝟐𝒎
𝒊=𝟏
❑ Objective
𝑴𝒊𝒏𝒊𝒎𝒊𝒛𝒆 𝑱𝒘,𝒃
𝒘,𝒃 6
Example
Year Adv. Sale
(𝒙𝒊 ) (𝒚𝒊 ) σ𝒎
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
• 𝒘= σ𝒎
1 10 44 𝒙 𝟐
𝒊=𝟏 𝒙𝒊 −ഥ
2 9 40
3 11 42
• ഥ − 𝒘ഥ
𝒃=𝒚 𝒙
4 12 46 • 𝒎=
5 11 48
σ𝒎
𝒊=𝟏 𝒙𝒊
6 12 52 • ഥ=
𝒙
𝒏
7 13 54
σ𝒎
𝒊=𝟏 𝒚𝒊
8 13 58 • ഥ=
𝒚 =
9 14 56 𝒏
10 15 60
120 500
7
Example
Year Adv. Sale
(𝒙𝒊 ) (𝒚𝒊 ) σ𝒎
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
• 𝒘= σ𝒎
1 10 44 𝒙 𝟐
𝒊=𝟏 𝒙𝒊 −ഥ
2 9 40
3 11 42
• ഥ − 𝒘ഥ
𝒃=𝒚 𝒙
4 12 46 • 𝒎 = 𝟏𝟎
5 11 48
σ𝒎
𝒊=𝟏 𝒙𝒊 𝟏𝟐𝟎
6 12 52 • ഥ=
𝒙 = = 𝟏𝟐
𝒏 𝟏𝟎
7 13 54
σ𝒎
𝒊=𝟏 𝒚𝒊 𝟓𝟎𝟎
8 13 58 • ഥ=
𝒚 = = 𝟓𝟎
9 14 56 𝒏 𝟏𝟎
10 15 60
120 500
8
Example
Year Adv. Sale ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚 ഥ ∙
𝒙𝒊 − 𝒙 ഥ 𝟐
𝒙𝒊 − 𝒙
(𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 σ𝒎
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
• 𝒘= 𝒎
2 9 40 σ𝒊=𝟏 𝒙𝒊 −ഥ𝒙 𝟐
3 11 42 • ഥ − 𝒘ഥ
𝒃=𝒚 𝒙
4 12 46 • 𝒎 = 𝟏𝟎
5 11 48 • ഥ=
𝒙
σ𝒎
𝒊=𝟏 𝒙𝒊
=
𝟏𝟐𝟎
= 𝟏𝟐
𝒏 𝟏𝟎
6 12 52
σ𝒎
𝒊=𝟏 𝒚𝒊 𝟓𝟎𝟎
• ഥ=
𝒚 = = 𝟓𝟎
7 13 54 𝒏 𝟏𝟎
8 13 58
9 14 56
10 15 60
120 500
9
Example
Year Adv. Sale ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚 ഥ ∙
𝒙𝒊 − 𝒙 ഥ 𝟐
𝒙𝒊 − 𝒙
(𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 -2 -6 12 4 σ𝒎
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
• 𝒘= 𝒎
2 9 40 -3 -10 30 9 σ𝒊=𝟏 𝒙𝒊 −ഥ𝒙 𝟐
3 11 42 -1 -8 8 1 • ഥ − 𝒘ഥ
𝒃=𝒚 𝒙
4 12 46 0 -4 0 0 • 𝒎 = 𝟏𝟎
5 11 48 -1 -2 2 1 • ഥ=
𝒙
σ𝒎
𝒊=𝟏 𝒙𝒊
=
𝟏𝟐𝟎
= 𝟏𝟐
𝒏 𝟏𝟎
6 12 52 0 2 0 0
σ𝒎
𝒊=𝟏 𝒚𝒊 𝟓𝟎𝟎
• ഥ=
𝒚 = = 𝟓𝟎
7 13 54 1 4 4 1 𝒏 𝟏𝟎
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
10
Example
Year Adv. Sale ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚 ഥ ×
𝒙𝒊 − 𝒙 ഥ 𝟐
𝒙𝒊 − 𝒙
(𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 -2 -6 12 4 σ𝒎
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
• 𝒘= 𝒎
2 9 40 -3 -10 30 9 σ𝒊=𝟏 𝒙𝒊 −ഥ𝒙 𝟐
3 11 42 -1 -8 8 1 • 𝒘=
4 12 46 0 -4 0 0 • 𝒘=
5 11 48 -1 -2 2 1 • ഥ − 𝒘ഥ
𝒃=𝒚 𝒙
6 12 52 0 2 0 0 • 𝒃=
7 13 54 1 4 4 1 • 𝒃=
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
11
Example
Year Adv. Sale ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚 ഥ ×
𝒙𝒊 − 𝒙 ഥ 𝟐
𝒙𝒊 − 𝒙
(𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 -2 -6 12 4 σ𝒎
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
• 𝒘= 𝒎
2 9 40 -3 -10 30 9 σ𝒊=𝟏 𝒙𝒊 −ഥ𝒙 𝟐
𝟏𝟎𝟔
3 11 42 -1 -8 8 1 • 𝒘= 𝟑𝟎
4 12 46 0 -4 0 0 • 𝒘 = 𝟑. 𝟓𝟑𝟑
5 11 48 -1 -2 2 1
• ഥ − 𝒘ഥ
𝒃=𝒚 𝒙
6 12 52 0 2 0 0
• 𝒃 = 𝟓𝟎 − 𝟑. 𝟓𝟑𝟑 𝟏𝟐
7 13 54 1 4 4 1
• 𝒃 = 𝟕. 𝟔𝟎
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
12
Example
σ𝒎
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
• 𝒘= σ𝒎 𝒙 𝟐
𝒊=𝟏 𝒙𝒊 −ഥ
𝟏𝟎𝟔 𝒇𝒘,𝒃 𝒙 = 𝒘𝒙𝒊 + 𝒃
• 𝒘=
𝟑𝟎
• 𝒘 = 𝟑. 𝟓𝟑𝟑 𝒚ෝ𝒊 = 𝒘𝒙𝒊 + 𝒃
• ഥ − 𝒘ഥ
𝒃=𝒚 𝒙
𝒚ෝ𝒊 = 𝟑. 𝟓𝟑𝒙𝒊 + 𝟕. 𝟔𝟎
• 𝒃 = 𝟓𝟎 − 𝟑. 𝟓𝟑𝟑 𝟏𝟐
• 𝒃 = 𝟕. 𝟔𝟎
13
Ordinary Least Squares Method (OLS)
𝑌𝑖 = 7.60 + 3.53𝑋𝑖
• This regression line indicates that with zero advertising expenditures
(i.e., with 𝑋𝑡 = 0), the expected sales revenue of the firm 𝑌𝑡 is $7.60
million.
𝑌𝑡 = 7.60 + 3.53 0 = $7.60 𝑚𝑖𝑙𝑙𝑖𝑜𝑛
• With advertising of $10 million as in the first observation year (𝑋1 =
$10 𝑚𝑖𝑙𝑙𝑖𝑜𝑛)
𝑌1 = 7.60 + 3.53 10 = $42.90 𝑚𝑖𝑙𝑙𝑖𝑜𝑛
• On the other hand, with 𝑋10 = $15 𝑚𝑖𝑙𝑙𝑖𝑜𝑛
𝑌10 = 7.60 + 3.53 15 = $60.55 𝑚𝑖𝑙𝑙𝑖𝑜𝑛
14
Ordinary Least Squares Method (OLS)
Plotting these last two
points (𝟏𝟎, 𝟒𝟐. 𝟗𝟎) and
(15, 60.55) and joining
them by a straight line, we
obtain the regression line.
15
Year Adv. Sale ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚 ഥ ×
𝒙𝒊 − 𝒙 ഥ
𝒙𝒊 − 𝒙 𝟐 𝒚ෝ𝒊 = 𝟑. 𝟓𝟑𝒙𝒊 + 𝟕. 𝟔𝟎
(𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 -2 -6 12 4
2 9 40 -3 -10 30 9
3 11 42 -1 -8 8 1
4 12 46 0 -4 0 0
5 11 48 -1 -2 2 1
6 12 52 0 2 0 0
7 13 54 1 4 4 1
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
16
Year Adv. Sale ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚 ഥ ×
𝒙𝒊 − 𝒙 ഥ
𝒙𝒊 − 𝒙 𝟐 𝒚ෝ𝒊 = 𝟑. 𝟓𝟑𝒙𝒊 + 𝟕. 𝟔𝟎
(𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 -2 -6 12 4 42.90
2 9 40 -3 -10 30 9 39.37
3 11 42 -1 -8 8 1 46.43
4 12 46 0 -4 0 0 49.96
5 11 48 -1 -2 2 1 46.43
6 12 52 0 2 0 0 49.96
7 13 54 1 4 4 1 53.49
8 13 58 1 8 8 1 53.49
9 14 56 2 6 12 4 57.02
10 15 60 3 10 30 9 60.55
120 500 106 30
17
Mean Square Error Cost Function
𝒎
𝟏 𝟐
𝑴𝒊𝒏. 𝑱𝒘,𝒃 = 𝒇𝒘,𝒃 𝒙𝒊 − 𝒚𝒊
𝟐𝒎
𝒊=𝟏
𝒎
𝟏 𝟐
𝑴𝒊𝒏. 𝑱𝒘,𝒃 = (𝒘𝒙𝒊 + 𝒃) − 𝒚𝒊
𝟐𝒎
𝒊=𝟏
𝒎
𝟏 𝟐
𝑴𝒊𝒏. 𝑱𝒘,𝒃 = 𝒚ෝ𝒊 − 𝒚𝒊
𝟐𝒎
𝒊=𝟏
18
Yea Adv. Sale ഥ 𝒚𝒊 − 𝒚
𝒙𝒊 − 𝒙 ഥ ഥ ×
𝒙𝒊 − 𝒙 ഥ
𝒙𝒊 − 𝒙 𝟐 𝒚ෝ𝒊 = 𝟑. 𝟓𝟑𝒙𝒊 + 𝟕. 𝟔𝟎 𝒚ෝ𝒊 − 𝒚𝒊 𝟐
r (𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 -2 -6 12 4 42.90
2 9 40 -3 -10 30 9 39.37
3 11 42 -1 -8 8 1 46.43
4 12 46 0 -4 0 0 49.96
5 11 48 -1 -2 2 1 46.43
6 12 52 0 2 0 0 49.96
7 13 54 1 4 4 1 53.49
8 13 58 1 8 8 1 53.49
9 14 56 2 6 12 4 57.02
10 15 60 3 10 30 9 60.55
120 500 106 30
19
Yea Adv. Sale ഥ 𝒚𝒊 − 𝒚
𝒙𝒊 − 𝒙 ഥ ഥ ×
𝒙𝒊 − 𝒙 ഥ
𝒙𝒊 − 𝒙 𝟐 𝒚ෝ𝒊 = 𝟑. 𝟓𝟑𝒙𝒊 + 𝟕. 𝟔𝟎 𝒚ෝ𝒊 − 𝒚𝒊 𝟐
r (𝒙𝒊 ) (𝒚𝒊 ) ഥ)
(𝒚𝒊 − 𝒚
1 10 44 -2 -6 12 4 42.90 1.2100
2 9 40 -3 -10 30 9 39.37 0.3969
3 11 42 -1 -8 8 1 46.43 19.6249
4 12 46 0 -4 0 0 49.96 15.6816
5 11 48 -1 -2 2 1 46.43 2.4649
6 12 52 0 2 0 0 49.96 4.1616
7 13 54 1 4 4 1 53.49 0.2601
8 13 58 1 8 8 1 53.49 20.3401
9 14 56 2 6 12 4 57.02 1.0404
10 15 60 3 10 30 9 60.55 0.3025
120 500 106 30 65.4830
20
Mean Square Error Cost Function
𝒎
𝟏 𝟐
𝑴𝒊𝒏. 𝑱𝒘,𝒃 = 𝒚ෝ𝒊 − 𝒚𝒊
𝟐𝒎
𝒊=𝟏
21
Mean Square Error Cost Function
𝒎
𝟏 𝟐
𝑴𝒊𝒏. 𝑱𝒘,𝒃 = 𝒚ෝ𝒊 − 𝒚𝒊
𝟐𝒎
𝒊=𝟏
𝟏
𝑱𝒘,𝒃 = 𝟔𝟓. 𝟒𝟖𝟑𝟎
𝟐 𝟏𝟎
𝑱𝒘,𝒃 = 𝟑. 𝟐𝟕
22
𝟐
Coefficient of determination (𝑹 )
R2 =
Explained Variation
=
(Yˆ − Y ) 2
TotalVariation t
(Y − Y ) 2
The total variation in the firm’s sales is accounted for by the variation
in the firm’s advertising expenditures.
23
Yea Adv. Sale ഥ
𝒚𝒊 − 𝒚 ഥ)𝟐
(𝒚𝒊 − 𝒚 𝒚ෝ𝒊 𝒚ෝ𝒊 − 𝒚
ഥ𝒊 𝟐
r (𝒙𝒊 ) (𝒚𝒊 )
1 10 44 -6 42.90
2 9 40 -10 39.37
3 11 42 -8 46.43
4 12 46 -4 49.96
5 11 48 -2 46.43
6 12 52 2 49.96
7 13 54 4 53.49
8 13 58 8 53.49
9 14 56 6 57.02
10 15 60 10 60.55
120 500
24
Yea Adv. Sale ഥ
𝒚𝒊 − 𝒚 ഥ)𝟐
(𝒚𝒊 − 𝒚 𝒚ෝ𝒊 𝒚ෝ𝒊 − 𝒚
ഥ𝒊 𝟐
r (𝒙𝒊 ) (𝒚𝒊 )
1 10 44 -6 36 42.90 50.41
2 9 40 -10 100 39.37 112.99
3 11 42 -8 64 46.43 12.744
4 12 46 -4 16 49.96 0.0016
5 11 48 -2 4 46.43 12.744
6 12 52 2 4 49.96 0.0016
7 13 54 4 16 53.49 12.180
8 13 58 8 64 53.49 12.180
9 14 56 6 36 57.02 49.280
10 15 60 10 100 60.55 111.30
120 500 440 373.83
25
𝟐
Coefficient of determination (𝑹 )
R2 =
Explained Variation
=
(Yˆ − Y ) 2
TotalVariation t
(Y − Y ) 2
26
𝟐
Coefficient of determination (𝑹 )
R2 =
Explained Variation
=
(Yˆ − Y ) 2
TotalVariation t
(Y − Y ) 2
373.84
R =
2
= 0.85
440.00
This means that 85% of the total variation in the firm’s sales is
accounted for by the variation in the firm’s advertising expenditures.
27
Coefficient of correlation (𝒓)
𝒓= 𝑹𝟐
This is simply a measure of the degree of association or covariation
that exists between variables 𝑋 & 𝑌 . For our advertising-sales
example,
𝒓= 𝑹𝟐 = 𝟎. 𝟖𝟓 = 𝟎. 𝟗𝟐
This means that variables 𝑋 & 𝑌 vary together 92% of the time.
28
Reference
• [Dominick Salvatore] Managerial Economics in a Global Economy 4th
Edition
29
Acknowledgment
• Material presented in these lecture slides is obtained from Prof. Andrew
Ng course on Machine Learning
• Dr. Iftikhar Ahmad’s lecture slides were consulted for assistance.
• .
30