LINEAR REGRESSION
(NUMERICAL
EXAMPLE) Lecture #
02a
Dr. Imran Khalil
[email protected]Contents
• Linear Regression with one
Variable
• Cost Function
• Squared Error Cost Function
Linear Regression with One Variable
❑ How to represent ��?
����,�� �� = ���� + ��
We can write as
�� �� = ���� + ��
❑ �� = slope of the function
❑ �� = y-intercept
❑ Univariate linear regression
Advertising Expenditures and Sales Revenues of the
Firm in Each of 10 Scatter Diagram
Years
9 14 56
Year Adv. Sale 10 15 60
(����) (���
1 10 �)
44
2 9 40
3 11 42
4 12 46
5 11 48
6 12 52
7 13 54
8 13 58
4
Regression Analysis
Regression Line: Line of
Best Fit:
Draw the line, by
visual inspection, the
positively sloped straight
line that “best” fits
between the data points
(so that the data points
are about equally distant
on either side of the
line).
Cost Function Intuition
❑ Model
����,�� �� = ����+��
❑ Parameters
��,��
❑ Cost Function
�� ��
����,��=��������=�� ����,���� −
��
�� ��
❑ Objective
���������������� ��,������,��
6
Example
Year Adv. Sale
(����) (���
1 10 �)
44
2 9 40
3 11 42
4 12 46
5 11 48
6 12 52
7 13 54
8 13 58
9 14 56
10 15 60
120 500
��
(����−��ഥ)(����−��ഥ)
σ
•��= ��=��
�� ��
����−��ഥ
σ��=��
•��=��ഥ−����ഥ
•��=
��
����
σ
•��ഥ= ��=��
��
��
����
σ
•��ഥ= ��=��
��=
Example
Year Adv. Sale
(����) (���
1 10 �)
44
2 9 40
3 11 42
4 12 46
5 11 48
6 12 52
7 13 54
8 13 58
9 14 56
10 15 60
120 500
��
(����−��ഥ)(����−��ഥ)
σ
•��= ��=��
�� ��
����−��ഥ
σ��=��
•��=��ഥ−����ഥ
•��= ����
��
����
σ
•��ഥ= ��=��
������
��=
����= ����
��
����
σ
•��ഥ= ��=��
������
��=
����= ����
Example
Year Adv. Sale ���� − ���� − ���� − ���� −
(����) (����) ��ഥ ��ഥ ��ഥ ∙ ��ഥ��
1 10 44 (���� −
��ഥ)
2 9 40
3 11 42
4 12 46
5 11 48
6 12 52
7 13 54
8 13 58
9 14 56
10 15 60
120 500
��
(����−ഥ��)(����−��ഥ)
σ
• �� = ��=��
�� ��
����−ഥ��
σ��=��
• �� = ��ഥ − ����ഥ
• �� = ����
��
����
σ
• ��ഥ = ��=��
������
��=
����= ����
��
����
σ
• ��ഥ = ��=��
������
��=
����= ����
9
Example
Year Adv. Sale ���� − ���� − ���� − ���� −
(����) (����) ��ഥ ��ഥ ��ഥ ∙ ��ഥ��
1 10 44 (���� −
-2 -6 ��ഥ) 4
12
2 9 40 -3 -10 30 9
3 11 42 -1 -8 8 1
4 12 46 0 -4 0 0
5 11 48 -1 -2 2 1
6 12 52 0 2 0 0
7 13 54 1 4 4 1
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
��
(����−ഥ��)(����−��ഥ)
σ
• �� = ��=��
�� ��
����−ഥ��
σ��=��
• �� = ��ഥ − ����ഥ
• �� = ����
��
����
σ
• ��ഥ = ��=��
������
��=
����= ����
��
����
σ
• ��ഥ = ��=��
������
��=
����= ����
10
Example
Year Adv. Sale ���� − ���� − ���� − ���� −
(����) (����) ��ഥ ��ഥ ��ഥ × ��ഥ��
1 10 44 (���� −
-2 -6 ��ഥ) 4
12
2 9 40 -3 -10 30 9
3 11 42 -1 -8 8 1
4 12 46 0 -4 0 0
5 11 48 -1 -2 2 1
6 12 52 0 2 0 0
7 13 54 1 4 4 1
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
��
(����−ഥ��)(����−��ഥ)
σ
• �� = ��=��
�� ��
����−ഥ��
σ��=��
• �� =
• �� =
• �� = ��ഥ − ����ഥ
• �� =
• �� =
11
Example
Year Adv. Sale ���� − ���� − ���� − ���� −
(����) (����) ��ഥ ��ഥ ��ഥ × ��ഥ��
1 10 44 (���� −
-2 -6 ��ഥ) 4
12
2 9 40 -3 -10 30 9
3 11 42 -1 -8 8 1
4 12 46 0 -4 0 0
5 11 48 -1 -2 2 1
6 12 52 0 2 0 0
7 13 54 1 4 4 1
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
��
(����−ഥ��)(����−��ഥ)
σ
• �� = ��=��
�� ��
����−ഥ��
σ��=��
������
• �� =
����
• �� = ��. ������
• �� = ��ഥ − ����ഥ
• �� = ���� − ��.
������ ���� • ��
= ��. ����
12
Example
σ
• �� = ��=��
�� �� ��
����−��ഥ
(����−��ഥ)(����−�� σ��=��
ഥ)
������
• �� = ������ + ��
����
• �� = ��. ������ ��ෝ�� =
• �� = ��ഥ − ����ഥ
������ + ��
• �� = ���� − ��.
������ ���� • ��ෝ�� = ��.
�� = ��. ���� �������� + ��.
����
����,�� �� =
13
Ordinary Least Squares Method (OLS)
���� = 7.60 + 3.53����
• This regression line indicates that with zero advertising expenditures
(i.e., with ���� = 0), the expected sales revenue of the firm ����is
$7.60
million.
���� = 7.60 + 3.53 0
= $7.60 �
�������������
• With advertising of $10 million as in the first observation year (��1 =
$10 $42.90
�������������� ��������������
)
��1 = 7.60 + 3.53 10 =
• On the other hand, with ��10 = $15
�������������� ��10 =
7.60 + 3.53 15 = $60.55
��������������
14
Ordinary Least Squares Method (OLS)
Plotting these last two
points (����, ����.
����) and (15, 60.55)
and joining them by a
straight line, we obtain the
regression line.
15
Year Adv. Sale ���� − ���� − ���� − ���� − ��ෝ�� = ��.
(����) (����) ��ഥ ��ഥ ��ഥ × ��ഥ�� �������� + ��.
1 10 44 (���� − ����
-2 -6 ��ഥ) 4
12
2 9 40 -3 -10 30 9
3 11 42 -1 -8 8 1
4 12 46 0 -4 0 0
5 11 48 -1 -2 2 1
6 12 52 0 2 0 0
7 13 54 1 4 4 1
8 13 58 1 8 8 1
9 14 56 2 6 12 4
10 15 60 3 10 30 9
120 500 106 30
16
Year Adv. Sale ���� − ���� − ���� − ���� − ��ෝ�� = ��.
(����) (����) ��ഥ ��ഥ ��ഥ × ��ഥ��
1 10 44 (���� − �������� +
-2 -6 ��ഥ) 4
12 ��. ���� 42.90
2 9 40 -3 -10 30 9 39.37
3 11 42 -1 -8 8 1 46.43
4 12 46 0 -4 0 0 49.96
5 11 48 -1 -2 2 1 46.43
6 12 52 0 2 0 0 49.96
7 13 54 1 4 4 1 53.49
8 13 58 1 8 8 1 53.49
9 14 56 2 6 12 4 57.02
10 15 60 3 10 30 9 60.55
120 500 106 30
17
Mean Square Error Cost
Functi
on ��
� �
� �
�
�,
�
� �
� �
.
� =
�
�
��
�
� �
�
�
�
� �
�
� .
�
� �
�
�
=
� �
�
,
�
�
� �
=
= �
�
�
��
����,�� ��
�
��
− �� ��
�
� (������ +
�
��) −
������ ��
������. ����,��
��
= ���� ��=�� 18
��ෝ�� − ������
Yea Adv. Sale ���� ���� ���� − ���� − ��ෝ�� = ��. ��ෝ�� −
r (��� (��� − − ��ഥ × ��ഥ�� ������
1 �) �)
��ഥ ��ഥ (���� − �������� +
10 44 ��ഥ) 4
-2 -6 12 ��. ���� 42.90
2 9 40 -3 -10 30 9 39.37
3 11 42 -1 -8 8 1 46.43
4 12 46 0 -4 0 0 49.96
5 11 48 -1 -2 2 1 46.43
6 12 52 0 2 0 0 49.96
7 13 54 1 4 4 1 53.49
8 13 58 1 8 8 1 53.49
9 14 56 2 6 12 4 57.02
10 15 60 3 10 30 9 60.55
120 500 106 30
19
Yea Adv. Sale ���� ���� ���� − ���� − ��ෝ�� = ��. ��ෝ�� −
r (��� (��� − − ��ഥ × ��ഥ�� ������
1 �) �)
��ഥ ��ഥ (���� − �������� +
10 44 ��ഥ) 4 1.2100
-2 -6 12 ��. ���� 42.90
2 9 40 -3 -10 30 9 39.37 0.3969
3 11 42 -1 -8 8 1 46.43 19.6249
4 12 46 0 -4 0 0 49.96 15.6816
5 11 48 -1 -2 2 1 46.43 2.4649
6 12 52 0 2 0 0 49.96 4.1616
7 13 54 1 4 4 1 53.49 0.2601
8 13 58 1 8 8 1 53.49 20.3401
9 14 56 2 6 12 4 57.02 1.0404
10 15 60 3 10 30 9 60.55 0.3025
120 500 106 30 65.4830
20
Mean Square Error Cost
Func
tion
��
������. ����,�� ��ෝ�� − ������
=������ ��=��
21
Mean Square Error Cost
Func
tion
��
������.
����,�� =��
�� ��ෝ�� −
= ����
��=�� ������
����,��
�� ���� ����. ��������
����,�� = ��. ����
22
��
Coefficient of determination (�� )
ˆ
()
∑
2
Explained Variation Y Y
−
2
==
−
R
TotalVariation Y Y 2
∑ ()t
The total variation in the firm’s sales is accounted for by the variation
in the firm’s advertising expenditures.
23
Yea Adv. Sale ���� (����−��ഥ)�� ��ෝ�� ��ෝ��−
r (���� (�� −��ഥ ��ഥ����
1 ) ��) 42.90
10 44 -6
2 9 40 -10 39.37
3 11 42 -8 46.43
4 12 46 -4 49.96
5 11 48 -2 46.43
6 12 52 2 49.96
7 13 54 4 53.49
8 13 58 8 53.49
9 14 56 6 57.02
10 15 60 10 60.55
120 500
24
Yea Adv. Sale ���� (����−��ഥ)�� ��ෝ�� ��ෝ��−
r (���� (�� −��ഥ ��ഥ����
1 ) ��) 36 42.90
10 44 -6 50.41
2 9 40 -10 100 39.37 112.99
3 11 42 -8 64 46.43 12.744
4 12 46 -4 16 49.96 0.0016
5 11 48 -2 4 46.43 12.744
6 12 52 2 4 49.96 0.0016
7 13 54 4 16 53.49 12.180
8 13 58 8 64 53.49 12.180
9 14 56 6 36 57.02 49.280
10 15 60 10 100 60.55 111.30
120 500 440 373.83
25
��
Coefficient of determination (�� )
ˆ
()
∑
2
Explained Variation Y Y
−
2
==
−
R
TotalVariation Y Y 2
∑ ()t
26
��
Coefficient of determination (�� )
ˆ
()
∑
2
Explained Variation Y Y
−
2
==
−
R
TotalVariation Y Y 2
∑ R==
440.00
373.84 ()t
2 0.85
This means that 85% of the total variation in the firm’s sales is
accounted for by the variation in the firm’s advertising expenditures.
27
Coefficient of correlation (��)
�� = ����
This is simply a measure of the degree of association or covariation
that exists between variables �� & ��. For our advertising-sales
example,
�� = ���� = ��. ���� = ��. ����
This means that variables �� & �� vary together 92% of the time.
28
Reference
th
• [Dominick Salvatore] Managerial Economics in a Global Economy 4
Edition
29
Acknowledgment
• Material presented in these lecture slides is obtained from Prof. Andrew
Ng course on Machine Learning
• Dr. Iftikhar
Ahmad’s lecture
slides were
consulted for
assistance. • .
30