0% found this document useful (0 votes)

45 views40 pages

Chapter 2 SLRM

Regression analysis is used to describe the relationship between a dependent variable and one or more independent variables. It aims to estimate or predict the average value of the dependent variable based on the independent variables. Simple linear regression involves one dependent and one independent variable, where the relationship is modeled with a straight line. The regression line is estimated by minimizing the sum of squared residuals, with the goal of making the sample regression function as close as possible to the true population regression function.

Uploaded by

merondemekets12347

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views40 pages

Chapter 2 SLRM

Uploaded by

merondemekets12347

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Chapter 2

A brief overview of the

classical linear regression model

1
Regression

• Regression is probably the single most important tool at the

econometrician’s disposal.

But what is regression analysis?

• It is concerned with describing and evaluating the relationship
between a given variable (usually called the dependent variable) and
one or more other variables (usually known as the independent
variable(s)).
• Regression analysis is concerned with the study of the dependence of
one variable, the dependent variable, on one or more other variables,
the explanatory variables, with a view to estimating and/or predicting
the (population) mean or average value of the dependent variable in
terms of the known or fixed (in repeated sampling) values of the
latter.
Regression vs. correlation

• In correlation analysis, the primary objective is to measure

the strength or degree of linear association between two
variables.
• The coefficient measures this strength of (linear) association.
– Smoking vs. lung cancer,
– Advertisement expenditure vs. sales volume etc
• There is no distinction between the dependent and
explanatory variables. However,
• Regression analysis try to estimate or predict the average
value of one variable (dependent variable) on the basis of the
fixed values of other variables
Simple Linear regression (SLR)

• SLR is some times called two variables (bivariate) regression

(one variable as independent and the other variable as
dependent).
• It assumes that there is one affecting variable and one
dependent (affected) variable.
• It is the technique used to develop the equation for the straight
line and predict the value.
• Linear: the relationship between the two variables is straight
line (negatively or positively).
 Simple linear equation: Y= β0 + β1X
 Simple linear regression model: Y= β0 + β1X + u
NB: the two variables are Y (dependent) & X ( independent)
SLR…

• This is the situation where y depends on only one x

variable.
• Examples of the kind of relationship that may be of
interest include:
– How asset returns vary with their level of market
risk
– Measuring the long-term relationship between
stock prices and dividends.
– Constructing an optimal hedge ratio
The Meaning Linearity

1. Linear both in the parameters and the variables

Y= β0 + β1X
2. Linear in the parameters but non-linear in the Variables
Y= β0 + β1X2
3. Non-linear in the parameters
Y= β0 + β12X or Y= β0 + β12X2
NB: Linear regression means a regression that is linear in
the parameters. Equations number 1 & 2 above are linear
A parameter is linear if not divided, squared, and multiplied
etc
Some Notation

• If Y is dependent variable affected by X independent variable; the following

alternative names for the y and x variables can be used:
y x
dependent variable independent variables
regressand regressor
effect variable causal variables
explained variable explanatory variable
endogenous exogenous
response stimulus/control
predictand predictor
SLR Model explained
Y= β0 + β1X + u
β0 + β1X = symmetric ( explained ) aspect of Y
u = unsymmetrical (unexplained) factors that affect Y
Called error term, disturbance term or residual
 It accounts for unobserved impact on Y
 It account for all factors other than X that affect Y
Why it is not introduce these variables into the model explicitly?
we usually ignore some variables due to
• Vagueness of theory
• Unavailability of data or not core variables
Why do we include a Disturbance term?
• The disturbance term can capture a number of features:
- We always leave out some determinants of yt
- There may be errors in the measurement of yt that cannot be
modelled.
- Random outside influences on yt which we cannot model

In regression analysis our interest is in estimating the values of the unknowns β0 and
β1 on the basis of observations on y and x.
- Y value is then estimated based on the estimated value of β0 and β1
β1 is the SLOPE PARAMETER. It captures ceteris paribus effect of x on Y
Δy = β1ΔX, if Δ u = 0

For example, if β1 =3, a 2 unit increase in x would cause a 6 unit

change in y (2 x 3 = 6)
if x and y are positively (negatively) correlated, β1 will be positive
(negative)
B0 is the INTERCEPT PARAMETER or CONSTANT TERM
Determining the Regression Coefficients

• So how do we determine what  0 and 1 are?

• Choose 0 and 1 so that the (vertical) distances from the data points to
the fitted lines are minimised (so that the line fits the data as closely as
possible):
• This graph can depict how 0 and 1 y

in addition to the OLS computations shown

in the following example.

x
SLR Model…

Example
• Assume that a sample of 10 sales persons are taken from sales department.
Sales calls and unit sales data are observed for the sample.
• Assume that the relationship between unit sales (Y) and sales calls (X) is
linear. Y= β0 + β1X sample x Y
1 14 28
2 35 66
3 22 38
In this example
4 29 70
n ( sample size) = 10
5 6 22
Mean of X, (Ⴟ) = 199/10 = 19.9
6 15 27
Mean of Y, (ӯ) = 408/10 = 40.8
7 17 28
8 20 47
9 12 14
10 29 68
N= 10 ∑X= 199 ∑Y= 408
Example…
•

(X-Ⴟ) (X-Ⴟ)2 (Y- ) (X-Ⴟ) (Y-)

-5.9 34.81 -12.8 75.52
15.1 228.01 25.2 380.52
2.1 4.41 -2.8 -5.88
9.1 82.81 29.2 265.72
-13.9 193.21 -18.8 261.32
-4.9 24.01 -13.8 67.62
-2.9 8.41 -12.8 37.12
0.1 0.01 6.2 0.62
-7.9 62.41 -26.8 211.72
9.1 82.81 27.2 247.52
720.9 1,541.80
Example…

β1 = 1541.80/ 720.90 = 2.14

β0 = 40.8- (2.14x19.9) = -1.716
Therefore, Y= -1.716 + 2.14X
Estimated Y is represented by Ŷ (Y hat) and actual Y is represented by Y
thus Ŷ = -1.716 + 2.14X
For X value of 14, Ŷ is 28.1817 (-1.716+ 2.14x14) slightly different from
actual Y of 28. this difference is due to the error term indicating that
there are other factors not explained represented by U (residual).
The following table represents the estimated Y for all observations of X
Example…

sample x Y XY Ŷ
1 14 28 392 28.181700
2 35 66 2310 73.094400
3 22 38 836 45.291300
4 29 70 2030 60.262200
5 6 22 132 11.072100
6 15 27 405 30.320400
7 17 28 476 34.597800
8 20 47 940 41.013900
9 12 14 168 23.904300
10 29 68 1972 60.262200

As you can see in the above table, the estimated Y, (Ŷ) for every
given value of X is slightly different from the actual Y. These
differences can be represented by u (residual).
Residual means that there are remaining factors other than X that
are not explained (not modeled) in the equation.
Population regression function Vs
Sample regression function
The population regression function (PRF) or DGP (data
generating Process) is an idealized concept. In practice one
rarely has access to the entire population of interest.
PRF is a description of the model that is thought to be
generating the actual data and the true relationship between the
variables (i.e. the true values of β0 and 1).
The primary objective in regression analysis is to estimate the
PRF on the basis of the sample regression function (SRF) as
accurately as possible.
How should the SRF be constructed so that β0 is as close as
possible to the true β0 and β1 is as close as possible to the
true β1 even though we will never know the true β0 and β1?
PRF Vs SRF

PRF 𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖
ˆ t  ˆ  ˆxt  u
y
SRF
Errors come to occur when estimating population
parameters based on the parameters measured for sample
data.
The objective is to determine the SRF in such a manner
that it is as close as possible to the actual Y.
This is done by making the sum of squared residuals of SRF
as small as possible, hence the method is OLS. The sum of
the squared residuals is also some function of the
estimators of SRF.
Ordinary Least Squares (OLS)
• The most common method used to fit a straight line to the data is known as
OLS (ordinary least squares).
 It is one of the most powerful and popular methods of regression analysis
 We use OLS method to determine the estimated value of β0 , β1 and error
term ût
 Least square stands for minimum sum of square error (SSE)
• What we actually do is take each distance and square it (i.e. take the area of
each of the squares in the diagram) and minimise the total sum of the
squares (hence least squares).
• Tightening up the notation, let
yt denote the actual data point t
ŷt denote the fitted value from the regression line
ût ŷ
denote the residual, yt - t
Actual and Fitted Value
y

yi
 The values through the straight line are
the estimated value of Y for give values
of X ûi
 The point indicated through the line is
ŷ i
the estimated Y, but the point indicated
above the straight line is the actual Y.
 The difference between the two points
(error) is indicated by residual
 Thus OLS method is a method used to
minimize the sum of squared values of
errors.
xi x
Estimator or Estimate?

• Estimators are the formulae used to calculate the coefficients

• Estimates are the actual numerical values for the coefficients.

• We use the SRF to infer to the value of PRF.
Numerical properties of the estimators

• The OLS estimators are expressed solely in terms of the sample data
• They are point estimators; that is, given the sample, each estimator will
provide only a single (point) value of the relevant population parameter.
• Properties of the regression line include:
• The mean value of the estimated Y is equal to the mean value of the
actual Y
• The mean value of the residuals is zero.
• The residuals ui are uncorrelated with the predicted Yi. [cov(ui, Yi)=0]
• The residuals ui are uncorrelated with Xi. [cov(ui, Xi)=0]
• The line passes through the sample means of Y and X
• The OLS line ensures the residuals that are equal in magnitude are
given equal weight. Consider the two residuals –4 and 4. In both of
these observations, the estimated y-value is equal distance from the
observed y-value, 4 units. It just happens you overestimated y in the
first case and underestimated y in the second case.
How OLS Works
5

• So min uˆ12  uˆ 22  uˆ32  uˆ 42 ,uˆor

2 .uˆ 2
t
5 minimise t 1
• This is known as the residual sum of squares.
• But what was ût ? It was the difference between the actual point and the
ŷ
line, yt - . t
 y  yˆ t 
2
t
• So minimising is equivalent
 uˆt2 to minimising
with respect to 0 and 1
• In order to use OLS, we need a model which is linear in the parameters
(0 and 1 ). It does not necessarily have to be linear in the variables (y
and x).

• Linear in the parameters means that the parameters are not multiplied
together, divided,
Continuing withsquared or cubed etc.example given for X (sales
the preceding
calls) & Y (sales in units): see the following computation

for  t t y  ˆ
y 2
Example…
Y Ŷ (Y-Ŷ) (Y-Ŷ)2
Assumption about the error 28 28.181700 -0.1817 0.033015
1. It is a random variable with a mean or 66 73.094400 -7.0944 50.33051
expected value of zero,
38 45.291300 -7.2913 53.16306
2. The variance of u ( δ2) is the same for all
value of x, which means that the variance of 70 60.262200 9.7378 94.82475
Y about the regression line equals δ2 and 22 11.072100 10.9279 119.419
the same for all values of X 27 30.320400 -3.3204 11.02506
3. The value of error is independent. The 28 34.597800 -6.5978 43.53096
value of error for a particular value of X is
47 41.013900 5.9861 35.83339
not related to the value of error for any
other value of X, thus the value of Y for a 14 23.904300 -9.9043 98.09516
particular value of X is not related to the 68 60.262200 7.7378 59.87355
value of Y for any other value of X.
4. the error is a normally distributed random 408.00 (0.00) 566.13
variable. Because Y is a linear function of
error, Y is also a normally distributed random
variable.
Residuals and Goodness-of-fit
• Residuals: Residuals in Linear Regression are the
difference between the actual value and the predicted
value.
• Goodness of fit (GOF): Goodness of fit measures for
linear regression are attempts to understand how well a
model fits a given set of data.
• A well-fitting regression model results in predicted values
close to the observed data values.
• Models almost never describe the process that
generated a dataset exactly
• Models approximate reality. However, even models that
approximate reality can be used to draw useful
inferences or to prediction future observations
GOF…
TOTAL SUM OF SQUARES (TSS) is the total variation of
estimated Y from the actual Y. The total variation is
divided in to two:
1. That explained by regression (explained by
independent variable) with degree of freedom of n-
1. There is one independent variable only.
this is sum of square regression(SSR)
2. The error, or unexplained variation. With degree of
freedom of n-2, parameters estimated
this is called sum of square error (SSE)
SSR = TSS –SSE
GOF…
Estimating error σ 2
Recall that the deviations of the Y values about the estimated
regression line are called residuals.
 Sum square error (SSE): error2 due to residuals
  yt  yˆ t 
thus SSE =
The mean square error (MSE) provides the estimate of δ2. It is the
SSE divided by the degree of freedom.

 From the preceding example we can compute MSE as

SSE = 566.13 MSE = 566.13/10-2 = 70.766
σ= 70.766 = 8.41
= 8.41 is referred to as the standard error of estimate
 Degree of freedom is 2 (two variables X &Y).
The smaller the σthe closer the estimated Y to the actual Y.
GOF…
R2 or Coefficient of Determination:
R² measures something related to the explained variance.
Thus it can be obtained as one minus fraction of variable
unexplained. So for example if R² = 0.89, then 89% of the total
variation in y can be explained by the linear relationship
between features and y and the 11% is unexplained.

R² ranges from 0 to 1. zero indicates that the model doesn’t

improve the prediction & 1 indicates perfect prediction.

The square of the correlation r describes the strength of a

straight-line relationship.
GOF…
Compute R2
SST = SSR+SSE Y (Y-ӯ) (y-ӯ)2
Thus, SSR = SST- SSE
28 -12.8 163.84
66 25.2 635.04
SSE is computed previously. 38 -2.8 7.84
SSE = 566.13 70 29.2 852.64
SST = 3863.60 22 -18.8 353.44
SSR = 3,863.60-566.13 = 3,397.47 27 -13.8 190.44
28 -12.8 163.84
R2 = SSR/SST 47 6.2 38.44
R2 =3397.47/3863.60 = 87.9% 14 -26.8 718.24
68 27.2 739.84
87.9% of the total variation in y can be
08.00 SST = 3,863.60
explained by the linear relationship
between X and y and the 12.1 % is
unexplained.
Correlation Coefficient

• Correlation coefficient (r) is a descriptive measure of the

strength of linear association between two variables, x and y.
• Values of the correlation coefficient are always between -1
and 1.
• A value of 1 indicates that the two variables x and y are
perfectly related in a positive linear sense. That is, all data
points are on a straight line that has a positive slope.
• A value of -1 indicates that x and y are perfectly related in a
negative linear sense, with all data points on a straight line
that has a negative slope.
• Values of the correlation coefficient close to zero indicate that
x and y are not linearly related.
• r= root of r2 thus, r = √0.879 = 0.9375
R2…

• Here is the basic idea.

• Think about trying to predict a new value of y. With no other information
than our sample of values of y, a reasonable choice is ӯ.
• Now consider how your prediction would change if you had an explanatory
variable.
• If we use the regression equation to predict, we would use Ŷ= b0 +b1x. This
prediction takes into account the value of the explanatory variable x.
• Let’s compare our two choices for predicting y. With the explanatory variable
x, we use Ŷ; without this information, we use ӯ. How can we compare these
two choices?
• When we use ӯ to predict, our prediction error is y − ӯ.
• If, instead, we use Ŷ, our prediction error is y − ŷ.
• The use of x in our prediction changes our prediction error from is y − ӯ to y −
ŷ.
• The difference is ŷ − ӯ. Our comparison uses the sums of squares of these
differences
• ∑(ӯ − ŷ)2 and ∑(y − ӯ )2.
• The ratio of these two quantities is the square of the correlation:
The Assumptions Underlying the
Classical Linear Regression Model (CLRM)
• In order to achieve a ceteris paribus analysis of x’s affect on y and for valid
interpretation of the regression estimates, we need assumptions about the Xi
variable(s) and the error term.
• The model which we have used is known as the classical linear regression model.
• We observe data for xt, but since yt also depends on ut, we must be specific about
how the ut are generated.
• We usually make the following set of assumptions about the ut’s (the unobservable
error terms):
• Technical Notation Interpretation
1. E(ut) = 0 The errors have zero mean
2. Var (ut) = 2 The variance of the errors is constant and finite
over all values of xt
3. Cov (ui,uj)=0 The errors are statistically independent of
one another
4. Cov (ut,xt)=0 No relationship between the error and
corresponding x variante
The Assumptions Underlying the
CLRM Again
• An alternative assumption to 4., which is slightly stronger, is that the xt’s are non-
stochastic or fixed in repeated samples.
• A fifth assumption is required if we want to make inferences about the
population parameters (the actual  and ) from the sample parameters ( and
)  
• Additional Assumption
5. ut is normally distributed

Stochastic: means that probable (randomness) in the occurrence of events &

predicted in statistical approaches but can’t be done precisely. Eg. The number of
phone call at customer center. In regression dependent variable are stochastic.
Non-stochastic: explanatory variables are non-stochastic (FIXED)
Properties of the OLS Estimator:
Gauss-Markov theorem

• If assumptions 1. through 4. hold, then the estimators  and  determined by

OLS are known as Best Linear Unbiased Estimators (BLUE).
What does the acronym stand for?

• “Estimator” - is an estimator of the true value of .


• “Linear” - is a linear estimator

• “Unbiased” - On average, the actual value of the and ’s will be
equal to the true values.  
• “Best” - means that the OLS estimator has
 minimum variance among

the class of linear unbiased estimators. The Gauss-Markov
theorem proves that the OLS estimator is best.
Consistency/Unbiasedness/Efficiency

• Consistent
The least squares estimators and  areconsistent. That is, the estimates will converge
to their true values as the sample size increases to infinity. Need the assumptions
E(xtut)=0 and Var(ut)=2 <  to prove this. Consistency implies that

• Unbiased T 
 
lim Pr ˆ      0   0

The least squares estimates of 


and are unbiased. That is E( )=and E( )=
Thus on average the estimated value will be equal to the true values. To prove this
also requires the assumption that E(ut)=0. Unbiasedness is a stronger condition than
consistency.

• Efficiency

An estimator  of parameter  is said to be efficient if it is unbiased and no other
unbiased estimator has a smaller variance. If the estimator is efficient, we are
minimising the probability that it is a long way off from the true value of .
Hypothesis & Testing for significance
In the linear equation, If 𝛽1 is 0, the estimated value of Y
equals 𝛽0. This mean that the value of Y doesn’t depend on
the value of X and hence we would conclude that X and Y
are not linearly related. In another way, if the value of 𝛽1 is
not 0, we conclude that the two variable are related.
Thus, to test for a significant regression relationship, we
must conduct hypothesis test to determine whether the
value of 𝛽1 is zero.
Hypothesis testing is used to confirm if the estimated
regression coefficients bear any statistical significance.
Either the confidence interval approach or the t-test
approach can be used in hypothesis testing.
t-test
• The standard error of estimate determined
previously is used to tests for a significant
relationship between x & y.
• The purpose of t-test is to see whether we can
conclude that 𝛽1 ≠ 0. we use sample data to test
the following hypothesis about 𝛽1 ≠ 0 .
Ho: 𝛽1 = 0
H1: 𝛽1 ≠ 0
If Ho is rejected we conclude that 𝛽1 ≠ 0 and say
that there is significant relationship between X & Y.
t-test…

• consider what would happen if we used a different random sample for the same
regression study.
• A regression analysis of this new sample might result in an estimated regression
equation similar to our previous estimated regression equation Ŷ= -1.716 + 2.14X
• However, it is doubtful that we would obtain exactly the same equation (with an
intercept of exactly -1.716 and a slope of exactly 2.14).
• Indeed, b0 and b1, the least squares estimators, are sample statistics with their own
sampling distributions.
• The properties of the sampling distribution of b1 follow.
Expected Value E(b1)
E(b1)= 𝛽1
Standard Deviation (σb1)
σb1 = σ /
Note that the expected value of b1 is equal to 𝛽1 , so b1 is an unbiased estimator of 𝛽1 .
Because we do not know the value of σ, we develop an estimate of σb1 , denoted , by
estimating σ with s in equation. Thus, we obtain the following estimate of σ .
σb1 = s /
t-test…
s or σ = 8.41 x (X-Ⴟ) (X-Ⴟ)2
14 -5.9 34.81
∑(X-Ⴟ)2 = 720.9
35 15.1 228.01
22 2.1 4.41
σb1 = 8.41 / 29 9.1 82.81
6 -13.9 193.21
σb1 or s = 0.313
15 -4.9 24.01
17 -2.9 8.41
The t test for a significant relationship is 20 0.1 0.01
based on the fact that the test statistic 12 -7.9 62.41
29 9.1 82.81
199.00 720.9
• Follows a t distribution with n-2 degrees
of freedom.
• If the null hypothesis is true, then = 0 and
• = 6.8
t = b1/s . t= =
t=
which means, t= =
Let us conduct this test of significance
• The p-value: or probability value, tells you how likely it is that
your data could have occurred under the null hypothesis. It
does this by calculating the likelihood of your test statistic
which is the number calculated by a statistical test (t) using
your data.
• Significance level(alpha) α : is used to refer to a pre-chosen
probability and the term “P value” used to indicate a
probability that you calculate after a given study.
• If your P-Value is less than the chosen significance level then
you reject the null hypothesis.
• Most common significance level is 0.05 (95% confidence).
• p-value thus does not provide much support for the null
hypothesis. The importance depends upon the level of
significance for the test.
Reject H0 if p-value ≤ α

Chapter Two
No ratings yet
Chapter Two
44 pages
Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
Ch3 Slides Ed4 2024 20
No ratings yet
Ch3 Slides Ed4 2024 20
72 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
Econometrics: Linear Regression Basics
No ratings yet
Econometrics: Linear Regression Basics
21 pages
Introduction to Simple Linear Regression
No ratings yet
Introduction to Simple Linear Regression
47 pages
STAT 445-Lecture 1 - 2021
No ratings yet
STAT 445-Lecture 1 - 2021
42 pages
OLS Regression Explained by Dr. Mitiku
No ratings yet
OLS Regression Explained by Dr. Mitiku
80 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Linear Regression Basics
No ratings yet
Linear Regression Basics
5 pages
Daunit 3
No ratings yet
Daunit 3
32 pages
2 Basic Regression
No ratings yet
2 Basic Regression
69 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
46 pages
125.785 Module 2.1
No ratings yet
125.785 Module 2.1
94 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Stephen and Senthamarai Kannan (2017) - Detection of Outliers in Regression Model For Medical Data
No ratings yet
Stephen and Senthamarai Kannan (2017) - Detection of Outliers in Regression Model For Medical Data
7 pages
Understanding Simple Regression and OLS
No ratings yet
Understanding Simple Regression and OLS
29 pages
Econometrics Hawas-1
No ratings yet
Econometrics Hawas-1
83 pages
Lecture 3 Ase
No ratings yet
Lecture 3 Ase
13 pages
Bivariate Regression Analysis: The Beginning of Many Types of Regression
No ratings yet
Bivariate Regression Analysis: The Beginning of Many Types of Regression
40 pages
Simple Regression
No ratings yet
Simple Regression
45 pages
Strongest Linear Regression Analysis
No ratings yet
Strongest Linear Regression Analysis
5 pages
Lecture 2. Simple Linear Regression
No ratings yet
Lecture 2. Simple Linear Regression
49 pages
Regression
No ratings yet
Regression
10 pages
File4 Session3 Introduction To Regression
No ratings yet
File4 Session3 Introduction To Regression
50 pages
OLS Regression Analysis Explained
No ratings yet
OLS Regression Analysis Explained
34 pages
Chapter 2
No ratings yet
Chapter 2
50 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
36 pages
Simple Regression Analysis Overview
No ratings yet
Simple Regression Analysis Overview
12 pages
Regression Analysis Overview by Vaibhav Sahu
No ratings yet
Regression Analysis Overview by Vaibhav Sahu
17 pages
Raw Introduction to Linear Regression (서울대 회귀분석 강의노트)
No ratings yet
Raw Introduction to Linear Regression (서울대 회귀분석 강의노트)
226 pages
OLS in Two-Variable Regression
No ratings yet
OLS in Two-Variable Regression
65 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
28 pages
Unit-3 Notes
No ratings yet
Unit-3 Notes
16 pages
Linear Regression
No ratings yet
Linear Regression
35 pages
Regression Analysis 6
No ratings yet
Regression Analysis 6
23 pages
Summary of 3 Lectures chp1 2 3 6
No ratings yet
Summary of 3 Lectures chp1 2 3 6
2 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
11 pages
Simple Linear Regression Model Explained
No ratings yet
Simple Linear Regression Model Explained
68 pages
Chapter2
No ratings yet
Chapter2
70 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Econometrics Theory Note
No ratings yet
Econometrics Theory Note
13 pages
Home Work 1: Group Member Student Name ID Contribution
No ratings yet
Home Work 1: Group Member Student Name ID Contribution
7 pages
Simple Linear Regression Assumptions
No ratings yet
Simple Linear Regression Assumptions
20 pages
Lecture Note #8 - PEC-CS701E
No ratings yet
Lecture Note #8 - PEC-CS701E
20 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
Econ 131: Regression Basics
No ratings yet
Econ 131: Regression Basics
25 pages
CH 3
No ratings yet
CH 3
123 pages
Finance Students' Guide to Regression
No ratings yet
Finance Students' Guide to Regression
41 pages
Simple Linear Regression Analysis..
No ratings yet
Simple Linear Regression Analysis..
51 pages
Linear Regression for Beginners
No ratings yet
Linear Regression for Beginners
11 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
35 pages
Simple Linear Regression and OLS Estimation
No ratings yet
Simple Linear Regression and OLS Estimation
11 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
21 pages
C1 English
No ratings yet
C1 English
26 pages
Regression Analysis in Healthcare
No ratings yet
Regression Analysis in Healthcare
3 pages
Beale Et Al 2010 Regression Analysis of Spatial Data
No ratings yet
Beale Et Al 2010 Regression Analysis of Spatial Data
19 pages
Chap 6 Solns
No ratings yet
Chap 6 Solns
2 pages
Correlation and Regression Analysis Techniques
No ratings yet
Correlation and Regression Analysis Techniques
12 pages
Excel Solver for Curve Fitting
No ratings yet
Excel Solver for Curve Fitting
3 pages
Sample Size Calculations Guide
No ratings yet
Sample Size Calculations Guide
1 page
Understanding Boxplots - Towards Data Science
No ratings yet
Understanding Boxplots - Towards Data Science
12 pages
Children 2005
No ratings yet
Children 2005
4 pages
Pre-Test & Post Test Analysis Sample Computations
100% (1)
Pre-Test & Post Test Analysis Sample Computations
8 pages
Critical Values of The Mann-Whitney U
No ratings yet
Critical Values of The Mann-Whitney U
2 pages
4) Obtain The Rank Correlation Coefficient For The Following Data
No ratings yet
4) Obtain The Rank Correlation Coefficient For The Following Data
3 pages
01 Econ115a Mod1 Lesson1 BasicStatisticalConcepts
No ratings yet
01 Econ115a Mod1 Lesson1 BasicStatisticalConcepts
54 pages
Discriminant Analysis STEP by STEM
No ratings yet
Discriminant Analysis STEP by STEM
2 pages
Process Quality Inference Guide
67% (3)
Process Quality Inference Guide
89 pages
PT Mineblox
No ratings yet
PT Mineblox
28 pages
ML Unit-4.a
No ratings yet
ML Unit-4.a
69 pages
Chapter 3 Forecasting
No ratings yet
Chapter 3 Forecasting
3 pages
ملف تحليل النتائج
No ratings yet
ملف تحليل النتائج
2 pages
7 Steps of Business Analytics Process
No ratings yet
7 Steps of Business Analytics Process
3 pages
Uses and Abuses of The Analysis of Covariance
No ratings yet
Uses and Abuses of The Analysis of Covariance
11 pages
Item
No ratings yet
Item
2 pages
Statistical Techniques Exam Paper 2A
No ratings yet
Statistical Techniques Exam Paper 2A
24 pages
Statistical Tools and Techniques: College-Level Notes
No ratings yet
Statistical Tools and Techniques: College-Level Notes
14 pages
Uji Validitas dan Reliabilitas Kuesioner
No ratings yet
Uji Validitas dan Reliabilitas Kuesioner
12 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Time Series Lec 2 Stationa
No ratings yet
Time Series Lec 2 Stationa
14 pages
Understanding Color Theory and Apigee Analytics
No ratings yet
Understanding Color Theory and Apigee Analytics
34 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
8 pages
التحليل الاحصائي للمتغيرات المتعددة
No ratings yet
التحليل الاحصائي للمتغيرات المتعددة
205 pages
Ground-Motion Models For Average Spectral Acceleration
No ratings yet
Ground-Motion Models For Average Spectral Acceleration
21 pages

Chapter 2 SLRM

Uploaded by

Chapter 2 SLRM

Uploaded by

Chapter 2

A brief overview of the

• Regression is probably the single most important tool at the

But what is regression analysis?

• In correlation analysis, the primary objective is to measure

• SLR is some times called two variables (bivariate) regression

• This is the situation where y depends on only one x

1. Linear both in the parameters and the variables

• If Y is dependent variable affected by X independent variable; the following

For example, if β1 =3, a 2 unit increase in x would cause a 6 unit

• So how do we determine what  0 and 1 are?

in addition to the OLS computations shown

(X-Ⴟ) (X-Ⴟ)2 (Y- ) (X-Ⴟ) (Y-)

β1 = 1541.80/ 720.90 = 2.14

• Estimators are the formulae used to calculate the coefficients

• Estimates are the actual numerical values for the coefficients.

• So min uˆ12  uˆ 22  uˆ32  uˆ 42 ,uˆor

 From the preceding example we can compute MSE as

R² ranges from 0 to 1. zero indicates that the model doesn’t

The square of the correlation r describes the strength of a

• Correlation coefficient (r) is a descriptive measure of the

• Here is the basic idea.

Stochastic: means that probable (randomness) in the occurrence of events &

• If assumptions 1. through 4. hold, then the estimators  and  determined by

• “Estimator” - is an estimator of the true value of .

The least squares estimates of 

You might also like