0% found this document useful (0 votes)

39 views22 pages

Linear Regression

Uploaded by

dtaditya26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views22 pages

Linear Regression

Uploaded by

dtaditya26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Unit – 4

Linear Regression
Regression analysis is a statistical technique that attempts to explore and model the relationship between two
or more variables. For example, an analyst may want to know if there is a relationship between road accidents
and the age of the driver. Regression analysis forms an important part of the statistical analysis of the data
obtained from designed experiments and is discussed briefly in this chapter.

Simple Linear Regression Analysis

A linear regression model attempts to explain the relationship between two or more variables using a straight
line. Consider the data obtained from a chemical process where the yield of the process is thought to be related
to the reaction temperature (see the table below).

And a scatter plot can be obtained as shown in the following figure. In the scatter plot yield, is plotted for
different temperature values, .
It is clear that no line can be found to pass through all points of the plot. Thus no functional relation exists
between the two variables and . However, the scatter plot does give an indication that a straight line may
exist such that all the points on the plot are scattered randomly around this line. A statistical relation is said to
exist in this case. The statistical relation between and may be expressed as follows:

The above equation is the linear regression model that can be used to explain the relation between and

that is seen on the scatter plot above. In this model, the mean value of (abbreviated as ) is
assumed to follow the linear relation:

The actual values of (which are observed as yield from the chemical process from time to time and are

random in nature) are assumed to be the sum of the mean value, , and a random error term, :

The regression model here is called a simple linear regression model because there is just one independent
variable, , in the model. In regression models, the independent variables are also referred to as regressors
or predictor variables. The dependent variable, , is also referred to as the response. The slope, , and

the intercept, , of the line are called regression coefficients. The slope, ,
can be interpreted as the change in the mean value of for a unit change in .
The random error term, , is assumed to follow the normal distribution with a mean of 0 and variance of .

Since is the sum of this random term and the mean value, , which is a constant, the variance
of at any given value of is also . Therefore, at any given value of , say , the dependent
variable follows a normal distribution with a mean of and a standard deviation of . This is
illustrated in the following figure.

Fitted Regression Line

The true regression line is usually not known. However, the regression line can be estimated by estimating the

coefficients and for an observed data set. The estimates, and , are calculated using least
squares. (For details on least square estimates, refer to Hahn & Shapiro (1967).) The estimated regression

line, obtained using the values of and , is called the fitted line. The least square estimates,

and , are obtained using the following equations:

where is the mean of all the observed values and is the mean of all values of the predictor variable at

which the observations were taken. is calculated using and is calculated

using .

Once and are known, the fitted regression line can be written as:
where is the fitted or estimated value based on the fitted regression model. It is an estimate of the mean

value, . The fitted value, , for a given value of the predictor variable, , may be different from the
corresponding observed value, . The difference between the two values is called the residual, :

Calculation of the Fitted Line Using Least Square Estimates

The least square estimates of the regression coefficients can be obtained for the data in the preceding table as
follows:

Knowing and , the fitted regression line is:

This line is shown in the figure below.

Once the fitted regression line is known, the

fitted value of corresponding to any
observed data point can be calculated. For
example, the fitted value corresponding to the
21st observation in the preceding table is:
The observed response at this point is . Therefore, the residual at this point is:

Example
In this lesson, we apply regression analysis to some fictitious data, and we show how to
interpret the results of our analysis.

Problem Statement

Last year, five randomly selected students took a math aptitude test before they began their
statistics course. The Statistics Department has three questions.

 What linear regression equation best predicts statistics performance, based on math
aptitude scores?
 If a student made an 80 on the aptitude test, what grade would we expect her to
make in statistics?
 How well does the regression equation fit the data?

How to Find the Regression Equation

In the table below, the xi column shows scores on the aptitude test. Similarly, the y i column
shows statistics grades. The last two rows show sums and mean scores that we will use to
conduct the regression analysis.

Stude (xi - (yi - (xi - x (yi - y (xi - x)(yi -

xi yi
nt x) y) )2
)2 y)

1 95 85 17 8 289 64 136

2 85 95 7 18 49 324 126

3 80 70 2 -7 4 49 -14
4 70 65 -8 -12 64 144 96

5 60 70 -18 -7 324 49 126

39 38
Sum 730 630 470
0 5

Mea
78 77
n

The regression equation is a linear equation of the form: ŷ = b 0 + b1x . To conduct a

regression analysis, we need to solve for b0 and b1. Computations are shown below.

b1 = Σ [ (xi - x)(yi - y) ] / Σ b0 = y - b 1 * x
2
[ (xi - x) ] b0 = 77 - (0.644)(78) =
b1 = 470/730 = 0.644 26.768

Therefore, the regression equation is: ŷ = 26.768 + 0.644x .

How to Use the Regression Equation

Once you have the regression equation, using it is a snap. Choose a value for the
independent variable (x), perform the computation, and you have an estimated value (ŷ) for
the dependent variable.

In our example, the independent variable is the student's score on the aptitude test. The
dependent variable is the student's statistics grade. If a student made an 80 on the aptitude
test, the estimated statistics grade would be:

ŷ = 26.768 + 0.644x = 26.768 + 0.644 * 80 = 26.768 + 51.52 = 78.288

Warning: When you use a regression equation, do not use values for the independent
variable that are outside the range of values used to create the equation. That is
called extrapolation, and it can produce unreasonable estimates.
In this example, the aptitude test scores used to create the regression equation ranged from
60 to 95. Therefore, only use values inside that range to estimate statistics grades. Using
values outside that range (less than 60 or greater than 95) is problematic.

How to Find the Coefficient of Determination

Whenever you use a regression equation, you should ask how well the equation fits the data.
One way to assess fit is to check the coefficient of determination, which can be computed
from the following formula.

R2 = { ( 1 / N ) * Σ [ (xi - x) * (yi - y) ] / (σx * σy ) }2

where N is the number of observations used to fit the model, Σ is the summation symbol,
xi is the x value for observation i, x is the mean x value, yi is the y value for observation
i, y is the mean y value, σx is the standard deviation of x, and σy is the standard deviation of
y. Computations for the sample problem of this lesson are shown below.

σx = sqrt [ Σ ( xi - x )2 / N ] σy = sqrt [ Σ ( yi - y )2 / N ]
σx = sqrt( 730/5 ) = sqrt(146) = σy = sqrt( 630/5 ) = sqrt(126) =
12.083 11.225

R2 = { ( 1 / N ) * Σ [ (xi - x) * (yi - y) ] / (σx * σy ) }2

R2 = [ ( 1/5 ) * 470 / ( 12.083 * 11.225 ) ]2 = ( 94 / 135.632 )2 = ( 0.693
)2 = 0.48

A coefficient of determination equal to 0.48 indicates that about 48% of the variation in
statistics grades (the dependent variable) can be explained by the relationship to math
aptitude scores (theindependent variable). This would be considered a good fit to the data,
in the sense that it would substantially improve an educator's ability to predict student
performance in statistics class.
Residual Analysis in Regression
Because a linear regression model is not always appropriate for the data, you should assess
the appropriateness of the model by defining residuals and examining residual plots.

Residuals

The difference between the observed value of the dependent variable (y) and the predicted
value (ŷ) is called the residual (e). Each data point has one residual.

Residual = Observed value - Predicted value

e=y-ŷ

Both the sum and the mean of the residuals are equal to zero. That is, Σ e = 0 and e = 0.

Residual Plots

A residual plot is a graph that shows the residuals on the vertical axis and the independent
variable on the horizontal axis. If the points in a residual plot are randomly dispersed around
the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-
linear model is more appropriate.

Below the table on the left shows inputs and outputs from a simple linear regression
analysis, and the chart on the right displays the residual (e) and independent variable (X) as
a residual plot.

x 60 70 80 85 95

y 70 65 70 95 85

ŷ 65.411 71.849 78.288 81.507 87.945

e 4.589 -6.849 -8.288 13.493 -2.945

The residual plot shows a fairly random pattern - the first residual is positive, the next two
are negative, the fourth is positive, and the last residual is negative. This random pattern
indicates that a linear model provides a decent fit to the data.
Below, the residual plots show three typical patterns. The first plot shows a random pattern,
indicating a good fit for a linear model. The other plot patterns are non-random (U-shaped
and inverted U), suggesting a better fit for a non-linear model.

Random pattern Non-random: U-shaped Non-random: Inverted U

In the next lesson, we will work on a problem, where the residual plot shows a non-random
pattern. And we will show how to "transform" the data to use a linear model with nonlinear
data.

Transformations to Achieve Linearity

When a residual plot reveals a data set to be nonlinear, it is often possible to "transform" the
raw data to make it more linear. This allows us to use linear regression techniques more
effectively with nonlinear data.

What is a Transformation to Achieve Linearity?

Transforming a variable involves using a mathematical operation to change its

measurement scale. Broadly speaking, there are two kinds of transformations.

 Linear transformation. A linear transformation preserves linear relationships

between variables. Therefore, the correlation between x and y would be unchanged
after a linear transformation. Examples of a linear transformation to variable x would
be multiplying x by a constant, dividing x by a constant, or adding a constant to x.
 Nonlinear tranformation. A nonlinear transformation changes (increases or
decreases) linear relationships between variables and, thus, changes the correlation
between variables. Examples of a nonlinear transformation of variable x would be
taking the square root of x or the reciprocal of x.
In regression, a transformation to achieve linearity is a special kind of nonlinear
transformation. It is a nonlinear transformation that increases the linear relationship
between two variables.

Methods of Transforming Variables to Achieve Linearity

There are many ways to transform variables to achieve linearity for regression analysis.
Some common methods are summarized below.

Regression Predicted value

Method Transformation(s)
equation (ŷ)

Standard linear
None y = b0 + b1x ŷ = b0 + b1x
regression

Dependent variable =
Exponential model log(y) = b0 + b1x ŷ = 10b0 + b1x
log(y)

Dependent variable =
Quadratic model sqrt(y) = b0 + b1x ŷ = ( b0 + b1x )2
sqrt(y)

Reciprocal model Dependent variable = 1/y 1/y = b0 + b1x ŷ = 1 / ( b0 + b1x )

Independent variable =
Logarithmic model y= b0 + b1log(x) ŷ = b0 + b1log(x)
log(x)

Power model Dependent variable = log(y)= b0 + b1log(x) ŷ = 10b0 + b1log(x)

log(y)

Independent variable =

log(x)

Each row shows a different nonlinear transformation method. The second column shows the
specific transformation applied to dependent and/or independent variables. The third
column shows the regression equation used in the analysis. And the last column shows the
"back transformation" equation used to restore the dependent variable to its original, non-
transformed measurement scale.

In practice, these methods need to be tested on the data to which they are applied to be
sure that they increase rather than decrease the linearity of the relationship. Testing the
effect of a transformation method involves looking at residual plots and correlation
coefficients, as described in the following sections.
Logistic Regression
What is the logistic curve? What is the base of the natural logarithm? Why do
statisticians prefer logistic regression to ordinary linear regression when the DV is
binary? How are probabilities, odds and logits related? What is an odds ratio? How
can logistic regression be considered a linear regression? What is a loss function?
What is a maximum likelihood estimate? How is the bweight in logistic regression for
a categorical variable related to the odds ratio of its constituent categories?

For this chapter only, we are going to deal with a dependent variable that is binary (a
categorical variable that has two values such as "yes" and "no") rather than
continuous.

It is customary to code a binary DV either 0 or 1. For example, we might code a

successfully kicked field goal as 1 and a missed field goal as 0 or we might code yes
as 1 and no as 0 or admitted as 1 and rejected as 0. If we code like this, then the mean
of the distribution is equal to the proportion of 1s in the distribution. For example if
there are 100 people in the distribution and 30 of them are coded 1, then the mean of
the distribution is .30, which is the proportion of 1s. The mean of the distribution is
also the probability of drawing a person labeled as 1 at random from the distribution.
That is, if we grab a person at random from our sample of 100 that I just described,
the probability that the person will be a 1 is .30. Therefore, proportion and probability
of 1 are the same in such cases. The mean of a binary distribution so coded is denoted
as P, the proportion of 1s. The proportion of zeros is (1-P), which is sometimes
denoted as Q. The variance of such a distribution is PQ, and the standard deviation is
Sqrt(PQ). {Why can't all of stats be this easy?}

Suppose we want to predict whether someone is male or female (DV, M=1, F=0)
using height in inches (IV). We could plot the relations between the two variables as
we customarily do in regression. The plot might look something like this:
Points to notice about the graph (data are fictional):

1. The regression line is a rolling average, just as in linear regression. The Y-axis
is P, which indicates the proportion of 1s at any given value of height. (review
graph)
2. The regression line is nonlinear. (review graph)
3. None of the observations --the raw data points-- actually fall on the regression
line. They all fall on zero or one. (review graph)

Why use logistic regression rather than ordinary linear regression?

When I was in graduate school, people didn't use logistic regression with a binary DV.
They just used ordinary linear regression instead. Statisticians won the day, however,
and now most psychologists use logistic regression with a binary DV for the
following reasons:

1. If you use linear regression, the predicted values will become greater than one
and less than zero if you move far enough on the X-axis. Such values are
theoretically inadmissible.
2. One of the assumptions of regression is that the variance of Y is constant across
values of X (homoscedasticity). This cannot be the case with a binary variable,
because the variance is PQ. When 50 percent of the people are 1s, then the
variance is .25, its maximum value. As we move to more extreme values, the
variance decreases. When P=.10, the variance is .1*.9 = .09, so as P approaches
1 or zero, the variance approaches zero.
3. The significance testing of the b weights rest upon the assumption that errors of
prediction (Y-Y') are normally distributed. Because Y only takes the values 0
and 1, this assumption is pretty hard to justify, even approximately. Therefore,
the tests of the regression weights are suspect if you use linear regression with a
binary DV.

The Logistic Curve

The logistic curve relates the independent variable, X, to the rolling mean of the DV,
P( ). The formula to do so may be written either

where P is the probability of a 1 (the proportion of 1s, the mean of Y), e is the base of
the natural logarithm (about 2.718) and a and b are the parameters of the model. The
value of a yields P when X is zero, and b adjusts how quickly the probability changes
with changing X a single unit (we can have standardized and unstandardized b
weights in logistic regression, just as in ordinary linear regression). Because the
relation between X and P is nonlinear, b does not have a straightforward interpretation
in this model as it does in ordinary linear regression.
Loss Function
A loss function is a measure of fit between a mathematical model of data and the
actual data. We choose the parameters of our model to minimize the badness-of-fit or
to maximize the goodness-of-fit of the model to the data. With least squares (the only
loss function we have used thus far), we minimize SSres, the sum of squares residual.
This also happens to maximize SSreg, the sum of squares due to regression. With linear
or curvilinear models, there is a mathematical solution to the problem that will
minimize the sum of squares, that is,

b = (X'X)-1X'y

 = R-1r

With some models, like the logistic curve, there is no mathematical solution that will
produce least squares estimates of the parameters. For many of these models, the loss
function chosen is called maximum likelihood. A likelihood is a conditional
probability (e.g., P(Y|X), the probability of Y given X). We can pick the parameters of
the model (a and b of the logistic curve) at random or by trial-and-error and then
compute the likelihood of the data given those parameters (actually, we do better than
trail-and-error, but not perfectly). We will choose as our parameters, those that result
in the greatest likelihood computed. The estimates are called maximum likelihood
because the parameters are chosen to maximize the likelihood (conditional probability
of the data given parameter estimates) of the sample data. The techniques actually
employed to find the maximum likelihood estimates fall under the general
label numerical analysis. There are several methods of numerical analysis, but they all
follow a similar series of steps. First, the computer picks some initial estimates of the
parameters. Then it will compute the likelihood of the data given these parameter
estimates. Then it will improve the parameter estimates slightly and recalculate the
likelihood of the data. It will do this forever until we tell it to stop, which we usually
do when the parameter estimates do not change much (usually a change .01 or .001 is
small enough to tell the computer to stop). [Sometimes we tell the computer to stop
after a certain number of tries or iterations, e.g., 20 or 250. This usually indicates a
problem in estimation.]

Where on Earth Did This Stuff Come From?

Suppose we only know a person's height and we want to predict whether that person is
male or female. We can talk about the probability of being male or female, or we can
talk about the odds of being male or female. Let's say that the probability of being
male at a given height is .90. Then the odds of being male would be

(Odds can also be found by counting the number of people in each group and dividing
one number by the other. Clearly, the probability is not the same as the odds.) In our
example, the odds would be .90/.10 or 9 to one. Now the odds of being female would
be .10/.90 or 1/9 or .11. This asymmetry is unappealing, because the odds of being a
male should be the opposite of the odds of being a female. We can take care of this
asymmetry though the natural logarithm, ln. The natural log of 9 is 2.217
(ln(.9/.1)=2.217). The natural log of 1/9 is -2.217 (ln(.1/.9)=-2.217), so the log odds of
being male is exactly opposite to the log odds of being female. The natural log
function looks like this:
Note that the natural log is zero when X is 1. When X is larger than one, the log
curves up slowly. When X is less than one, the natural log is less than zero, and
decreases rapidly as X approaches zero. When P = .50, the odds are .50/.50 or 1, and
ln(1) =0. If P is greater than .50, ln(P/(1-P) is positive; if P is less than .50, ln(odds) is
negative. [A number taken to a negative power is one divided by that number, e.g. e-
10
= 1/e10. A logarithm is an exponent from a given base, for example ln(e10) = 10.]

Back to logistic regression.

In logistic regression, the dependent variable is a logit, which is the natural log of the
odds, that is,

So a logit is a log of odds and odds are a function of P, the probability of a 1. In

logistic regression, we find

logit(P) = a + bX,

Which is assumed to be linear, that is, the log odds (logit) is assumed to be linearly
related to X, our IV. So there's an ordinary regression hidden in there. We could in
theory do ordinary regression with logits as our DV, but of course, we don't have
logits in there, we have 1s and 0s. Then, too, people have a hard time understanding
logits. We could talk about odds instead. Of course, people like to talk about
probabilities more than odds. To get there (from logits to probabilities), we first have
to take the log out of both sides of the equation. Then we have to convert odds to a
simple probability:
The simple probability is this ugly equation that you saw earlier. If log odds are
linearly related to X, then the relation between X and P is nonlinear, and has the form
of the S-shaped curve you saw in the graph and the function form (equation) shown
immediately above.

An Example
Suppose that we are working with some doctors on heart attack patients. The
dependent variable is whether the patient has had a second heart attack within 1 year
(yes = 1). We have two independent variables, one is whether the patient completed a
treatment consistent of anger control practices (yes=1). The other IV is a score on a
trait anxiety scale (a higher score means more anxious).

Our data:

Person 2nd Heart Attack Treatment of Anger Trait Anxiety

1 1 1 70
2 1 1 80
3 1 1 50
4 1 0 60
5 1 0 40
6 1 0 65
7 1 0 75
8 1 0 80
9 1 0 70
10 1 0 60
11 0 1 65
12 0 1 50
13 0 1 45
14 0 1 35
15 0 1 40
16 0 1 50
17 0 0 55
18 0 0 45
19 0 0 50
20 0 0 60

Our correlation matrix:

Heart Treat Anx

Heart 1
Treat -.30 1
Anx .59** -.23 1
Mean .50 .45 57.25
SD .51 .51 13.42

Note that half of our patients have had a second heart attack. Knowing nothing else
about a patient, and following the best in current medical practice, we would flip a
coin to predict whether they will have a second attack within 1 year. According to our
correlation coefficients, those in the anger treatment group are less likely to have
another attack, but the result is not significant. Greater anxiety is associated with a
higher probability of another attack, and the result is significant (according to r).
Now let's look at the logistic regression, for the moment examining the treatment of
anger by itself, ignoring the anxiety test scores. SAS prints this:

Response Variable: HEART

Response Levels: 2

Number of Observations: 20

Link Function: Logit

Response Profile

Ordered

Value HEART Count

1 0 10

2 1 10

SAS tells us what it understands us to model, including the name of the DV, and its
distribution.

Then we calculate probabilities with and without including the treatment variable.

Model Fitting Information and Testing Global Null Hypothesis

BETA=0

Criterion Intercept Intercept Chi-sq

Only and
Covariates

-2 LOG L 27.726 25.878 1.848

1df (p=.17)

The computer calculates the likelihood of the data. Because there are equal numbers
of people in the two groups, the probability of group membership initially (without
considering anger treatment) is .50 for each person. Because the people are
independent, the probability of the entire set of people is .5020, a very small number.
Because the number is so small, it is customary to first take the natural log of the
probability and then multiply the result by -2. The latter step makes the result positive.
The statistic -2LogL (minus 2 times the log of the likelihood) is a badness-of-fit
indicator, that is, large numbers mean poor fit of the model to the data. SAS prints the
result as -2 LOG L. For the initial model (intercept only), our result is the value
27.726. This is a baseline number indicating model fit. This number has no direct
analog in linear regression. It is roughly analogous to generating some random
numbers and finding R2 for these numbers as a baseline measure of fit in ordinary
linear regression. By including a term for treatment, the loss function reduces to
25.878, a difference of 1.848, shown in the chi-square column. The difference
between the two values of -2LogL is known as the likelihood ratio test.

When taken from large samples, the difference between two values of -2LogL is
distributed as chi-square:

Recall that multiplying numbers is equivalent to adding exponents (same for

subtraction and division of logs).

This says that the (-2Log L) for a restricted (smaller) model - (-2LogL) for a full
(larger) model is the same as the log of the ratio of two likelihoods, which is
distributed as chi-square. The full or larger model has all the parameters of interest in
it. The restricted is said to be nested in the larger model. The restricted model has one
or more of parameters in the full model restricted to some value (usually zero). The
parameters in the nested model must be a proper subset of the parameters in the full
model. For example, suppose we have two IVs, one categorical and once continuous,
and we are looking at an ATI design. A full model could have included terms for the
continuous variable, the categorical variable and their interaction (3 terms). Restricted
models could delete the interaction or one or more main effects (e.g., we could have a
model with only the categorical variable). A nested model cannot have as a single IV,
some other categorical or continuous variable not contained in the full model. If it
does, then it is no longer nested, and we cannot compare the two values of -2LogL to
get a chi-square value. The chi-square is used to statistically test whether including a
variable reduces badness-of-fit measure. This is analogous to producing an increment
in R-square in hierarchical regression. If chi-square is significant, the variable is
considered to be a significant predictor in the equation, analogous to the significance
of the b weight in simultaneous regression.

For our example with anger treatment only, SAS produces the following:

Analysis of Maximum Likelihood Estimates

Variable DF Par Est Std Err Wald Chisq Pr > Chi- sq Stand. Est Odds Ratio

Intercept 1 -.5596 .6268 .7972 .3719 . .

Treatment 1 1.2528 .9449 17566 .1849 .3525 3.50

The intercept is the value of a, in this case -.5596. As usual, we are not terribly
interested in whether a is equal to zero. The value of b given for Anger Treatment is
1.2528. the chi-square associated with this b is not significant, just as the chi-square
for covariates was not significant. Therefore we cannot reject the hypothesis that b is
zero in the population. Our equation can be written either:

Logit(P) = -.5596+1.2528X

The main interpretation of logistic regression results is to find the significant

predictors of Y. However, other things can sometimes be done with the results.

The Odds Ratio

Recall that the odds for a group is :

Now the odds for another group would also be P/(1-P) for that group. Suppose we
arrange our data in the following way:

Anger Treatment
Heart Attack Yes (1) No (0) Total
Yes (1) 3 (a) 7 (b) 10 (a+b)
No (0) 6 (c) 4 (d) 10 (c+d)
Total 9 (a+c) 11 (b+d) 20 (a+b+c+d)

Now we can compute the odds of having a heart attack for the treatment group and the
no treatment group. For the treatment group, the odds are 3/6 = 1/2. The probability of
a heart attack is 3/(3+6) = 3/9 = .33. The odds from this probability are .33/(1-.33)
= .33/.66 = 1/2. The odds for the no treatment group are 7/4 or 1.75. The odds ratio is
calculated to compare the odds across groups.

If the odds are the same across groups, the odds ratio (OR) will be 1.0. If not, the OR
will be larger or smaller than one. People like to see the ratio be phrased in the larger
direction. In our case, this would be 1.75/.5 or 1.75*2 = 3.50.

Now if we go back up to the last column of the printout where is says odds ratio in the
treatment column, you will see that the odds ratio is 3.50, which is what we got by
finding the odds ratio for the odds from the two treatment conditions. It also happens
that e1.2528 = 3.50. Note that the exponent is our value of b for the logistic curve.

Understanding Regression Analysis Basics
100% (1)
Understanding Regression Analysis Basics
8 pages
Regression Analysis for Educators
No ratings yet
Regression Analysis for Educators
8 pages
Regression Analysis Assignment
No ratings yet
Regression Analysis Assignment
8 pages
4 Regression
No ratings yet
4 Regression
24 pages
CS601 - Machine Learning - Unit 1 - Regression
No ratings yet
CS601 - Machine Learning - Unit 1 - Regression
11 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
Module 11. Lesson Proper
No ratings yet
Module 11. Lesson Proper
5 pages
ML Assignment No. 1: 1.1 Title
No ratings yet
ML Assignment No. 1: 1.1 Title
8 pages
Linear Regression for Academics
No ratings yet
Linear Regression for Academics
28 pages
Linear Regression Basics
No ratings yet
Linear Regression Basics
23 pages
Output Input Linear Correlation Coefficient Regression Analysis
No ratings yet
Output Input Linear Correlation Coefficient Regression Analysis
6 pages
LP-III Lab Manual
No ratings yet
LP-III Lab Manual
49 pages
Engineering Regression Analysis
No ratings yet
Engineering Regression Analysis
22 pages
(Mathe) Simple Linear Regression and Correlation
No ratings yet
(Mathe) Simple Linear Regression and Correlation
61 pages
Regression Analysis
100% (1)
Regression Analysis
11 pages
Regression
No ratings yet
Regression
14 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
36 pages
Pradytha Galuh Putranti - 2304220013 - SSD - B ING-STAT
No ratings yet
Pradytha Galuh Putranti - 2304220013 - SSD - B ING-STAT
26 pages
Regression
No ratings yet
Regression
60 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
12 pages
Updated Lecture 7
No ratings yet
Updated Lecture 7
29 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
ch12 0
No ratings yet
ch12 0
43 pages
Module 3 PoM-Forecasting
No ratings yet
Module 3 PoM-Forecasting
5 pages
Linear Regression Analysis Guide
No ratings yet
Linear Regression Analysis Guide
104 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Regression Analysis
No ratings yet
Regression Analysis
29 pages
F Regression
No ratings yet
F Regression
65 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Understanding Simple Regression Analysis
100% (1)
Understanding Simple Regression Analysis
8 pages
Simple Linear Regression Example
100% (1)
Simple Linear Regression Example
3 pages
Regression and Correlation
No ratings yet
Regression and Correlation
14 pages
Simple Linear Regression Analysis Guide
No ratings yet
Simple Linear Regression Analysis Guide
3 pages
Regression Analysis
No ratings yet
Regression Analysis
22 pages
Regression Analysis for Students
No ratings yet
Regression Analysis for Students
10 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
36 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
Regression Analysis and Techniques
No ratings yet
Regression Analysis and Techniques
49 pages
Regression Analysis
No ratings yet
Regression Analysis
21 pages
Simple and Multiple Regression
No ratings yet
Simple and Multiple Regression
56 pages
Regression
No ratings yet
Regression
9 pages
Regression (Hrishikesh)
No ratings yet
Regression (Hrishikesh)
30 pages
Regression: Leech N L, Barret K C & Morgan G A (2011)
No ratings yet
Regression: Leech N L, Barret K C & Morgan G A (2011)
35 pages
Simple Linear Regression Analysis Guide
No ratings yet
Simple Linear Regression Analysis Guide
95 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
STAT630Slide Adv Data Analysis
0% (1)
STAT630Slide Adv Data Analysis
238 pages
Simple Regression and Correlation Analysis
No ratings yet
Simple Regression and Correlation Analysis
13 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
27 pages
Linearregression-Rupak
No ratings yet
Linearregression-Rupak
32 pages
Regression Analysis in Data Science
No ratings yet
Regression Analysis in Data Science
16 pages
C6 Regression
No ratings yet
C6 Regression
27 pages
8-Simple Regression Analysis
No ratings yet
8-Simple Regression Analysis
9 pages
Regression Analysis Basics
No ratings yet
Regression Analysis Basics
56 pages
BSC - Applied Statistics - Correlation and SLR
No ratings yet
BSC - Applied Statistics - Correlation and SLR
67 pages
UE20CS312 Unit2 Slides
No ratings yet
UE20CS312 Unit2 Slides
206 pages
Regression Analysis Techniques Explained
No ratings yet
Regression Analysis Techniques Explained
44 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
5 pages
Principle Component Analysis
No ratings yet
Principle Component Analysis
7 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Robust Regression
No ratings yet
Robust Regression
25 pages
Assignment 1 - Problem - Statements
No ratings yet
Assignment 1 - Problem - Statements
6 pages
Ride Test On Vehicles Travelling Over Speed Bumps
No ratings yet
Ride Test On Vehicles Travelling Over Speed Bumps
12 pages
New Rap Mist Paper
No ratings yet
New Rap Mist Paper
28 pages
Combined BT2101 Notes
No ratings yet
Combined BT2101 Notes
37 pages
Introduction To Econometrics, 5 Edition: Chapter Heading
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter Heading
69 pages
(Ebook) Business Analytics: Data Analysis & Decision Making by S. Christian Albright, Wayne L. Winston ISBN 9781305947542, 1305947541 Instant Download Full Chapters
No ratings yet
(Ebook) Business Analytics: Data Analysis & Decision Making by S. Christian Albright, Wayne L. Winston ISBN 9781305947542, 1305947541 Instant Download Full Chapters
56 pages
Mizuho Enmansai
No ratings yet
Mizuho Enmansai
37 pages
Autocorrelation in Statistics 136 Analysis
No ratings yet
Autocorrelation in Statistics 136 Analysis
15 pages
Weinrich 2021
No ratings yet
Weinrich 2021
14 pages
2015KDD - Generic and Scalable Framework For Automated Time-Series Anomaly Detection
No ratings yet
2015KDD - Generic and Scalable Framework For Automated Time-Series Anomaly Detection
9 pages
MRP Biscuit Company PVT LTD
No ratings yet
MRP Biscuit Company PVT LTD
2 pages
A Doe/Qbd Model For "Sweetener:Flavor:Color" Ratio Optimization Using D-Optimal Mixture Design For Development of Liquid Oral Dosage Forms
No ratings yet
A Doe/Qbd Model For "Sweetener:Flavor:Color" Ratio Optimization Using D-Optimal Mixture Design For Development of Liquid Oral Dosage Forms
11 pages
Practica para HL
No ratings yet
Practica para HL
12 pages
Gujarati Book
No ratings yet
Gujarati Book
3 pages
Corporate Social Responsibility and Financial Reporting Quality Evidence From Korean Retail Industry
No ratings yet
Corporate Social Responsibility and Financial Reporting Quality Evidence From Korean Retail Industry
10 pages
Unit 4 Test Review: Scatterplots & Correlation
No ratings yet
Unit 4 Test Review: Scatterplots & Correlation
3 pages
Business Statistics Final Exam Example
No ratings yet
Business Statistics Final Exam Example
19 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
25 pages
Model Fit Indices Overview
No ratings yet
Model Fit Indices Overview
4 pages
Box-Jenkins Method for Time Series
No ratings yet
Box-Jenkins Method for Time Series
9 pages
Casus - Joseph Conrad
No ratings yet
Casus - Joseph Conrad
25 pages
Young Consumer Green Purchase Behavior: Sohaib Zafar
No ratings yet
Young Consumer Green Purchase Behavior: Sohaib Zafar
20 pages
Chapter 3 - Linear Regression Model
No ratings yet
Chapter 3 - Linear Regression Model
289 pages
Determinant Factors On Labor Absorption in Small A
No ratings yet
Determinant Factors On Labor Absorption in Small A
13 pages
Polynomial Regression Insights
No ratings yet
Polynomial Regression Insights
36 pages
Machine Learning in Statistical Arbitrage
No ratings yet
Machine Learning in Statistical Arbitrage
5 pages
Permeability Determination From Well Log Data: W, For Both The Saturation Exponent, N, and Cementation Exponent, M
No ratings yet
Permeability Determination From Well Log Data: W, For Both The Saturation Exponent, N, and Cementation Exponent, M
5 pages
Pierre Gy's Sampling Theory Applications
No ratings yet
Pierre Gy's Sampling Theory Applications
11 pages
Errors Experiment
No ratings yet
Errors Experiment
8 pages
Hammer 1992
No ratings yet
Hammer 1992
12 pages
Advanced Regression Techniques
No ratings yet
Advanced Regression Techniques
63 pages
Public Schools Versus Private Schools Essay
100% (5)
Public Schools Versus Private Schools Essay
14 pages

Linear Regression

Uploaded by

Linear Regression

Uploaded by

Unit – 4

Simple Linear Regression Analysis

Fitted Regression Line

and , are obtained using the following equations:

which the observations were taken. is calculated using and is calculated

Calculation of the Fitted Line Using Least Square Estimates

Knowing and , the fitted regression line is:

This line is shown in the figure below.

Once the fitted regression line is known, the

How to Find the Regression Equation

Stude (xi - (yi - (xi - x (yi - y (xi - x)(yi -

5 60 70 -18 -7 324 49 126

The regression equation is a linear equation of the form: ŷ = b 0 + b1x . To conduct a

Therefore, the regression equation is: ŷ = 26.768 + 0.644x .

How to Use the Regression Equation

ŷ = 26.768 + 0.644x = 26.768 + 0.644 * 80 = 26.768 + 51.52 = 78.288

How to Find the Coefficient of Determination

R2 = { ( 1 / N ) * Σ [ (xi - x) * (yi - y) ] / (σx * σy ) }2

R2 = { ( 1 / N ) * Σ [ (xi - x) * (yi - y) ] / (σx * σy ) }2

Residual = Observed value - Predicted value

ŷ 65.411 71.849 78.288 81.507 87.945

e 4.589 -6.849 -8.288 13.493 -2.945

Random pattern Non-random: U-shaped Non-random: Inverted U

Transformations to Achieve Linearity

What is a Transformation to Achieve Linearity?

Transforming a variable involves using a mathematical operation to change its

 Linear transformation. A linear transformation preserves linear relationships

Methods of Transforming Variables to Achieve Linearity

Regression Predicted value

Reciprocal model Dependent variable = 1/y 1/y = b0 + b1x ŷ = 1 / ( b0 + b1x )

Power model Dependent variable = log(y)= b0 + b1log(x) ŷ = 10b0 + b1log(x)

It is customary to code a binary DV either 0 or 1. For example, we might code a

Why use logistic regression rather than ordinary linear regression?

The Logistic Curve

Where on Earth Did This Stuff Come From?

Back to logistic regression.

So a logit is a log of odds and odds are a function of P, the probability of a 1. In

Person 2nd Heart Attack Treatment of Anger Trait Anxiety

Our correlation matrix:

Heart Treat Anx

Response Variable: HEART

Link Function: Logit

Value HEART Count

Model Fitting Information and Testing Global Null Hypothesis

Criterion Intercept Intercept Chi-sq

-2 LOG L 27.726 25.878 1.848

Recall that multiplying numbers is equivalent to adding exponents (same for

Analysis of Maximum Likelihood Estimates

Intercept 1 -.5596 .6268 .7972 .3719 . .

Treatment 1 1.2528 .9449 17566 .1849 .3525 3.50

The main interpretation of logistic regression results is to find the significant

The Odds Ratio

You might also like