[NI]
CSE 261: NUMERICAL METHODS
LECTURE 11
REGRESSION ANALYSIS
PRESENTED BY
MD NAZRUL ISLAM, LECTURER, CSE, SEU
[NI]
WHAT IS REGRESSION
ANALYSIS?
Regression analysis gives information on the
relationship between a response (dependent) variable
and one or more predictor (independent) variables
The goal of regression analysis is to express the
response variable as a function of the predictor
variables
The goodness of fit and the accuracy of conclusion
depend on the data used
Hence non-representative or improperly compiled data
result in poor fits and conclusions
2
[NI]
A REGRESSION MODEL
An example of a regression model is the
linear regression model which is a linear
relationship between response variable, y
and the predictor variable, xi where
i=1,2,.....n, of the form
y 0 1 x1 2 x2 ... n xn (1)
where,
0 , 1....... n are regression coefficients
(unknown model parameters), and
is the error due to variability in the
observed responses.
3
[NI]
EXAMPLE 1
In the transformation of raw or uncooked potato to
cooked potato, heat is applied for some specific
time.
One might postulate that the amount of
untransformed portion of the starch (y) inside the
potato is a linear function of time (t) and
temperature (θ) of cooking. This is represented as
y 0 1t 2
The linear regression refers to finding the unknown
parameters, β1 and β2 which are simple linear
multipliers of the predictor variable.
4
[NI]
USES OF REGRESSION
ANALYSIS
Three uses for regression analysis are for
• model specification
• parameter estimation
• prediction
5
[NI]
Model specification
Accurate prediction and model specification
require that
• all relevant variables be accounted for in the
data
• the prediction equation be defined in the correct
functional form for all predictor variables.
6
[NI]
PARAMETER ESTIMATION
Parameter estimation is the most difficult to
perform because not only is the model required
to be correctly specified, the prediction must
also be accurate and the data should allow for
good estimation
For example, multi-linear regression creates a
problem and requires that some variables may
not be used
Thus, limitations of data and inability to
measure all predictor variables relevant in a
study restrict the use of prediction equations
7
[NI]
PREDICTION
Regression analysis equations are designed
only to make predictions.
Good predictions will not be possible if the
model is not correctly specified and accuracy of
the parameter not ensured.
8
CONSIDERATIONS FOR
[NI]
EFFECTIVE USE OF
REGRESSION ANALYSIS
For effective use of regression analysis, one
should
• investigate the data collection process,
• discover any limitations in data collected
• restrict conclusions accordingly
9
[NI]
LINEAR REGRESSION
Linear regression is the most popular
regression model. In this model, we wish to
predict response to n data points
(x1,y1),(x2,y2)..... (xn,yn)by a regression model
given by
y = a 0 + a 1x (1)
where, a0 and a1 are the constants of the
regression model.
10
[NI]
MEASURE OF GOODNESS OF
FIT
A measure of goodness of fit, that is, how well predicts
the response variable is the magnitude of the residual at
each of the data points.
E y (a a x )
i i 0 1 i
(2)
Ideally, if all the residuals are zero, one may have found
an equation in which all the points lie on the model.
i is an objective of
Thus, minimization of the residual
obtaining regression coefficients.
The most popular method to minimize the residual is the
least squares methods, where the estimates of the
constants of the models are chosen such that the sum n
of
the squared residuals is minimized, that is minimize Ei 2
i 1
11
[NI]
MINIMIZATION OF THE ERROR
Let us use the least squares criterion where we minimize
n n 2
R 2 Ei yi a0 a1 xi (3)
2
i 1 i 1
where, Sr is called the sum of the square of the residuals.
Differentiating Equation (3) with respect to a0 and a1 we get
R 2 n
2 yi a0 a1 xi 1 0 (4)
a0 i 1
R 2 n
2 yi a0 a1 xi xi 0 (5)
a1 i 1
12
[NI]
MINIMIZATION OF THE
ERROR (CONTINUED)
Using equation (4) and (5), we get
n n n
yi a0 a1 xi 0 (6)
i 1 i 1 i 1
n n n
yi xi a0 xi a1 xi2 0 (7)
i 1 i 1 i 1
n
Noting that ai 1
0 a0 a0 . . . a0 na0
n n
na0 a1 xi yi (8)
i 1 i 1
n n n
a0 xi a1 x xi yi 2
i (9)
i 1 i 1 i 1
13
[NI]
MINIMIZATION OF THE
ERROR (CONTINUED)
Solving the above equations (8) and (9) gives
n n n
n xi yi xi yi
a1 i 1 i 1 i 1 (10)
2
n
n
n x xi 2
i
i 1 i 1
n n
y i x i
(11)
a0 i 1
a1 i 1
y a1 x
n n
14
[NI]
EXAMPLE 2
The torque T needed to turn the torsional spring of a
mousetrap through an angle, θ is given below
Angle θ, Radians Torque, T
0.698132 0.188224
0.959931 0.209138
1.134464 0.230052
1.570796 0.250965
1.919862 0.313707
Find the constants and of the regression model
T k1 k 2
15
TABULATION OF DATA FOR
[NI]
CALCULATION OF NEEDED
SUMMATIONS
i θ T θ2 Tθ
Radians N-m radians N-m
1 0.698132 0.188224
2 0.959931 0.209138
3 1.134464 0.230052
4 1.570796 0.250965
55
1.919862 0.313707
i 1 6.2831 1.1921 8.8491 1.589
6
16
[NI]
THE VALUES OF CONSTANTS
5 5 5
n iTi i Ti
k2 i 1 i 1 i 1
2
5
5
n i2 i
i 1 i 1
=9.6091 X 10-2 N-m/rad k1 = 1.1767 X 10-1 N-m
5
_ T i
T i 1
n
=2.3842 X 10-2 N-m
5
_ i
i 1
n
17
=9.6091 X 10-2 N-m/rad
[NI]
LINEAR REGRESSION OF
TORQUE VS. ANGLE DATA
18
[NI]
A CLASS EXERCISE
For the following points, find a regression for
• (a) 1st order
• (b)2nd order
x Y
1 0.11
2 0.2
3 0.32
4 0.38
5 0.53
19
[NI]
LEAST SQUARE FITTING -
POLYNOMIAL
Generalizing from a stright line (i.e. First degree
polynomial) to a kth degree polynomial
y=a0+a1x+a2x2+a3x3+.....+akxk
The residual is given by
n
R 2 [ yi (a0 a1 xi a2 xi2 ....... ak xik )]2
i 1
20
[NI]
LEAST SQUARE FITTING –
POLYNOMIAL (CONTINUED)
The partial derivatives are:
R 2 n
2 [ y (a 0 a1 xi ...... a k xi )] 0
k
a 0 i 1
R 2 n
2 [ y (a 0 a1 xi ...... a k xi )]xi 0
k
a1 i 1
..........
R 2 n
2 [ y (a 0 a1 xi ...... a k xi )]xi 0
k m
a m i 1
...........
R 2 n
2 [ y (a 0 a1 xi ...... a k xi )]xi 0
k k
a k
21
i 1
[NI]
IN MATRIX FORM
n
a0
n n
n x i ............. xik yi
n
i 1 i 1
a i 1
k 1
n
1
n n
xi x ........... xi xi yi
2
i
i 1
...
i 1 i 1
.. i 1
...
...
...
n k n n .. n k
xi i ......... xi ak xi yi
k 1 2k
x
i 1
i 1 i 1
i 1
[C] [A] [B]
22
[NI]
PROGRAM (CPP)
#include <iostream>
#include <iomanip>
using namespace std;
int main() {
cout << fixed << setprecision(4);
int n;
cout << "Enter number of data points: ";
cin >> n;
double x[n], y[n];
cout << "Enter x and y values:\n";
for (int i = 0; i < n; i++) {
cout << "x[" << i << "]: ";
cin >> x[i];
cout << "y[" << i << "]: ";
cin >> y[i];
}
23
[NI]
PROGRAM (CPP)
double sumX = 0.0, sumY = 0.0, sumXY = 0.0, sumX2 = 0.0;
for (int i = 0; i < n; i++) {
sumX += x[i];
sumY += y[i];
sumXY += x[i] * y[i];
sumX2 += x[i] * x[i];
}
// Compute slope (b) and intercept (a)
double b = (n * sumXY - sumX * sumY) / (n * sumX2 - sumX * sumX);
double a = (sumY - b * sumX) / n;
cout << "\nLinear Regression Equation:\n";
cout << "y = " << a << " + " << b << " * x" << endl;
// Predict y for a given x
double x_value;
cout << "\nEnter x value to predict y: ";
cin >> x_value;
double y_pred = a + b * x_value;
cout << "Predicted y at x = " << x_value << " is: " << y_pred <<
endl;
return 0;
}
24
[NI]
Thanks
25