Previous Lecture
Linear systems
Review example on Gauss-Seidel method
Nonlinear systems
Review example on Successive substitution
Review example on Newton-Raphson method
Least Squares Regression (Linear
regression)
Polynomial Regression
Multiple Regression
Nonlinear Regression
CURVE FITTING
Describes techniques to fit curves (curve fitting) to
discrete data to obtain intermediate estimates.
There are two general approaches for curve fitting:
Data exhibit a significant degree of scatter. The strategy is to
derive a single curve that represents the general trend of the
data.
Data is very precise. The strategy is to pass a curve or a series of
curves through each of the points.
In engineering two types of applications are
encountered:
Trend analysis. Predicting values of dependent variable, may
include extrapolation beyond data points or interpolation
between data points.
Hypothesis testing. Comparing existing mathematical model
with measured data.
Figure part4.1
Least-square regression
Linear interpolation
curvilinear interpolation
Mathematical Background
Simple Statistics
In course of engineering study, if several
measurements are made of a particular
quantity, additional insight can be gained by
summarizing the data in one or more well
chosen statistics that convey as much
information as possible about specific
characteristics of the data set.
These descriptive statistics are most often
selected to represent
The location of the center of the distribution of the
data,
The degree of spread of the data.
Arithmetic mean. The sum of the individual data
points (yi) divided by the number of points (n).
yi
y
n
i 1, , n
Standard deviation. The most common measure of
a spread for a sample.
St 2 2
Sy or 2
y
i yi / n
n 1 S y
St ( yi y ) 2 n 1
Variance. Representation of spread by the square of
the standard deviation.
( yi y)2
S y2 Degrees of freedom
n 1
Coefficient of variation. Has the utility to quantify
the spread of data.
Sy
c.v. 100%
y
Mode?? Median?? Range??
Figure 13.3
A histogram used to depict the distribution of data.
Least Squares Regression
Linear Regression
Fitting a straight line to a set of paired
observations: (x1, y1), (x2, y2 xn, yn).
y=a0+a1x+e
a1- slope
a0- intercept
e- error, or residual, between the model
and the observations
Fit
Minimize the sum of the residual errors for all
available data:
n n
ei ( yi ao a1 xi )
i 1 i 1
n = total number of points
However, this is an inadequate criterion, so is the
sum of the absolute values
n n
ei yi a0 a1 xi
i 1 i 1
Best strategy is to minimize the sum of the squares
of the residuals between the measured y and the y
calculated with the linear model:
n
Sr ei2
i 1
n
( yi , measured yi , model ) 2
i 1
n
( yi a0 a1 xi ) 2
i 1
Yields a unique line for a given set of data.
List-Squares Fit of a Straight Line
Sr
2 ( yi ao a1 xi )( 1) 0
ao
Sr
2 ( yi ao a1 xi )( xi ) 0
a1
0 yi a0 a1 xi
0 yi xi a 0 xi a1 xi2
a0 na0 Normal equations, can be
solved simultaneously
na0 xi a1 yi
n xi yi xi yi
a1 2
n xi2 xi Mean values
a0 y a1 x
Figure 13.7
The residual in linear regression
Figure 13.8
a) The spread of data around mean value
b) The spread of data around the best-fit line
Figure 13.9
The small and large
residual error
fit
If:
Total sum of the squares around the mean for the
dependent variable, y, is St
Sum of the squares of residuals around the
regression line is Sr
St-Sr quantifies the improvement or error reduction
due to describing data in terms of a straight line
rather than as an average value.
St ( yi y)2
2 St Sr
r n
St Sr ( yi a0 a1 xi ) 2
i 1
r2-coefficient of determination
Sqrt(r2) correlation coefficient
For a perfect fit
Sr=0 and r=r2=1, signifying that the line
explains 100 percent of the variability of the
data.
For r=r2=0, Sr=St, the fit represents no
improvement.
Polynomial Regression
Some engineering data is poorly
represented by a straight line.
For these cases a curve is better suited to fit
the data.
The least squares method can readily be
extended to fit the data to higher order
polynomials
Using Excel
You may also use Excel Built-in functions (slope and intercept)
a0 na0
na0 xi a1 yi
n xi yi xi yi
a1 2
n xi2 xi
a0 y a1 x
Using MATLAB
General Linear Least Squares
y a0 z0 a1 z1 a2 z 2 am z m e
z0 , z1, , z m are m 1 basis functions
Y Z A E
Z matrix of the calculated values of the basis functions
at the measured values of the independent variable
Y observed valued of the dependent variable
A unknown coefficien ts
E residuals
2 Minimized by taking its partial
n m
derivative w.r.t. each of the
Sr yi a j z ji coefficients and setting the
i 1 j 0 resulting equation equal to
zero
n.a0 a1 x1i a2 x2 i yi
a0 x1i a1 x12i a2 x1i x2i x1i yi
a0 x1i a1 x1i x2i a2 x22i x2i yi