9/29/2022
CE 206/CSE 274
Engineering Computation Sessional
Curve Fitting
Md. Shaharul Islam
Lecturer, CE Dept., MIST
Constructing new data
2
Data are often given for discrete values along a continuum. However, you
may require estimates at points between the discrete values.
If x = 2.5, y = ?
Interpolation is a method of constructing new data points from a discrete
set of known data points.
Md. Shaharul Islam 1
9/29/2022
Interpolation
3
If the data are known to be very precise, the basic approach is to fit a curve or a
series of curves that pass directly through each of the points.
Interpolated point Interpolated point
Linear interpolation Curvilinear interpolation
Curve fitting
4
If the data exhibit a significant degree of error or “scatter,” the strategy is to derive a
single curve that represents the general trend of the data.
In engineering and science one often
has a number of data points, as
obtained by sampling or some
experiment, and tries to construct a
function which closely fits those data
points. This is called curve fitting.
Md. Shaharul Islam 2
9/29/2022
Least Square Regression
5
Where substantial error is associated with data, the best curve-fitting strategy is to derive an
approximating function that fits the shape or general trend of the data without necessarily matching the
individual points.
One approach to do this is to visually inspect the plotted data and then sketch a “best” line through the
points.
Some criterion must be devised to establish a basis for the fit.
One way to do this is to derive a curve that minimizes the discrepancy between the data points and the
curve.
One strategy for fitting a “best” line through the data would be to minimize the sum of the square errors
between the data points and the curve.
The technique is called least squares regression.
Least square linear regression
Error in Regression
6
If x = x0, modeled value from equation-
ym = a0+a1x0
(x0,y0)
y0
e is error or residual between
the model and observation. e
ym
So sum of square of errors- x0
Md. Shaharul Islam 3
9/29/2022
Optimizing error in Linear Regression
7
So sum of square of errors-
To determine the values of a0 and
a1, equation of Sr is differentiated
with respect to each coefficient.
Setting these derivatives equal to zero will
result in a minimum Sr . If this is done, the
equations can be expressed as
Condition for error Optimization
8
We can express the above equations as a set of two simultaneous
linear equations with two unknowns a0 and a1.
Md. Shaharul Islam 4
9/29/2022
Steps in curve fitting
9
Identify dependent and independent variable. Are they maintain linear
relationship? If not, then transform the data to get linear relationship
between dependent and independent variable.
Prepare the co-efficient matrix i.e. calculate ∑x , ∑y and ∑xy etc. You can
use For loop or MATLAB built in sum function.
Develop simultaneous equations. Solve the equations using Cramer’s rule
or Substitution method/ Elimination method or MATLAB built in backslash
function.
Example -1
10
∑=
Here, n = 8
MATLAB is a calculator.
You can do this calculation in MATLAB using a FOR loop/sum function.
Md. Shaharul Islam 5
9/29/2022
Co-efficient matrix with sum function
11
>> x = [10 20 30 40 50 60 70 80];
>> y = [25 70 380 550 610 1220 830 1450];
>> sx = sum(x);
>> sy = sum(y);
>> sxy = sum(x.*y);
>> sx2 = sum(x.*x);
Co-efficient matrix with For loop
12
Md. Shaharul Islam 6
9/29/2022
Substitution method/ Elimination method
13
We can express the above equations as a set of two simultaneous linear
equations with two unknowns a0 and a1.
……………. (1)
……………. (2)
Substitution/ Elimination method
14
a1 = (n*sxy—sx*sy)/(n*sx2—sx^2);
a0 = sy/n—a(1)*sx/n;
Md. Shaharul Islam 7
9/29/2022
Polynomial Regression
15
Data are poorly represented by a straight line.
For these cases, a curve would be better suited to fit the data.
2nd order polynomial regression
16
𝟐
𝟎
𝟐 𝟑
𝟏
𝟐 𝟑 𝟒 𝟐
𝟐
Md. Shaharul Islam 8
9/29/2022
Cramer’s Rule
17
Determination of a0 and a1
18
Md. Shaharul Islam 9
9/29/2022
Determination of a0 and a1
19
Solution using Cramer’s Rule
20
n = 8, sumx = 360, sumy = 5135,
sumxy = 312850, sumx2 = 20400
D = det ( [n sumx ; sumx sumx2] );
a0 = det ( [sumy sumx ; sumxy sumx2] )/D;
a1 = det ( [n sumy ; sumx sumxy] )/D;
Md. Shaharul Islam 10
9/29/2022
Solution with backslash operator
21
We can express the above equations as a set of two simultaneous linear equations with
two unknowns a0 and a1.
Ax = b
x = A\b or inv(A) * b A x b
A = [n sumx ; sumx sumx2] ;
b = [sumy ; sumxy];
x = A\b;
Solution with backslash operator
22
n = 8, sumx = 360, sumy = 5135,
sumxy = 312850, sumx2 = 20400
A = [n sumx ; sumx sumx2] ;
b = [sumy ; sumxy];
x = A\b;
Md. Shaharul Islam 11
9/29/2022
Example -1
23
Fit a first order linear equation for the given data set-
ym = a0+a1.*x ;
plot (x, y, ‘o’)
a0 = -1.1379 hold on
a1 = 2.8966 plot(x, ym, ‘r’)
plot( x, y , ‘o’ , x, ym, ‘r’)
MATLAB Code using sum function
24
Fit a first order linear equation for the given data set-
x = [0 2 5 7];
y = [-1 5 12 20];
n = length(x);
sx = sum(x);
sy = sum(y); xm = linspace (0,7);
sx2 = sum(x.*x); ym = a1.*xm+a0;
sxy = sum(x.*y);
sy2 = sum(y.*y);
plot( x, y , ‘o’ , xm, ym, ‘r’)
a1 = (n*sxy—sx*sy)/(n*sx2—sx^2);
a0 = sy/n—a(1)*sx/n;
Md. Shaharul Islam 12
9/29/2022
Developed model
25
How perfectly the curve is
fitted?
Does the curve properly
represent trend of the given
data?
y = -1.1379 + 2.8966 x
Correlation Co-efficient, r
26
r2 is called the coefficient of determination and r is the correlation coefficient.
For a perfect fit, Sr = 0 and r2 = 1, signifying that the line explains 100% of the
variability of the data.
For r2 = 0 the fit represents no improvement.
An alternative formulation for r that is more convenient for computer implementation is-
Md. Shaharul Islam 13
9/29/2022
Rainfall-runoff relation
27
Best curve for a set of data
28
Md. Shaharul Islam 14
9/29/2022
MATLAB Code using sum function
29
n = length(x);
sx = sum(x);
sy = sum(y);
sx2 = sum(x.*x);
sxy = sum(x.*y);
sy2 = sum(y.*y);
a(1) = (n*sxy—sx*sy)/(n*sx2—sx^2);
a(2) = sy/n—a(1)*sx/n;
r2 = ((n*sxy—sx*sy)/sqrt(n*sx2—sx^2)/sqrt(n*sy2—sy^2))^2
M-file to implement linear regression
30
function [a, r2] = linregr (x,y) Command window:
n = length(x); >> x = [10 20 30 40 50 60 70 80];
sx = sum(x); >> y = [25 70 380 550 610 1220 830 1450];
sy = sum(y); >> [a, r2] = linregr(x,y)
sx2 = sum(x.*x);
sxy = sum(x.*y);
sy2 = sum(y.*y); a=
19.4702 -234.2857
r2 =
a(1) = (n*sxy—sx*sy)/(n*sx2—sx^2); 0.8805
a(2) = sy/n—a(1)*sx/n;
r2 = ((n*sxy—sx*sy)/sqrt(n*sx2—sx^2)/sqrt(n*sy2—sy^2))^2;
xp = linspace(min(x),max(x));
yp = a(1)*xp+a(2);
plot(x,y,'o',xp,yp)
Md. Shaharul Islam 15
9/29/2022
Modulus of Elasticity, E
31
Up to yield strength,
y = mx + c
m = Young’s Modulus
Modulus of Elasticity of Mild Steel
32
Stress
Strain
(ksi)
0 0
5.9 0.0002
11.7 0.0004
17.5 0.0006
23.3 0.0008 y = 29042 x
29.1 0.0010
34.9 0.0012
40.7 0.0014
Young's Modulus = 29,042 ksi
Md. Shaharul Islam 16
9/29/2022
Shear Strength of Soil
33
Shear strength is defined as the
maximum shear stress that the soil may
sustain without experiencing failure.
Shear strength is a critical parameter in
geotechnical projects. It is needed to
derive the bearing capacity, design
retaining walls, evaluate the stability of
slopes and embankments, etc.
τ = σ tanϕ + c
Shear Strength of Soil
34
τ = σ tan35° + 30 Normal Shear
Stress Strength
(KPa) (KPa)
100 99.80
150 135.03
200 170.04
250 205.15
300 240.00
Md. Shaharul Islam 17
9/29/2022
Home Assignment
35
1. Write down the MATLAB code to fit a second-order polynomial to the given data. Also write down the
code to determine the value of correlation coefficient from Sr and St.
x 0 1 2 3 4 5
y 2.2 7.7 13.6 27.2 40.9 61.1
2. The following data was obtained during a tensile test on a mild steel specimen having an initial
diameter of 12.8 mm. Determine the Modulus of Elasticity of the mild steel.
Axial load (N) 0 12,700 25,400 38,100 50,800 76,200 89,100
Length (mm) 50.800 50.825 50.851 50.876 50.902 50.952 51.003
3. Before using a tacheometer for surveying work, it is reqired to determine the constants K and C. A
tacheometer is used to calculate the horizontal distance, D using the relation, D = kS + C where S is the
staff intercept. Determine the tacheometric constants of the theodolite.
D 50 60 70 80 90
s 0.496 0.596 0.696 0.796 0.896
Home Assignment (Contd.)
36
4. A direct shear test was conducted for an over-consolidated clay. The size of the sample was 4 in. x 4 in.
in plan and 1 in. in height. The test result is presented in the following table. Determine the shear
strength parameters, cohesion and friction angle.
Normal load (lb) 160 320 480 640
Shear force (lb) 320 304 480 608
5. For the following rainfall-runoff data, develop a model to predict the runoff for a certain
rainfall. Which one will be the suitable model for the given data set- linear or parabolic curve?
Rainfall (mm) 400 800 1200 1400
Runoff (mm) 100 150 350 500
6. Write two MATLAB Code (M. file) that will fit the equations, y = aebx and y = axb . No need of
determination of r2. Only write down code to obtain the constants a and b. The code should plot
the given data and modeled data in a single graph.
Md. Shaharul Islam 18