0% found this document useful (0 votes)
12 views18 pages

CSE 274 - Linear Regression

Uploaded by

purple9440
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views18 pages

CSE 274 - Linear Regression

Uploaded by

purple9440
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

9/29/2022

CE 206/CSE 274
Engineering Computation Sessional

Curve Fitting

Md. Shaharul Islam


Lecturer, CE Dept., MIST

Constructing new data


2
 Data are often given for discrete values along a continuum. However, you
may require estimates at points between the discrete values.

 If x = 2.5, y = ?
 Interpolation is a method of constructing new data points from a discrete
set of known data points.

Md. Shaharul Islam 1


9/29/2022

Interpolation
3
If the data are known to be very precise, the basic approach is to fit a curve or a
series of curves that pass directly through each of the points.

Interpolated point Interpolated point

Linear interpolation Curvilinear interpolation

Curve fitting
4
 If the data exhibit a significant degree of error or “scatter,” the strategy is to derive a
single curve that represents the general trend of the data.

 In engineering and science one often


has a number of data points, as
obtained by sampling or some
experiment, and tries to construct a
function which closely fits those data
points. This is called curve fitting.

Md. Shaharul Islam 2


9/29/2022

Least Square Regression


5
 Where substantial error is associated with data, the best curve-fitting strategy is to derive an
approximating function that fits the shape or general trend of the data without necessarily matching the
individual points.
 One approach to do this is to visually inspect the plotted data and then sketch a “best” line through the
points.
 Some criterion must be devised to establish a basis for the fit.
 One way to do this is to derive a curve that minimizes the discrepancy between the data points and the
curve.
 One strategy for fitting a “best” line through the data would be to minimize the sum of the square errors
between the data points and the curve.
 The technique is called least squares regression.

Least square linear regression

Error in Regression
6
 If x = x0, modeled value from equation-
ym = a0+a1x0
(x0,y0)
y0
 e is error or residual between
the model and observation. e
ym

So sum of square of errors- x0

Md. Shaharul Islam 3


9/29/2022

Optimizing error in Linear Regression


7
So sum of square of errors-

 To determine the values of a0 and


a1, equation of Sr is differentiated
with respect to each coefficient.

 Setting these derivatives equal to zero will


result in a minimum Sr . If this is done, the
equations can be expressed as

Condition for error Optimization


8
We can express the above equations as a set of two simultaneous
linear equations with two unknowns a0 and a1.

Md. Shaharul Islam 4


9/29/2022

Steps in curve fitting


9

 Identify dependent and independent variable. Are they maintain linear


relationship? If not, then transform the data to get linear relationship
between dependent and independent variable.

 Prepare the co-efficient matrix i.e. calculate ∑x , ∑y and ∑xy etc. You can
use For loop or MATLAB built in sum function.

 Develop simultaneous equations. Solve the equations using Cramer’s rule


or Substitution method/ Elimination method or MATLAB built in backslash
function.

Example -1
10

∑=

Here, n = 8
 MATLAB is a calculator.
 You can do this calculation in MATLAB using a FOR loop/sum function.

Md. Shaharul Islam 5


9/29/2022

Co-efficient matrix with sum function


11

>> x = [10 20 30 40 50 60 70 80];


>> y = [25 70 380 550 610 1220 830 1450];
>> sx = sum(x);
>> sy = sum(y);
>> sxy = sum(x.*y);
>> sx2 = sum(x.*x);

Co-efficient matrix with For loop


12

Md. Shaharul Islam 6


9/29/2022

Substitution method/ Elimination method


13
We can express the above equations as a set of two simultaneous linear
equations with two unknowns a0 and a1.
……………. (1)

……………. (2)

Substitution/ Elimination method


14

a1 = (n*sxy—sx*sy)/(n*sx2—sx^2);
a0 = sy/n—a(1)*sx/n;

Md. Shaharul Islam 7


9/29/2022

Polynomial Regression
15
Data are poorly represented by a straight line.
For these cases, a curve would be better suited to fit the data.

2nd order polynomial regression


16

𝟐
𝟎
𝟐 𝟑
𝟏
𝟐 𝟑 𝟒 𝟐
𝟐

Md. Shaharul Islam 8


9/29/2022

Cramer’s Rule
17

Determination of a0 and a1
18

Md. Shaharul Islam 9


9/29/2022

Determination of a0 and a1
19

Solution using Cramer’s Rule


20
n = 8, sumx = 360, sumy = 5135,
sumxy = 312850, sumx2 = 20400

D = det ( [n sumx ; sumx sumx2] );


a0 = det ( [sumy sumx ; sumxy sumx2] )/D;
a1 = det ( [n sumy ; sumx sumxy] )/D;

Md. Shaharul Islam 10


9/29/2022

Solution with backslash operator


21
We can express the above equations as a set of two simultaneous linear equations with
two unknowns a0 and a1.

Ax = b
x = A\b or inv(A) * b A x b
A = [n sumx ; sumx sumx2] ;
b = [sumy ; sumxy];
x = A\b;

Solution with backslash operator


22
n = 8, sumx = 360, sumy = 5135,
sumxy = 312850, sumx2 = 20400

A = [n sumx ; sumx sumx2] ;


b = [sumy ; sumxy];
x = A\b;

Md. Shaharul Islam 11


9/29/2022

Example -1
23
Fit a first order linear equation for the given data set-

ym = a0+a1.*x ;

plot (x, y, ‘o’)


a0 = -1.1379 hold on
a1 = 2.8966 plot(x, ym, ‘r’)

plot( x, y , ‘o’ , x, ym, ‘r’)

MATLAB Code using sum function


24
Fit a first order linear equation for the given data set-
x = [0 2 5 7];
y = [-1 5 12 20];

n = length(x);
sx = sum(x);
sy = sum(y); xm = linspace (0,7);
sx2 = sum(x.*x); ym = a1.*xm+a0;
sxy = sum(x.*y);
sy2 = sum(y.*y);
plot( x, y , ‘o’ , xm, ym, ‘r’)
a1 = (n*sxy—sx*sy)/(n*sx2—sx^2);
a0 = sy/n—a(1)*sx/n;

Md. Shaharul Islam 12


9/29/2022

Developed model
25

 How perfectly the curve is


fitted?
 Does the curve properly
represent trend of the given
data?
y = -1.1379 + 2.8966 x

Correlation Co-efficient, r
26

 r2 is called the coefficient of determination and r is the correlation coefficient.


 For a perfect fit, Sr = 0 and r2 = 1, signifying that the line explains 100% of the
variability of the data.
 For r2 = 0 the fit represents no improvement.

An alternative formulation for r that is more convenient for computer implementation is-

Md. Shaharul Islam 13


9/29/2022

Rainfall-runoff relation
27

Best curve for a set of data


28

Md. Shaharul Islam 14


9/29/2022

MATLAB Code using sum function


29
n = length(x);
sx = sum(x);
sy = sum(y);
sx2 = sum(x.*x);
sxy = sum(x.*y);
sy2 = sum(y.*y);

a(1) = (n*sxy—sx*sy)/(n*sx2—sx^2);
a(2) = sy/n—a(1)*sx/n;
r2 = ((n*sxy—sx*sy)/sqrt(n*sx2—sx^2)/sqrt(n*sy2—sy^2))^2

M-file to implement linear regression


30
function [a, r2] = linregr (x,y) Command window:
n = length(x); >> x = [10 20 30 40 50 60 70 80];
sx = sum(x); >> y = [25 70 380 550 610 1220 830 1450];
sy = sum(y); >> [a, r2] = linregr(x,y)
sx2 = sum(x.*x);
sxy = sum(x.*y);
sy2 = sum(y.*y); a=
19.4702 -234.2857
r2 =
a(1) = (n*sxy—sx*sy)/(n*sx2—sx^2); 0.8805
a(2) = sy/n—a(1)*sx/n;
r2 = ((n*sxy—sx*sy)/sqrt(n*sx2—sx^2)/sqrt(n*sy2—sy^2))^2;

xp = linspace(min(x),max(x));
yp = a(1)*xp+a(2);
plot(x,y,'o',xp,yp)

Md. Shaharul Islam 15


9/29/2022

Modulus of Elasticity, E
31

Up to yield strength,
y = mx + c
m = Young’s Modulus

Modulus of Elasticity of Mild Steel


32
Stress
Strain
(ksi)
0 0
5.9 0.0002
11.7 0.0004
17.5 0.0006
23.3 0.0008 y = 29042 x
29.1 0.0010
34.9 0.0012
40.7 0.0014
Young's Modulus = 29,042 ksi

Md. Shaharul Islam 16


9/29/2022

Shear Strength of Soil


33

Shear strength is defined as the


maximum shear stress that the soil may
sustain without experiencing failure.
Shear strength is a critical parameter in
geotechnical projects. It is needed to
derive the bearing capacity, design
retaining walls, evaluate the stability of
slopes and embankments, etc.

τ = σ tanϕ + c

Shear Strength of Soil


34

τ = σ tan35° + 30 Normal Shear


Stress Strength
(KPa) (KPa)
100 99.80
150 135.03
200 170.04
250 205.15
300 240.00

Md. Shaharul Islam 17


9/29/2022

Home Assignment
35
1. Write down the MATLAB code to fit a second-order polynomial to the given data. Also write down the
code to determine the value of correlation coefficient from Sr and St.
x 0 1 2 3 4 5
y 2.2 7.7 13.6 27.2 40.9 61.1
2. The following data was obtained during a tensile test on a mild steel specimen having an initial
diameter of 12.8 mm. Determine the Modulus of Elasticity of the mild steel.

Axial load (N) 0 12,700 25,400 38,100 50,800 76,200 89,100


Length (mm) 50.800 50.825 50.851 50.876 50.902 50.952 51.003

3. Before using a tacheometer for surveying work, it is reqired to determine the constants K and C. A
tacheometer is used to calculate the horizontal distance, D using the relation, D = kS + C where S is the
staff intercept. Determine the tacheometric constants of the theodolite.

D 50 60 70 80 90
s 0.496 0.596 0.696 0.796 0.896

Home Assignment (Contd.)


36
4. A direct shear test was conducted for an over-consolidated clay. The size of the sample was 4 in. x 4 in.
in plan and 1 in. in height. The test result is presented in the following table. Determine the shear
strength parameters, cohesion and friction angle.

Normal load (lb) 160 320 480 640


Shear force (lb) 320 304 480 608

5. For the following rainfall-runoff data, develop a model to predict the runoff for a certain
rainfall. Which one will be the suitable model for the given data set- linear or parabolic curve?

Rainfall (mm) 400 800 1200 1400


Runoff (mm) 100 150 350 500

6. Write two MATLAB Code (M. file) that will fit the equations, y = aebx and y = axb . No need of
determination of r2. Only write down code to obtain the constants a and b. The code should plot
the given data and modeled data in a single graph.

Md. Shaharul Islam 18

You might also like