0% found this document useful (0 votes)
65 views71 pages

Solving Nonlinear Equations with MATLAB

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views71 pages

Solving Nonlinear Equations with MATLAB

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT IV

Nonlinear Equations System of Non - linear equations , Solving System of Equations Using MATLAB
function fsolve , Interpolation Lagrange Interpolation . Two dimensional Interpolation . Straight line
fit using Least Square Method , Curve fitting using built - in functions ployval and polyfit , cubic fit
using least square method . Finding roots of a polynomial - roots function , Newton - Raphson
Method .

Nonlinear Equations System of Non - linear equations

Systems of Nonlinear Equations


Solve systems of nonlinear equations in serial or parallel

Find a solution to a multivariable nonlinear equation F(x) = 0. You can also solve a scalar equation or
linear system of equations, or a system represented by F(x) = G(x) in the problem-based approach
(equivalent to F(x) – G(x) = 0 in the solver-based approach). For nonlinear systems, solvers convert
the equation-solving problem to the optimization problem of minimizing the sum of squares of the
components of F, namely min(∑Fi2(x)). Linear and scalar equations have different solution
algorithms; see Equation Solving Algorithms.
Before you begin to solve an optimization problem, you must choose the appropriate approach:
problem-based or solver-based. For details, see First Choose Problem-Based or Solver-Based
Approach.
For the problem-based approach, create problem variables, and then represent the equations in terms
of these variables. For the problem-based steps to take, see Problem-Based Workflow for Solving
Equations. To solve the resulting problem, use solve.
For the solver-based steps to take, including defining the objective function and choosing the
appropriate solver, see Solver-Based Optimization Problem Setup.

Functions
collapse all
Solve and Analyze, Problem-Based

eqnproblem Create equation problem

evaluate Evaluate optimization expression

infeasibility Constraint violation at a point

optimeq Create empty optimization equality array

optimvar Create optimization variables

Convert optimization problem or equation problem to


prob2struct
solver form

show Display information about optimization object


solve Solve optimization problem or equation problem

write Save optimization object description

Solve Equations, Solver-Based

fsolve Solve system of nonlinear equations

fzero Root of nonlinear function

lsqlin Solve constrained linear least-squares problems

Solve nonlinear least-squares (nonlinear data-fitting)


lsqnonlin
problems

Live Editor Tasks

Optimize Optimize or solve equations in the Live Editor


Objects

EquationProblem System of nonlinear equations

OptimizationEquality Equalities and equality constraints

OptimizationExpression
Arithmetic or functional expression in terms of
optimization variables
OptimizationVariable Variable for optimization

Solving System of Equations Using MATLAB function fsolve

fsolve
Solve system of nonlinear equations

collapse all in page


Syntax
x = fsolve(fun,x0)
x = fsolve(fun,x0,options)
x = fsolve(problem)
[x,fval] = fsolve(___)
[x,fval,exitflag,output] = fsolve( ___)
[x,fval,exitflag,output,jacobian] = fsolve( ___)

Description
Nonlinear system solver
Solves a problem specified by
F(x) = 0
for x, where F(x) is a function that returns a vector value.
x is a vector or a matrix; see Matrix Arguments.

example
x = fsolve(fun,x0) starts at x0 and tries to solve the equations fun(x) = 0, an array of zeros.
Note
Passing Extra Parameters explains how to pass extra parameters to the vector function fun(x), if
necessary. See Solve Parameterized Equation.

example
x = fsolve(fun,x0,options) solves the equations with the optimization options specified
in options. Use optimoptions to set these options.
example
x = fsolve(problem) solves problem, a structure described in problem.
example
[x,fval] = fsolve(___), for any syntax, returns the value of the objective function fun at the
solution x.
example
[x,fval,exitflag,output] = fsolve(___) additionally returns a value exitflag that describes
the exit condition of fsolve, and a structure output with information about the optimization process.
[x,fval,exitflag,output,jacobian] = fsolve(___) returns the Jacobian of fun at the
solution x.
Examples
collapse all
Solution of 2-D Nonlinear System
Try This ExampleCopy Command Copy Code
This example shows how to solve two nonlinear equations in two variables. The equations are

−(x +x )
e−e 1 2 =x2(1+x21)x1cos(x2)+x2sin(x1)=12.
Convert the equations to the form F(x)=0.

−(x +x )
e−e 1 2 −x2(1+x21)=0x1cos(x2)+x2sin(x1)−12=0.
The root2d.m function, which is available when you run this example, computes the values.
type root2d.m
function F = root2d(x)

F(1) = exp(-exp(-(x(1)+x(2)))) - x(2)*(1+x(1)^2);


F(2) = x(1)*cos(x(2)) + x(2)*sin(x(1)) - 0.5;
Solve the system of equations starting at the point [0,0].
fun = @root2d;
x0 = [0,0];
x = fsolve(fun,x0)
Equation solved.

fsolve completed because the vector of function values is near zero


as measured by the value of the function tolerance, and
the problem appears regular as measured by the gradient.
x = 1×2

0.3532 0.6061

Solution with Nondefault Options


Try This ExampleCopy Command Copy Code
Examine the solution process for a nonlinear system.
Set options to have no display and a plot function that displays the first-order optimality, which should
converge to 0 as the algorithm iterates.
options =
optimoptions('fsolve','Display','none','PlotFcn',@optimplotfirstorderopt);
The equations in the nonlinear system are

−(x +x )
e−e 1 2 =x2(1+x21)x1cos(x2)+x2sin(x1)=12.

Convert the equations to the form F(x)=0.

−(x +x )
e−e 1 2 −x2(1+x21)=0x1cos(x2)+x2sin(x1)−12=0.
The root2d function computes the left-hand side of these two equations.
type root2d.m
function F = root2d(x)

F(1) = exp(-exp(-(x(1)+x(2)))) - x(2)*(1+x(1)^2);


F(2) = x(1)*cos(x(2)) + x(2)*sin(x(1)) - 0.5;
Solve the nonlinear system starting from the point [0,0] and observe the solution process.
fun = @root2d;
x0 = [0,0];
x = fsolve(fun,x0,options)
x = 1×2

0.3532 0.6061

Solve Parameterized Equation


Try This ExampleCopy Command Copy Code
You can parameterize equations as described in the topic Passing Extra Parameters. For example,
the paramfun helper function at the end of this example creates the following equation system
parameterized by c:

2x1+x2=exp(cx1)−x1+2x2=exp(cx2).

To solve the system for a particular value, in this case c=−1, set c in the workspace and create an
anonymous function in x from paramfun.
c = -1;
fun = @(x)paramfun(x,c);
Solve the system starting from the point x0 = [0 1].
x0 = [0 1];
x = fsolve(fun,x0)
Equation solved.

fsolve completed because the vector of function values is near zero


as measured by the value of the function tolerance, and
the problem appears regular as measured by the gradient.
x = 1×2
0.1976 0.4255

To solve for a different value of c, enter c in the workspace and create the fun function again, so it
has the new c value.
c = -2;
fun = @(x)paramfun(x,c); % fun now has the new c value
x = fsolve(fun,x0)
Equation solved.

fsolve completed because the vector of function values is near zero


as measured by the value of the function tolerance, and
the problem appears regular as measured by the gradient.
x = 1×2

0.1788 0.3418

Helper Function
This code creates the paramfun helper function.
function F = paramfun(x,c)
F = [ 2*x(1) + x(2) - exp(c*x(1))
-x(1) + 2*x(2) - exp(c*x(2))];
end
Solve a Problem Structure
Try This ExampleCopy Command Copy Code
Create a problem structure for fsolve and solve the problem.
Solve the same problem as in Solution with Nondefault Options, but formulate the problem using a
problem structure.
Set options for the problem to have no display and a plot function that displays the first-order
optimality, which should converge to 0 as the algorithm iterates.
[Link] =
optimoptions('fsolve','Display','none','PlotFcn',@optimplotfirstorderopt);
The equations in the nonlinear system are

−(x +x )
e−e 1 2 =x2(1+x21)x1cos(x2)+x2sin(x1)=12.

Convert the equations to the form F(x)=0.

−(x +x )
e−e 1 2 −x2(1+x21)=0x1cos(x2)+x2sin(x1)−12=0.
The root2d function computes the left-hand side of these two equations.
type root2d
function F = root2d(x)

F(1) = exp(-exp(-(x(1)+x(2)))) - x(2)*(1+x(1)^2);


F(2) = x(1)*cos(x(2)) + x(2)*sin(x(1)) - 0.5;
Create the remaining fields in the problem structure.
[Link] = @root2d;
problem.x0 = [0,0];
[Link] = 'fsolve';
Solve the problem.
x = fsolve(problem)

x = 1×2

0.3532 0.6061

Solution Process of Nonlinear System


Try This ExampleCopy Command Copy Code
This example returns the iterative display showing the solution process for the system of two
equations and two unknowns

−x −x
2x1−x2=e 1−x1+2x2=e 2.

Rewrite the equations in the form F(x) = 0:


−x −x
2x1−x2−e 1=0−x1+2x2−e 2=0.

Start your search for a solution at x0 = [-5 -5].


First, write a function that computes F, the values of the equations at x.
F = @(x) [2*x(1) - x(2) - exp(-x(1));
-x(1) + 2*x(2) - exp(-x(2))];
Create the initial point x0.
x0 = [-5;-5];
Set options to return iterative display.
options = optimoptions('fsolve','Display','iter');
Solve the equations.
[x,fval] = fsolve(F,x0,options)
Norm of First-order Trust-region
Iteration Func-count f(x) step optimality radius
0 3 47071.2 2.29e+04 1
1 6 12003.4 1 5.75e+03 1
2 9 3147.02 1 1.47e+03 1
3 12 854.452 1 388 1
4 15 239.527 1 107 1
5 18 67.0412 1 30.8 1
6 21 16.7042 1 9.05 1
7 24 2.42788 1 2.26 1
8 27 0.032658 0.759511 0.206 2.5
9 30 7.03149e-06 0.111927 0.00294 2.5
10 33 3.29525e-13 0.00169132 6.36e-07 2.5

Equation solved.

fsolve completed because the vector of function values is near zero


as measured by the value of the function tolerance, and
the problem appears regular as measured by the gradient.
x = 2×1

0.5671
0.5671

fval = 2×1
10-6 ×

-0.4059
-0.4059

The iterative display shows f(x), which is the square of the norm of the function F(x). This value
decreases to near zero as the iterations proceed. The first-order optimality measure likewise
decreases to near zero as the iterations proceed. These entries show the convergence of the
iterations to a solution. For the meanings of the other entries, see Iterative Display.
The fval output gives the function value F(x), which should be zero at a solution (to within
the FunctionTolerance tolerance).
Examine Matrix Equation Solution
Try This ExampleCopy Command Copy Code

Find a matrix X that satisfies

X∗X∗X=[1324],
starting at the point x0 = [1,1;1,1]. Create an anonymous function that calculates the matrix
equation and create the point x0.
fun = @(x)x*x*x - [1,2;3,4];
x0 = ones(2);
Set options to have no display.
options = optimoptions('fsolve','Display','off');
Examine the fsolve outputs to see the solution quality and process.
[x,fval,exitflag,output] = fsolve(fun,x0,options)
x = 2×2

-0.1291 0.8602
1.2903 1.1612

fval = 2×2
10-9 ×

-0.2742 0.1258
0.1876 -0.0864

exitflag = 1
output = struct with fields:
iterations: 11
funcCount: 52
algorithm: 'trust-region-dogleg'
firstorderopt: 4.0197e-10
message: 'Equation solved....'

The exit flag value 1 indicates that the solution is reliable. To verify this manually, calculate the
residual (sum of squares of fval) to see how close it is to zero.
sum(sum(fval.*fval))
ans = 1.3367e-19
This small residual confirms that x is a solution.
You can see in the output structure how many iterations and function evaluations fsolve performed
to find the solution.

Input Arguments
collapse all
fun — Nonlinear equations to solve
function handle | function name
Nonlinear equations to solve, specified as a function handle or function name. fun is a function that
accepts a vector x and returns a vector F, the nonlinear equations evaluated at x. The equations to
solve are F = 0 for all components of F. The function fun can be specified as a function handle for a
file
x = fsolve(@myfun,x0)

where myfun is a MATLAB® function such as


function F = myfun(x)
F = ... % Compute function values at x
fun can also be a function handle for an anonymous function.
x = fsolve(@(x)sin(x.*x),x0);
fsolve passes x to your objective function in the shape of the x0 argument. For example, if x0 is a 5-
by-3 array, then fsolve passes x to fun as a 5-by-3 array.
If the Jacobian can also be computed and the 'SpecifyObjectiveGradient' option is true, set by
options = optimoptions('fsolve','SpecifyObjectiveGradient',true)
the function fun must return, in a second output argument, the Jacobian value J, a matrix, at x.
If fun returns a vector (matrix) of m components and x has length n, where n is the length of x0, the
Jacobian J is an m-by-n matrix where J(i,j) is the partial derivative of F(i) with respect to x(j).
(The Jacobian J is the transpose of the gradient of F.)
Example: fun = @(x)x*x*x-[1,2;3,4]
Data Types: char | function_handle | string
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. fsolve uses the number of elements in and size
of x0 to determine the number and size of variables that fun accepts.
Example: x0 = [1,2,3,4]
Data Types: double
options — Optimization options
output of optimoptions | structure as optimset returns
Optimization options, specified as the output of optimoptions or a structure such
as optimset returns.
Some options apply to all algorithms, and others are relevant for particular algorithms.
See Optimization Options Reference for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see View Options.

All Algorithms

Algorithm Choose between 'trust-region-dogleg' (default), 'trust-region',


and 'levenberg-marquardt'.
The Algorithm option specifies a preference for which algorithm to use.
It is only a preference because for the trust-region algorithm, the
nonlinear system of equations cannot be underdetermined; that is, the
number of equations (the number of elements of F returned by fun) must
be at least as many as the length of x. Similarly, for the trust-region-
dogleg algorithm, the number of equations must be the same as the
length of x. fsolve uses the Levenberg-Marquardt algorithm when the
selected algorithm is unavailable. For more information on choosing the
algorithm, see Choosing the Algorithm.
To set some algorithm options using optimset instead of optimoptions:
• Algorithm — Set the algorithm to 'trust-region-
reflective' instead of 'trust-region'.
• InitDamping — Set the initial Levenberg-Marquardt
parameter λ by setting Algorithm to a cell array such
as {'levenberg-marquardt',.005}.
CheckGradients Compare user-supplied derivatives (gradients of objective or
constraints) to finite-differencing derivatives. The choices are true or
the default false.
For optimset, the name is DerivativeCheck and the values
are 'on' or 'off'. See Current and Legacy Option Names.
Diagnostics Display diagnostic information about the function to be minimized or
solved. The choices are 'on' or the default 'off'.
DiffMaxChange Maximum change in variables for finite-difference gradients (a positive
scalar). The default is Inf.
DiffMinChange Minimum change in variables for finite-difference gradients (a positive
scalar). The default is 0.
Display Level of display (see Iterative Display):
• 'off' or 'none' displays no output.
• 'iter' displays output at each iteration, and gives the default
exit message.
• 'iter-detailed' displays output at each iteration, and gives
the technical exit message.
• 'final' (default) displays just the final output, and gives the
default exit message.
• 'final-detailed' displays just the final output, and gives the
technical exit message.
FiniteDifferenceStepSize Scalar or vector step size factor for finite differences. When you
set FiniteDifferenceStepSize to a vector v, the forward finite
differences delta are
delta = v.*sign′(x).*max(abs(x),TypicalX);
where sign′(x) = sign(x) except sign′(0) = 1. Central finite
differences are
delta = v.*max(abs(x),TypicalX);
Scalar FiniteDifferenceStepSize expands to a vector. The default
is sqrt(eps) for forward finite differences, and eps^(1/3) for central
finite differences.
For optimset, the name is FinDiffRelStep. See Current and Legacy
Option Names.
FiniteDifferenceType Finite differences, used to estimate gradients, are
either 'forward' (default), or 'central' (centered). 'central' takes
twice as many function evaluations, but should be more accurate.
The algorithm is careful to obey bounds when estimating both types of
finite differences. So, for example, it could take a backward, rather than
a forward, difference to avoid evaluating at a point outside bounds.
For optimset, the name is FinDiffType. See Current and Legacy Option
Names.
FunctionTolerance Termination tolerance on the function value, a positive scalar. The
default is 1e-6. See Tolerances and Stopping Criteria.
For optimset, the name is TolFun. See Current and Legacy Option
Names.
FunValCheck Check whether objective function values are valid. 'on' displays an
error when the objective function returns a value that is complex, Inf,
or NaN. The default, 'off', displays no error.
MaxFunctionEvaluations Maximum number of function evaluations allowed, a positive integer.
The default is 100*numberOfVariables for the 'trust-region-
dogleg' and 'trust-region' algorithms,
and 200*numberOfVariables for the 'levenberg-marquardt' algorithm.
See Tolerances and Stopping Criteria and Iterations and Function
Counts.
For optimset, the name is MaxFunEvals. See Current and Legacy Option
Names.
MaxIterations Maximum number of iterations allowed, a positive integer. The default
is 400. See Tolerances and Stopping Criteria and Iterations and Function
Counts.
For optimset, the name is MaxIter. See Current and Legacy Option
Names.
OptimalityTolerance Termination tolerance on the first-order optimality (a positive scalar).
The default is 1e-6. See First-Order Optimality Measure.
Internally, the 'levenberg-marquardt' algorithm uses an optimality
tolerance (stopping criterion) of 1e-4 times FunctionTolerance and does
not use OptimalityTolerance.
OutputFcn Specify one or more user-defined functions that an optimization
function calls at each iteration. Pass a function handle or a cell array of
function handles. The default is none ([]). See Output Function and Plot
Function Syntax.
PlotFcn Plots various measures of progress while the algorithm executes; select
from predefined plots or write your own. Pass a built-in plot function
name, a function handle, or a cell array of built-in plot function names
or function handles. For custom plot functions, pass function handles.
The default is none ([]):
• 'optimplotx' plots the current point.
• 'optimplotfunccount' plots the function count.
• 'optimplotfval' plots the function value.
• 'optimplotstepsize' plots the step size.
• 'optimplotfirstorderopt' plots the first-order optimality
measure.
Custom plot functions use the same syntax as output functions.
See Output Functions for Optimization Toolbox and Output Function
and Plot Function Syntax.
For optimset, the name is PlotFcns. See Current and Legacy Option
Names.
SpecifyObjectiveGradient If true, fsolve uses a user-defined Jacobian (defined in fun), or
Jacobian information (when using JacobianMultiplyFcn), for the
objective function. If false (default), fsolve approximates the Jacobian
using finite differences.
For optimset, the name is Jacobian and the values are 'on' or 'off'.
See Current and Legacy Option Names.
StepTolerance Termination tolerance on x, a positive scalar. The default is 1e-6.
See Tolerances and Stopping Criteria.
For optimset, the name is TolX. See Current and Legacy Option Names.
TypicalX Typical x values. The number of elements in TypicalX is equal to the
number of elements in x0, the starting point. The default value
is ones(numberofvariables,1). fsolve uses TypicalX for scaling finite
differences for gradient estimation.
The trust-region-dogleg algorithm uses TypicalX as the diagonal
terms of a scaling matrix.
UseParallel When true, fsolve estimates gradients in parallel. Disable by setting to
the default, false. See Parallel Computing.
trust-region Algorithm
JacobianMultiplyFcn Jacobian multiply function, specified as a function handle. For large-
scale structured problems, this function computes the Jacobian matrix
product J*Y, J'*Y, or J'*(J*Y) without actually forming J. The function
is of the form
W = jmfun(Jinfo,Y,flag)

where Jinfo contains a matrix used to compute J*Y (or J'*Y,


or J'*(J*Y)). The first argument Jinfo must be the same as the second
argument returned by the objective function fun, for example, in
[F,Jinfo] = fun(x)

Y is a matrix that has the same number of rows as there are dimensions
in the problem. flag determines which product to compute:
• If flag == 0, W = J'*(J*Y).
• If flag > 0, W = J*Y.
• If flag < 0, W = J'*Y.
In each case, J is not formed explicitly. fsolve uses Jinfo to compute
the preconditioner. See Passing Extra Parameters for information on
how to supply values for any additional parameters jmfun needs.
Note
'SpecifyObjectiveGradient' must be set to true for fsolve to
pass Jinfo from fun to jmfun.

See Minimization with Dense Structured Hessian, Linear Equalities for


a similar example.
For optimset, the name is JacobMult. See Current and Legacy Option
Names.
JacobPattern Sparsity pattern of the Jacobian for finite differencing.
Set JacobPattern(i,j) = 1 when fun(i) depends on x(j). Otherwise,
set JacobPattern(i,j) = 0. In other words, JacobPattern(i,j) =
1 when you can have ∂fun(i)/∂x(j) ≠ 0.

Use JacobPattern when it is inconvenient to compute the Jacobian


matrix J in fun, though you can determine (say, by inspection)
when fun(i) depends on x(j). fsolve can approximate J via sparse
finite differences when you give JacobPattern.
In the worst case, if the structure is unknown, do not set JacobPattern.
The default behavior is as if JacobPattern is a dense matrix of ones.
Then fsolve computes a full finite-difference approximation in each
iteration. This can be very expensive for large problems, so it is usually
better to determine the sparsity structure.
MaxPCGIter Maximum number of PCG (preconditioned conjugate gradient)
iterations, a positive scalar. The default
is max(1,floor(numberOfVariables/2)). For more information,
see Equation Solving Algorithms.
PrecondBandWidth Upper bandwidth of preconditioner for PCG, a nonnegative integer. The
default PrecondBandWidth is Inf, which means a direct factorization
(Cholesky) is used rather than the conjugate gradients (CG). The direct
factorization is computationally more expensive than CG, but produces
a better quality step towards the solution. Set PrecondBandWidth to 0 for
diagonal preconditioning (upper bandwidth of 0). For some problems,
an intermediate bandwidth reduces the number of PCG iterations.
SubproblemAlgorithm Determines how the iteration step is calculated. The
default, 'factorization', takes a slower but more accurate step
than 'cg'. See Trust-Region Algorithm.
TolPCG Termination tolerance on the PCG iteration, a positive scalar. The
default is 0.1.
Levenberg-Marquardt Algorithm
InitDamping Initial value of the Levenberg-Marquardt parameter, a positive scalar.
Default is 1e-2. For details, see Levenberg-Marquardt Method.
ScaleProblem 'jacobian' can sometimes improve the convergence of a poorly scaled
problem. The default is 'none'.
Example: options = optimoptions('fsolve','FiniteDifferenceType','central')
problem — Problem structure
structure
Problem structure, specified as a structure with the following fields:

Field Name Entry

objective Objective function


Field Name Entry

x0 Initial point for x


solver 'fsolve'
options Options created with optimoptions
Data Types: struct

Output Arguments
collapse all
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0.
Typically, x is a local solution to the problem when exitflag is positive. For information on the
quality of the solution, see When the Solver Succeeds.
fval — Objective function value at the solution
real vector
Objective function value at the solution, returned as a real vector. Generally, fval = fun(x).
exitflag — Reason fsolve stopped
integer
Reason fsolve stopped, returned as an integer.

1 Equation solved. First-order optimality is small.


2 Equation solved. Change in x smaller than the specified tolerance, or
Jacobian at x is undefined.
3 Equation solved. Change in residual smaller than the specified
tolerance.
4 Equation solved. Magnitude of search direction smaller than specified
tolerance.
0 Number of iterations exceeded [Link] or number of
function evaluations exceeded [Link].
-1 Output function or plot function stopped the algorithm.
-2 Equation not solved. The exit message can have more information.
-3 Equation not solved. Trust region radius became too small (trust-
region-dogleg algorithm).
output — Information about the optimization process
structure
Information about the optimization process, returned as a structure with fields:

iterations Number of iterations taken

funcCount Number of function evaluations


algorithm Optimization algorithm used
cgiterations Total number of PCG iterations ('trust-region' algorithm only)
stepsize Final displacement in x (not in 'trust-region-dogleg')
firstorderopt Measure of first-order optimality
message Exit message
jacobian — Jacobian at the solution
real matrix
Jacobian at the solution, returned as a real matrix. jacobian(i,j) is the partial derivative
of fun(i) with respect to x(j) at the solution x.

Limitations
• The function to be solved must be continuous.
• When successful, fsolve only gives one root.
• The default trust-region dogleg method can only be used when the system of equations is
square, i.e., the number of equations equals the number of unknowns. For the Levenberg-
Marquardt method, the system of equations need not be square.

Tips
• For large problems, meaning those with thousands of variables or more, save memory
(and possibly save time) by setting the Algorithm option to 'trust-region' and
the SubproblemAlgorithm option to 'cg'.

Algorithms
The Levenberg-Marquardt and trust-region methods are based on the nonlinear least-squares
algorithms also used in lsqnonlin. Use one of these methods if the system may not have a zero. The
algorithm still returns a point where the residual is small. However, if the Jacobian of the system is
singular, the algorithm might converge to a point that is not a solution of the system of equations
(see Limitations).
• By default fsolve chooses the trust-region dogleg algorithm. The algorithm is a variant of
the Powell dogleg method described in [8]. It is similar in nature to the algorithm
implemented in [7]. See Trust-Region-Dogleg Algorithm.
• The trust-region algorithm is a subspace trust-region method and is based on the interior-
reflective Newton method described in [1] and [2]. Each iteration involves the approximate
solution of a large linear system using the method of preconditioned conjugate gradients
(PCG). See Trust-Region Algorithm.
• The Levenberg-Marquardt method is described in references [4], [5], and [6].
See Levenberg-Marquardt Method.

Interpolation Lagrange Interpolation

Derivation of Lagrange Interpolation:


Consider a given set of k+1 points, (x0, y0) , (x1, y1), ( x2, y2)….. (xk, yk) where each points are
distinct.

Let’s assume a function L(xj) such that L(xj) = yj , j = 0, 1 , 3 , 3 . .. k

Observing the following points

• Lj(x) contains k factors in product and each factor has x



Now, consider what happens when this product is expanded.

Since the product skips m = j, when i = j then all terms are [xj – xm ] / [xj – xm] =1

Also, when i ≠ j, m ≠ j does not produce it and one term in the product will be for m = I, that
is, [xi – xi]/[xj – xi] = 0

Zeroing the entire product,

where, δij is Kronecker’s delta.

So, it can be written that:

Therefore, the considered function L(x) is a polynomial with degree at most k and where
L(xj) = yj

So, for all i ≠ j, lj(x) includes the term ( x – xi ) in the numerator, therefore the entire product
will be found to be zero at x = xj

This is the required formula which will also be used in the program code for Lagrange
Interpolation in MATLAB.

NOTE:
The Lagrange Interpolation formula can also be derived from Newton’s divided difference
formula.

When a polynomial function f(x) is be approximated with an n th degree polynomial, nth


divided difference of f(x) is constant and the (n+1)th divided difference is zero.

Mathematically, f[x0, x1, x2, x3, . . . . . .. . xn] = 0

By using the second property of divided difference, it can be written that

Simplifying this equation, we get

This can be represented as:

Lagrange Interpolation in MATLAB Code:


% Lagrange Interpolation MATLAB Program

function [P,R,S] = lagrangepoly(X,Y,XX)

X = [1 2 3 4 5 6 7 8]; % inputting the values of given x

Y = [0 1 0 1 0 1 0 1]; % inputting the values of given y

%[P,R,S] = lagrangepoly(X,Y);
xx = 0.5 : 0.01 : 8.5;

%plot(xx,polyval(P,xx),X,Y,'or',R,S,'.b',xx,spline(X,Y,xx),'--
g')

%grid

%axis([0.5 8.5 -5 5])

if size(X,1) > 1; X = X'; end % checking for parameters

if size(Y,1) > 1; Y = Y'; end

if size(X,1) > 1 || size(Y,1) > 1 || size(X,2) ~= size(Y,2)

error('both inputs must be equal-length vectors') %


displaying error

end % end of scope of if

N = length(X);

pvals = zeros(N,N);

% for evaluating the polynomial weights for each order

for i = 1:N

% the polynomial with roots may be values of X other than


this one

pp = poly(X( (1:N) ~= i));

pvals(i,:) = pp ./ polyval(pp, X(i));

end

P = Y*pvals;

if nargin==3

YY = polyval(P,XX); % output is YY with given XX

P = YY; % assigning to output


end

% end of scope of if

if nargout > 1 % checking for conndtions

R = roots( ((N-1):-1:1) .* P(1:(N-1)) );

if nargout > 2

% the evalustion of acual value at the poins of zero


derivative

S = polyval(P,R);

end

end
The above Matlab code for Lagrange method is written for interpolation of polynomials
fitting a set of points. The program uses a user-defined function named LAGRANGE(X, Y)
with two input parameters which are required to be row vectors.

The row vectors X and Y define a set of n points which are used in Lagrange method for the
determination of (n-1)th order polynomial in X which passes through these points. The
function of P in the program is to return the n coefficients which define the polynomial in
the same order as used by POLY and POLYVAL.

Similarly, R and S are defined to return x-coordinates and y-values at n-1 extreme of the
resulting polynomial. YY returns the value of the polynomial sampled at the points which
are specified in XX.

All the inputs which are required to give to the program are embedded in the source code.
The values of X and Y initially set in the program are:

X = [1 2 3 4 5 6 7 8] and Y = [0 1 0 1 0 1 0 1]

A sample output of this MATLAB program is given below:


Numerical Example in Lagrange Interpolation:
Now, let’s analyze Lagrange Interpolation and its Matlab code mathematically using a
different set of parameters. The question here is:

From the following sets of data, find the value of x corresponding to y=15 by using
Lagrange Interpolation.

(5,12), (6,13), (9,14), 11,16)

Solution

Given value of x and y are:

X: 5 6 9 11

Y: 12 13 14 16

By using the Lagrange Interpolation Formula:

x(y)=(y-13)(y-14)(y-16)*5/(12-13)(12-14)(12-16) + (y-12)(y-14)(y-16)*6/(13-12)(13-14)(13-16)
+ (y-12)(y-13)(y-16)*9/(14-12)(14-13)(14-16) + (y-12)(y-13)(y-14)*11/(16-12)(16-13)(16-14)

By substituting y= 15, we get x = 11.5; which is the required answer.

If you have any questions regarding Lagrange interpolation, its MATLAB code, or its
derivation, bring them up to us from the comments section. You can find more Numerical
methods tutorial using Matlab here.

Two dimensional Interpolation

interp2
Interpolation for 2-D gridded data in meshgrid format

collapse all in page

Syntax
Vq = interp2(X,Y,V,Xq,Yq)

Vq = interp2(V,Xq,Yq)

Vq = interp2(V)

Vq = interp2(V,k)
Vq = interp2(___,method)

Vq = interp2(___,method,extrapval)

Description
example
Vq = interp2(X,Y,V,Xq,Yq) returns interpolated values of a function of two variables at specific
query points using linear interpolation. The results always pass through the original sampling of the
function. X and Y contain the coordinates of the sample points. V contains the corresponding function
values at each sample point. Xq and Yq contain the coordinates of the query points.
Vq = interp2(V,Xq,Yq) assumes a default grid of sample points. The default grid points cover the
rectangular region, X=1:n and Y=1:m, where [m,n] = size(V). Use this syntax when you want to
conserve memory and are not concerned about the absolute distances between points.
Vq = interp2(V) returns the interpolated values on a refined grid formed by dividing the interval
between sample values once in each dimension.
example
Vq = interp2(V,k) returns the interpolated values on a refined grid formed by repeatedly halving
the intervals k times in each dimension. This results in 2^k-1 interpolated points between sample
values.
example
Vq = interp2(___,method) specifies an alternative interpolation
method: 'linear', 'nearest', 'cubic', 'makima', or 'spline'. The default method is 'linear'.
example
Vq = interp2(___,method,extrapval) also specifies extrapval, a scalar value that is assigned
to all queries that lie outside the domain of the sample points.
If you omit the extrapval argument for queries outside the domain of the sample points, then based
on the method argument interp2 returns one of the following:
• Extrapolated values for the 'spline' and 'makima' methods
• NaN values for other interpolation methods
Examples
collapse all
Interpolate over a Grid Using Default Method
Try This ExampleCopy Command Copy Code

Coarsely sample the peaks function.


[X,Y] = meshgrid(-3:3);
V = peaks(X,Y);
Plot the coarse sampling.
figure
surf(X,Y,V)
title('Original Sampling');
Create the query grid with spacing of 0.25.
[Xq,Yq] = meshgrid(-3:0.25:3);
Interpolate at the query points.
Vq = interp2(X,Y,V,Xq,Yq);
Plot the result.
figure
surf(Xq,Yq,Vq);
title('Linear Interpolation Using Finer Grid');
Interpolate over a Grid Using Cubic Method
Try This ExampleCopy Command Copy Code

Coarsely sample the peaks function.


[X,Y] = meshgrid(-3:3);
V = peaks(7);
Plot the coarse sampling.
figure
surf(X,Y,V)
title('Original Sampling');
Create the query grid with spacing of 0.25.
[Xq,Yq] = meshgrid(-3:0.25:3);
Interpolate at the query points, and specify cubic interpolation.
Vq = interp2(X,Y,V,Xq,Yq,'cubic');
Plot the result.
figure
surf(Xq,Yq,Vq);
title('Cubic Interpolation Over Finer Grid');
Refine Grayscale Image
Try This ExampleCopy Command Copy Code

Load some image data into the workspace.


load [Link]
colormap gray
Isolate a small region of the image and cast it to single-precision.
V = single(X(200:300,1:25));
Display the image region.
imagesc(V);
axis off
title('Original Image')
Insert interpolated values by repeatedly dividing the intervals between points of the refined grid five
times in each dimension.
Vq = interp2(V,5);
Display the result.
imagesc(Vq);
axis off
title('Linear Interpolation')
Evaluate Outside the Domain of X and Y
Try This ExampleCopy Command Copy Code

Coarsely sample a function over the range, [-2, 2] in both dimensions.


[X,Y] = meshgrid(-2:0.75:2);
R = sqrt(X.^2 + Y.^2)+ eps;
V = sin(R)./(R);
Plot the coarse sampling.
figure
surf(X,Y,V)
xlim([-4 4])
ylim([-4 4])
title('Original Sampling')
Create the query grid that extends beyond the domain of X and Y.
[Xq,Yq] = meshgrid(-3:0.2:3);
Perform cubic interpolation within the domain of X and Y, and assign all queries that fall outside to
zero.
Vq = interp2(X,Y,V,Xq,Yq,'cubic',0);
Plot the result.
figure
surf(Xq,Yq,Vq)
title('Cubic Interpolation with Vq=0 Outside Domain of X and Y');
Input Arguments
collapse all
X,Y — Sample grid points
matrices | vectors
Sample grid points, specified as real matrices or vectors. The sample grid points must be unique.
• If X and Y are matrices, then they contain the coordinates of a full grid (in meshgrid
format). Use the meshgrid function to create the X and Y matrices together. Both matrices
must be the same size.
• If X and Y are vectors, then they are treated as grid vectors. The values in both vectors
must be strictly monotonic, either increasing or decreasing.
Example: [X,Y] = meshgrid(1:30,-10:10)
Data Types: single | double
V — Sample values
matrix
Sample values, specified as a real or complex matrix. The size requirements for V depend on the size
of X and Y:
• If X and Y are matrices representing a full grid (in meshgrid format), then V must be the
same size as X and Y.
• If X and Y are grid vectors, then V must be a matrix containing length(Y) rows
and length(X) columns.
If V contains complex numbers, then interp2 interpolates the real and imaginary parts separately.
Example: rand(10,10)
Data Types: single | double
Complex Number Support: Yes
Xq,Yq — Query points
scalars | vectors | matrices | arrays
Query points, specified as a real scalars, vectors, matrices, or arrays.
• If Xq and Yq are scalars, then they are the coordinates of a single query point.
• If Xq and Yq are vectors of different orientations, then Xq and Yq are treated as grid
vectors.
• If Xq and Yq are vectors of the same size and orientation, then Xq and Yq are treated
as scattered points in 2-D space.
• If Xq and Yq are matrices, then they represent either a full grid of query points
(in meshgrid format) or scattered points.
• If Xq and Yq are N-D arrays, then they represent scattered points in 2-D space.
Example: [Xq,Yq] = meshgrid((1:0.1:10),(-5:0.1:0))
Data Types: single | double
k — Refinement factor
1 (default) | real, nonnegative, integer scalar
Refinement factor, specified as a real, nonnegative, integer scalar. This value specifies the number of
times to repeatedly divide the intervals of the refined grid in each dimension. This results in 2^k-
1 interpolated points between sample values.
If k is 0, then Vq is the same as V.
interp2(V,1) is the same as interp2(V).
The following illustration shows the placement of interpolated values (in red) among nine sample
values (in black) for k=2.

Example: interp2(V,2)
Data Types: single | double
method — Interpolation method
'linear' (default) | 'nearest' | 'cubic' | 'spline' | 'makima'
Interpolation method, specified as one of the options in this table.

Method Description Continuity Comments

'linear' The interpolated value at a query C0 • Requires at least two


point is based on linear grid points in each
interpolation of the values at dimension
neighboring grid points in each
Method Description Continuity Comments

respective dimension. This is the • Requires more


default interpolation method. memory
than 'nearest'
'nearest' The interpolated value at a query Discontinuous • Requires two grid
point is the value at the nearest points in each
sample grid point. dimension.
• Fastest computation
with modest memory
requirements
'cubic' The interpolated value at a query C1 • Grid must have
point is based on a cubic uniform spacing in
interpolation of the values at each dimension, but
neighboring grid points in each the spacing does not
respective dimension. The have to be the same
interpolation is based on a cubic for all dimensions
convolution. • Requires at least four
points in each
dimension
• Requires more
memory and
computation time
than 'linear'
'makima' Modified Akima cubic Hermite C1 • Requires at least 2
interpolation. The interpolated points in each
value at a query point is based on dimension
a piecewise function of • Produces fewer
polynomials with degree at most undulations
three evaluated using the values than 'spline'
of neighboring grid points in each • Computation time is
respective dimension. The Akima typically less
formula is modified to avoid than 'spline', but the
overshoots. memory requirements
are similar
'spline' The interpolated value at a query C2 • Requires four points
point is based on a cubic in each dimension
interpolation of the values at • Requires more
neighboring grid points in each memory and
respective dimension. The computation time
interpolation is based on a cubic than 'cubic'
spline using not-a-knot end
conditions.
extrapval — Function value outside domain of X and Y
scalar
Function value outside domain of X and Y, specified as a real or complex scalar. interp2 returns this
constant value for all points outside the domain of X and Y.
Example: 5
Example: 5+1i
Data Types: single | double
Complex Number Support: Yes

Output Arguments
collapse all
Vq — Interpolated values
scalar | vector | matrix
Interpolated values, returned as a real or complex scalar, vector, or matrix. The size and shape
of Vq depends on the syntax you use and, in some cases, the size and value of the input arguments.

Special
Syntaxes Size of Vq Example
Conditions

interp2(X,Y,V,Xq,Yq) Xq, Yq are Scalar size(Vq) = [1


interp2(V,Xq,Yq) scalars 1] when you
and variations of these syntaxes pass Xq and Yq as
that scalars.
include method or extrapval

Same as above Xq, Yq are Vector of same size If size(Xq) = [100


vectors of the and orientation 1]
same size and as Xq and Yq and size(Yq) =
orientation [100 1],
then size(Vq) =
[100 1].

Same as above Xq, Yq are Matrix in which the If size(Xq) = [1


vectors of number of rows 100]
mixed is length(Yq), and the and size(Yq) = [50
orientation number of columns 1],
is length(Xq) then size(Vq) =
[50 100].

Same as above Xq, Yq are Matrix or array of the If size(Xq) = [50


matrices or same size as Xq and Yq 25]
arrays of the and size(Yq) = [50
same size 25],
then size(Vq) =
[50 25].
Special
Syntaxes Size of Vq Example
Conditions

interp2(V,k) None Matrix in which the If size(V) = [10


and variations of this syntax that number of rows is: 20]
include method or extrapval 2^k * (size(V,1)- and k = 2,
1)+1, then size(Vq) =
[37 77].
and the number of
columns is:
2^k * (size(V,2)-
1)+1
More About
collapse all
Strictly Monotonic
A set of values that are always increasing or decreasing, without reversals. For example, the
sequence, a = [2 4 6 8] is strictly monotonic and increasing. The sequence, b = [2 4 4 6 8] is
not strictly monotonic because there is no change in value between b(2) and b(3). The sequence, c
= [2 4 6 8 6] contains a reversal between c(4) and c(5), so it is not monotonic at all.
Full Grid (in meshgrid Format)
For interp2, the full grid is a pair of matrices whose elements represent a grid of points over a
rectangular region. One matrix contains the x-coordinates, and the other matrix contains the y-
coordinates. The values in the x-matrix are strictly monotonic and increasing along the rows. The
values along its columns are constant. The values in the y-matrix are strictly monotonic and
increasing along the columns. The values along its rows are constant. Use the meshgrid function to
create a full grid that you can pass to interp2.
For example, the following code creates a full grid for the region, –1 ≤ x ≤ 3 and 1 ≤ y ≤ 4:
[X,Y] = meshgrid(-1:3,(1:4))
X =

-1 0 1 2 3
-1 0 1 2 3
-1 0 1 2 3
-1 0 1 2 3

Y =

1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
Grid vectors are a more compact format to represent a grid than the full grid. The relation between the
two formats and the matrix of sample values V is
Grid Vectors
For interp2, grid vectors consist of a pair of vectors that define the x- and y-coordinates in a grid. The
row vector defines x-coordinates, and the column vector defines y-coordinates.

For example, the following code creates the grid vectors that specify the region, –1 ≤ x ≤ 3 and 1 ≤ y ≤
4:
x = -1:3;
y = (1:4)';
Scattered Points
For interp2, scattered points consist of a pair of arrays that define a collection of points scattered in
2-D space. One array contains the x-coordinates, and the other contains the y-coordinates.
For example, the following code specifies the points, (2,7), (5,3), (4,1), and (10,9):
x = [2 5; 4 10];
y = [7 3; 1 9];
Straight line fit using Least Square Method

lsqr
Solve system of linear equations — least-squares method

collapse all in page


Syntax
x = lsqr(A,b)

x = lsqr(A,b,tol)

x = lsqr(A,b,tol,maxit)

x = lsqr(A,b,tol,maxit,M)

x = lsqr(A,b,tol,maxit,M1,M2)

x = lsqr(A,b,tol,maxit,M1,M2,x0)

[x,flag] = lsqr(___)

[x,flag,relres] = lsqr(___)

[x,flag,relres,iter] = lsqr(___)

[x,flag,relres,iter,resvec] = lsqr(___)

[x,flag,relres,iter,resvec,lsvec] = lsqr(___)

Description
example
x = lsqr(A,b) attempts to solve the system of linear equations A*x = b for x using the Least
Squares Method. lsqr finds a least squares solution for x that minimizes norm(b-A*x). When A is
consistent, the least squares solution is also a solution of the linear system. When the attempt is
successful, lsqr displays a message to confirm convergence. If lsqr fails to converge after the
maximum number of iterations or halts for any reason, it displays a diagnostic message that includes
the relative residual norm(b-A*x)/norm(b) and the iteration number at which the method stopped.
example
x = lsqr(A,b,tol) specifies a tolerance for the method. The default tolerance is 1e-6.
example
x = lsqr(A,b,tol,maxit) specifies the maximum number of iterations to use. lsqr displays a
diagnostic message if it fails to converge within maxit iterations.
example
x = lsqr(A,b,tol,maxit,M) specifies a preconditioner matrix M and computes x by effectively
−1
solving the system AM y=b for y, where y=Mx. Using a preconditioner matrix can improve the
numerical properties of the problem and the efficiency of the calculation.
example
x = lsqr(A,b,tol,maxit,M1,M2) specifies factors of the preconditioner matrix M such that M =
M1*M2.
example
x = lsqr(A,b,tol,maxit,M1,M2,x0) specifies an initial guess for the solution vector x. The default
is a vector of zeros.
example
[x,flag] = lsqr(___) returns a flag that specifies whether the algorithm successfully converged.
When flag = 0, convergence was successful. You can use this output syntax with any of the
previous input argument combinations. When you specify the flag output, lsqr does not display any
diagnostic messages.
example
[x,flag,relres] = lsqr(___) also returns the residual error of the computed solution x.
If flag is 0, then x is a least-squares solution that minimizes norm(b-A*x). If relres is small,
then x is also a consistent solution, since relres represents norm(b-A*x)/norm(b).
example
[x,flag,relres,iter] = lsqr(___) also returns the iteration number iter at which x was
computed.
example
[x,flag,relres,iter,resvec] = lsqr(___) also returns a vector of the residual norms at each
iteration, including the first residual norm(b-A*x0).
example
[x,flag,relres,iter,resvec,lsvec] = lsqr(___) also returns lsvec, which is an estimate of
the scaled normal equation error at each iteration.
Examples
collapse all
Iterative Solution to Linear System
Try This ExampleCopy Command Copy Code

Solve a rectangular linear system using lsqr with default settings, and then adjust the tolerance and
number of iterations used in the solution process.
Create a random sparse matrix A with 50% density. Also create a random vector b for the right-hand
side of Ax=b.
rng default
A = sprand(400,300,.5);
b = rand(400,1);
Solve Ax=b using lsqr. The output display includes the value of the relative residual
error .
‖b−Ax‖‖b‖
x = lsqr(A,b);
lsqr stopped at iteration 20 without converging to the desired tolerance 1e-06
because the maximum number of iterations was reached.
The iterate returned (number 20) has relative residual 0.26.
By default lsqr uses 20 iterations and a tolerance of 1e-6, but the algorithm is unable to converge in
those 20 iterations for this matrix. Since the residual is still large, it is a good indicator that more
iterations (or a preconditioner matrix) are needed. You also can use a larger tolerance to make it
easier for the algorithm to converge.
Solve the system again using a tolerance of 1e-4 and 70 iterations. Specify six outputs to return the
relative residual relres of the calculated solution, as well as the residual history resvec and the
least-squares residual history lsvec.
[x,flag,relres,iter,resvec,lsvec] = lsqr(A,b,1e-4,70);
flag
flag = 0
Since flag is 0, the algorithm was able to meet the desired error tolerance in the specified number of
iterations. You can generally adjust the tolerance and number of iterations together to make trade-
offs between speed and precision in this manner.
Examine the relative residual and least-squares residual of the calculated solution.
relres
relres = 0.2625
lsres = lsvec(end)
lsres = 2.7640e-04
These residual norms indicate that x is a least-squares solution, because relres is not smaller than
the specified tolerance of 1e-4. Since no consistent solution to the linear system exists, the best the
solver can do is to make the least-squares residual satisfy the tolerance.
Plot the residual histories. The relative residual resvec quickly reaches a minimum and cannot make
further progress, while the least-squares residual lsvec continues to be minimized on subsequent
iterations.
N = length(resvec);
semilogy(0:N-1,lsvec,'--o',0:N-1,resvec,'-o')
legend("Least-squares residual","Relative residual")

Using lsqr with Preconditioner


Try This ExampleCopy Command Copy Code

Examine the effect of using a preconditioner matrix with lsqr to solve a linear system.
Load west0479, a real 479-by-479 nonsymmetric sparse matrix.
load west0479
A = west0479;
Define b so that the true solution to Ax=b is a vector of all ones.
b = sum(A,2);
Set the tolerance and maximum number of iterations.
tol = 1e-12;
maxit = 20;
Use lsqr to find a solution at the requested tolerance and number of iterations. Specify six outputs to
return information about the solution process:
• x is the computed solution to A*x = b.
• fl is a flag indicating whether the algorithm converged.
• rr is the relative residual of the computed answer x.
• it is the iteration number when x was computed.
• rv is a vector of the residual history for ‖b−Ax‖.
• lsrv is a vector of the least squares residual history.
[x,fl,rr,it,rv,lsrv] = lsqr(A,b,tol,maxit);
fl
fl = 1
rr
rr = 0.0017
it
it = 20
Since fl = 1, the algorithm did not converge to the specified tolerance within the maximum number
of iterations.
To aid with the slow convergence, you can specify a preconditioner matrix. Since A is nonsymmetric,
use ilu to generate the preconditioner M=L U in factorized form. Specify a drop tolerance to ignore
nondiagonal entries with values smaller than 1e-6. Solve the preconditioned
−1(M x)=b for y=Mx by specifying L and U as the M1 and M2 inputs to lsqr.
system AM
setup = struct('type','ilutp','droptol',1e-6);
[L,U] = ilu(A,setup);
[x1,fl1,rr1,it1,rv1,lsrv1] = lsqr(A,b,tol,maxit,L,U);
fl1
fl1 = 0
rr1
rr1 = 7.0954e-14
it1
it1 = 13
The use of an ilu preconditioner produces a relative residual less than the prescribed tolerance of 1e-
12 at the 13th iteration. The output rv1(1) is norm(b), and the output rv1(end) is norm(b-A*x1).
You can follow the progress of lsqr by plotting the relative residuals at each iteration. Plot the
residual history of each solution with a line for the specified tolerance.
semilogy(0:length(rv)-1,rv/norm(b),'-o')
hold on
semilogy(0:length(rv1)-1,rv1/norm(b),'-o')
yline(tol,'r--');
legend('No preconditioner','ILU preconditioner','Tolerance','Location','East')
xlabel('Iteration number')
ylabel('Relative residual')
Supplying Initial Guess
Try This ExampleCopy Command Copy Code

Examine the effect of supplying lsqr with an initial guess of the solution.
Create a random rectangular sparse matrix. Use the sum of each row as the vector for the right-hand
side of Ax=b so that the expected solution for x is a vector of ones.
A = sprand(700,900,0.1);
b = sum(A,2);
Use lsqr to solve Ax=b twice: one time with the default initial guess, and one time with a good initial
guess of the solution. Use 75 iterations and the default tolerance for both solutions. Specify the initial
guess in the second solution as a vector with all elements equal to 0.99.
maxit = 75;
x1 = lsqr(A,b,[],maxit);
lsqr converged at iteration 64 to a solution with relative residual 8.7e-07.
x0 = 0.99*ones(size(A,2),1);
x2 = lsqr(A,b,[],maxit,[],[],x0);
lsqr converged at iteration 26 to a solution with relative residual 9.6e-07.
With an initial guess close to the expected solution, lsqr is able to converge in fewer iterations.
Returning Intermediate Results
You also can use the initial guess to get intermediate results by calling lsqr in a for-loop. Each call to
the solver performs a few iterations and stores the calculated solution. Then you use that solution as
the initial vector for the next batch of iterations.
For example, this code performs 100 iterations four times and stores the solution vector after each
pass in the for-loop:
x0 = zeros(size(A,2),1);
tol = 1e-8;
maxit = 100;
for k = 1:4
[x,flag,relres] = lsqr(A,b,tol,maxit,[],[],x0);
X(:,k) = x;
R(k) = relres;
x0 = x;
end
X(:,k) is the solution vector computed at iteration k of the for-loop, and R(k) is the relative residual
of that solution.
Using Function Handle Instead of Numeric Matrix
Try This ExampleCopy Command Copy Code

Solve a linear system by providing lsqr with a function handle that computes A*x and A'*x in place
of the coefficient matrix A.
Create a nonsymmetric tridiagonal matrix. Preview the matrix.
A = gallery('wilk',21) + diag(ones(20,1),1)
A = 21×21

10 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
1 9 2 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 1 8 2 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 1 7 2 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 1 6 2 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 1 5 2 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 1 4 2 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 3 2 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 2 2 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 2 0 0
0 0 0 0 0 0 0 0

Since this tridiagonal matrix has a special structure, you can represent the operation A*x with a
function handle. When A multiplies a vector, most of the elements in the resulting vector are zeros.
The nonzero elements in the result correspond with the nonzero tridiagonal elements of A.

The expression A x becomes:


A x= 1010⋮⋮02910⋯02⋱10⋯020⋱⋱⋯0⋱1⋱0⋯⋱⋱⋱10⋮⋮0210 x1x2x

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
3⋮⋮x21 = 10x1+2x2x1+9x2+2x3⋮⋮x19+9x20+2x21x20+10x21 .

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦


The resulting vector can be written as the sum of three vectors:

A x= 10x1+2x2x1+9x2+2x3⋮⋮x19+9x20+2x21x20+10x21 = 0x1x2⋮x2

⎡⎢⎢⎢⎢⎢⎢⎢⎣
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
0 + 10x19x2⋮9x2010x21 +2⋅ x2x3⋮x210 .

⎤⎥⎥⎥⎥⎥⎥⎥⎦ ⎡⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎦ ⎡⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎦

T x becomes:
Likewise, the expression for A

AT x= 1020⋮⋮01920⋯01⋱20⋯010⋱⋱⋯0⋱1⋱0⋯⋱⋱⋱20⋮⋮0110 x1x2

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
x3⋮⋮x21 = 10x1+x22x1+9x2+x3⋮⋮2x19+9x20+x212x20+10x21 .

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦


AT x= 10x1+x22x1+9x2+x3⋮⋮2x19+9x20+x212x20+10x21 =2⋅ 0x1x2

⎡⎢⎢⎢⎢⎢⎢⎢⎣
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
⋮x20 + 10x19x2⋮9x2010x21 + x2x3⋮x210 .

⎤⎥⎥⎥⎥⎥⎥⎥⎦ ⎡⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎦ ⎡⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎦


In MATLAB®, write a function that creates these vectors and adds them together, thus giving the
value of A*x or A'*x, depending on the flag input:
function y = afun(x,flag)
if strcmp(flag,'notransp') % Compute A*x
y = [0; x(1:20)] ...
+ [(10:-1:0)'; (1:10)'].*x ...
+ 2*[x(2:end); 0];
elseif strcmp(flag,'transp') % Compute A'*x
y = 2*[0; x(1:20)] ...
+ [(10:-1:0)'; (1:10)'].*x ...
+ [x(2:end); 0];
end
end
(This function is saved as a local function at the end of the example.)

Now, solve the linear system Ax=b by providing lsqr with the function handle that
calculates A*x and A'*x. Use a tolerance of 1e-6 and 25 iterations. Specify b as the row sums
of A so that the true solution for x is a vector of ones.
b = full(sum(A,2));
tol = 1e-6;
maxit = 25;
x1 = lsqr(@afun,b,tol,maxit)
lsqr converged at iteration 21 to a solution with relative residual 5.4e-13.
x1 = 21×1

1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000

Local Functions
function y = afun(x,flag)
if strcmp(flag,'notransp') % Compute A*x
y = [0; x(1:20)] ...
+ [(10:-1:0)'; (1:10)'].*x ...
+ 2*[x(2:end); 0];
elseif strcmp(flag,'transp') % Compute A'*x
y = 2*[0; x(1:20)] ...
+ [(10:-1:0)'; (1:10)'].*x ...
+ [x(2:end); 0];
end
end
Input Arguments
collapse all
A — Coefficient matrix
matrix | function handle
Coefficient matrix, specified as a matrix or function handle. This matrix is the coefficient matrix in the
linear system A*x = b. Generally, A is a large sparse matrix or a function handle that returns the
product of a large sparse matrix and column vector.
Specifying A as a Function Handle
You can optionally specify the coefficient matrix as a function handle instead of a matrix. The
function handle returns matrix-vector products instead of forming the entire coefficient matrix,
making the calculation more efficient.
To use a function handle, use the function signature function y = afun(x,opt). Parameterizing
Functions explains how to provide additional parameters to the function afun, if necessary. The
function afun must satisfy these conditions:

• afun(x,'notransp') returns the product A*x.


• afun(x,'transp') returns the product A'*x.
An example of an acceptable function is:
function y = afun(x,opt,B,C,n)
if strcmp(opt,'notransp')
y = [B*x(n+1:end); C*x(1:n)];
else
y = [C'*x(n+1:end); B'*x(1:n)];
end
The function afun uses the values in B and C to compute either A*x or A'*x (depending on the
specified flag) without actually forming the entire matrix.

Data Types: double | function_handle


Complex Number Support: Yes
b — Right-hand side of linear equation
column vector
Right-hand side of linear equation, specified as a column vector. The length of b must be equal
to size(A,1).
Data Types: double
Complex Number Support: Yes
tol — Method tolerance
[] or 1e-6 (default) | positive scalar
Method tolerance, specified as a positive scalar. Use this input to trade-off accuracy and runtime in
the calculation. lsqr must meet the tolerance within the number of allowed iterations to be
successful. A smaller value of tol means the answer must be more precise for the calculation to be
successful.
Data Types: double
maxit — Maximum number of iterations
[] or min(size(A,1),20) (default) | positive scalar integer
Maximum number of iterations, specified as a positive scalar integer. Increase the value of maxit to
allow more iterations for lsqr to meet the tolerance tol. Generally, a smaller value of tol means
more iterations are required to successfully complete the calculation.
M, M1, M2 — Preconditioner matrices (as separate arguments)
eye(size(A)) (default) | matrices | function handles
Preconditioner matrices, specified as separate arguments of matrices or function handles. You can
specify a preconditioner matrix M or its matrix factors M = M1*M2 to improve the numerical aspects of
the linear system and make it easier for lsqr to converge quickly. For square coefficient matrices,
you can use the incomplete matrix factorization functions ilu and ichol to generate preconditioner
matrices. You also can use equilibrate prior to factorization to improve the condition number of the
coefficient matrix. For more information on preconditioners, see Iterative Methods for Linear
Systems.
lsqr treats unspecified preconditioners as identity matrices.
Specifying M as a Function Handle
You can optionally specify any of M, M1, or M2 as function handles instead of matrices. The function
handle performs matrix-vector operations instead of forming the entire preconditioner matrix, making
the calculation more efficient.
To use a function handle, first create a function with the signature function y =
mfun(x,opt). Parameterizing Functions explains how to provide additional parameters to the
function mfun, if necessary. The function mfun must satisfy these conditions:

• mfun(x,'notransp') returns the value of M\x or M2\(M1\x).


• mfun(x,'transp') returns the value of M'\x or M1'\(M2'\x).
An example of an acceptable function is:
function y = mfun(x,opt,a,b)
if strcmp(opt,'notransp')
y = x.*a;
else
y = x.*b;
end
end
In this example the function mfun uses a and b to compute either M\x = x*a or M'\x =
x*b (depending on the specified flag) without actually forming the entire matrix M.

Data Types: double | function_handle


Complex Number Support: Yes
x0 — Initial guess
[] or a column vector of zeros (default) | column vector
Initial guess, specified as a column vector with length equal to size(A,2). If you can
provide lsqr with a more reasonable initial guess x0 than the default vector of zeros, then it can save
computation time and help the algorithm converge faster.
Data Types: double
Complex Number Support: Yes

Output Arguments
collapse all
x — Linear system solution
column vector
Linear system solution, returned as a column vector. This output gives the approximate solution to
the linear system A*x = b.

• If flag is 0 and relres <= tol, then x is a consistent solution to A*x = b.


• If flag is 0 but relres > tol, then x is the least squares solution that
minimizes norm(b-A*x). In this case, the lsvec output contains the scaled normal
equation error of x.
Whenever the calculation is not successful (flag ~= 0), the solution x returned by lsqr is the one
with minimal norm residual computed over all the iterations.
flag — Convergence flag
scalar
Convergence flag, returned as one of the scalar values in this table. The convergence flag indicates
whether the calculation was successful and differentiates between several different forms of failure.
Flag Value Convergence
0 Success — lsqr converged to the desired tolerance tol within maxit iterations.
1 Failure — lsqr iterated maxit iterations but did not converge.
2 Failure — The preconditioner matrix M or M = M1*M2 is ill conditioned.
3 Failure — lsqr stagnated after two consecutive iterations were the same.
4 Failure — One of the scalar quantities calculated by the lsqr algorithm became too small or to
relres — Relative residual error
scalar
Relative residual error, returned as a scalar. The relative residual error is an indication of how accurate
the returned answer x is. lsqr tracks the relative residual and least-squares residual at each iteration
in the solution process, and the algorithm converges when either residual meets the specified
tolerance tol. The relres output contains the value of the residual that converged, either the relative
residual or the least-squares residual:

• The relative residual error is equal to norm(b-A*x)/norm(b) and is generally the residual
that meets the tolerance tol when lsqr converges. The resvec output tracks the history
of this residual over all iterations.
• The least-squares residual error is equal to norm((A*inv(M))'*(B-
A*X))/norm(A*inv(M),'fro'). This residual causes lsqr to converge less frequently
than the relative residual. The lsvec output tracks the history of this residual over all
iterations.
iter — Iteration number
scalar
Iteration number, returned as a scalar. This output indicates the iteration number at which the
computed answer for x was calculated.
Data Types: double
resvec — Residual error
vector
Residual error, returned as a vector. The residual error norm(b-A*x) reveals how close the algorithm
is to converging for a given value of x. The number of elements in resvec is equal to the number of
iterations. You can examine the contents of resvec to help decide whether to change the values
of tol or maxit.
Data Types: double
lsvec — Scaled normal equation error
vector
Scaled normal equation error, returned as a vector. For each iteration, lsvec contains an estimate of
the scaled normal equation residual norm((A*inv(M))'*(B-A*X))/norm(A*inv(M),'fro'). The
number of elements in lsvec is equal to the number of iterations.

More About
collapse all
Least Squares Method
The least squares (LSQR) algorithm is an adaptation of the conjugate gradients (CG) method for
rectangular matrices. Analytically, LSQR for A*x = b produces the same residuals as CG for the
normal equations A'*A*x = A'*b, but LSQR possesses more favorable numeric properties and is
thus generally more reliable [1].
The least squares method is the only iterative linear system solver that can handle rectangular and
inconsistent coefficient matrices.
Tips
• Convergence of most iterative methods depends on the condition number of the
coefficient matrix, cond(A). When A is square, you can use equilibrate to improve its
condition number, and on its own this makes it easier for most iterative solvers to
converge. However, using equilibrate also leads to better quality preconditioner
matrices when you subsequently factor the equilibrated matrix B = R*P*A*C.
• You can use matrix reordering functions such as dissect and symrcm to permute the
rows and columns of the coefficient matrix and minimize the number of nonzeros when
the coefficient matrix is factored to generate a preconditioner. This can reduce the
memory and time required to subsequently solve the preconditioned linear system.
Curve fitting using built - in functions ployval and polyfit

Polynomial Curve Fitting


Try This ExampleCopy Command Copy Code
This example shows how to fit a polynomial curve to a set of data points using the polyfit function.
You can use polyfit to find the coefficients of a polynomial that fits a set of data in a least-squares
sense using the syntax
p = polyfit(x,y,n),
where:
• x and y are vectors containing the x and y coordinates of the data points
• n is the degree of the polynomial to fit
Create some x-y test data for five data points.
x = [1 2 3 4 5];
y = [5.5 43.1 128 290.7 498.4];
Use polyfit to find a third-degree polynomial that approximately fits the data.
p = polyfit(x,y,3)
p = 1×4

-0.1917 31.5821 -60.3262 35.3400

After you obtain the polynomial for the fit line using polyfit, you can use polyval to evaluate the
polynomial at other points that might not have been included in the original data.
Compute the values of the polyfit estimate over a finer domain and plot the estimate over the real
data values for comparison. Include an annotation of the equation for the fit line.
x2 = 1:.1:5;
y2 = polyval(p,x2);
plot(x,y,'o',x2,y2)
grid on
s = sprintf('y = (%.1f) x^3 + (%.1f) x^2 + (%.1f) x +
(%.1f)',p(1),p(2),p(3),p(4));
text(2,400,s)
polyval
Polynomial evaluation

collapse all in page

Syntax
y = polyval(p,x)

[y,delta] = polyval(p,x,S)

y = polyval(p,x,[],mu)

[y,delta] = polyval(p,x,S,mu)

Description
example
y = polyval(p,x) evaluates the polynomial p at each point in x. The argument p is a vector of
length n+1 whose elements are the coefficients (in descending powers) of an nth-degree polynomial:

p(x)=p1xn+p2xn−1+...+pnx+pn+1.
The polynomial coefficients in p can be calculated for different purposes by functions
like polyint, polyder, and polyfit, but you can specify any vector for the coefficients.
To evaluate a polynomial in a matrix sense, use polyvalm instead.
example
[y,delta] = polyval(p,x,S) uses the optional output structure S produced by polyfit to
generate error estimates. delta is an estimate of the standard error in predicting a future observation
at x by p(x).
example
y = polyval(p,x,[],mu) or [y,delta] = polyval(p,x,S,mu) use the optional
output mu produced by polyfit to center and scale the data. mu(1) is mean(x), and mu(2) is std(x).
Using these values, polyval centers x at zero and scales it to have unit standard deviation,
ˆx= ‾x .
x− σx
This centering and scaling transformation improves the numerical properties of the polynomial.
Examples
collapse all
Evaluate Polynomial at Several Points
Try This ExampleCopy Command Copy Code

2
Evaluate the polynomial p(x)=3x +2x+1 at the points x=5,7,9. The polynomial coefficients can be
represented by the vector [3 2 1].
p = [3 2 1];
x = [5 7 9];
y = polyval(p,x)
y = 1×3

86 162 262

Integrate Quartic Polynomial


Try This ExampleCopy Command Copy Code

Evaluate the definite integral

4 2
I=
∫3−1 (3x −4x +10x−25)dx.
4 2 3
Create a vector to represent the polynomial integrand 3x −4x +10x−25. The x term is absent and
thus has a coefficient of 0.
p = [3 0 -4 10 -25];
Use polyint to integrate the polynomial using a constant of integration equal to 0.
q = polyint(p)
q = 1×6

0.6000 0 -1.3333 5.0000 -25.0000 0

Find the value of the integral by evaluating q at the limits of integration.


a = -1;
b = 3;
I = diff(polyval(q,[a b]))
I = 49.0667
Linear Regression With Error Estimate
Try This ExampleCopy Command Copy Code

Fit a linear model to a set of data points and plot the results, including an estimate of a 95%
prediction interval.
Create a few vectors of sample data points (x,y). Use polyfit to fit a first degree polynomial to the
data. Specify two outputs to return the coefficients for the linear fit as well as the error estimation
structure.
x = 1:100;
y = -0.3*x + 2*randn(1,100);
[p,S] = polyfit(x,y,1);
Evaluate the first-degree polynomial fit in p at the points in x. Specify the error estimation structure as
the third input so that polyval calculates an estimate of the standard error. The standard error
estimate is returned in delta.
[y_fit,delta] = polyval(p,x,S);
Plot the original data, linear fit, and 95% prediction interval y±2Δ.
plot(x,y,'bo')
hold on
plot(x,y_fit,'r-')
plot(x,y_fit+2*delta,'m--',x,y_fit-2*delta,'m--')
title('Linear Fit of Data with 95% Prediction Interval')
legend('Data','Linear Fit','95% Prediction Interval')

Use Centering and Scaling to Improve Numerical Properties


Try This ExampleCopy Command Copy Code

Create a table of population data for the years 1750 - 2000 and plot the data points.
year = ([Link])';
pop = 1e6*[791 856 978 1050 1262 1544 1650 2532 6122 8170 11560]';
T = table(year, pop)
T=11×2 table
year pop
____ _________
1750 7.91e+08
1775 8.56e+08
1800 9.78e+08
1825 1.05e+09
1850 1.262e+09
1875 1.544e+09
1900 1.65e+09
1925 2.532e+09
1950 6.122e+09
1975 8.17e+09
2000 1.156e+10

plot(year,pop,'o')

Use polyfit with three outputs to fit a 5th-degree polynomial using centering and scaling, which
improves the numerical properties of the problem. polyfit centers the data in year at 0 and scales it
to have a standard deviation of 1, which avoids an ill-conditioned Vandermonde matrix in the fit
calculation.
[p,~,mu] = polyfit([Link], [Link], 5);
Use polyval with four inputs to evaluate p with the scaled years, (year-mu(1))/mu(2). Plot the
results against the original years.
f = polyval(p,year,[],mu);
hold on
plot(year,f)
hold off
Input Arguments
collapse all
p — Polynomial coefficients
vector
Polynomial coefficients, specified as a vector. For example, the vector [1 0 1] represents the
2+1, and the vector [3.13 -2.21 5.99] represents the
polynomial x
2
polynomial 3.13x −2.21x+5.99.
For more information, see Create and Evaluate Polynomials.
Data Types: single | double
Complex Number Support: Yes
x — Query points
vector
Query points, specified as a vector. polyval evaluates the polynomial p at the points in x and returns
the corresponding function values in y.
Data Types: single | double
Complex Number Support: Yes
S — Error estimation structure
structure
Error estimation structure. This structure is an optional output from [p,S] = polyfit(x,y,n) that
can be used to obtain error estimates. S contains the following fields:
Field Description

R Triangular factor from a QR decomposition of the Vandermonde matrix of x

df Degrees of freedom

normr Norm of the residuals

If the data in y is random, then an estimate of the covariance matrix


of p is (Rinv*Rinv')*normr^2/df, where Rinv is the inverse of R.
mu — Centering and scaling values
two-element vector
Centering and scaling values, specified as a two-element vector. This vector is an optional output
from [p,S,mu] = polyfit(x,y,n) that is used to improve the numerical properties of fitting and
evaluating the polynomial p. The value mu(1) is mean(x), and mu(2) is std(x). These values are
used to center the query points in x at zero with unit standard deviation.
Specify mu to evaluate p at the scaled points, (x - mu(1))/mu(2).

Output Arguments
collapse all
y — Function values
vector
Function values, returned as a vector of the same size as the query points x. The vector contains the
result of evaluating the polynomial p at each point in x.
delta — Standard error for prediction
vector
Standard error for prediction, returned as a vector of the same size as the query points x. Generally,
an interval of y ± Δ corresponds to a roughly 68% prediction interval for future observations of large
samples, and y ± 2Δ a roughly 95% prediction interval.
If the coefficients in p are least-squares estimates computed by polyfit, and the errors in the data
input to polyfit are independent, normal, and have constant variance, then y ± Δ is at least a 50%
prediction interval.

polyfit
Polynomial curve fitting

collapse all in page

Syntax
p = polyfit(x,y,n)

[p,S] = polyfit(x,y,n)

[p,S,mu] = polyfit(x,y,n)

Description
example
p = polyfit(x,y,n) returns the coefficients for a polynomial p(x) of degree n that is a best fit (in a
least-squares sense) for the data in y. The coefficients in p are in descending powers, and the length
of p is n+1

p(x)=p1xn+p2xn−1+...+pnx+pn+1.
[p,S] = polyfit(x,y,n) also returns a structure S that can be used as an input to polyval to
obtain error estimates.
example
[p,S,mu] = polyfit(x,y,n) performs centering and scaling to improve the numerical properties of
both the polynomial and the fitting algorithm. This syntax additionally returns mu, which is a two-
element vector with centering and scaling values. mu(1) is mean(x), and mu(2) is std(x). Using
these values, polyfit centers x at zero and scales it to have unit standard deviation,
ˆx= ‾x .
x− σx
Examples
collapse all
Fit Polynomial to Trigonometric Function
Try This ExampleCopy Command Copy Code

Generate 10 points equally spaced along a sine curve in the interval [0,4*pi].
x = linspace(0,4*pi,10);
y = sin(x);
Use polyfit to fit a 7th-degree polynomial to the points.
p = polyfit(x,y,7);
Evaluate the polynomial on a finer grid and plot the results.
x1 = linspace(0,4*pi);
y1 = polyval(p,x1);
figure
plot(x,y,'o')
hold on
plot(x1,y1)
hold off
Fit Polynomial to Set of Points
Try This ExampleCopy Command Copy Code

−1 at those
Create a vector of 5 equally spaced points in the interval [0,1], and evaluate y(x)=(1+x)
points.
x = linspace(0,1,5);
y = 1./(1+x);
Fit a polynomial of degree 4 to the 5 points. In general, for n points, you can fit a polynomial of
degree n-1 to exactly pass through the points.
p = polyfit(x,y,4);
Evaluate the original function and the polynomial fit on a finer grid of points between 0 and 2.
x1 = linspace(0,2);
y1 = 1./(1+x1);
f1 = polyval(p,x1);
Plot the function values and the polynomial fit in the wider interval [0,2], with the points used to
obtain the polynomial fit highlighted as circles. The polynomial fit is good in the
original [0,1] interval, but quickly diverges from the fitted function outside of that interval.
figure
plot(x,y,'o')
hold on
plot(x1,y1)
plot(x1,f1,'r--')
legend('y','y1','f1')
Fit Polynomial to Error Function
Try This ExampleCopy Command Copy Code

First generate a vector of x points, equally spaced in the interval [0,2.5], and then
evaluate erf(x) at those points.
x = (0:0.1:2.5)';
y = erf(x);
Determine the coefficients of the approximating polynomial of degree 6.
p = polyfit(x,y,6)
p = 1×7

0.0084 -0.0983 0.4217 -0.7435 0.1471 1.1064 0.0004

To see how good the fit is, evaluate the polynomial at the data points and generate a table showing
the data, fit, and error.
f = polyval(p,x);
T = table(x,y,f,y-f,'VariableNames',{'X','Y','Fit','FitError'})
T=26×4 table
X Y Fit FitError
___ _______ __________ ___________

0 0 0.00044117 -0.00044117
0.1 0.11246 0.11185 0.00060836
0.2 0.2227 0.22231 0.00039189
0.3 0.32863 0.32872 -9.7429e-05
0.4 0.42839 0.4288 -0.00040661
0.5 0.5205 0.52093 -0.00042568
0.6 0.60386 0.60408 -0.00022824
0.7 0.6778 0.67775 4.6383e-05
0.8 0.7421 0.74183 0.00026992
0.9 0.79691 0.79654 0.00036515
1 0.8427 0.84238 0.0003164
1.1 0.88021 0.88005 0.00015948
1.2 0.91031 0.91035 -3.9919e-05
1.3 0.93401 0.93422 -0.000211
1.4 0.95229 0.95258 -0.00029933
1.5 0.96611 0.96639 -0.00028097

In this interval, the interpolated values and the actual values agree fairly closely. Create a plot to show
how outside this interval, the extrapolated values quickly diverge from the actual data.
x1 = (0:0.1:5)';
y1 = erf(x1);
f1 = polyval(p,x1);
figure
plot(x,y,'o')
hold on
plot(x1,y1,'-')
plot(x1,f1,'r--')
axis([0 5 0 2])
hold off

Use Centering and Scaling to Improve Numerical Properties


Try This ExampleCopy Command Copy Code

Create a table of population data for the years 1750 - 2000 and plot the data points.
year = ([Link])';
pop = 1e6*[791 856 978 1050 1262 1544 1650 2532 6122 8170 11560]';
T = table(year, pop)
T=11×2 table
year pop
____ _________

1750 7.91e+08
1775 8.56e+08
1800 9.78e+08
1825 1.05e+09
1850 1.262e+09
1875 1.544e+09
1900 1.65e+09
1925 2.532e+09
1950 6.122e+09
1975 8.17e+09
2000 1.156e+10

plot(year,pop,'o')

Use polyfit with three outputs to fit a 5th-degree polynomial using centering and scaling, which
improves the numerical properties of the problem. polyfit centers the data in year at 0 and scales it
to have a standard deviation of 1, which avoids an ill-conditioned Vandermonde matrix in the fit
calculation.
[p,~,mu] = polyfit([Link], [Link], 5);
Use polyval with four inputs to evaluate p with the scaled years, (year-mu(1))/mu(2). Plot the
results against the original years.
f = polyval(p,year,[],mu);
hold on
plot(year,f)
hold off
Simple Linear Regression
Try This ExampleCopy Command Copy Code

Fit a simple linear regression model to a set of discrete 2-D data points.
Create a few vectors of sample data points (x,y). Fit a first degree polynomial to the data.
x = 1:50;
y = -0.3*x + 2*randn(1,50);
p = polyfit(x,y,1);
Evaluate the fitted polynomial p at the points in x. Plot the resulting linear regression model with the
data.
f = polyval(p,x);
plot(x,y,'o',x,f,'-')
legend('data','linear fit')
Linear Regression With Error Estimate
Try This ExampleCopy Command Copy Code

Fit a linear model to a set of data points and plot the results, including an estimate of a 95%
prediction interval.
Create a few vectors of sample data points (x,y). Use polyfit to fit a first degree polynomial to the
data. Specify two outputs to return the coefficients for the linear fit as well as the error estimation
structure.
x = 1:100;
y = -0.3*x + 2*randn(1,100);
[p,S] = polyfit(x,y,1);
Evaluate the first-degree polynomial fit in p at the points in x. Specify the error estimation structure as
the third input so that polyval calculates an estimate of the standard error. The standard error
estimate is returned in delta.
[y_fit,delta] = polyval(p,x,S);
Plot the original data, linear fit, and 95% prediction interval y±2Δ.
plot(x,y,'bo')
hold on
plot(x,y_fit,'r-')
plot(x,y_fit+2*delta,'m--',x,y_fit-2*delta,'m--')
title('Linear Fit of Data with 95% Prediction Interval')
legend('Data','Linear Fit','95% Prediction Interval')
Input Arguments
collapse all
x — Query points
vector
Query points, specified as a vector. The points in x correspond to the fitted function values contained
in y. If x is not a vector, then polyfit converts it into a column vector x(:).
Warning messages result when x has repeated (or nearly repeated) points or if x might need
centering and scaling.
Data Types: single | double
Complex Number Support: Yes
y — Fitted values at query points
vector
Fitted values at query points, specified as a vector. The values in y correspond to the query points
contained in x. If y is not a vector, then polyfit converts it into a column vector y(:).
Data Types: single | double
Complex Number Support: Yes
n — Degree of polynomial fit
positive integer scalar
Degree of polynomial fit, specified as a positive integer scalar. n specifies the polynomial power of
the left-most coefficient in p.

Output Arguments
collapse all
p — Least-squares fit polynomial coefficients
vector
Least-squares fit polynomial coefficients, returned as a vector. p has length n+1 and contains the
polynomial coefficients in descending powers, with the highest power being n. If
either x or y contain NaN values and n < length(x), then all elements in p are NaN. If you specify
three output arguments to center and scale the data, then polyfit returns different coefficients
in p compared to when the data is not centered and scaled.
Use polyval to evaluate p at query points.
S — Error estimation structure
structure
Error estimation structure. This optional output structure is primarily used as an input to
the polyval function to obtain error estimates. S contains the following fields:

Field Description

R Triangular R factor (possibly permuted) from a QR decomposition of the Vandermonde matrix of x

df Degrees of freedom

normr Norm of the residuals

If the data in y is random, then an estimate of the covariance matrix


of p is (Rinv*Rinv')*normr^2/df, where Rinv is the inverse of R.
If the errors in the data in y are independent and normal with constant variance, then [y,delta] =
polyval(...) produces error bounds that contain at least 50% of the predictions. That
is, y ± delta contains at least 50% of the predictions of future observations at x.
mu — Centering and scaling values
two-element vector
Centering and scaling values, returned as a two-element vector. mu(1) is mean(x),
and mu(2) is std(x). These values center the query points in x at zero with unit standard deviation.
Use mu as the fourth input to polyval to evaluate p at the scaled points, (x - mu(1))/mu(2).

Limitations
• In problems with many points, increasing the degree of the polynomial fit
using polyfit does not always result in a better fit. High-order polynomials can be
oscillatory between the data points, leading to a poorer fit to the data. In those cases, you
might use a low-order polynomial fit (which tends to be smoother between points) or a
different technique, depending on the problem.
• Polynomials are unbounded, oscillatory functions by nature. Therefore, they are not well-
suited to extrapolating bounded data or monotonic (increasing or decreasing) data.
Algorithms
polyfit uses x to form Vandermonde matrix V with n+1 columns and m = length(x) rows, resulting
in the linear system
xn1xn2⋮xnmxn−11xn−12⋮xn−1m⋯⋯⋱⋯11⋮1

⎛⎜⎜⎜⎜
⎛⎜⎜⎜⎜⎜⎜⎜⎝ ⎞⎟⎟⎟⎟⎟⎟⎟⎠
p1p2⋮pn+1 = y1y2⋮ym ,

⎜⎝ ⎞⎟⎟⎟⎟⎟⎠ ⎛⎜⎜⎜⎜⎜⎝ ⎞⎟⎟⎟⎟⎟⎠


which polyfit solves with p = V\y. Since the columns in the Vandermonde matrix are powers of the
vector x, the condition number of V is often large for high-order fits, resulting in a singular coefficient
matrix. In those cases centering and scaling can improve the numerical properties of the system to
produce a more reliable fit.

cubic fit using least square method

Finding roots of a polynomial - roots function

roots
Polynomial roots

collapse all in page

Syntax
r = roots(p)

Description
example
r = roots(p) returns the roots of the polynomial represented by p as a column vector. Input p is a
vector containing n+1 polynomial coefficients, starting with the coefficient of xn. A coefficient
of 0 indicates an intermediate power that is not present in the equation. For example, p = [3 2 -
2+2x−2.
2] represents the polynomial 3x
n+...+p x+p
The roots function solves polynomial equations of the form p1x n n+1=0. Polynomial
equations contain a single variable with nonnegative exponents.
Examples
collapse all
Roots of Quadratic Polynomial
Try This ExampleCopy Command Copy Code

2−2x−4=0.
Solve the equation 3x
Create a vector to represent the polynomial, then find the roots.
p = [3 -2 -4];
r = roots(p)
r = 2×1

1.5352
-0.8685
Roots of Quartic Polynomial
Try This ExampleCopy Command Copy Code

4−1=0.
Solve the equation x
Create a vector to represent the polynomial, then find the roots.
p = [1 0 0 0 -1];
r = roots(p)
r = 4×1 complex

-1.0000 + 0.0000i
0.0000 + 1.0000i
0.0000 - 1.0000i
1.0000 + 0.0000i

Input Arguments
collapse all
p — Polynomial coefficients
vector
Polynomial coefficients, specified as a vector. For example, the vector [1 0 1] represents the
2+1, and the vector [3.13 -2.21 5.99] represents the
polynomial x
2
polynomial 3.13x −2.21x+5.99.
For more information, see Create and Evaluate Polynomials.
Data Types: single | double
Complex Number Support: Yes

Tips
• Use the poly function to obtain a polynomial from its roots: p = poly(r).
The poly function is the inverse of the roots function.
• Use the fzero function to find the roots of nonlinear equations. While the roots function
works only with polynomials, the fzero function is more broadly applicable to different
types of equations.
Algorithms
The roots function considers p to be a vector with n+1 elements representing the nth degree
characteristic polynomial of an n-by-n matrix, A. The roots of the polynomial are calculated by
computing the eigenvalues of the companion matrix, A.
A = diag(ones(n-1,1),-1);
A(1,:) = -p(2:n+1)./p(1);
r = eig(A)
The results produced are the exact eigenvalues of a matrix within roundoff error of the companion
matrix, A. However, this does not mean that they are the exact roots of a polynomial whose
coefficients are within roundoff error of those in p.

Roots of Polynomials
This example shows several different methods to calculate the roots of a polynomial.
• Numeric Roots
• Roots Using Substitution
• Roots in a Specific Interval
• Symbolic Roots
Numeric Roots
The roots function calculates the roots of a single-variable polynomial represented by a
vector of coefficients.

For example, create a vector to represent the polynomial x2−x−6, then calculate the roots.
p = [1 -1 -6];
r = roots(p)
r =

3
-2
By convention, MATLAB® returns the roots in a column vector.
The poly function converts the roots back to polynomial coefficients. When operating on
vectors, poly and roots are inverse functions, such that poly(roots(p)) returns p (up to
roundoff error, ordering, and scaling).
p2 = poly(r)
p2 =

1 -1 -6
When operating on a matrix, the poly function computes the characteristic polynomial of the
matrix. The roots of the characteristic polynomial are the eigenvalues of the matrix.
Therefore, roots(poly(A)) and eig(A) return the same answer (up to roundoff error,
ordering, and scaling).
Roots Using Substitution
Try This ExampleCopy Command Copy Code

You can solve polynomial equations involving trigonometric functions by simplifying the
equation using a substitution. The resulting polynomial of one variable no longer contains
any trigonometric functions.
For example, find the values of θ that solve the equation

3cos2(θ)−sin(θ)+3=0.
2 2
Use the fact that cos (θ)=1−sin (θ) to express the equation entirely in terms of sine
functions:

−3sin2(θ)−sin(θ)+6=0.
Use the substitution x=sin(θ) to express the equation as a simple polynomial equation:

−3x2−x+6=0.
Create a vector to represent the polynomial.
p = [-3 -1 6];
Find the roots of the polynomial.
r = roots(p)
r = 2×1

-1.5907
1.2573

To undo the substitution, use θ=sin−1(x). The asin function calculates the inverse sine.
theta = asin(r)
theta = 2×1 complex

-1.5708 + 1.0395i
1.5708 - 0.7028i

Verify that the elements in theta are the values of θ that solve the original equation (within
roundoff error).
f = @(Z) 3*cos(Z).^2 - sin(Z) + 3;
f(theta)
ans = 2×1 complex
10-14 ×

-0.0888 + 0.0647i
0.2665 + 0.0399i

Roots in a Specific Interval


Try This ExampleCopy Command Copy Code

Use the fzero function to find the roots of a polynomial in a specific interval. Among other
uses, this method is suitable if you plot the polynomial and want to know the value of a
particular root.
For example, create a function handle to represent the
polynomial 3x7+4x6+2x5+4x4+x3+5x2.
p = @(x) 3*x.^7 + 4*x.^6 + 2*x.^5 + 4*x.^4 + x.^3 + 5*x.^2;
Plot the function over the interval [−2,1].
x = -2:0.1:1;
plot(x,p(x))
ylim([-100 50])
grid on
hold on
From the plot, the polynomial has a trivial root at 0 and another near -1.5. Use fzero to
calculate and plot the root that is near -1.5.
Z = fzero(p, -1.5)
Z = -1.6056
plot(Z,p(Z),'r*')
Symbolic Roots
If you have Symbolic Math Toolbox™, then there are additional options for evaluating
polynomials symbolically. One way is to use the solve (Symbolic Math Toolbox) function.
syms x
s = solve(x^2-x-6)
s =

-2
3
Another way is to use the factor (Symbolic Math Toolbox) function to factor the polynomial
terms.
F = factor(x^2-x-6)
F =

[ x + 2, x - 3]
See Solve Algebraic Equations (Symbolic Math Toolbox) for more information.
Newton - Raphson Method
"The Newton - Raphson Method" uses one initial approximation to solve a given equation
y = f(x).In this method the function f(x) , is approximated by a tangent line, whose
equation is found from the value of f(x) and its first derivative at the initial
approximation.

The tangent line then intersects the X - Axis at second point. This second point is again
used as next approximation to find the third point.
The script proceeds in the same way and performs upto 100 iterations. The Accuracy
required
(required no. of decimal places) is taken as input from the user. The error between
solutions of each iteration is checked every time and if found less than required
accuracy, the iterations are stopped.

MATLAB Source Code: Newton-Raphson


Method

% Clearing Screen
clc

% Setting x as symbolic variable


syms x;

% Input Section
y = input('Enter non-linear equations: ');
a = input('Enter initial guess: ');
e = input('Tolerable error: ');
N = input('Enter maximum number of steps: ');
% Initializing step counter
step = 1;

% Finding derivate of given function


g = diff(y,x);

% Finding Functional Value


fa = eval(subs(y,x,a));

while abs(fa)> e
fa = eval(subs(y,x,a));
ga = eval(subs(g,x,a));
if ga == 0
disp('Division by zero.');
break;
end

b = a - fa/ga;
fprintf('step=%d\ta=%f\tf(a)=%f\n',step,a,fa);
a = b;

if step>N
disp('Not convergent');
break;
end
step = step + 1;
end

fprintf('Root is %f\n', a);

Bisection Method MATLAB Output

Enter non-linear equations: cos(x)-x*exp(x)


Enter initial guess: 1
Tolerable error: 0.00001
Enter maximum number of steps: 20
step=1 a=1.000000 f(a)=-2.177980
step=2 a=0.653079 f(a)=-0.460642
step=3 a=0.531343 f(a)=-0.041803
step=4 a=0.517910 f(a)=-0.000464
step=5 a=0.517757 f(a)=-0.000000
Root is 0.517757
What is Newton-Raphson method?
The Newton-Raphson method (also known as Newton's method) is a way to quickly
find a good approximation for the root of a real-valued function f ( x ) = 0 f(x) =
0 f(x)=0. It uses the idea that a continuous and differentiable function can be
approximated by a straight line tangent to it.

You might also like