Solving Nonlinear Equations with MATLAB
Solving Nonlinear Equations with MATLAB
Nonlinear Equations System of Non - linear equations , Solving System of Equations Using MATLAB
function fsolve , Interpolation Lagrange Interpolation . Two dimensional Interpolation . Straight line
fit using Least Square Method , Curve fitting using built - in functions ployval and polyfit , cubic fit
using least square method . Finding roots of a polynomial - roots function , Newton - Raphson
Method .
Find a solution to a multivariable nonlinear equation F(x) = 0. You can also solve a scalar equation or
linear system of equations, or a system represented by F(x) = G(x) in the problem-based approach
(equivalent to F(x) – G(x) = 0 in the solver-based approach). For nonlinear systems, solvers convert
the equation-solving problem to the optimization problem of minimizing the sum of squares of the
components of F, namely min(∑Fi2(x)). Linear and scalar equations have different solution
algorithms; see Equation Solving Algorithms.
Before you begin to solve an optimization problem, you must choose the appropriate approach:
problem-based or solver-based. For details, see First Choose Problem-Based or Solver-Based
Approach.
For the problem-based approach, create problem variables, and then represent the equations in terms
of these variables. For the problem-based steps to take, see Problem-Based Workflow for Solving
Equations. To solve the resulting problem, use solve.
For the solver-based steps to take, including defining the objective function and choosing the
appropriate solver, see Solver-Based Optimization Problem Setup.
Functions
collapse all
Solve and Analyze, Problem-Based
OptimizationExpression
Arithmetic or functional expression in terms of
optimization variables
OptimizationVariable Variable for optimization
fsolve
Solve system of nonlinear equations
Description
Nonlinear system solver
Solves a problem specified by
F(x) = 0
for x, where F(x) is a function that returns a vector value.
x is a vector or a matrix; see Matrix Arguments.
example
x = fsolve(fun,x0) starts at x0 and tries to solve the equations fun(x) = 0, an array of zeros.
Note
Passing Extra Parameters explains how to pass extra parameters to the vector function fun(x), if
necessary. See Solve Parameterized Equation.
example
x = fsolve(fun,x0,options) solves the equations with the optimization options specified
in options. Use optimoptions to set these options.
example
x = fsolve(problem) solves problem, a structure described in problem.
example
[x,fval] = fsolve(___), for any syntax, returns the value of the objective function fun at the
solution x.
example
[x,fval,exitflag,output] = fsolve(___) additionally returns a value exitflag that describes
the exit condition of fsolve, and a structure output with information about the optimization process.
[x,fval,exitflag,output,jacobian] = fsolve(___) returns the Jacobian of fun at the
solution x.
Examples
collapse all
Solution of 2-D Nonlinear System
Try This ExampleCopy Command Copy Code
This example shows how to solve two nonlinear equations in two variables. The equations are
−(x +x )
e−e 1 2 =x2(1+x21)x1cos(x2)+x2sin(x1)=12.
Convert the equations to the form F(x)=0.
−(x +x )
e−e 1 2 −x2(1+x21)=0x1cos(x2)+x2sin(x1)−12=0.
The root2d.m function, which is available when you run this example, computes the values.
type root2d.m
function F = root2d(x)
0.3532 0.6061
−(x +x )
e−e 1 2 =x2(1+x21)x1cos(x2)+x2sin(x1)=12.
−(x +x )
e−e 1 2 −x2(1+x21)=0x1cos(x2)+x2sin(x1)−12=0.
The root2d function computes the left-hand side of these two equations.
type root2d.m
function F = root2d(x)
0.3532 0.6061
2x1+x2=exp(cx1)−x1+2x2=exp(cx2).
To solve the system for a particular value, in this case c=−1, set c in the workspace and create an
anonymous function in x from paramfun.
c = -1;
fun = @(x)paramfun(x,c);
Solve the system starting from the point x0 = [0 1].
x0 = [0 1];
x = fsolve(fun,x0)
Equation solved.
To solve for a different value of c, enter c in the workspace and create the fun function again, so it
has the new c value.
c = -2;
fun = @(x)paramfun(x,c); % fun now has the new c value
x = fsolve(fun,x0)
Equation solved.
0.1788 0.3418
Helper Function
This code creates the paramfun helper function.
function F = paramfun(x,c)
F = [ 2*x(1) + x(2) - exp(c*x(1))
-x(1) + 2*x(2) - exp(c*x(2))];
end
Solve a Problem Structure
Try This ExampleCopy Command Copy Code
Create a problem structure for fsolve and solve the problem.
Solve the same problem as in Solution with Nondefault Options, but formulate the problem using a
problem structure.
Set options for the problem to have no display and a plot function that displays the first-order
optimality, which should converge to 0 as the algorithm iterates.
[Link] =
optimoptions('fsolve','Display','none','PlotFcn',@optimplotfirstorderopt);
The equations in the nonlinear system are
−(x +x )
e−e 1 2 =x2(1+x21)x1cos(x2)+x2sin(x1)=12.
−(x +x )
e−e 1 2 −x2(1+x21)=0x1cos(x2)+x2sin(x1)−12=0.
The root2d function computes the left-hand side of these two equations.
type root2d
function F = root2d(x)
x = 1×2
0.3532 0.6061
−x −x
2x1−x2=e 1−x1+2x2=e 2.
Equation solved.
0.5671
0.5671
fval = 2×1
10-6 ×
-0.4059
-0.4059
The iterative display shows f(x), which is the square of the norm of the function F(x). This value
decreases to near zero as the iterations proceed. The first-order optimality measure likewise
decreases to near zero as the iterations proceed. These entries show the convergence of the
iterations to a solution. For the meanings of the other entries, see Iterative Display.
The fval output gives the function value F(x), which should be zero at a solution (to within
the FunctionTolerance tolerance).
Examine Matrix Equation Solution
Try This ExampleCopy Command Copy Code
X∗X∗X=[1324],
starting at the point x0 = [1,1;1,1]. Create an anonymous function that calculates the matrix
equation and create the point x0.
fun = @(x)x*x*x - [1,2;3,4];
x0 = ones(2);
Set options to have no display.
options = optimoptions('fsolve','Display','off');
Examine the fsolve outputs to see the solution quality and process.
[x,fval,exitflag,output] = fsolve(fun,x0,options)
x = 2×2
-0.1291 0.8602
1.2903 1.1612
fval = 2×2
10-9 ×
-0.2742 0.1258
0.1876 -0.0864
exitflag = 1
output = struct with fields:
iterations: 11
funcCount: 52
algorithm: 'trust-region-dogleg'
firstorderopt: 4.0197e-10
message: 'Equation solved....'
The exit flag value 1 indicates that the solution is reliable. To verify this manually, calculate the
residual (sum of squares of fval) to see how close it is to zero.
sum(sum(fval.*fval))
ans = 1.3367e-19
This small residual confirms that x is a solution.
You can see in the output structure how many iterations and function evaluations fsolve performed
to find the solution.
Input Arguments
collapse all
fun — Nonlinear equations to solve
function handle | function name
Nonlinear equations to solve, specified as a function handle or function name. fun is a function that
accepts a vector x and returns a vector F, the nonlinear equations evaluated at x. The equations to
solve are F = 0 for all components of F. The function fun can be specified as a function handle for a
file
x = fsolve(@myfun,x0)
All Algorithms
Y is a matrix that has the same number of rows as there are dimensions
in the problem. flag determines which product to compute:
• If flag == 0, W = J'*(J*Y).
• If flag > 0, W = J*Y.
• If flag < 0, W = J'*Y.
In each case, J is not formed explicitly. fsolve uses Jinfo to compute
the preconditioner. See Passing Extra Parameters for information on
how to supply values for any additional parameters jmfun needs.
Note
'SpecifyObjectiveGradient' must be set to true for fsolve to
pass Jinfo from fun to jmfun.
Output Arguments
collapse all
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0.
Typically, x is a local solution to the problem when exitflag is positive. For information on the
quality of the solution, see When the Solver Succeeds.
fval — Objective function value at the solution
real vector
Objective function value at the solution, returned as a real vector. Generally, fval = fun(x).
exitflag — Reason fsolve stopped
integer
Reason fsolve stopped, returned as an integer.
Limitations
• The function to be solved must be continuous.
• When successful, fsolve only gives one root.
• The default trust-region dogleg method can only be used when the system of equations is
square, i.e., the number of equations equals the number of unknowns. For the Levenberg-
Marquardt method, the system of equations need not be square.
Tips
• For large problems, meaning those with thousands of variables or more, save memory
(and possibly save time) by setting the Algorithm option to 'trust-region' and
the SubproblemAlgorithm option to 'cg'.
Algorithms
The Levenberg-Marquardt and trust-region methods are based on the nonlinear least-squares
algorithms also used in lsqnonlin. Use one of these methods if the system may not have a zero. The
algorithm still returns a point where the residual is small. However, if the Jacobian of the system is
singular, the algorithm might converge to a point that is not a solution of the system of equations
(see Limitations).
• By default fsolve chooses the trust-region dogleg algorithm. The algorithm is a variant of
the Powell dogleg method described in [8]. It is similar in nature to the algorithm
implemented in [7]. See Trust-Region-Dogleg Algorithm.
• The trust-region algorithm is a subspace trust-region method and is based on the interior-
reflective Newton method described in [1] and [2]. Each iteration involves the approximate
solution of a large linear system using the method of preconditioned conjugate gradients
(PCG). See Trust-Region Algorithm.
• The Levenberg-Marquardt method is described in references [4], [5], and [6].
See Levenberg-Marquardt Method.
Since the product skips m = j, when i = j then all terms are [xj – xm ] / [xj – xm] =1
Also, when i ≠ j, m ≠ j does not produce it and one term in the product will be for m = I, that
is, [xi – xi]/[xj – xi] = 0
Therefore, the considered function L(x) is a polynomial with degree at most k and where
L(xj) = yj
So, for all i ≠ j, lj(x) includes the term ( x – xi ) in the numerator, therefore the entire product
will be found to be zero at x = xj
This is the required formula which will also be used in the program code for Lagrange
Interpolation in MATLAB.
NOTE:
The Lagrange Interpolation formula can also be derived from Newton’s divided difference
formula.
%[P,R,S] = lagrangepoly(X,Y);
xx = 0.5 : 0.01 : 8.5;
%plot(xx,polyval(P,xx),X,Y,'or',R,S,'.b',xx,spline(X,Y,xx),'--
g')
%grid
N = length(X);
pvals = zeros(N,N);
for i = 1:N
end
P = Y*pvals;
if nargin==3
% end of scope of if
if nargout > 2
S = polyval(P,R);
end
end
The above Matlab code for Lagrange method is written for interpolation of polynomials
fitting a set of points. The program uses a user-defined function named LAGRANGE(X, Y)
with two input parameters which are required to be row vectors.
The row vectors X and Y define a set of n points which are used in Lagrange method for the
determination of (n-1)th order polynomial in X which passes through these points. The
function of P in the program is to return the n coefficients which define the polynomial in
the same order as used by POLY and POLYVAL.
Similarly, R and S are defined to return x-coordinates and y-values at n-1 extreme of the
resulting polynomial. YY returns the value of the polynomial sampled at the points which
are specified in XX.
All the inputs which are required to give to the program are embedded in the source code.
The values of X and Y initially set in the program are:
X = [1 2 3 4 5 6 7 8] and Y = [0 1 0 1 0 1 0 1]
From the following sets of data, find the value of x corresponding to y=15 by using
Lagrange Interpolation.
Solution
X: 5 6 9 11
Y: 12 13 14 16
x(y)=(y-13)(y-14)(y-16)*5/(12-13)(12-14)(12-16) + (y-12)(y-14)(y-16)*6/(13-12)(13-14)(13-16)
+ (y-12)(y-13)(y-16)*9/(14-12)(14-13)(14-16) + (y-12)(y-13)(y-14)*11/(16-12)(16-13)(16-14)
If you have any questions regarding Lagrange interpolation, its MATLAB code, or its
derivation, bring them up to us from the comments section. You can find more Numerical
methods tutorial using Matlab here.
interp2
Interpolation for 2-D gridded data in meshgrid format
Syntax
Vq = interp2(X,Y,V,Xq,Yq)
Vq = interp2(V,Xq,Yq)
Vq = interp2(V)
Vq = interp2(V,k)
Vq = interp2(___,method)
Vq = interp2(___,method,extrapval)
Description
example
Vq = interp2(X,Y,V,Xq,Yq) returns interpolated values of a function of two variables at specific
query points using linear interpolation. The results always pass through the original sampling of the
function. X and Y contain the coordinates of the sample points. V contains the corresponding function
values at each sample point. Xq and Yq contain the coordinates of the query points.
Vq = interp2(V,Xq,Yq) assumes a default grid of sample points. The default grid points cover the
rectangular region, X=1:n and Y=1:m, where [m,n] = size(V). Use this syntax when you want to
conserve memory and are not concerned about the absolute distances between points.
Vq = interp2(V) returns the interpolated values on a refined grid formed by dividing the interval
between sample values once in each dimension.
example
Vq = interp2(V,k) returns the interpolated values on a refined grid formed by repeatedly halving
the intervals k times in each dimension. This results in 2^k-1 interpolated points between sample
values.
example
Vq = interp2(___,method) specifies an alternative interpolation
method: 'linear', 'nearest', 'cubic', 'makima', or 'spline'. The default method is 'linear'.
example
Vq = interp2(___,method,extrapval) also specifies extrapval, a scalar value that is assigned
to all queries that lie outside the domain of the sample points.
If you omit the extrapval argument for queries outside the domain of the sample points, then based
on the method argument interp2 returns one of the following:
• Extrapolated values for the 'spline' and 'makima' methods
• NaN values for other interpolation methods
Examples
collapse all
Interpolate over a Grid Using Default Method
Try This ExampleCopy Command Copy Code
Example: interp2(V,2)
Data Types: single | double
method — Interpolation method
'linear' (default) | 'nearest' | 'cubic' | 'spline' | 'makima'
Interpolation method, specified as one of the options in this table.
Output Arguments
collapse all
Vq — Interpolated values
scalar | vector | matrix
Interpolated values, returned as a real or complex scalar, vector, or matrix. The size and shape
of Vq depends on the syntax you use and, in some cases, the size and value of the input arguments.
Special
Syntaxes Size of Vq Example
Conditions
-1 0 1 2 3
-1 0 1 2 3
-1 0 1 2 3
-1 0 1 2 3
Y =
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
Grid vectors are a more compact format to represent a grid than the full grid. The relation between the
two formats and the matrix of sample values V is
Grid Vectors
For interp2, grid vectors consist of a pair of vectors that define the x- and y-coordinates in a grid. The
row vector defines x-coordinates, and the column vector defines y-coordinates.
For example, the following code creates the grid vectors that specify the region, –1 ≤ x ≤ 3 and 1 ≤ y ≤
4:
x = -1:3;
y = (1:4)';
Scattered Points
For interp2, scattered points consist of a pair of arrays that define a collection of points scattered in
2-D space. One array contains the x-coordinates, and the other contains the y-coordinates.
For example, the following code specifies the points, (2,7), (5,3), (4,1), and (10,9):
x = [2 5; 4 10];
y = [7 3; 1 9];
Straight line fit using Least Square Method
lsqr
Solve system of linear equations — least-squares method
x = lsqr(A,b,tol)
x = lsqr(A,b,tol,maxit)
x = lsqr(A,b,tol,maxit,M)
x = lsqr(A,b,tol,maxit,M1,M2)
x = lsqr(A,b,tol,maxit,M1,M2,x0)
[x,flag] = lsqr(___)
[x,flag,relres] = lsqr(___)
[x,flag,relres,iter] = lsqr(___)
[x,flag,relres,iter,resvec] = lsqr(___)
[x,flag,relres,iter,resvec,lsvec] = lsqr(___)
Description
example
x = lsqr(A,b) attempts to solve the system of linear equations A*x = b for x using the Least
Squares Method. lsqr finds a least squares solution for x that minimizes norm(b-A*x). When A is
consistent, the least squares solution is also a solution of the linear system. When the attempt is
successful, lsqr displays a message to confirm convergence. If lsqr fails to converge after the
maximum number of iterations or halts for any reason, it displays a diagnostic message that includes
the relative residual norm(b-A*x)/norm(b) and the iteration number at which the method stopped.
example
x = lsqr(A,b,tol) specifies a tolerance for the method. The default tolerance is 1e-6.
example
x = lsqr(A,b,tol,maxit) specifies the maximum number of iterations to use. lsqr displays a
diagnostic message if it fails to converge within maxit iterations.
example
x = lsqr(A,b,tol,maxit,M) specifies a preconditioner matrix M and computes x by effectively
−1
solving the system AM y=b for y, where y=Mx. Using a preconditioner matrix can improve the
numerical properties of the problem and the efficiency of the calculation.
example
x = lsqr(A,b,tol,maxit,M1,M2) specifies factors of the preconditioner matrix M such that M =
M1*M2.
example
x = lsqr(A,b,tol,maxit,M1,M2,x0) specifies an initial guess for the solution vector x. The default
is a vector of zeros.
example
[x,flag] = lsqr(___) returns a flag that specifies whether the algorithm successfully converged.
When flag = 0, convergence was successful. You can use this output syntax with any of the
previous input argument combinations. When you specify the flag output, lsqr does not display any
diagnostic messages.
example
[x,flag,relres] = lsqr(___) also returns the residual error of the computed solution x.
If flag is 0, then x is a least-squares solution that minimizes norm(b-A*x). If relres is small,
then x is also a consistent solution, since relres represents norm(b-A*x)/norm(b).
example
[x,flag,relres,iter] = lsqr(___) also returns the iteration number iter at which x was
computed.
example
[x,flag,relres,iter,resvec] = lsqr(___) also returns a vector of the residual norms at each
iteration, including the first residual norm(b-A*x0).
example
[x,flag,relres,iter,resvec,lsvec] = lsqr(___) also returns lsvec, which is an estimate of
the scaled normal equation error at each iteration.
Examples
collapse all
Iterative Solution to Linear System
Try This ExampleCopy Command Copy Code
Solve a rectangular linear system using lsqr with default settings, and then adjust the tolerance and
number of iterations used in the solution process.
Create a random sparse matrix A with 50% density. Also create a random vector b for the right-hand
side of Ax=b.
rng default
A = sprand(400,300,.5);
b = rand(400,1);
Solve Ax=b using lsqr. The output display includes the value of the relative residual
error .
‖b−Ax‖‖b‖
x = lsqr(A,b);
lsqr stopped at iteration 20 without converging to the desired tolerance 1e-06
because the maximum number of iterations was reached.
The iterate returned (number 20) has relative residual 0.26.
By default lsqr uses 20 iterations and a tolerance of 1e-6, but the algorithm is unable to converge in
those 20 iterations for this matrix. Since the residual is still large, it is a good indicator that more
iterations (or a preconditioner matrix) are needed. You also can use a larger tolerance to make it
easier for the algorithm to converge.
Solve the system again using a tolerance of 1e-4 and 70 iterations. Specify six outputs to return the
relative residual relres of the calculated solution, as well as the residual history resvec and the
least-squares residual history lsvec.
[x,flag,relres,iter,resvec,lsvec] = lsqr(A,b,1e-4,70);
flag
flag = 0
Since flag is 0, the algorithm was able to meet the desired error tolerance in the specified number of
iterations. You can generally adjust the tolerance and number of iterations together to make trade-
offs between speed and precision in this manner.
Examine the relative residual and least-squares residual of the calculated solution.
relres
relres = 0.2625
lsres = lsvec(end)
lsres = 2.7640e-04
These residual norms indicate that x is a least-squares solution, because relres is not smaller than
the specified tolerance of 1e-4. Since no consistent solution to the linear system exists, the best the
solver can do is to make the least-squares residual satisfy the tolerance.
Plot the residual histories. The relative residual resvec quickly reaches a minimum and cannot make
further progress, while the least-squares residual lsvec continues to be minimized on subsequent
iterations.
N = length(resvec);
semilogy(0:N-1,lsvec,'--o',0:N-1,resvec,'-o')
legend("Least-squares residual","Relative residual")
Examine the effect of using a preconditioner matrix with lsqr to solve a linear system.
Load west0479, a real 479-by-479 nonsymmetric sparse matrix.
load west0479
A = west0479;
Define b so that the true solution to Ax=b is a vector of all ones.
b = sum(A,2);
Set the tolerance and maximum number of iterations.
tol = 1e-12;
maxit = 20;
Use lsqr to find a solution at the requested tolerance and number of iterations. Specify six outputs to
return information about the solution process:
• x is the computed solution to A*x = b.
• fl is a flag indicating whether the algorithm converged.
• rr is the relative residual of the computed answer x.
• it is the iteration number when x was computed.
• rv is a vector of the residual history for ‖b−Ax‖.
• lsrv is a vector of the least squares residual history.
[x,fl,rr,it,rv,lsrv] = lsqr(A,b,tol,maxit);
fl
fl = 1
rr
rr = 0.0017
it
it = 20
Since fl = 1, the algorithm did not converge to the specified tolerance within the maximum number
of iterations.
To aid with the slow convergence, you can specify a preconditioner matrix. Since A is nonsymmetric,
use ilu to generate the preconditioner M=L U in factorized form. Specify a drop tolerance to ignore
nondiagonal entries with values smaller than 1e-6. Solve the preconditioned
−1(M x)=b for y=Mx by specifying L and U as the M1 and M2 inputs to lsqr.
system AM
setup = struct('type','ilutp','droptol',1e-6);
[L,U] = ilu(A,setup);
[x1,fl1,rr1,it1,rv1,lsrv1] = lsqr(A,b,tol,maxit,L,U);
fl1
fl1 = 0
rr1
rr1 = 7.0954e-14
it1
it1 = 13
The use of an ilu preconditioner produces a relative residual less than the prescribed tolerance of 1e-
12 at the 13th iteration. The output rv1(1) is norm(b), and the output rv1(end) is norm(b-A*x1).
You can follow the progress of lsqr by plotting the relative residuals at each iteration. Plot the
residual history of each solution with a line for the specified tolerance.
semilogy(0:length(rv)-1,rv/norm(b),'-o')
hold on
semilogy(0:length(rv1)-1,rv1/norm(b),'-o')
yline(tol,'r--');
legend('No preconditioner','ILU preconditioner','Tolerance','Location','East')
xlabel('Iteration number')
ylabel('Relative residual')
Supplying Initial Guess
Try This ExampleCopy Command Copy Code
Examine the effect of supplying lsqr with an initial guess of the solution.
Create a random rectangular sparse matrix. Use the sum of each row as the vector for the right-hand
side of Ax=b so that the expected solution for x is a vector of ones.
A = sprand(700,900,0.1);
b = sum(A,2);
Use lsqr to solve Ax=b twice: one time with the default initial guess, and one time with a good initial
guess of the solution. Use 75 iterations and the default tolerance for both solutions. Specify the initial
guess in the second solution as a vector with all elements equal to 0.99.
maxit = 75;
x1 = lsqr(A,b,[],maxit);
lsqr converged at iteration 64 to a solution with relative residual 8.7e-07.
x0 = 0.99*ones(size(A,2),1);
x2 = lsqr(A,b,[],maxit,[],[],x0);
lsqr converged at iteration 26 to a solution with relative residual 9.6e-07.
With an initial guess close to the expected solution, lsqr is able to converge in fewer iterations.
Returning Intermediate Results
You also can use the initial guess to get intermediate results by calling lsqr in a for-loop. Each call to
the solver performs a few iterations and stores the calculated solution. Then you use that solution as
the initial vector for the next batch of iterations.
For example, this code performs 100 iterations four times and stores the solution vector after each
pass in the for-loop:
x0 = zeros(size(A,2),1);
tol = 1e-8;
maxit = 100;
for k = 1:4
[x,flag,relres] = lsqr(A,b,tol,maxit,[],[],x0);
X(:,k) = x;
R(k) = relres;
x0 = x;
end
X(:,k) is the solution vector computed at iteration k of the for-loop, and R(k) is the relative residual
of that solution.
Using Function Handle Instead of Numeric Matrix
Try This ExampleCopy Command Copy Code
Solve a linear system by providing lsqr with a function handle that computes A*x and A'*x in place
of the coefficient matrix A.
Create a nonsymmetric tridiagonal matrix. Preview the matrix.
A = gallery('wilk',21) + diag(ones(20,1),1)
A = 21×21
10 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
1 9 2 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 1 8 2 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 1 7 2 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 1 6 2 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 1 5 2 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 1 4 2 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 3 2 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 2 2 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 2 0 0
0 0 0 0 0 0 0 0
⋮
Since this tridiagonal matrix has a special structure, you can represent the operation A*x with a
function handle. When A multiplies a vector, most of the elements in the resulting vector are zeros.
The nonzero elements in the result correspond with the nonzero tridiagonal elements of A.
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
3⋮⋮x21 = 10x1+2x2x1+9x2+2x3⋮⋮x19+9x20+2x21x20+10x21 .
A x= 10x1+2x2x1+9x2+2x3⋮⋮x19+9x20+2x21x20+10x21 = 0x1x2⋮x2
⎡⎢⎢⎢⎢⎢⎢⎢⎣
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
0 + 10x19x2⋮9x2010x21 +2⋅ x2x3⋮x210 .
T x becomes:
Likewise, the expression for A
AT x= 1020⋮⋮01920⋯01⋱20⋯010⋱⋱⋯0⋱1⋱0⋯⋱⋱⋱20⋮⋮0110 x1x2
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
x3⋮⋮x21 = 10x1+x22x1+9x2+x3⋮⋮2x19+9x20+x212x20+10x21 .
⎡⎢⎢⎢⎢⎢⎢⎢⎣
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
⋮x20 + 10x19x2⋮9x2010x21 + x2x3⋮x210 .
Now, solve the linear system Ax=b by providing lsqr with the function handle that
calculates A*x and A'*x. Use a tolerance of 1e-6 and 25 iterations. Specify b as the row sums
of A so that the true solution for x is a vector of ones.
b = full(sum(A,2));
tol = 1e-6;
maxit = 25;
x1 = lsqr(@afun,b,tol,maxit)
lsqr converged at iteration 21 to a solution with relative residual 5.4e-13.
x1 = 21×1
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
⋮
Local Functions
function y = afun(x,flag)
if strcmp(flag,'notransp') % Compute A*x
y = [0; x(1:20)] ...
+ [(10:-1:0)'; (1:10)'].*x ...
+ 2*[x(2:end); 0];
elseif strcmp(flag,'transp') % Compute A'*x
y = 2*[0; x(1:20)] ...
+ [(10:-1:0)'; (1:10)'].*x ...
+ [x(2:end); 0];
end
end
Input Arguments
collapse all
A — Coefficient matrix
matrix | function handle
Coefficient matrix, specified as a matrix or function handle. This matrix is the coefficient matrix in the
linear system A*x = b. Generally, A is a large sparse matrix or a function handle that returns the
product of a large sparse matrix and column vector.
Specifying A as a Function Handle
You can optionally specify the coefficient matrix as a function handle instead of a matrix. The
function handle returns matrix-vector products instead of forming the entire coefficient matrix,
making the calculation more efficient.
To use a function handle, use the function signature function y = afun(x,opt). Parameterizing
Functions explains how to provide additional parameters to the function afun, if necessary. The
function afun must satisfy these conditions:
Output Arguments
collapse all
x — Linear system solution
column vector
Linear system solution, returned as a column vector. This output gives the approximate solution to
the linear system A*x = b.
• The relative residual error is equal to norm(b-A*x)/norm(b) and is generally the residual
that meets the tolerance tol when lsqr converges. The resvec output tracks the history
of this residual over all iterations.
• The least-squares residual error is equal to norm((A*inv(M))'*(B-
A*X))/norm(A*inv(M),'fro'). This residual causes lsqr to converge less frequently
than the relative residual. The lsvec output tracks the history of this residual over all
iterations.
iter — Iteration number
scalar
Iteration number, returned as a scalar. This output indicates the iteration number at which the
computed answer for x was calculated.
Data Types: double
resvec — Residual error
vector
Residual error, returned as a vector. The residual error norm(b-A*x) reveals how close the algorithm
is to converging for a given value of x. The number of elements in resvec is equal to the number of
iterations. You can examine the contents of resvec to help decide whether to change the values
of tol or maxit.
Data Types: double
lsvec — Scaled normal equation error
vector
Scaled normal equation error, returned as a vector. For each iteration, lsvec contains an estimate of
the scaled normal equation residual norm((A*inv(M))'*(B-A*X))/norm(A*inv(M),'fro'). The
number of elements in lsvec is equal to the number of iterations.
More About
collapse all
Least Squares Method
The least squares (LSQR) algorithm is an adaptation of the conjugate gradients (CG) method for
rectangular matrices. Analytically, LSQR for A*x = b produces the same residuals as CG for the
normal equations A'*A*x = A'*b, but LSQR possesses more favorable numeric properties and is
thus generally more reliable [1].
The least squares method is the only iterative linear system solver that can handle rectangular and
inconsistent coefficient matrices.
Tips
• Convergence of most iterative methods depends on the condition number of the
coefficient matrix, cond(A). When A is square, you can use equilibrate to improve its
condition number, and on its own this makes it easier for most iterative solvers to
converge. However, using equilibrate also leads to better quality preconditioner
matrices when you subsequently factor the equilibrated matrix B = R*P*A*C.
• You can use matrix reordering functions such as dissect and symrcm to permute the
rows and columns of the coefficient matrix and minimize the number of nonzeros when
the coefficient matrix is factored to generate a preconditioner. This can reduce the
memory and time required to subsequently solve the preconditioned linear system.
Curve fitting using built - in functions ployval and polyfit
After you obtain the polynomial for the fit line using polyfit, you can use polyval to evaluate the
polynomial at other points that might not have been included in the original data.
Compute the values of the polyfit estimate over a finer domain and plot the estimate over the real
data values for comparison. Include an annotation of the equation for the fit line.
x2 = 1:.1:5;
y2 = polyval(p,x2);
plot(x,y,'o',x2,y2)
grid on
s = sprintf('y = (%.1f) x^3 + (%.1f) x^2 + (%.1f) x +
(%.1f)',p(1),p(2),p(3),p(4));
text(2,400,s)
polyval
Polynomial evaluation
Syntax
y = polyval(p,x)
[y,delta] = polyval(p,x,S)
y = polyval(p,x,[],mu)
[y,delta] = polyval(p,x,S,mu)
Description
example
y = polyval(p,x) evaluates the polynomial p at each point in x. The argument p is a vector of
length n+1 whose elements are the coefficients (in descending powers) of an nth-degree polynomial:
p(x)=p1xn+p2xn−1+...+pnx+pn+1.
The polynomial coefficients in p can be calculated for different purposes by functions
like polyint, polyder, and polyfit, but you can specify any vector for the coefficients.
To evaluate a polynomial in a matrix sense, use polyvalm instead.
example
[y,delta] = polyval(p,x,S) uses the optional output structure S produced by polyfit to
generate error estimates. delta is an estimate of the standard error in predicting a future observation
at x by p(x).
example
y = polyval(p,x,[],mu) or [y,delta] = polyval(p,x,S,mu) use the optional
output mu produced by polyfit to center and scale the data. mu(1) is mean(x), and mu(2) is std(x).
Using these values, polyval centers x at zero and scales it to have unit standard deviation,
ˆx= ‾x .
x− σx
This centering and scaling transformation improves the numerical properties of the polynomial.
Examples
collapse all
Evaluate Polynomial at Several Points
Try This ExampleCopy Command Copy Code
2
Evaluate the polynomial p(x)=3x +2x+1 at the points x=5,7,9. The polynomial coefficients can be
represented by the vector [3 2 1].
p = [3 2 1];
x = [5 7 9];
y = polyval(p,x)
y = 1×3
86 162 262
4 2
I=
∫3−1 (3x −4x +10x−25)dx.
4 2 3
Create a vector to represent the polynomial integrand 3x −4x +10x−25. The x term is absent and
thus has a coefficient of 0.
p = [3 0 -4 10 -25];
Use polyint to integrate the polynomial using a constant of integration equal to 0.
q = polyint(p)
q = 1×6
Fit a linear model to a set of data points and plot the results, including an estimate of a 95%
prediction interval.
Create a few vectors of sample data points (x,y). Use polyfit to fit a first degree polynomial to the
data. Specify two outputs to return the coefficients for the linear fit as well as the error estimation
structure.
x = 1:100;
y = -0.3*x + 2*randn(1,100);
[p,S] = polyfit(x,y,1);
Evaluate the first-degree polynomial fit in p at the points in x. Specify the error estimation structure as
the third input so that polyval calculates an estimate of the standard error. The standard error
estimate is returned in delta.
[y_fit,delta] = polyval(p,x,S);
Plot the original data, linear fit, and 95% prediction interval y±2Δ.
plot(x,y,'bo')
hold on
plot(x,y_fit,'r-')
plot(x,y_fit+2*delta,'m--',x,y_fit-2*delta,'m--')
title('Linear Fit of Data with 95% Prediction Interval')
legend('Data','Linear Fit','95% Prediction Interval')
Create a table of population data for the years 1750 - 2000 and plot the data points.
year = ([Link])';
pop = 1e6*[791 856 978 1050 1262 1544 1650 2532 6122 8170 11560]';
T = table(year, pop)
T=11×2 table
year pop
____ _________
1750 7.91e+08
1775 8.56e+08
1800 9.78e+08
1825 1.05e+09
1850 1.262e+09
1875 1.544e+09
1900 1.65e+09
1925 2.532e+09
1950 6.122e+09
1975 8.17e+09
2000 1.156e+10
plot(year,pop,'o')
Use polyfit with three outputs to fit a 5th-degree polynomial using centering and scaling, which
improves the numerical properties of the problem. polyfit centers the data in year at 0 and scales it
to have a standard deviation of 1, which avoids an ill-conditioned Vandermonde matrix in the fit
calculation.
[p,~,mu] = polyfit([Link], [Link], 5);
Use polyval with four inputs to evaluate p with the scaled years, (year-mu(1))/mu(2). Plot the
results against the original years.
f = polyval(p,year,[],mu);
hold on
plot(year,f)
hold off
Input Arguments
collapse all
p — Polynomial coefficients
vector
Polynomial coefficients, specified as a vector. For example, the vector [1 0 1] represents the
2+1, and the vector [3.13 -2.21 5.99] represents the
polynomial x
2
polynomial 3.13x −2.21x+5.99.
For more information, see Create and Evaluate Polynomials.
Data Types: single | double
Complex Number Support: Yes
x — Query points
vector
Query points, specified as a vector. polyval evaluates the polynomial p at the points in x and returns
the corresponding function values in y.
Data Types: single | double
Complex Number Support: Yes
S — Error estimation structure
structure
Error estimation structure. This structure is an optional output from [p,S] = polyfit(x,y,n) that
can be used to obtain error estimates. S contains the following fields:
Field Description
df Degrees of freedom
Output Arguments
collapse all
y — Function values
vector
Function values, returned as a vector of the same size as the query points x. The vector contains the
result of evaluating the polynomial p at each point in x.
delta — Standard error for prediction
vector
Standard error for prediction, returned as a vector of the same size as the query points x. Generally,
an interval of y ± Δ corresponds to a roughly 68% prediction interval for future observations of large
samples, and y ± 2Δ a roughly 95% prediction interval.
If the coefficients in p are least-squares estimates computed by polyfit, and the errors in the data
input to polyfit are independent, normal, and have constant variance, then y ± Δ is at least a 50%
prediction interval.
polyfit
Polynomial curve fitting
Syntax
p = polyfit(x,y,n)
[p,S] = polyfit(x,y,n)
[p,S,mu] = polyfit(x,y,n)
Description
example
p = polyfit(x,y,n) returns the coefficients for a polynomial p(x) of degree n that is a best fit (in a
least-squares sense) for the data in y. The coefficients in p are in descending powers, and the length
of p is n+1
p(x)=p1xn+p2xn−1+...+pnx+pn+1.
[p,S] = polyfit(x,y,n) also returns a structure S that can be used as an input to polyval to
obtain error estimates.
example
[p,S,mu] = polyfit(x,y,n) performs centering and scaling to improve the numerical properties of
both the polynomial and the fitting algorithm. This syntax additionally returns mu, which is a two-
element vector with centering and scaling values. mu(1) is mean(x), and mu(2) is std(x). Using
these values, polyfit centers x at zero and scales it to have unit standard deviation,
ˆx= ‾x .
x− σx
Examples
collapse all
Fit Polynomial to Trigonometric Function
Try This ExampleCopy Command Copy Code
Generate 10 points equally spaced along a sine curve in the interval [0,4*pi].
x = linspace(0,4*pi,10);
y = sin(x);
Use polyfit to fit a 7th-degree polynomial to the points.
p = polyfit(x,y,7);
Evaluate the polynomial on a finer grid and plot the results.
x1 = linspace(0,4*pi);
y1 = polyval(p,x1);
figure
plot(x,y,'o')
hold on
plot(x1,y1)
hold off
Fit Polynomial to Set of Points
Try This ExampleCopy Command Copy Code
−1 at those
Create a vector of 5 equally spaced points in the interval [0,1], and evaluate y(x)=(1+x)
points.
x = linspace(0,1,5);
y = 1./(1+x);
Fit a polynomial of degree 4 to the 5 points. In general, for n points, you can fit a polynomial of
degree n-1 to exactly pass through the points.
p = polyfit(x,y,4);
Evaluate the original function and the polynomial fit on a finer grid of points between 0 and 2.
x1 = linspace(0,2);
y1 = 1./(1+x1);
f1 = polyval(p,x1);
Plot the function values and the polynomial fit in the wider interval [0,2], with the points used to
obtain the polynomial fit highlighted as circles. The polynomial fit is good in the
original [0,1] interval, but quickly diverges from the fitted function outside of that interval.
figure
plot(x,y,'o')
hold on
plot(x1,y1)
plot(x1,f1,'r--')
legend('y','y1','f1')
Fit Polynomial to Error Function
Try This ExampleCopy Command Copy Code
First generate a vector of x points, equally spaced in the interval [0,2.5], and then
evaluate erf(x) at those points.
x = (0:0.1:2.5)';
y = erf(x);
Determine the coefficients of the approximating polynomial of degree 6.
p = polyfit(x,y,6)
p = 1×7
To see how good the fit is, evaluate the polynomial at the data points and generate a table showing
the data, fit, and error.
f = polyval(p,x);
T = table(x,y,f,y-f,'VariableNames',{'X','Y','Fit','FitError'})
T=26×4 table
X Y Fit FitError
___ _______ __________ ___________
0 0 0.00044117 -0.00044117
0.1 0.11246 0.11185 0.00060836
0.2 0.2227 0.22231 0.00039189
0.3 0.32863 0.32872 -9.7429e-05
0.4 0.42839 0.4288 -0.00040661
0.5 0.5205 0.52093 -0.00042568
0.6 0.60386 0.60408 -0.00022824
0.7 0.6778 0.67775 4.6383e-05
0.8 0.7421 0.74183 0.00026992
0.9 0.79691 0.79654 0.00036515
1 0.8427 0.84238 0.0003164
1.1 0.88021 0.88005 0.00015948
1.2 0.91031 0.91035 -3.9919e-05
1.3 0.93401 0.93422 -0.000211
1.4 0.95229 0.95258 -0.00029933
1.5 0.96611 0.96639 -0.00028097
⋮
In this interval, the interpolated values and the actual values agree fairly closely. Create a plot to show
how outside this interval, the extrapolated values quickly diverge from the actual data.
x1 = (0:0.1:5)';
y1 = erf(x1);
f1 = polyval(p,x1);
figure
plot(x,y,'o')
hold on
plot(x1,y1,'-')
plot(x1,f1,'r--')
axis([0 5 0 2])
hold off
Create a table of population data for the years 1750 - 2000 and plot the data points.
year = ([Link])';
pop = 1e6*[791 856 978 1050 1262 1544 1650 2532 6122 8170 11560]';
T = table(year, pop)
T=11×2 table
year pop
____ _________
1750 7.91e+08
1775 8.56e+08
1800 9.78e+08
1825 1.05e+09
1850 1.262e+09
1875 1.544e+09
1900 1.65e+09
1925 2.532e+09
1950 6.122e+09
1975 8.17e+09
2000 1.156e+10
plot(year,pop,'o')
Use polyfit with three outputs to fit a 5th-degree polynomial using centering and scaling, which
improves the numerical properties of the problem. polyfit centers the data in year at 0 and scales it
to have a standard deviation of 1, which avoids an ill-conditioned Vandermonde matrix in the fit
calculation.
[p,~,mu] = polyfit([Link], [Link], 5);
Use polyval with four inputs to evaluate p with the scaled years, (year-mu(1))/mu(2). Plot the
results against the original years.
f = polyval(p,year,[],mu);
hold on
plot(year,f)
hold off
Simple Linear Regression
Try This ExampleCopy Command Copy Code
Fit a simple linear regression model to a set of discrete 2-D data points.
Create a few vectors of sample data points (x,y). Fit a first degree polynomial to the data.
x = 1:50;
y = -0.3*x + 2*randn(1,50);
p = polyfit(x,y,1);
Evaluate the fitted polynomial p at the points in x. Plot the resulting linear regression model with the
data.
f = polyval(p,x);
plot(x,y,'o',x,f,'-')
legend('data','linear fit')
Linear Regression With Error Estimate
Try This ExampleCopy Command Copy Code
Fit a linear model to a set of data points and plot the results, including an estimate of a 95%
prediction interval.
Create a few vectors of sample data points (x,y). Use polyfit to fit a first degree polynomial to the
data. Specify two outputs to return the coefficients for the linear fit as well as the error estimation
structure.
x = 1:100;
y = -0.3*x + 2*randn(1,100);
[p,S] = polyfit(x,y,1);
Evaluate the first-degree polynomial fit in p at the points in x. Specify the error estimation structure as
the third input so that polyval calculates an estimate of the standard error. The standard error
estimate is returned in delta.
[y_fit,delta] = polyval(p,x,S);
Plot the original data, linear fit, and 95% prediction interval y±2Δ.
plot(x,y,'bo')
hold on
plot(x,y_fit,'r-')
plot(x,y_fit+2*delta,'m--',x,y_fit-2*delta,'m--')
title('Linear Fit of Data with 95% Prediction Interval')
legend('Data','Linear Fit','95% Prediction Interval')
Input Arguments
collapse all
x — Query points
vector
Query points, specified as a vector. The points in x correspond to the fitted function values contained
in y. If x is not a vector, then polyfit converts it into a column vector x(:).
Warning messages result when x has repeated (or nearly repeated) points or if x might need
centering and scaling.
Data Types: single | double
Complex Number Support: Yes
y — Fitted values at query points
vector
Fitted values at query points, specified as a vector. The values in y correspond to the query points
contained in x. If y is not a vector, then polyfit converts it into a column vector y(:).
Data Types: single | double
Complex Number Support: Yes
n — Degree of polynomial fit
positive integer scalar
Degree of polynomial fit, specified as a positive integer scalar. n specifies the polynomial power of
the left-most coefficient in p.
Output Arguments
collapse all
p — Least-squares fit polynomial coefficients
vector
Least-squares fit polynomial coefficients, returned as a vector. p has length n+1 and contains the
polynomial coefficients in descending powers, with the highest power being n. If
either x or y contain NaN values and n < length(x), then all elements in p are NaN. If you specify
three output arguments to center and scale the data, then polyfit returns different coefficients
in p compared to when the data is not centered and scaled.
Use polyval to evaluate p at query points.
S — Error estimation structure
structure
Error estimation structure. This optional output structure is primarily used as an input to
the polyval function to obtain error estimates. S contains the following fields:
Field Description
df Degrees of freedom
Limitations
• In problems with many points, increasing the degree of the polynomial fit
using polyfit does not always result in a better fit. High-order polynomials can be
oscillatory between the data points, leading to a poorer fit to the data. In those cases, you
might use a low-order polynomial fit (which tends to be smoother between points) or a
different technique, depending on the problem.
• Polynomials are unbounded, oscillatory functions by nature. Therefore, they are not well-
suited to extrapolating bounded data or monotonic (increasing or decreasing) data.
Algorithms
polyfit uses x to form Vandermonde matrix V with n+1 columns and m = length(x) rows, resulting
in the linear system
xn1xn2⋮xnmxn−11xn−12⋮xn−1m⋯⋯⋱⋯11⋮1
⎛⎜⎜⎜⎜
⎛⎜⎜⎜⎜⎜⎜⎜⎝ ⎞⎟⎟⎟⎟⎟⎟⎟⎠
p1p2⋮pn+1 = y1y2⋮ym ,
roots
Polynomial roots
Syntax
r = roots(p)
Description
example
r = roots(p) returns the roots of the polynomial represented by p as a column vector. Input p is a
vector containing n+1 polynomial coefficients, starting with the coefficient of xn. A coefficient
of 0 indicates an intermediate power that is not present in the equation. For example, p = [3 2 -
2+2x−2.
2] represents the polynomial 3x
n+...+p x+p
The roots function solves polynomial equations of the form p1x n n+1=0. Polynomial
equations contain a single variable with nonnegative exponents.
Examples
collapse all
Roots of Quadratic Polynomial
Try This ExampleCopy Command Copy Code
2−2x−4=0.
Solve the equation 3x
Create a vector to represent the polynomial, then find the roots.
p = [3 -2 -4];
r = roots(p)
r = 2×1
1.5352
-0.8685
Roots of Quartic Polynomial
Try This ExampleCopy Command Copy Code
4−1=0.
Solve the equation x
Create a vector to represent the polynomial, then find the roots.
p = [1 0 0 0 -1];
r = roots(p)
r = 4×1 complex
-1.0000 + 0.0000i
0.0000 + 1.0000i
0.0000 - 1.0000i
1.0000 + 0.0000i
Input Arguments
collapse all
p — Polynomial coefficients
vector
Polynomial coefficients, specified as a vector. For example, the vector [1 0 1] represents the
2+1, and the vector [3.13 -2.21 5.99] represents the
polynomial x
2
polynomial 3.13x −2.21x+5.99.
For more information, see Create and Evaluate Polynomials.
Data Types: single | double
Complex Number Support: Yes
Tips
• Use the poly function to obtain a polynomial from its roots: p = poly(r).
The poly function is the inverse of the roots function.
• Use the fzero function to find the roots of nonlinear equations. While the roots function
works only with polynomials, the fzero function is more broadly applicable to different
types of equations.
Algorithms
The roots function considers p to be a vector with n+1 elements representing the nth degree
characteristic polynomial of an n-by-n matrix, A. The roots of the polynomial are calculated by
computing the eigenvalues of the companion matrix, A.
A = diag(ones(n-1,1),-1);
A(1,:) = -p(2:n+1)./p(1);
r = eig(A)
The results produced are the exact eigenvalues of a matrix within roundoff error of the companion
matrix, A. However, this does not mean that they are the exact roots of a polynomial whose
coefficients are within roundoff error of those in p.
Roots of Polynomials
This example shows several different methods to calculate the roots of a polynomial.
• Numeric Roots
• Roots Using Substitution
• Roots in a Specific Interval
• Symbolic Roots
Numeric Roots
The roots function calculates the roots of a single-variable polynomial represented by a
vector of coefficients.
For example, create a vector to represent the polynomial x2−x−6, then calculate the roots.
p = [1 -1 -6];
r = roots(p)
r =
3
-2
By convention, MATLAB® returns the roots in a column vector.
The poly function converts the roots back to polynomial coefficients. When operating on
vectors, poly and roots are inverse functions, such that poly(roots(p)) returns p (up to
roundoff error, ordering, and scaling).
p2 = poly(r)
p2 =
1 -1 -6
When operating on a matrix, the poly function computes the characteristic polynomial of the
matrix. The roots of the characteristic polynomial are the eigenvalues of the matrix.
Therefore, roots(poly(A)) and eig(A) return the same answer (up to roundoff error,
ordering, and scaling).
Roots Using Substitution
Try This ExampleCopy Command Copy Code
You can solve polynomial equations involving trigonometric functions by simplifying the
equation using a substitution. The resulting polynomial of one variable no longer contains
any trigonometric functions.
For example, find the values of θ that solve the equation
3cos2(θ)−sin(θ)+3=0.
2 2
Use the fact that cos (θ)=1−sin (θ) to express the equation entirely in terms of sine
functions:
−3sin2(θ)−sin(θ)+6=0.
Use the substitution x=sin(θ) to express the equation as a simple polynomial equation:
−3x2−x+6=0.
Create a vector to represent the polynomial.
p = [-3 -1 6];
Find the roots of the polynomial.
r = roots(p)
r = 2×1
-1.5907
1.2573
To undo the substitution, use θ=sin−1(x). The asin function calculates the inverse sine.
theta = asin(r)
theta = 2×1 complex
-1.5708 + 1.0395i
1.5708 - 0.7028i
Verify that the elements in theta are the values of θ that solve the original equation (within
roundoff error).
f = @(Z) 3*cos(Z).^2 - sin(Z) + 3;
f(theta)
ans = 2×1 complex
10-14 ×
-0.0888 + 0.0647i
0.2665 + 0.0399i
Use the fzero function to find the roots of a polynomial in a specific interval. Among other
uses, this method is suitable if you plot the polynomial and want to know the value of a
particular root.
For example, create a function handle to represent the
polynomial 3x7+4x6+2x5+4x4+x3+5x2.
p = @(x) 3*x.^7 + 4*x.^6 + 2*x.^5 + 4*x.^4 + x.^3 + 5*x.^2;
Plot the function over the interval [−2,1].
x = -2:0.1:1;
plot(x,p(x))
ylim([-100 50])
grid on
hold on
From the plot, the polynomial has a trivial root at 0 and another near -1.5. Use fzero to
calculate and plot the root that is near -1.5.
Z = fzero(p, -1.5)
Z = -1.6056
plot(Z,p(Z),'r*')
Symbolic Roots
If you have Symbolic Math Toolbox™, then there are additional options for evaluating
polynomials symbolically. One way is to use the solve (Symbolic Math Toolbox) function.
syms x
s = solve(x^2-x-6)
s =
-2
3
Another way is to use the factor (Symbolic Math Toolbox) function to factor the polynomial
terms.
F = factor(x^2-x-6)
F =
[ x + 2, x - 3]
See Solve Algebraic Equations (Symbolic Math Toolbox) for more information.
Newton - Raphson Method
"The Newton - Raphson Method" uses one initial approximation to solve a given equation
y = f(x).In this method the function f(x) , is approximated by a tangent line, whose
equation is found from the value of f(x) and its first derivative at the initial
approximation.
The tangent line then intersects the X - Axis at second point. This second point is again
used as next approximation to find the third point.
The script proceeds in the same way and performs upto 100 iterations. The Accuracy
required
(required no. of decimal places) is taken as input from the user. The error between
solutions of each iteration is checked every time and if found less than required
accuracy, the iterations are stopped.
% Clearing Screen
clc
% Input Section
y = input('Enter non-linear equations: ');
a = input('Enter initial guess: ');
e = input('Tolerable error: ');
N = input('Enter maximum number of steps: ');
% Initializing step counter
step = 1;
while abs(fa)> e
fa = eval(subs(y,x,a));
ga = eval(subs(g,x,a));
if ga == 0
disp('Division by zero.');
break;
end
b = a - fa/ga;
fprintf('step=%d\ta=%f\tf(a)=%f\n',step,a,fa);
a = b;
if step>N
disp('Not convergent');
break;
end
step = step + 1;
end