MINISTRY OF INDUSTRY & TRADE
INDUSTRIAL UNIVERSITY OF HO CHI MINH CITY
Engineering Optimization
Unconstrained Problems:
Multiple-Variable Methods (2nd part)
Lecturer : Dr. Vo Quoc Thang
Contents
● Unconstrained Optimization:
Methods for Multiple Variables
– 0th order
– 1st order methods
– 2nd order methods
– Quasi-Newton Methods
● Constrained Optimization: Optimality Criteria
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 2
Multivariate Unconstrained Optimization Algorithms (1)
● 0th order (direct search):
– Random methods: random jumping / walk / S.A.
– Cyclic coordinate search
– Powell’s Conjugate Directions method
– Nelder and Mead Simplex method
– Biologically inspired algorithms (GA, swarms, …)
● Conclusions:
– Simplex method usually best
– Inefficient for N>10
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 3
Multivariate Unconstrained Optimization Algorithms (2)
● 1st order methods (descent methods):
– Steepest descent method (with line search)
– Fletcher-Reeves Conjugate Gradient (CG) method
y2
● Conclusions (for now):
– Scaling important for Steepest descent (zig-zag) y1
– For quadratic problem, CG converges in N steps
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 4
Fletcher-Reeves conjugate gradient method
● Based on building set of conjugate directions,
combined with line searches
di Ad j = 0 | i j
T
● Conjugate directions:
● Conjugate directions: guaranteed convergence in N
steps for quadratic problems
(recall Powell: N cycles of N line searches)
1 T
Quadratic function: f (x) = x Ax + b x + c
T
2
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 5
CG practical
1. Start with an arbitrary initial point X1
2. Set the first search direction: 𝑺1 = −∇𝑓1 = −∇𝑓(𝑿1 )
3. Find the point 𝑿2 according to the relation: 𝑿2 = 𝑿1 + 𝜆1∗ 𝑺1
where 𝜆1∗ is the optimal step length in the direction 𝑺1 . Set 𝑖 = 2 and go to
the next step.
∇𝑓𝑖 2
4. Find ∇𝑓𝑖 = ∇𝑓 𝑿𝑖 , and set: 𝑺𝑖 = −∇𝑓𝑖 + 2
𝑺𝑖−1
∇𝑓𝑖−1
5. Compute the optimum step length 𝜆∗𝑖 in the direction 𝑺𝑖 , and find the
new point: 𝑿𝑖+1 = 𝑿𝑖 + 𝜆∗𝑖 𝑺𝑖
6. Test for the optimality of the point 𝑋𝑖+1 . If 𝑋𝑖+1 is optimum, stop the
process. Otherwise, set the value of 𝑖 = 𝑖 + 1 and go to step 4.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 6
Unconstrained optimization algorithms
● Single-variable methods
● Multiple variable methods
– 0th order
– 1st order
– 2nd order
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 8
Newton’s method
● Concept:
– Construct local quadratic approximation
– Minimize approximation
– Repeat
● Local approximation: 2nd order Taylor series:
1 T
f (x + x) f (x) + ( f ) x + x Hx = f (x)
T
2
● First order sufficiency condition applied to local
approximation to find the minimum:
f (x) = 0 f + Hx = 0 x = −H−1f
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 9
Newton’s method (2)
● Step: s = −H −1f
Evaluated at x
● Update: x k +1 = x k + s k sk
xk x k +1
● Note:
– Finds minimum of quadratic functions in 1 step! More iterations for a
general nonlinear function.
– Provide a step length embedded in 𝒔𝑘 .
– If H not positive definite, or starting point is not close to the optimum,
divergence occurs.
– The method is impractical for problems involving a complicated
objective function with a large number of variables
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 10
Newton’s method (3)
When the starting point is close to
the optimal point, following the
gradient can be inefficient. Newton
methods attempt to search along a
direct path to the optimum (solid
line). The quadratic predictions overshoot because the actual
function has more curvature than predicted by the quadratic
approximation.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 11
Newton’s method (4)
● To avoid divergence: line search.
Error
−1
Search direction: s = −H f 10
0
● Update: x k +1 = x k + k s k 10
-5
k sk
xk x k +1
-10
10
● Newton’s method has quadratic
convergence (best!) close to 10
-15
optimum: 2
x k +1 − x* c x k − x* Iteration
1 2 3 4
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 12
Example 1
Show that the Newton’s method finds the minimum of a quadratic
function in one iteration.
Solution: Let the quadratic function be given by
1 𝑇
𝑓 𝒙 = 𝒙 𝐴 𝒙 + 𝑩𝑻 𝒙 + 𝐶
2
The minimum of 𝐹 𝑿 is given by
∇𝑓 = 𝐴 𝒙 + 𝑩 = 0 → 𝒙∗ = − 𝐴 −1 𝑩
Since
𝒙𝑘+1 = 𝒙𝑘 − H −1 ∇𝑓 = 𝒙𝑘 − 𝐴 −1 𝐴 𝒙𝑘 + 𝑩
𝑘
where 𝑥𝑘 is the starting point for the 𝑘𝑡ℎ iteration. Thus above equation
gives the exact solution
𝒙𝑘+1 = 𝒙∗ = − 𝐴 −1 𝑩
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 13
Example 2
Minimize 𝑓 𝑥1 , 𝑥2 = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22 by taking the starting
0
point as 𝑿1 =
0
Solution: To find 𝑿2 , we require 𝐻 −1 , where
𝜕2𝑓 𝜕2𝑓
𝜕𝑥12 𝜕𝑥1 𝜕𝑥2 4 2
𝐻 = =
𝜕2𝑓 𝜕2𝑓 2 2
𝜕𝑥2 𝜕𝑥1 𝜕𝑥22
Therefore
1 1
1 2 −2 −
𝐻 −1 = = 2 2
4 −2 4 1
− 1
2
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 14
Example 2
As
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = = =
𝜕𝑓Τ𝜕𝑥2 𝑿1
−1 + 2𝑥1 + 2𝑥2 𝑿1 −1
Thus
1 1
− −1
0 1
𝑿2 = 𝑿1 − 𝐻 −1 ∇𝑓1 = − 2 2 = 3
0 1 −1
− 1 2
2
To see whether or not 𝑿2 is the optimum point, we evaluate
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 0
∇𝑓2 = = =
𝜕𝑓Τ𝜕𝑥2 𝑿 −1 + 2𝑥1 + 2𝑥2 𝑿 0
2 2
Since ∇𝑓2 = 𝟎, 𝑿2 is an optimum point. Thus the method has converged
in one iteration for this quadratic function.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 15
Marquardt method
● Problems in Newton method:
– Bad convergence / divergence when far from optimum
– What to do when H singular / not positive definite?
● Remedy: use modified H: ෩ = 𝐇 + 𝛼𝐈
𝐇
෩ positive definite
with 𝛼 a positive constant such that 𝐇
● Levenberg-Marquardt method: start with large 𝛼 k, and
decrease gradually:
−1
𝛼 large: 𝐬 𝑘 = − 𝐇 + 𝛼𝑘 𝐈 ∇𝑓 ≈ −∇𝑓 Steepest descent
−1
𝛼 small: 𝐬 𝑘 = − 𝐇 + 𝛼𝑘 𝐈 ∇𝑓 ≈ −𝐇 −1 ∇𝑓 Newton
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 16
Marquardt method : Iterative procedure
1. Start with an arbitrary initial point 𝑿1 and constants 𝛼1 (on the order of
104 ), 𝑐1 (0 < 𝑐1 < 1), 𝑐2 (𝑐2 >1), and 𝜀 (on the order of 10−2 ). Set the
iteration number as 𝑖 = 1.
2. Compute the gradient of the function, ∇𝑓𝑖 = ∇𝑓(𝑿𝑖 ).
3. Test for optimality of the point 𝑋𝑖 . If ||∇𝑓𝑖 || = ||∇𝑓(𝑋𝑖 )|| ≤ 𝜀, 𝑿𝑖 is
optimum and hence stop the process. Otherwise, go to step 4.
4. Find the new vector 𝑿𝑖+1 as
−1
𝑿𝑖+1 = 𝑿𝑖 + 𝑺𝑖 = 𝑿𝑖 − 𝐻𝑖 + 𝛼𝑖 𝐼 ∇𝑓𝑖
5. Compare the values of 𝑓𝑖+1 and 𝑓𝑖 . If 𝑓𝑖+1 < 𝑓𝑖 , go to, step 6. If 𝑓𝑖+1 ≥
𝑓𝑖 , go to step 7.
6. Set 𝛼𝑖+1 = 𝑐1 𝛼𝑖 , 𝑖 = 𝑖 + 1, and go to step 2.
7. Set 𝛼𝑖 = 𝑐2 𝛼𝑖 and go to step 4.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 17
Example 3
Minimize 𝑓 𝑥1 , 𝑥2 = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22 from the starting point
0 1
𝑿1 = by using Marquardt method with 𝛼1 = 104 , 𝑐1 = , 𝑐2 = 2, and
0 4
𝜀 = 10−2 .
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 18
Example 3 (2)
Solution:
Iteration 1 (𝒊 = 𝟏)
Here 𝑓1 = 𝑓 𝑿1 = 0.0 and
𝜕𝑓 Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = = =
𝜕𝑓 Τ𝜕𝑥2 (0,0)
−1 + 2𝑥1 + 2𝑥2 (0,0) −1
Since
𝜕2𝑓 𝜕2𝑓
𝜕𝑥12 𝜕𝑥1 𝜕𝑥2 4 2
𝐻1 = =
𝜕2𝑓 𝜕2𝑓 2 2
𝜕𝑥2 𝜕𝑥1 𝜕𝑥22 (0,0)
−1
𝑿2 = 𝑿𝟏 + 𝑺𝟏 = 𝑿𝟏 − 𝐻1 + 𝛼1 𝐼 ∇𝑓1
4 −1
0 4 + 10 2 1 −0.9998
= − 4 = 10−4
0 2 2 + 10 −1 1
As 𝑓2 = 𝑓 𝑿2 = −1.9997 × 10−4 < 𝑓1 , we set 𝛼2 = 𝑐1 𝛼1 = 2500, 𝑖 = 2, and proceed to
the next iteration.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 19
Example 3 (3)
Iteration 2 (𝒊 = 𝟐)
0.9998
The gradient vector corresponding to 𝑿2 is given by ∇𝑓2 = , ∇f2 = 1.4141 > 𝜀,
−1
and hence we compute
−1
𝑿3 = 𝑿𝟐 − 𝐻2 + 𝛼2 𝐼 ∇𝑓2
−0.9998 × 10 −4 2504 2 −1 0.9998 −4.9958 × 10−4
= − =
1 × 10−4 2 2504 −1 5 × 10−4
Since 𝑓3 = 𝑓 𝑿3 = −0.9993 × 10−3 < 𝑓2 , we set 𝛼3 = 𝑐1 𝛼2 = 625, 𝑖 = 3, and proceed to
the next iteration. The iterative process is to be continued until the convergence criterion,
∇𝑓𝑖 < 𝜀, is satisfied.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 20
Newton summary
● 2nd method: Newton’s method
● Conclusions:
– Most efficient: solves quadratic problem in 1 step
– Not robust!
● Robustness improvements:
– Marquardt (blend Steepest decent / Newton)
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 24
Quasi-Newton methods
● Drawbacks remain for Newton method:
– Need to evaluate H (often impractical)
– Storage of H, solving system of equations
● Alternatives: quasi-Newton methods:
– Based on approximating H (or H-1)
– Use only first-order derivative information
Newton step: Quasi-Newton step:
~ −1
xk +1 = xk − k H f −1
x k +1 = x k − k H f
~ ~ −1
Update H (or H )
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 25
Davidson-Fletcher-Powell (DFP)
Iterative procedure
1. Start with an initial point 𝑋1 and a 𝑛 × 𝑛 positive definite symmetric matrix 𝐵1
to approximate the inverse of the Hessian matrix of 𝑓. Usually, 𝐵1 is taken as
the identity matrix 𝐼 . Set the iteration number as 𝑖 = 1.
2. Compute the gradient of the function, ∇𝑓𝑖 = ∇𝑓(𝑿𝑖 ), and set
𝑺𝑖 = − 𝐵𝑖 ∇𝑓𝑖
3. Find the optimal step length 𝜆∗𝑖 in the direction 𝑺𝑖 and set
𝑿𝑖+1 = 𝑿𝑖 + 𝜆∗𝑖 𝑺𝑖
4. Test the new point 𝑿𝑖+1 for optimality. If 𝑿𝑖+1 is optimal, terminate the iterative
process. Otherwise, go to step 5.
5. Update the matrix 𝐵𝑖 as
𝐵𝑖+1 = 𝐵𝑖 + 𝑀𝑖 + 𝑁𝑖
𝑺𝑖 𝑺𝑇𝑖 𝐵𝑖 𝒈𝑖 𝐵𝑖 𝒈𝑖 𝑇
𝑀𝑖 = 𝜆∗𝑖 ; 𝑁𝑖 = − ; 𝒈𝑖 = ∇𝑓 𝑿𝑖+1 − ∇𝑓 𝑿𝑖 = ∇𝑓𝑖+1 − ∇𝑓𝑖
𝑺𝑇𝑖 𝒈𝑖 𝒈𝑇𝑖 𝐵𝑖 𝒈𝑖
6. Set the new iteration number as 𝑖 = 𝑖 + 1, and go to step 2.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 26
Example 4
Minimize 𝑓 𝑥1 , 𝑥2 = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22 from starting point
0
𝑿1 = using DFP method with
0
1 0
𝐵1 = 𝜀 = 0.01
0 1
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 27
Example 4 (2)
Solution:
Iteration 1 (𝒊 = 𝟏)
Here
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = ∇𝑓(𝑿1 ) = = =
𝜕𝑓Τ𝜕𝑥2 (0,0)
−1 + 2𝑥1 + 2𝑥2 (0,0) −1
and hence
1 0 1 −1
𝑺𝟏 = − B1 ∇𝑓1 = =
0 1 −1 1
To find the minimizing step length 𝜆1∗ along 𝑺1 , we minimize
0 −1
𝑓 𝑿1 + 𝜆1 𝑺1 = 𝑓 + 𝜆1 = 𝑓 −𝜆1 , 𝜆1 = 𝜆12 − 2𝜆1
0 1
with respect to 𝜆1 . Since 𝑑𝑓Τ𝑑𝜆1 = 0 at 𝜆1∗ = 1, we obtain
0 −1 −1
𝑿2 = 𝑿𝟏 + 𝜆1∗ 𝑺𝟏 = +1 =
0 1 1
−1
Since ∇𝑓2 = ∇𝑓 𝑿2 = and ∇𝑓2 = 1.4142 > 𝜀, we proceed to update the matrix
−1
𝐵𝑖 by computing
−1 1 −2
𝒈1 = ∇𝑓2 − ∇𝑓1 = − = ;
−1 −1 0
−2 −1 1 −1
𝑺1𝑇 𝒈1 = −1 1 = 2; 𝑺1 𝑺1𝑇 = −1 1 =
1 1 −1 1
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 28
Example 4 (3)
1 0 −2 −2 𝑇 −2 𝑇
𝐵1 𝒈1 = = ; 𝐵1 𝒈1 = = −2 0
0 1 0 0 0
1 0 −2 −2
𝒈1𝑇 𝐵1 𝒈1 = −2 0 = −2 0 =4
0 1 0 0
1 1
𝑇
𝑺1 𝑺1 1 1 −1 −
𝑀1 = 𝜆1∗ 𝑇 = 1 = 2 2
𝑺1 𝒈1 2 −1 1 1 1
−
2 2
−2
𝐵1 𝒈1 𝐵1 𝒈1 𝑇 −2 0 1 4 0
0 1 0
𝑁1 = − = − = − = −
𝒈1𝑇 𝐵1 𝒈1 4 4 0 0 0 0
0.5 −0.5
𝐵2 = 𝐵1 + 𝑀1 + 𝑁1 =
−0.5 1.5
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 29
Example 4 (4)
Iteration 2 (𝒊 = 𝟐)
The next search direction is determined as
0.5 −0.5 −1 0
𝑺2 = − 𝐵2 ∇𝑓2 = − =
−0.5 1.5 1 1
To find the minimizing step length 𝜆∗2 along 𝑺2 , we minimize
−1 0
𝑓 𝑿2 + 𝜆2 𝑺2 = 𝑓 + 𝜆1 = 𝑓 −1,1 + 𝜆2 = 𝜆22 − 𝜆2 − 1
1 2
with respect to 𝜆2 . Since 𝑑𝑓Τ𝑑𝜆2 = 0 at 𝜆∗2 = 1/2, we obtain
1 0 −1
∗ −1
𝑿3 = 𝑿𝟐 + 𝜆2 𝑺𝟐 = + = 3
1 2 1
2
This point can be identified to be optimum since
0
∇𝑓3 = 𝑎𝑛𝑑 ∇𝑓3 = 0 < 𝜀
0
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 30
Broyden, Fletcher, Goldfarb, Shannon (BFGS)
Iterative procedure
1. Start with an initial point 𝑋1 and a 𝑛 × 𝑛 positive definite symmetric matrix 𝐵1
as an initial estimate of the inverse of the Hessian matrix of 𝑓. In the absence
of additional information, 𝐵1 is taken as the identity matrix [𝐼]. Compute the
gradient vector ∇𝑓1 = ∇𝑓 𝑿1 and set the iteration number as 𝑖 = 1.
2. Compute the gradient of the function, ∇𝑓𝑖 = ∇𝑓(𝑿𝑖 ), and set
𝑺𝑖 = − 𝐵𝑖 ∇𝑓𝑖
3. Find the optimal step length 𝜆∗𝑖 in the direction 𝑺𝑖 and set
𝑿𝑖+1 = 𝑿𝑖 + 𝜆∗𝑖 𝑺𝑖
4. Test the new point 𝑿𝑖+1 for optimality. If | ∇𝑓𝑖+1 | ≤ 𝜀 is small quantity, take
𝑿∗ ≈ 𝑿𝑖+1 and terminate the process. Otherwise, go to step 5.
5. Update the Hessian matrix as
𝒈𝑇𝑖 𝐵𝑖 𝒈𝑖 𝒅𝑖 𝒅𝑇𝑖 𝒅𝑖 𝒈𝑇𝑖 𝐵𝑖 𝐵𝑖 𝒈𝑖 𝒅𝑇𝑖
𝐵𝑖+1 = 𝐵𝑖 + 1 + − −
𝒅𝑇𝑖 𝒈𝑖 𝒅𝑇𝑖 𝒈𝑖 𝒅𝑇𝑖 𝒈𝑖 𝒅𝑇𝑖 𝒈𝑖
𝐝𝑖 = 𝜆∗𝑖 𝑺𝑖 ; 𝒈𝑖 = ∇𝑓 𝑿𝑖+1 − ∇𝑓 𝑿𝑖 = ∇𝑓𝑖+1 − ∇𝑓𝑖
6. Set the new iteration number as 𝑖 = 𝑖 + 1 and go to step 2.
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 31
Example 5
Minimize 𝑓 𝑥1 , 𝑥2 = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22 from starting point
0
𝑿1 = using BFGS method with
0
1 0
𝐵1 = 𝜀 = 0.01
0 1
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 32
Example 5 (2)
Solution:
Iteration 1 (𝒊 = 𝟏)
Here
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = ∇𝑓(𝑿1 ) = = =
𝜕𝑓Τ𝜕𝑥2 (0,0)
−1 + 2𝑥1 + 2𝑥2 (0,0) −1
and hence
1 0 1 −1
𝑺𝟏 = − B1 ∇𝑓1 = =
0 1 −1 1
To find the minimizing step length 𝜆1∗ along 𝑺1 , we minimize
0 −1
𝑓 𝑿1 + 𝜆1 𝑺1 = 𝑓 + 𝜆1 = 𝑓 −𝜆1 , 𝜆1 = 𝜆12 − 2𝜆1
0 1
with respect to 𝜆1 . Since 𝑑𝑓Τ𝑑𝜆1 = 0 at 𝜆1∗ = 1, we obtain
0 −1 −1
𝑿2 = 𝑿𝟏 + 𝜆1∗ 𝑺𝟏 = +1 =
0 1 1
−1
Since ∇𝑓2 = ∇𝑓 𝑿2 = and ∇𝑓2 = 1.4142 > 𝜀, we proceed to update the matrix
−1
𝐵𝑖 by computing
−1 1 −2 −1 −1
𝒈1 = ∇𝑓2 − ∇𝑓1 = − = ; 𝐝1 = 𝜆1∗ 𝑺1 = 1 =
−1 −1 0 1 1
−1 1 −1 −2
𝒅1 𝒅1𝑇 = −1 1 = ; 𝒅1𝑇 𝒈1 = −1 1 =2
1 −1 1 0
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 33
Example 5 (3)
−1 0 2 −2 2 −2
𝒅1 𝒈1𝑇 = −2 0 = ; 𝒈1 𝒅1𝑇 = −1 1 =
1 0 −2 0 0 0
1 0 −2 −2
𝒈1𝑇 𝐵1 𝒈1 = −2 0 = −2 0 =4
0 1 0 0
2 0 1 0 2 0
𝒅1 𝒈1𝑇 𝐵1 = =
−2 0 0 1 −2 0
1 0 2 −2 2 −2
𝐵1 𝒈1 𝒅1𝑇 = =
0 1 0 0 0 0
Update the Hessian matrix as
1 1
4 1 1 −1 1 2 0 1 2 −2 −
1 0
𝐵2 = + 1+ − − = 2 2
0 1 2 2 −1 1 2 −2 0 2 0 0 1 5
−
2 2
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 34
Example 5 (4)
Iteration 2 (𝒊 = 𝟐)
The next search direction is determined as
1 1
−
𝑺2 = − 𝐵2 ∇𝑓2 = − 2 2 −1 = 0
1 5 1 2
−
2 2
To find the minimizing step length 𝜆∗2 along 𝑺2 , we minimize
−1 0
𝑓 𝑿2 + 𝜆2 𝑺2 = 𝑓 + 𝜆1 = 𝑓 −1,1 + 2𝜆2 = 4𝜆22 − 2𝜆2 − 1
1 2
with respect to 𝜆2 . Since 𝑑𝑓Τ𝑑𝜆2 = 0 at 𝜆∗2 = 1/4, we obtain
1 0 −1
∗ −1
𝑿3 = 𝑿𝟐 + 𝜆2 𝑺𝟐 = + = 3
1 4 2
2
This point can be identified to be optimum since
0
∇𝑓3 = 𝑎𝑛𝑑 ∇𝑓3 = 0 < 𝜀
0
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 35
banana
Comparison on Banana function
● Nelder-Mead simplex: 210 Newton
DFP
Nelder-Mead
Levenberg-Marquardt
Steepest
BFGSdescent
simplex
● Steepest descent: >300
(no convergence)
● BFGS: 53
● DFP: 260
● Newton: 18
● Marquardt: 29
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 36
Summary multi-variable algorithms for
unconstrained optimization
● Theoretically nothing beats Newton, but:
– Expensive computations and storage (Hessian)
– Not robust
● Quasi-Newton (BFGS) best if gradients available
● 0th order methods robust and simple, but inefficient for n > 10
● CG inferior to quasi-Newton methods, but calculations simple:
more efficient for n > 1000
● Exact line search important!
● Variable scaling can have large impact
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 37