6.
252 NONLINEAR PROGRAMMING
LECTURE 8
OPTIMIZATION OVER A CONVEX SET;
OPTIMALITY CONDITIONS
Problem: minx∈X f (x), where:
(a) X ⊂ n is nonempty, convex, and closed.
(b) f is continuously differentiable over X.
• Local and global minima. If f is convex local
minima are also global.
f(x)
Local Minima Global Minimum
x
X
OPTIMALITY CONDITION
Proposition (Optimality Condition)
(a) If x∗ is a local minimum of f over X, then
∇f (x∗ ) (x − x∗ ) ≥ 0, ∀ x ∈ X.
(b) If f is convex over X, then this condition is
also sufficient for x∗ to minimize f over X.
Constraint set X
∇f(x*)
At a local minimum x∗ ,
x
the gradient ∇f (x∗ ) makes
an angle less than or equal
x* to 90 degrees with all fea-
sible variations x−x∗ , x ∈
Surfaces of equal cost f(x)
X.
∇f(x*)
Constraint set X
Illustration of failure of the
optimality condition when
x* X is not convex. Here x∗
x
is a local min but we have
∇f (x∗ ) (x − x∗ ) < 0 for
the feasible vector x shown.
PROOF
Proof: (a) Suppose that ∇f (x∗ ) (x − x∗ ) < 0 for
some x ∈ X. By the Mean Value Theorem, for
every > 0 there exists an s ∈ [0, 1] such that
f x +(x−x ) = f (x )+∇f x +s(x−x ) (x−x∗ ).
∗ ∗ ∗ ∗ ∗
Since ∇f is continuous, for suff. small > 0,
∇f x∗ + s(x − ∗
x ) (x − x∗ ) < 0
so that f + (x −
x∗ x∗ )
< f (x∗ ). The vector
x∗ + (x − x∗ ) is feasible for all ∈ [0, 1] because
X is convex, so the local optimality of x∗ is con-
tradicted.
(b) Using the convexity of f
f (x) ≥ f (x∗ ) + ∇f (x∗ ) (x − x∗ )
for every x ∈ X. If the condition ∇f (x∗ ) (x−x∗ ) ≥
0 holds for all x ∈ X, we obtain f (x) ≥ f (x∗ ), so
x∗ minimizes f over X. Q.E.D.
OPTIMIZATION SUBJECT TO BOUNDS
• Let X = {x | x ≥ 0}. Then the necessary
condition for x∗ = (x∗1 , . . . , x∗n ) to be a local min is
n
∂f (x∗ )
(xi − x∗i ) ≥ 0, ∀ xi ≥ 0, i = 1, . . . , n.
∂xi
i=1
• Fix i. Let xj = x∗j for j = i and xi = x∗i + 1:
∂f (x∗ )
≥ 0, ∀ i.
∂xi
• If x∗i > 0, let also xj = x∗j for j = i and xi = 12 x∗i .
Then ∂f (x∗ )/∂xi ≤ 0, so
∂f (x∗ )
= 0, if x∗i > 0.
∂xi
∇f(x*) ∇f(x*)
x* x* = 0
OPTIMIZATION OVER A SIMPLEX
n
X = x x ≥ 0, xi = r
i=1
where r > 0 is a given scalar.
• Necessary condition for x∗ = (x∗1 , . . . , x∗n ) to be
a local min:
n
∂f (x∗ )
n
(xi −x∗i ) ≥ 0, ∀ xi ≥ 0 with xi = r.
∂xi
i=1 i=1
• Fix i with x∗i > 0 and let j be any other index.
Use x with xi = 0, xj = x∗j + x∗i , and xm = x∗m for
all m = i, j:
∂f (x∗ ) ∂f (x∗ )
− x∗i ≥ 0,
∂xj ∂xi
∂f (x∗ ) ∂f (x∗ )
x∗i >0 =⇒ ≤ , ∀ j.
∂xi ∂xj
OPTIMAL ROUTING
• Given a data net, and a set W of OD pairs
w = (i, j). Each OD pair w has input traffic rw .
x1
Origin of Destination of
OD pair w x2 OD pair w
rw x3 rw
x4
• Optimal routing problem:
minimize D(x) = Dij xp
(i,j) all paths p
containing (i,j)
subject to xp = rw , ∀ w ∈ W,
p∈Pw
xp ≥ 0, ∀ p ∈ Pw , w ∈ W
• Optimality condition
∂D(x∗ ) ∂D(x∗ )
x∗p > 0 =⇒ ≤ , ∀ p ∈ Pw .
∂xp ∂xp
TRAFFIC ASSIGNMENT
• Transportation network with OD pairs w. Each
w has paths p ∈ Pw and traffic rw . Let xp be the
flow of path p and let Tij x
p: crossing (i,j) p
be the travel time of link (i, j).
• User-optimization principle: Traffic equilibrium
is established when each user of the network chooses,
among all available paths, a path requiring mini-
mum travel time, i.e., for all w ∈ W and paths
p ∈ Pw ,
x∗p > 0 =⇒ tp (x∗ ) ≤ tp (x∗ ), ∀ p ∈ Pw , ∀ w ∈ W
where tp (x), is the travel time of path p
tp (x) = Tij (Fij ), ∀ p ∈ Pw , ∀ w ∈ W.
all arcs (i,j)
on path p
Identical with the optimality condition of the rout-
ing problem if we identify the arc travel time Tij (Fij )
with the cost derivative Dij (F ).
ij
PROJECTION OVER A CONVEX SET
• Let z ∈ n and a closed convex set X be given.
Problem:
minimize f (x) = z − x 2
subject to x ∈ X.
Proposition (Projection Theorem) Problem
has a unique solution [z]+ (the projection of z).
Necessary and sufficient con-
Constraint set X dition for x∗ to be the pro-
jection. The angle between
x - x*
x z − x∗ and x − x∗ should
x*
be greater or equal to 90
z - x*
degrees for all x ∈ X, or
(z − x∗ ) (x − x∗ ) ≤ 0
z
• If X is a subspace, z − x∗ ⊥ X.
• The mapping f : n → X defined by f (x) =
[x]+ is continuous and nonexpansive, that is,
[x]+ − [y]+ ≤ x − y , ∀ x, y ∈ n .