Convexity and Its Application in Constrained Optimization
Dr. B.B. Upadhyay
Assistant Professor
Department of Mathematics
Indian Institute of Technology Patna
E-mail: bhooshan@[Link]
B.B. Upadhyay Convexity and Its Application March 22, 2021 1 / 73
Nonlinear programming problem with set constraints
(NLP)
Minimize f (x )
subject to x ∈ Ω,
where Ω ⊆ Rn and f : Rn → R is a continuously differentiable function.
The set Ω is called constraint set for nonlinear programming problem.
If Ω ⊂ Rn , the problem (NLP) is called constrained nonlinear
programming probelm.
If Ω = Rn , the problem (NLP) is called unconstrained nonlinear
programming probelm.
B.B. Upadhyay Convexity and Its Application March 22, 2021 2 / 73
Definition(Local minima)
Let f : Rn → R is a real valued function defined on some set Ω ⊆ Rn . A
point x ∗ is a local minimizer of f over Ω, if there exists > 0 such that
f (x ) ≥ f (x ∗ ), ∀x ∈ Ω ∩ B(, x ∗ ) \ {x ∗ }.
Definition(Global minima)
A point x ∗ is a global minimizer of f over Ω, if
f (x ) ≥ f (x ∗ ), ∀x ∈ Ω \ {x ∗ }.
B.B. Upadhyay Convexity and Its Application March 22, 2021 3 / 73
B.B. Upadhyay Convexity and Its Application March 22, 2021 4 / 73
Definition (Feasible directions)
A vector d ∈ R n , d 6= 0, is a feasible direction at x ∈ Ω if there exists
α0 > 0, such that
x + αd ∈ Ω, ∀α ∈ [0, α0 ].
Figure: Feasible Direction
B.B. Upadhyay Convexity and Its Application March 22, 2021 5 / 73
Definition (Partial derivatives)
Let Ω be an open subset of Rn . f : Ω → R is said to have a partial
derivative at x̄ ∈ Ω with respect to xi , i = 1, . . . , n, if
f (x̄1 , . . . , x̄i−1 , x̄i + δ, x̄i+1 , . . . , x̄n ) − f (x̄1 , . . . , x̄n )
δ
tends to a finite limit when δ → 0. This limit is called partial derivative of
f with respect to xi at x̄ and is denoted by ∂f (x̄ )/∂xi .
Definition (Gradient of a function)
The n-dimensional vector of the partial derivatives of f with respect to
x1 , x2 , . . . , xn at x̄ is called the gradient of f at x̄ and is denoted by
∇f (x̄ ), that is
∂f (x̄ ) ∂f (x̄ )
∇f (x̄ ) = ,..., .
x1 xn
B.B. Upadhyay Convexity and Its Application March 22, 2021 6 / 73
Definition (Differentiable function)
Let Ω be an open subset of Rn . The function f : Ω → R is said to be
differentiable at x̄ if for all x ∈ Rn such that x̄ + x ∈ Ω, we have
f (x̄ + x ) = f (x̄ ) + ∇f (x̄ )x + α(x̄ , x )kx k,
where ∇f (x̄ ) is an n-dimensional bounded vector, and α is a real valued
function such that lim α(x̄ , x ) = 0.
x →0
B.B. Upadhyay Convexity and Its Application March 22, 2021 7 / 73
Definition (Directional derivative)
Let f : Ω → R be a real valued function defined on an open set Ω and d
be a feasible direction at x ∈ Ω. The directional derivative of f at x̄ in the
direction d, denoted by f 0 (x , d) is a real valued function defined by
f (x + αd) − f (x )
f 0 (x , d) = lim .
α→0 α
Remark: f 0 (x , d) = d
dα f (x + αd) = ∇f (x )T d.
α=0
B.B. Upadhyay Convexity and Its Application March 22, 2021 8 / 73
Example
Let f (x , y ) = x 2 y + y 2 + 2x + 3y . Then
∂f
∂x = 2xy + 2.
∂f
∂y = x 2 + 2y + 3.
" # " #
∂f
∂x 2xy + 2
∇f (x , y ) = ∂f = 2 .
∂y x + 2y + 3
For (x , y ) = (1, 1) and d = (1, 2), the directional derivative of f at
direction d is
f 0 ((x , y ), d) = 14.
B.B. Upadhyay Convexity and Its Application March 22, 2021 9 / 73
Taylor’s theorem
Theorem
Assume that a function f : Rn → R is twice continuously differentiable
function. Then, the Taylor series expansion of f about the point x◦ ∈ Rn is
1 1
f (x ) = f (x◦ ) + ∇f (x◦ )(x − x◦ ) + (x − x◦ )T ∇2 (x − x◦ ) + o(kx − x◦ k2 ).
1! 2!
where o(kx − x◦ k2 ) → 0 as x → x◦ .
Remark: o(f (x )) represents a function that goes to zero "faster" than
f (x ) in the sense that
|o(f (x ))|
lim =0
x →0 kf (x )k
B.B. Upadhyay Convexity and Its Application March 22, 2021 10 / 73
Example
Consider the minimization problem
(P) minimize f (x ) = x 2
subject to Ω = {x ∈ R : x ≥ 1}
Let x̄ = 2. Then d T f 0 (x̄ ) < 0 for any feasible direction d ∈ [−1, 0).
B.B. Upadhyay Convexity and Its Application March 22, 2021 11 / 73
First order necessary conditions(FONC)
Let Ω be a subset of R n and f ∈ C 1 be a real-valued function on Ω. If x ∗
is a local minimizer of f over Ω, then for any feasible direction d at x ∗ , we
have d T ∇f (x ∗ ) ≥ 0.
Proof. Define x (α) = x ∗ + αd ∈ Ω and x (0) = x ∗ .
Define the composite function
φ(α) = f (x (α)).
By Taylor’s theorem,
f (x ∗ + αd) − f (x ∗ ) = φ(α) − φ(0) = φ0 (0)α + o(α), α ≥ 0.
B.B. Upadhyay Convexity and Its Application March 22, 2021 12 / 73
Thus, if φ(α) ≥ φ(0), i.e., f (x ∗ + αd) ≥ f (x ∗ ) for sufficiently small values
of α > 0 which is contradiction.
Thus we have d T ∇f (x ∗ ) ≥ 0.
B.B. Upadhyay Convexity and Its Application March 22, 2021 13 / 73
Thus, if φ(α) ≥ φ(0), i.e., f (x ∗ + αd) ≥ f (x ∗ ) for sufficiently small values
of α > 0 which is contradiction.
Thus we have d T ∇f (x ∗ ) ≥ 0.
Remark: Let Ω be a subset of R n and f ∈ C 1 be a real-valued function on
Ω. If x ∗ is a local minimizer of f over Ω, and if x ∗ is an interior point of
Ω, then
∇f (x ∗ ) = 0.
It is also known as Fermat Theorem.
B.B. Upadhyay Convexity and Its Application March 22, 2021 13 / 73
Real quadratic form
Definition (Quadratic form)
An expression of the form:
n P
P n
Q(x1 , x2 , ..., xn ) = aij xi xj
j=1 i=1
where aij are real and aij = aji , 1 ≤ i, j ≤ n is said to be a real quadratic
form in n variables x1 , x2 , ...xn .
For example:
12x 2 is a quadratic form of one variable x .
x 2 + 7xy + y 2 is a quadratic form of two variables x and y .
3x 2 + 11xy + 17yz + 8xz + 2y 2 + z 2 is a quadratic form of three
variables x , y and z.
B.B. Upadhyay Convexity and Its Application March 22, 2021 14 / 73
Matrix notation for a quadratic form
The matrix form of a given quadratic form is x T Ax where:
x1
x2
x= .. and A is the n × n matrix given by A = (aij ).
.
xn
A is a real symmetric matrix as aij = aji .
To every real quadratic form, there exists a real symmetric matrix,
which is called the matrix of the quadratic form.
B.B. Upadhyay Convexity and Its Application March 22, 2021 15 / 73
Examples
5x 2 + 2xy − y 2 is a real quadratic form of two variables.
The associated matrix of the quadratic form is:
" #
5 1
1 −1
x 2 − y 2 + 2z 2 is a real quadratic form of three variables.
The associated matrix of the quadratic form is:
1 0 0
0 −1 0
0 0 2
B.B. Upadhyay Convexity and Its Application March 22, 2021 16 / 73
Types of quadratic form
A real quadratic form Q(x1 , x2 , ..., xn ) assumes the value 0, when x = 0.
But Q assumes different values when x is not zero. Based on this, we
define the following:
A real quadratic form Q = x T Ax is said to be:
Positive definite if Q > 0 for all x 6= 0.
Positive semi-definite if Q ≥ 0 for all x and Q = 0 for some x 6= 0.
Negative definite if Q < 0 for all x 6= 0.
Negative semi-definite if Q ≤ 0 for all x and Q = 0 for some x 6= 0.
Indefinite if Q ≥ 0 for some non zero x , and Q ≤ 0 for some non zero
x.
B.B. Upadhyay Convexity and Its Application March 22, 2021 17 / 73
Example
Example
Let Q = x 2 + 2y 2 + 4z 2 + 2xy − 4yz − 2xz be a real quadratic [Link]
whether Q is positive definite or not.
Solution: We can rewrite the given quadratic form in the following manner:
Q = (x + y − z)2 + (y − z)2 + 2z 2
Thus, Q > 0 for all X 6= 0.
This proves that Q is positive definite.
B.B. Upadhyay Convexity and Its Application March 22, 2021 18 / 73
Positive definite and semidefinite matrices
Definition (Positive definite matrix)
The n × n matrix A is said to be positive definite if
y T Ay > 0,
for all vectors y ∈ Rn , such that y 6= 0.
Remark: The matrix A is called non-negative definite matrix (or
sometimes positive semidefinite matrix) if
y T Ay ≥ 0.
B.B. Upadhyay Convexity and Its Application March 22, 2021 19 / 73
Properties of positive definite matrices
Given below are some of the properties of Positive definite matrices:
A matrix is positive definite if it is symmetric and all its eigenvalues
are positive.
A matrix is positive definite if it is symmetric and all its pivots are
positive.
A matrix A is positive definite x T Ax > O for all vectors x 6= 0.
A matrix A is positive definite if and only if it can be written as
A = R T R for some possibly rectangular matrix R with independent
columns.
B.B. Upadhyay Convexity and Its Application March 22, 2021 20 / 73
Example
Example
2 2 −2
Verify whether the matrix A = 2 5 −4 is positive definite or not.
−2 −4 5
Solution:
The characteristic equation of the matrix A is given by:
det(A − λI) = 0
Solving this equation, we obtain:
λ = 1, 1, 10
Since, all the eigen values of this matrix is positive, hence it is a
positive definite matrix.
B.B. Upadhyay Convexity and Its Application March 22, 2021 21 / 73
Properties of positive semi-definite matrices
Given below are some of the properties of Positive semi-definite matrices:
A matrix is positive definite if it is symmetric and all its eigenvalues
are non-negative.
A matrix is positive definite if its symmetric and all its pivots are
non-negative.
A matrix is positive definite x T Ax ≥ O for all vectors x 6= 0.
B.B. Upadhyay Convexity and Its Application March 22, 2021 22 / 73
For a symmetric matrix,
Positive definite
⇓
All eigenvalues positive
⇓
Inverse exists
m
Columns (rows) linearly independent
B.B. Upadhyay Convexity and Its Application March 22, 2021 23 / 73
Example
2 −2 0
Show that the matrix A = −2 1 −2 is indefinite.
0 −2 0
Solution:
The characteristic equation of the matrix A is given by:
det(A − λI) = 0
Solving this equation, we obtain:
λ = 1, −2, 4
Since, all the eigen values of this matrix are positive as well as
negative, hence it is an indefinite matrix.
B.B. Upadhyay Convexity and Its Application March 22, 2021 24 / 73
Hessian
The Hessian matrix was developed in the 19th century by the German
mathematician Ludwig Otto Hesse. Hesse originally used the term
"functional determinants".
The Hessian matrix or Hessian is a square matrix of second-order
partial derivatives of a scalar-valued function, or scalar field.
It describes the local curvature of a function of many variables.
Figure: L. O. Hesse (22 April 1811 - 4 August 1874)
B.B. Upadhyay Convexity and Its Application March 22, 2021 25 / 73
Let f : Rn → R be a twice differentiable function.
If all second partial derivatives of f exist and are continuous over the
domain of the function, then the Hessian matrix F (x̄ ) (or ∇2 f (x̄ )) of
f at x̄ ∈ Rn is a n × n matrix, usually defined and arranged as follows:
∂ 2 f (x̄ ) ∂ 2 f (x̄ ) ∂ 2 f (x̄ )
∂x12 ∂x1 ∂x2 ··· ∂x1 ∂xn
∂ 2 f (x̄ ) ∂ 2 f (x̄ ) ∂ 2 f (x̄ )
···
∂x22
2
∂x2 ∂x1 ∂x2 ∂xn
∇ f (x̄ ) = F (x̄ ) =
. .. ..
.. ..
. . .
∂ 2 f (x̄ ) ∂ 2 f (x̄ ) ∂ 2 f (x̄ )
∂xn ∂x1 ∂xn ∂x2 ··· ∂xn2
B.B. Upadhyay Convexity and Its Application March 22, 2021 26 / 73
Thus, we can write,
∂2f
Fij ≡ (1)
∂xi ∂xj
Example
x2
Let f (x , y ) = y , Then
" 2
− 2x
#
y y2
F (x , y ) = 2x 2
− 2x
y2 y3
B.B. Upadhyay Convexity and Its Application March 22, 2021 27 / 73
Properties of Hessian matrix are:
1 The Hessian matrix is a symmetric matrix, since the hypothesis of
continuity of the second derivatives implies that the order of
differentiation does not matter (Schwarz’s theorem).
2 The determinant of the Hessian matrix is called the Hessian
determinant.
3 The Hessian matrix F of a function f is the Jacobian matrix of the
gradient of the function f , that is: F (x ) = J(∇f (x )).
B.B. Upadhyay Convexity and Its Application March 22, 2021 28 / 73
Second order necessary conditions (SONC)
Let Ω be a subset of R n and f ∈ C 2 be a real-valued function on Ω. Let
x ∗ is a local minimizer of f over Ω, and d be a feasible direction at x ∗ . If
d T ∇f (x ∗ ) = 0,
then we have
d T F (x ∗ )d ≥ 0,
where F is the Hessian of f .
Proof. Suppose on contrary that, there is a feasible direction d at x ∗ such
that
d T ∇f (x ∗ ) = 0
and
d T F (x ∗ )d < 0.
Define
x (α) = x ∗ + αd ∈ Ω
B.B. Upadhyay Convexity and Its Application March 22, 2021 29 / 73
Define the composite function
φ(α) = f (x (α)).
Then, by Taylor’s theorem,
α2
φ(α) = φ(0) + φ00 (0) + o(α2 ),
2
where by assumption
φ0 (0) = d T ∇f (x ∗ ) = 0
and
φ00 (0) = d T F (x ∗ )d < 0.
For sufficiently small α,
α2
φ(α) − φ(0) = φ00 (0) + o(α2 ) < 0,
2
B.B. Upadhyay Convexity and Its Application March 22, 2021 30 / 73
that is,
f (x ∗ + αd) < f (x ∗ ),
which contradicts the assumption that x ∗ is a local minimizer. Thus,
φ00 (0) = d T F (x ∗ )d ≥ 0.
Corollary: Interior case Let Ω be a subset of R n and f ∈ C 2 be a
real-valued function on Ω. If x ∗ is a local minimizer of f over Ω, and if x ∗
is an interior point of Ω, then ∇f (x ∗ ) = 0 and F (x ∗ ) is positive
semidefinite, that is
d T F (x ∗ )d ≥ 0, ∀d ∈ R n .
B.B. Upadhyay Convexity and Its Application March 22, 2021 31 / 73
Example
f (x ) = x 3 , we have f 0 (0) = 0 and f 00 (0) = 0, but x = 0 is not a minimizer.
B.B. Upadhyay Convexity and Its Application March 22, 2021 32 / 73
Definition (Convex set)
A set S ⊆ Rn is a convex set if
x , y ∈ S ⇒ λx + (1 − λ)y ∈ S, ∀ λ ∈ [0, 1].
B.B. Upadhyay Convexity and Its Application March 22, 2021 33 / 73
Properties of convex sets
If S is a convex set and β is a real number, then the set
βS = {x : x = βv , v ∈ S}
is also convex;
B.B. Upadhyay Convexity and Its Application March 22, 2021 34 / 73
Properties of convex sets
If S is a convex set and β is a real number, then the set
βS = {x : x = βv , v ∈ S}
is also convex;
If S1 and S2 are convex sets, then the set
S1 + S2 = {x : x = v1 + v2 , v1 ∈ S1 , v2 ∈ S2 }
is also convex;
B.B. Upadhyay Convexity and Its Application March 22, 2021 34 / 73
Properties of convex sets
If S is a convex set and β is a real number, then the set
βS = {x : x = βv , v ∈ S}
is also convex;
If S1 and S2 are convex sets, then the set
S1 + S2 = {x : x = v1 + v2 , v1 ∈ S1 , v2 ∈ S2 }
is also convex;
The intersection of any collection of convex sets is convex.
B.B. Upadhyay Convexity and Its Application March 22, 2021 34 / 73
Figure: Intersection of convex sets
B.B. Upadhyay Convexity and Its Application March 22, 2021 35 / 73
Convex hull of a set
Let S ⊂ Rn , then the intersection of all convex sets, containing S, is called
the convex hull of S, and denoted by < S > .
Figure: Convex hull
Since the intersection of the members of any family of convex sets is
convex, it follows that < S >, the convex hull of S is a convex set.
B.B. Upadhyay Convexity and Its Application March 22, 2021 36 / 73
Convex hull of a set
Let S ⊂ Rn , then the intersection of all convex sets, containing S, is called
the convex hull of S, and denoted by < S > .
Figure: Convex hull
Since the intersection of the members of any family of convex sets is
convex, it follows that < S >, the convex hull of S is a convex set.
The convex hull of a set S, is the smallest convex set containing S.
B.B. Upadhyay Convexity and Its Application March 22, 2021 36 / 73
Epigraph of a function
The epigraph of a function f : S → R, S ⊂ Rn , denoted epi(f ), is the set
of points in S × R given by
epi(f ) = {(x , β) : x ∈ S, β ∈ R, β ≥ f (x )}.
Figure: Epigraph of a function
B.B. Upadhyay Convexity and Its Application March 22, 2021 37 / 73
Why we need convexity?
Let Ω ⊆ Rn be a convex set and let f be a convex function on Ω,
then a local minimum point is also global.
Let Ω ⊆ Rn be a convex set and let f be a differentiable convex
function on Rn . Then, a critical point x̄ ∈ Ω is a global minimum
point.
Let f be a continuously differentiable convex function defined on an
open convex set Ω ⊆ Rn , then (FONC) and (SONC) for a point to be
a local minimum is also sufficient.
B.B. Upadhyay Convexity and Its Application March 22, 2021 38 / 73
Definition (Convex function)
Let f be a function defined on a convex set S ⊆ Rn , then f is said to be a
convex function on S, if for every x1 , x2 ∈ S
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ), ∀ λ ∈ [0, 1].
A function f defined on a convex set S ⊆ Rn is concave if and only if −f
is convex on S.
B.B. Upadhyay Convexity and Its Application March 22, 2021 39 / 73
Examples of convex function
x2
|x |
ex
x
B.B. Upadhyay Convexity and Its Application March 22, 2021 40 / 73
Theorem
Let f = (f1 , f2 , . . . , fn ) be an m-dimensional vector function defined on
S ⊆ Rn . If f is convex on S, then each nonnegative linear combination of
its components fi
θ(x ) = pf (x ), p = 0
is convex on S.
Theorem
For a real valued function f defined on a convex set S ⊆ Rn to be convex
on S it is necessary and sufficient that its epigraph
Gf = {(x , ζ) | x ∈ S, ζ ∈ R, f (x ) ≤ ζ}
be a convex set in Rn+1 .
B.B. Upadhyay Convexity and Its Application March 22, 2021 41 / 73
Theorem
If (fi )i∈I is a family of convex functions, which are bounded above on a
convex set S ⊆ Rn , then the function
f (x ) = sup fi (x )
i∈I
is a convex function on S.
Theorem
Let f be a differentiable function defined on a nonempty open convex set
S ⊆ Rn . Then f is convex on S if and only if for every x◦ ∈ S
f (x ) ≥ f (x◦ ) + (x − x◦ )T ∇f (x◦ ), ∀x ∈ S. (2)
B.B. Upadhyay Convexity and Its Application March 22, 2021 42 / 73
Proof. Assume the convexity of f . Then, for every x , x◦ ∈ S with x 6= x◦ ,
we have
f (x◦ + λ(x − x◦ )) ≤ f (x◦ ) + λ[f (x ) − f (x◦ )], λ ∈ [0, 1],
or, equivalently
f (x◦ + λ(x − x◦ )) − f (x◦ )
≤ f (x ) − f (x◦ ).
λ
It follows that
f (x◦ + λ(x − x◦ ))
lim+ = (x − x◦ )T ∇f (x◦ ) ≤ f (x ) − f (x◦ )
λ→0 λ
Conversely, let x1 , x2 ∈ S. Setting x = x2 and x◦ = λx1 + (1 − λ)x2 in (2),
we obtain
B.B. Upadhyay Convexity and Its Application March 22, 2021 43 / 73
f (x2 ) ≥ f (λx1 + (1 − λ)x2 ) + λ(x2 − x1 )T ∇f (λx1 + (1 − λ)x2 ). (3)
Setting x = x1 and x◦ = λx1 + (1 − λ)x2 in (2), we obtain
f (x2 ) ≥ f (λx1 + (1 − λ)x2 ) + λ(x2 − x1 )T ∇f (λx1 + (1 − λ)x2 ). (4)
Multiplying (3) and (4) by (1 − λ) and λ, respectively, and by adding them
up we obtain
(1 − λ)f (x2 ) + λf (x1 ) ≥ f (λx1 + (1 − λ)x2 ), λ ∈ [0, 1].
B.B. Upadhyay Convexity and Its Application March 22, 2021 44 / 73
Theorem
Let S ⊆ Rn be a convex set and let f be a convex function on S. Then, a
local minimum point is also global.
Proof. Let x◦ ∈ S be a local minimum point of f and assume, by
contradiction, the existence of x ∗ ∈ S such that f (x ∗ ) < f (x◦ ).
The convexity of S implies x = λx ∗ + (1 − λ)x◦ ∈ S, ∀λ ∈ [0, 1] and the
convexity of the function implies
f (x ) ≤ λf (x ∗ ) + (1 − λ)f (x◦ )
< λf (x◦ ) + (1 − λ)f (x◦ ) = f (x◦ ), ∀λ ∈ [0, 1]
In Particular, when λ approaches to zero, we have f (x ) < f (x◦ ) for every
point x belonging to a neighbourhood of x◦ , contradicting the local
minimality assumption.
B.B. Upadhyay Convexity and Its Application March 22, 2021 45 / 73
Theorem
Let f : S → R be a twice differentiable function defined on an open
convex set S ⊆ Rn . Then, f is convex on S if and only if for each x ∈ S,
the Hessian F (x ) of f at x is a positive semidefinite matrix.
Proof. Assume that F (x ) = ∇2 f (x ) is a positive semidefinite matrix for
each x ∈ S.
Let x , y ∈ S, from Taylor’s theorem there exists α ∈ (0, 1) such that
1
f (y ) = f (x ) + (y − x )T ∇f (x ) + (y − x )T ∇2 f (x + α(y − x ))(y − x ).
2
Since ∇2 f is positive semidefinite
(y − x )T ∇2 f (x + α(y − x ))(y − x ) ≥ 0.
Therefore, we have
f (y ) ≥ f (x ) + (y − x )T ∇f (x ).
B.B. Upadhyay Convexity and Its Application March 22, 2021 46 / 73
Conversely, on contrary assume that there exists x ∈ S such that ∇2 f (x )
is not positive semidefinite.
Therefore, there exists d ∈ Rn such that d T ∇2 f (x )d < 0.
By the continuity of Hessian matrix there exists a nonzero s ∈ R such that
y = x + sd ∈ S
Then for all points z lying on the line segment joining x and y , we have
d T ∇2 f (z)d < 0.
From Taylor’s theorem there exists α ∈ (0, 1), such that
B.B. Upadhyay Convexity and Its Application March 22, 2021 47 / 73
1
f (y ) = f (x ) + (y − x )T ∇f (x ) + (y − x )T ∇2 f (x + α(y − x ))(y − x )
2
1
= f (x ) + sd T ∇f (x ) + s 2 d T ∇2 f (x + αsd)d
2
Since, α ∈ (0, 1), the point x + αsd is on the line segment joining x and
y , and therefore
d T ∇2 f (x + αsd)d < 0
Since, s 6= 0 and s 2 > 0, hence
f (y ) < f (x ) + (y − x )T ∇f (x ),
Therefore f is not a convex function.
B.B. Upadhyay Convexity and Its Application March 22, 2021 48 / 73
First order sufficient condition
Let Ω be a convex subset of R n and f ∈ C 1 be a real-valued convex
function on Ω. If for any feasible direction d at x ∗ , we have
d T ∇f (x ∗ ) ≥ 0,
then x ∗ is a local minimizer of f over Ω.
Second order sufficient condition
Let Ω be a convex subset of R n and f ∈ C 2 be a real-valued convex
function on Ω. Let d be a feasible direction at x ∗ , such that
d T ∇f (x ∗ ) = 0
and
d T F (x ∗ )d ≥ 0,
where F is the Hessian of f , then x ∗ is a local minimizer of f over Ω.
B.B. Upadhyay Convexity and Its Application March 22, 2021 49 / 73
Nonlinear programming problem with equality constraints
(NPP)
Minimize f (x )
subject to h(x ) = 0
where x ∈ Rn , f : Rn → R, h = (h1 , . . . , hm ) : Rn → Rm are continuously
differentiable function and m ≤ n.
Example
Minimize f (x , y ) = 2x + 3y − 4
subject to h(x , y ) := xy − 6 = 0
B.B. Upadhyay Convexity and Its Application March 22, 2021 50 / 73
Definition (Feasible point)
Any point satisfying the constraints is called a feasible [Link] set of
feasible points
S := {x ∈ Rn : h(x ) = 0}
is called the feasible set.
Definition (Regular point)
A point x ∗ satisfying the constraints
h1 (x ∗ ) = 0, . . . , hm (x ∗ ) = 0
is said to be a regular point of the constraints if the gradient vectors
∇h1 (x ∗ ), . . . , ∇hm (x ∗ )
are linearly independent.
B.B. Upadhyay Convexity and Its Application March 22, 2021 51 / 73
Remark: Let Dh(x ∗ ) be the Jacobian matrix of h = (h1 , . . . , hm )T at x ∗ ,
given by
Dh1 (x ∗ ) ∇h1 (x ∗ )T
Dh(x ∗ ) = .. ..
=
. .
Dhm (x ∗ ) ∇hm (x ∗ )T
Then x ∗ is regular if and only if rank Dh(x ∗ ) = m, i.e. the Jacobian
matrix is of full rank.
B.B. Upadhyay Convexity and Its Application March 22, 2021 52 / 73
Remark: Let Dh(x ∗ ) be the Jacobian matrix of h = (h1 , . . . , hm )T at x ∗ ,
given by
Dh1 (x ∗ ) ∇h1 (x ∗ )T
Dh(x ∗ ) = .. ..
=
. .
Dhm (x ∗ ) ∇hm (x ∗ )T
Then x ∗ is regular if and only if rank Dh(x ∗ ) = m, i.e. the Jacobian
matrix is of full rank.
Definition (Surface)
The set of equality constraints h1 (x ) = 0, . . . , hm (x ) = 0, describes a
surface
S = {x ∈ R n : h1 (x ) = 0, ......, hm (x ) = 0}
Assuming the points in S are regular, the dimension of the surface S is
n − m.
B.B. Upadhyay Convexity and Its Application March 22, 2021 52 / 73
Tangent space and normal space
Definition (Tangent space)
The tangent space at a point x ∗ on the surface
S := {x ∈ Rn : h(x ) = 0}
is the set
T (x ∗ ) := {y ∈ Rn : Dh(x ∗ )y = 0}
Remark: Tangent space is equal to the null space of Dh(x ∗ ) i.e.,
T (x ∗ ) = N(Dh(x ∗ ))
and hence it is the subspace of Rn with dimension n − m, if all the points
in tangent space are regular.
B.B. Upadhyay Convexity and Its Application March 22, 2021 53 / 73
Definition (Tangent plane)
The tangent plane at x ∗ is the set
TP(x ∗ ) = T (x ∗ ) + x ∗ = {x + x ∗ : x ∈ T (x ∗ )}
Figure: The tangent plane to the surface S at the point x ∗
B.B. Upadhyay Convexity and Its Application March 22, 2021 54 / 73
Theorem
Suppose x ∗ ∈ S is a regular point, and T (x ∗ ) is the tangent space at x ∗ .
Then y ∈ T (x ∗ ) if and only if there exist a differentiable curve in S
passing through a point x ∗ with derivative y at x ∗ .
Proof. Let us suppose there exist a curve
{x (t) : t ∈ (a, b)}
in S such that
x (t ∗ ) = x ∗
and
ẋ (t ∗ ) = y
for some t ∗ ∈ (a, b). Then,
h(x (t)) = 0
B.B. Upadhyay Convexity and Its Application March 22, 2021 55 / 73
for all t ∈ (a, b). If we differentiate the function h(x (t)) with respect to t
using the chain rule, we obtain
d
h(x (t)) = Dh(x (t))ẋ (t) = 0
dt
for all t ∈ (a, b). Therefore, at t ∗ , we get
Dh(x ∗ )y = 0,
and hence y ∈ T (x ∗ ).
For converse part, we use implicit function theorem.
B.B. Upadhyay Convexity and Its Application March 22, 2021 56 / 73
Implicit function theorem
Let f : Rn+m → Rm ∈ C 1 , and let Rn+m have coordinates (x , y ). Fix a
point (a, b) = (a1 , . . . , an , b1 , . . . , bm ) with f (a, b) = 0, where 0 ∈ Rm is
the zero vector. If the Jacobian matrix
h i
∂fi
Jf ,y (a, b) = ∂yj (a, b)
is invertible, then there exists an open set U ⊂ Rn containing a such that
there exists a unique continuously differentiable function g : U → Rm such
that g(a) = b and f (x , g(x )) = 0 for all x ∈ U. Moreover, the partial
derivatives of g in U are given by the matrix product
∂g h i−1 h
∂f
i
(x ) = − Jf ,y (x , g(x )) (x , g(x ))
∂xj m×m ∂xj m×1
B.B. Upadhyay Convexity and Its Application March 22, 2021 57 / 73
Definition (Normal Space)
The normal space N(x ∗ ) at a point x ∗ on the surface
S = {x ∈ Rn : h(x ) = 0}
is the set
N(x ∗ ) = {x ∈ Rn : x = Dh(x ∗ )T z, ∀z ∈ Rm }
Normal space can be expressed as Range space of matrix Dh(x ∗ )T , i.e.
N(x ∗ ) = R(Dh(x ∗ )T )
N(x ∗ ) = {x ∈ Rn : x = z1 ∇h1 (x ∗ ) + · · · + zm ∇hm (x ∗ ), ∀z1 , . . . , zm ∈ R}
We can say that normal space N(x ∗ ) is the subspace of R n spanned by the
vectors ∇h1 (x ∗ ), . . . , ∇hm (x ∗ ),
B.B. Upadhyay Convexity and Its Application March 22, 2021 58 / 73
that is,
N(x ∗ ) = span[∇h1 (x ∗ ), . . . , ∇hm (x ∗ )]
Assuming x ∗ is regular, the dimension of the normal space N(x ∗ ) is m.
Definition (Normal plane)
Normal plane at x ∗ is the set
NP(x ∗ ) = N(x ∗ ) + x ∗ = {x + x ∗ ∈ Rn : x ∈ N(x ∗ )}.
Figure: Normal space in R3
B.B. Upadhyay Convexity and Its Application March 22, 2021 59 / 73
Lemma
Prove that
T (x ∗ ) = N(x ∗ )⊥
and
T (x ∗ )⊥ = N(x ∗ ).
Proof. From the definition of T (x ∗ ), we have
n o
T (x ∗ ) = y ∈ Rn : x T y = 0, ∀x ∈ N(x ∗ ).
Hence, from the definition of N(x ∗ ), we get
T (x ∗ ) = N(x ∗ )⊥
T (x ∗ )⊥ = N(x ∗ ).
B.B. Upadhyay Convexity and Its Application March 22, 2021 60 / 73
Lagrange condition
Lagrange’s Theorem
Let x ∗ be a local minimizer of (NPP). Assume that x ∗ is a regular point,
then there exist a scalar λ∗ ∈ Rm such that
∇f (x ∗ ) + ∇h(x ∗ )λ∗ = 0.
Proof. To prove
∇f (x ∗ ) = −∇h(x ∗ )λ∗
∇f (x ∗ ) = −Dh(x ∗ )T λ∗
∇f (x ∗ ) ∈ R(Dh(x ∗ )T ) = N(x ∗ ) = T (x ∗ )⊥
i.e. ∇f (x ∗ ) ∈ T (x ∗ )⊥
B.B. Upadhyay Convexity and Its Application March 22, 2021 61 / 73
Let y ∈ T (x ∗ ). We parameterize the level set
{x : h(x ) = 0}
by the curve
{x (t) : t ∈ (a, b)}
such that for all t ∈ (a, b),
h(x (t)) = 0.
Let x (t ∗ ) = x ∗ ẋ (t ∗ ) = y , Let us define a composition of function f by
φ(t) = f (x (t))
since x ∗ is local minimizer of f so t ∗ will be minimizer of φ. Thus,
d
φ(t ∗ ) = 0
dt
By using chain rule,we get
B.B. Upadhyay Convexity and Its Application March 22, 2021 62 / 73
∇f (x (t ∗ ))T ẋ (t ∗ ) = 0
∇f (x (t ∗ ))T y = 0
so all y ∈ T (x ∗ ) satisfy
∇f (x (t ∗ ))T y = 0
that is,
∇f (x ∗ ) ∈ T (x ∗ )⊥ .
B.B. Upadhyay Convexity and Its Application March 22, 2021 63 / 73
Note: x ∗ cannot be an extremizer if
∇f (x ∗ ) 6∈ N(x ∗ ).
Figure: An example where Lagrange condition does not hold
B.B. Upadhyay Convexity and Its Application March 22, 2021 64 / 73
We introduce Lagrangian function L : Rn × Rm → R given by
L(x , λ) , f (x ) + λT h(x )
The Lagrange condition for a local minimizer x ∗ can be represented using
the Lagrangian function as
DL(x ∗ , λ∗ ) = 0T
for some λ∗ , where the derivative operation D is with respect to the entire
argument [x T , λT ]T .
Remark: Lagrange’s theorem is only necessary condition not sufficient
condition .
B.B. Upadhyay Convexity and Its Application March 22, 2021 65 / 73
Example
Consider the problem :
minimize 8x12 − 2x2 , x1 , x2 ∈ R
subject to x12 + x22 =1
solution. The Lagrangian function is :
L(x1 , x2 , λ) = 8x12 − 2x2 + λ(x12 + x22 − 1)
The necessary condition to minimize f (x1 , x2 ) is:
Dx1 l(x1 , x2 , λ) = 0 ⇒ 16x1 + 2λx1 = 0
Dx2 l(x1 , x2 , λ) = 0 ⇒ −2 + 2λx2 = 0
Dλ l(x1 , x2 , λ) = 0 ⇒ x12 + x22 − 1 = 0
√
The point ( 3 7/8, −1/8)T , λ = −8 satisfy the Lagrange’s theorem but
this√point
( 3 7/8, −1/8)T is not a minimizer as shown in the figure given below.
B.B. Upadhyay Convexity and Its Application March 22, 2021 66 / 73
From the graph given below , point (0, 1)T is the minimizer, and
corresponding value of function f is
f (x , y ) = −2
Figure: Graphical solution of the above problem
B.B. Upadhyay Convexity and Its Application March 22, 2021 67 / 73
Example
Consider the following problem :
x T Qx
Maximize
x T Px
where Q = Q T ≥ 0, and P = P T > 0.
Therefore, to avoid multiplicity of solutions, we further impose the
constraint
x T Px = 1.
The optimization problem becomes
Maximize x T Qx
subject to x T Px = 1.
B.B. Upadhyay Convexity and Its Application March 22, 2021 68 / 73
Let us write
f (x ) = x T Qx
h(x ) = 1 − x T Px .
Any feasible point for this problem is regular.
The Lagrangian function is :
L(x , λ) = x T Qx + λ(1 − x T Px )
The lagrange condition yields
Dx L(x , λ) = 0 ⇒ 2x T Q − 2λx T P = 0T
Dλ L(x , λ) = 0 ⇒ 1 − x T Px = 0.
The first of the above equations can be represented as
(λP − Q)x = 0
B.B. Upadhyay Convexity and Its Application March 22, 2021 69 / 73
Because P = P T > 0, hence P −1 exists. we obtain
(λIn − P −1 Q)x = 0
Therefore, the solution, if it exists, is an eigenvector of P −1 Q and the
Lagrange multiplier is the corresponding eigenvalue.
Let x ∗ and λ∗ be the optimal solution. Because x ∗T Px ∗ = 1, and
P −1 Q x ∗ = λ∗ x ∗ we have
λ∗ = x ∗T Qx ∗ .
Hence, λ∗ is the maximum of the objective function ,and therefore,is,in
fact,the maximal eigenvalue of P −1 Q.
B.B. Upadhyay Convexity and Its Application March 22, 2021 70 / 73
B.B. Upadhyay Convexity and Its Application March 22, 2021 71 / 73
References
Mishra, S.K., Upadhyay, B.B.: Pseudolinear Functions and
Optimization. CRC Press, (2014)
Mangasarian, O.L.: Nonlinear Programming. MeGraw-Hill, NY (1969)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming:
Theory and Algorithms. John Wiley & Sons. (2013)
Floudas, C.A., Pardalos, P.M. (eds.): Encyclopedia of optimization.
Springer Science & Business Media. (2008)
Chong, E.K., Zak, S.H.: An Introduction to Optimization. John Wiley
& Sons. (2004)
Dempe, S.: Foundations of Bilevel Programming. Springer Science &
Business Media. (2002)
B.B. Upadhyay Convexity and Its Application March 22, 2021 72 / 73
Thank You
B.B. Upadhyay Convexity and Its Application March 22, 2021 73 / 73