0% found this document useful (0 votes)
8 views38 pages

Convex Functions

The document provides a comprehensive overview of convex functions, including their definitions, properties, and examples. It discusses concepts such as concavity, strict convexity, norms, dual norms, and conditions for convexity based on first and second derivatives. Additionally, it covers special cases, operations preserving convexity, and applications of Jensen's inequality.

Uploaded by

farah.fairy0107
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views38 pages

Convex Functions

The document provides a comprehensive overview of convex functions, including their definitions, properties, and examples. It discusses concepts such as concavity, strict convexity, norms, dual norms, and conditions for convexity based on first and second derivatives. Additionally, it covers special cases, operations preserving convexity, and applications of Jensen's inequality.

Uploaded by

farah.fairy0107
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Convex functions

Definition
f : R n → R is convex if dom f is a convex set and

f (θx + (1 − θ)y ) ≤ θf (x) + (1 − θ)f (y )

for all x, y ∈ dom f , and θ ∈ [0, 1].


Convex functions

Definition
f : R n → R is convex if dom f is a convex set and

f (θx + (1 − θ)y ) ≤ θf (x) + (1 − θ)f (y )

for all x, y ∈ dom f , and θ ∈ [0, 1].


◮ f is concave if −f is convex
◮ f is strictly convex if dom f is convex and

f (θx + (1 − θ)y ) < θf (x) + (1 − θ)f (y )

for all x, y ∈ dom f , x 6= y , θ ∈ (0, 1).


Convex function: examples on R

◮ affine: ax + b
◮ exponential: e αx , for any α ∈ R
◮ powers:
◮ x a on R++ , for a ≥ 1 or a ≤ 0
◮ −x a on R++ , for 0 ≤ a ≤ 1
◮ powers of absolute value: |x|p on R, for p ≥ 1
◮ negative logarithm: − log x on R++
◮ x log x on R++
Convex function: affine functions

Affine functions are both convex and concave

◮ affine function in R n :
f (x) = aT x + b
◮ affine function in R m×n :
m X
X n
f (X ) = tr(AT X ) + b = hA, X i + b = Aij Xij + b
i =1 j=1
Convex function: norms

All norms are convex

◮ norms in R n : !1/p
n
X
p
kxkp = |xi |
i =1

for all p ≥ 1
◮ norms in R m×n :
◮ Frobenius norm: kX kF = hX , X i1/2
◮ spectral norm: kX k2 = σmax (X ) = (λmax (X T X ))1/2
Operator norms
Let k · ka and k · kb be norms on R m and R n , respectively. The operator
norm of X ∈ R m×n , induced by k · ka and k · kb , is defined to be

kX ka,b = sup {kXuka | kukb ≤ 1}

◮ Spectral norm (ℓ2 -norm):

kX k2 = kX k2,2 = σmax (X ) = (λmax (X T X ))1/2

◮ Max-row-sum norm:
n
X
kX k∞ = kX k∞,∞ = max |Xij |
i =1,··· ,m
j=1

◮ Max-column-sum norm:
m
X
kX k1 = kX k1,1 = max |Xij |
j=1,··· ,n
i =1
Dual norm

Let k · k be a norm on R n . The associated dual norm, denoted k · k∗ , is


defined as
kzk∗ = sup {z T x | kxk ≤ 1}.

◮ z T x ≤ kxk kzk∗ for all x, z ∈ R n


◮ The dual of the ℓp -norm is the ℓq -norm, where 1/p + 1/q = 1
◮ The dual of the ℓ2 -norm on R m×n is the nuclear norm,

kZ k2∗ = sup {tr(Z T X ) | kX k2 ≤ 1}


= σ1 (Z ) + · · · + σr (Z ) = tr(Z T Z )1/2 ,

where r = rank Z .
Restriction of a convex function to a line

f : R n → R is convex iff g : R → R,

g (t) = f (x + tv ) dom g = {t | x + tv ∈ dom f }

is convex for any x ∈ dom f , v ∈ R n

So we can check the convexity of a function with multiple variables by


checking the convexity of functions of one variable
Restriction of a convex function to a line: example

n
Show that f : S++ → R with f (X ) = log det X is concave.
Proof.
Define g : R → R, g (t) = f (X + tV ) with dom g = {t | X + tV ≻ 0},
for any X ≻ 0 and V ∈ S n .

g (t) = log det(X + tV )


= log det(X 1/2 (I + tX −1/2 VX −1/2 )X 1/2 )
X n
= log(1 + tλi ) + log det X
i =1

where λ1 , · · · , λn are the eigenvalues of X −1/2 VX −1/2 . Hence g is


concave for any X ≻ 0 and V ∈ S n , so is f .
Extended-value extensions

Definition
If f : R n → R is convex, we define its extended-value extension
f˜ : R n → R ∪ {∞} by

˜ f (x) x ∈ dom f
f (x) =
∞ x∈ / dom f

By this notation, the condition

f˜(θx + (1 − θ)y ) ≤ θf˜(x) + (1 − θ)f˜(y ) ∀θ ∈ [0, 1]

is equivalent to the two conditions:


◮ dom f is convex

◮ f (θx + (1 − θ)y ) ≤ θf (x) + (1 − θ)f (y ) ∀θ ∈ [0, 1]


First-order conditions

Theorem (first-order condition)


If f : R n → R is differentiable, then f is convex if and only if dom f is
convex and

f (y ) ≥ f (x) + ∇f (x)T (y − x), ∀x, y ∈ dom f


First-order conditions

Theorem (first-order condition)


If f : R n → R is differentiable, then f is convex if and only if dom f is
convex and

f (y ) ≥ f (x) + ∇f (x)T (y − x), ∀x, y ∈ dom f

◮ local information (gradient) leads to global information (convexity)


◮ f is strictly convex if and only if dom f is convex and

f (y ) > f (x) + ∇f (x)T (y − x), ∀x, y ∈ dom f , x 6= y


First-order condition: proof

Proof.
◮ Suppose f (x) is convex. Then

f (x + θ(y − x)) − f (x) ≤ θ(f (y ) − f (x)), ∀θ ∈ [0, 1]


f (x + θ(y − x)) − f (x)
lim ≤ f (y ) − f (x)
θ→0 θ
T
=⇒ ∇f (x) (y − x) ≤ f (y ) − f (x)

◮ Suppose the first-order condition holds. Let z = θx + (1 − θ)y . Then

f (x) ≥ f (z) + ∇f (z)T (x − z)


f (y ) ≥ f (z) + ∇f (z)T (y − z)
=⇒ θf (x) + (1 − θ)f (y ) ≥ f (z)

which is true for any θ ∈ [0, 1], so f is convex.


Second-order conditions

Theorem (second-order condition)


If f : R n → R is twice differentiable, then f is convex if and only if
dom f is convex and its Hessian is positive semidefinite, i.e.,

∇f (x)  0 ∀x ∈ dom f
Second-order conditions

Theorem (second-order condition)


If f : R n → R is twice differentiable, then f is convex if and only if
dom f is convex and its Hessian is positive semidefinite, i.e.,

∇f (x)  0 ∀x ∈ dom f

◮ if ∇f (x) ≻ 0 for all x ∈ dom f , then f is strictly convex


Second-order conditions: proof
Proof.
◮ Suppose f is convex. Because f is twice differentiable, we have
1 T 2
f (x + δx) = f (x) + ∇f (x)T δx + δx ∇ f (x)δx + R(x; δx)kδxk2
2
where R(x; δx) → 0 as δx → 0. Because f is convex, by the first-order
condition, f (x + δx) ≥ f (x) + ∇f (x)T δx. Hence

δx T ∇2 f (x)δx + R(x; δx)kδxk2 ≥ 0

for any δx. Let δx = ǫd. Taking ǫ → 0 yields d T ∇2 f (x)d ≥ 0 for any d,
thus ∇2 f (x)  0.
◮ Suppose ∇f (x)  0 ∀x ∈ dom f . Then for any x, y ∈ dom f and some
z = θx + (1 − θ)y with θ ∈ [0, 1],
1
f (y ) = f (x) + ∇f (x)T (y − x) + (y − x)T ∇2 f (z)(y − x)
2
≥ f (x) + ∇f (x)T (y − x)

By the first-order condition, f is convex.


Some special cases

◮ quadratic function: f (x) = 21 x T Px + q T x + r , where P ∈ S n

∇f (x) = Px + q ∇2 f (x) = P

convex if P  0
◮ least-squares: f (x) = kAx − bk22

∇f (x) = 2AT (Ax − b) ∇2 f (x) = 2AT A

convex for any A


◮ quadratic-over-linear: f (x, y ) = x 2 /y
   T
2 y y
∇2 f (x) = 0
y 3 −x −x

convex for y > 0


Some special cases: log-sum-exp

Pn
log-sum-exp: f (x) = log k=1 exp xk is convex

max{x1 , · · · , xn } ≤ f (x) ≤ max{x1 , · · · , xn } + log n, so f can be viewed as a


differentiable approximation of the max function.
Proof.

1
∇2 f (x) = ((1T z) diag(z) − zz T ) (zk = exp xk )
(1T z)2
n n n
!
1 X X X
=⇒ v T ∇2 f (x)v = T 2 zi zi vi2 − ( zi vi )2
(1 z) i =1 i =1 i =1
1
= (kak22 kbk22 − ha, bi2 ) ≥ 0
(1T z)2
√ √ √ √
where a = ( z1 , · · · , zn ), b = ( z1 v1 , · · · , zn vn ), for any v .
Some special cases: geometric mean

Qn
geometric mean: f (x) = ( k=1 xk )1/n on R++
n
is concave
Proof.
Qn 1/n
xi
∇2 f (x) = − i =1
(n diag2 (q) − qq T ) (qi = 1/xi )
n2 !
Qn 1/n n n n
i =1 xi
T 2
X X 2 2
X 2
=⇒ v ∇ f (x)v = − 1 vi qi − ( qi vi )
n2 i =1 i =1 i =1
Qn 1/n
x
= − i =12 i (kak22 kbk22 − ha, bi2 ) ≤ 0
n
where ai = 1, bi = qi vi for any v , so ∇2 f (x)  0.
Epigraph and sublevel set

α-sublevel set of f : R n → R:

Cα = {x ∈ dom f | f (x) ≤ α}

sublevel sets of convex functions are convex (converse is false)

epigraph of f : R n → R:

epi f = {(x, t) ∈ R n+1 | x ∈ dom f , f (x) ≤ t}

f is convex if and only if epi f is a convex set


Jensen’s inequality and extensions

basic inequality: if f is convex, then for any θ ∈ [0, 1],

f (θx + (1 − θ)y ) ≤ θf (x) + (1 − θ)f (y )

extension: if f is convex, then

f (E[z]) ≤ E[f (z)]

for any random variable z.


Using Jensen’s inequality: deriving Holder’s inequality

θ log a + (1 − θ) log b ≤ log(θa + (1 − θ)b)


=⇒ aθ b 1−θ ≤ θa + (1 − θ)b ∀a, b ≥ 0, θ ∈ [0, 1]

Applying this with

|xi |p |yi |p
a= P p
a= P p
θ = 1/p
j |xj | j |yj |

yields
!1/p !1/q
|x |p |y |p |x |p |yi |p
P i p P i p ≤ Pi +
p j |xj |p q j |yj |p
P
j |xj | j |yj |

and summing over i yields Holder’s inequality.


Operations preserving convexity

◮ nonnegative weighted sum


◮ composition with affine function
◮ pointwise maximum and supremum
◮ composition
◮ minimization
◮ perspective
positive weighted sum & composition with affine function

◮ f is convex =⇒ αf is convex for any α ≥ 0


◮ f1 , f2 are convex =⇒ f1 + f2 is convex (extends to infinite sums,
integrals)
◮ f is convex =⇒ f (Ax + b) is convex
Examples:
◮ log barrier for linear inequalities
m
X
f (x) = − log(bi − aiT x)
i =1

◮ f (x) = kAx + bk
Pointwise maximum

If f1 , · · · , fm are convex, then f (x) = max{f1 (x), · · · , fm (x)} is convex

examples
◮ piecewise-linear function: f (x) = maxi =1,··· ,m (aiT x + bi ) is convex
◮ sum of r largest components of x ∈ R n :

f (x) = x[1] + x[2] + · · · + x[n]

is convex (x[i ] is ith largest component of x)


Pointwise supremum

If f (x, y ) is convex in x for each y ∈ A, then

g (x) = sup f (x, y )


y∈A

is convex

In terms of epigraph:
\
epi g = epi f (·, y )
y∈A

Examples
◮ support function of a set: SC (x) = supy∈C y T x
◮ f (x) = supy∈C kx − y k
◮ λmax (X ) = supkuk2 =1 u T Xu where X ∈ S n is convex
Minimization
If f (x, y ) is convex in (x, y ) and C is a convex nonempty set, then

g (x) = inf f (x, y )


y∈C

is convex

examples
◮ f (x, y ) = x T Ax + 2x T By + y T Cy with

 
A B
0 C ≻0
BT C

Because
g (x) = inf f (x, y ) = x T (A − BC −1 B T )x
y

is convex, the Schur complement A − BC −1 B T  0


◮ distance to a set: dist(x, S) = inf y∈S kx − y k is convex if S is convex
Minimization: proof

Proof.
Suppose x1 , x2 ∈ dom g . Given ǫ > 0, ∃ y1 , y2 ∈ C such that
f (xi , yi ) ≤ g (xi ) + ǫ. Hence for any θ ∈ [0, 1],

g (θx1 + (1 − θ)x2 ) = inf f (θx1 + (1 − θ)x2 , y )


y∈C

≤ f (θx1 + (1 − θ)x2 , θy1 + (1 − θ)y2 )


≤ θf (x1 , y1 ) + (1 − θ)f (x2 , y2 )
≤ θg (x1 ) + (1 − θ)g (x2 ) + ǫ

which holds for any ǫ > 0. Thus

g (θx1 + (1 − θ)x2 ) ≤ θg (x1 ) + (1 − θ)g (x2 )


Composition with scalar functions

Composition of g : R n → R and h : R → R:

f (x) = h(g (x))

Use:
∇2 f (x) = h′ (g (x))∇2 g (x) + h′′ (g (x))∇g (x)∇g (x)T
f is convex if
◮ g convex, h convex, h̃ nondecreasing
◮ example: exp g (x) is convex if g is convex
◮ g concave, h convex, h̃ nonincreasing
◮ example: 1/g (x) is concave if g is concave and positive
Composition with vector functions

Composition of g : R n → R k and h : R k → R:

f (x) = h(g (x) = h(g1 (x), · · · , gk (x))

when n = 1:

f ′′ (x) = g ′ (x)T ∇2 h(g (x))g ′ (x) + ∇h(g (x))T g ′′ (x)

f is convex if
◮ gi convex, h convex, h̃ nondecreasing in each argument
Pm
◮ example: i =1 exp gi (x) is convex if gi is convex
◮ gi concave, h convex, h̃ nonincreasing in each argument
Pm
◮ example: i =1 log gi (x) is concave if gi is concave and positive
Perspective

The perspective of a function f : R n → R is the function g : R n+1 → R,


x
g (x, t) = tf ( ) dom g = {(x, t) | x/t ∈ dom f , t > 0}
t
f is convex =⇒ g is convex.

For t > 0,

(x, t, s) ∈ epi g ⇐⇒ f (x/t) ≤ s/t ⇐⇒ (x/t, s/t) ∈ epi f

Hence epi g is the inverse image of epi f under the perspective mapping

examples:
◮ f (x) = x T x =⇒ g (x, t) = x T x/t is convex for t > 0
2
◮ f (x) = − log x =⇒ g (x, t) = t log t − t log x is convex on R++
The conjugate function
the conjugate of a function f is

f ∗ (y ) = sup y T x − f (x)
x∈dom f

Figure: Conjugate function


The conjugate function: examples

◮ negative logarithm f (x) = − log x



∗ −1 − log(−y ) y <0
f (y ) = sup xy + log x =
x>0 ∞ otherwise

◮ strictly convex quadratic f (x) = 1/2x T Qx with Q ∈ S++


n

f ∗ (y ) = sup y T x − 1/2x T Qx
x
1
= y T Q −1 y
2
The conjugate function: properties

Properties:
◮ f ∗ is convex

◮ f is convex and closed (i.e., epi f is closed) =⇒ f ∗∗ = f .


◮ Frechel’s inequality: f (x) + f ∗ (y ) ≥ x T y
◮ Example: with f (x) = (1/2)x T Qx with Q ∈ S++
n
, we have

x T y ≤ (1/2)x T Qx + (1/2)y T Q −1 y
Convexity with respect to generalized inequalities

f : R n → R m is K -convex if dom f is convex and

f (θx + (1 − θ)y ) K θf (x) + (1 − θ)f (y ) ∀x, y ∈ dom f , θ ∈ [0, 1]


Convexity with respect to generalized inequalities

f : R n → R m is K -convex if dom f is convex and

f (θx + (1 − θ)y ) K θf (x) + (1 − θ)f (y ) ∀x, y ∈ dom f , θ ∈ [0, 1]

example: f : S m → S m with f (X ) = X 2 is S+m -convex


Proof.
for fixed z ∈ R m , z T X 2 z = kXzk22 is convex in X , i.e.,

z T (θX + (1 − θ)Y )2 z ≤ θz T X 2 z + (1 − θ)z T Y 2 z ∀X , Y ∈ S m


=⇒ (θX + (1 − θ)Y )2  θX 2 + (1 − θ)Y 2
Quasiconvex functions

Definition
A function f (x) is quasiconvex if ∀x, y ∈ dom f ,

f (θx + (1 − θ)y ) ≤ max{f (x), f (y )} ∀θ ∈ [0, 1]

Theorem
f (x) is quasiconvex if and only if every level set of f is convex.
Quasiconvex functions: level sets

Theorem
f (x) is quasiconvex if and only if every level set of f is convex.

Proof.
1. Suppose f is quasiconvex. Suppose x, y ∈ dom f belongs to level set
Sa = {x | f (x) ≤ a}. Then

f (θx + (1 − θ)y ) ≤ max{f (x), f (y )} ≤ a

Thus θx + (1 − θ)y ∈ Sa for all θ ∈ [0, 1], so Sa is convex.

2. Suppose every level set of f is convex. For any x, y ∈ dom f , let


a = max{f (x), f (y )}. Clearly x, y ∈ Sa . Because Sa is convex,
θx + (1 − θ)y ∈ Sa for any θ ∈ [0, 1]. Thus f is quasiconvex.

You might also like