0% found this document useful (0 votes)
32 views3 pages

Convex Optimization Notes

Uploaded by

sandy.kgml
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views3 pages

Convex Optimization Notes

Uploaded by

sandy.kgml
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Topics in Convex Optimisation (Lent 2023) Lecturer: Hamza Fawzi

3 Smoothness and strong convexity


3.1 Dual norms
Recall that if k · k is a norm on Rn , then the dual norm is defined by

kyk∗ = sup hy, xi .


kxk=1

In particular we have the generalized Cauchy-Schwarz inequality

hx, yi ≤ kxkkyk∗ ∀x, y ∈ Rn .


p
= hx, xi is the Euclidean norm
Exercise: Show that the dual norm of the Euclidean norm kxk2 P
itself. More generally show the dual of the p-norm kxkp = ( i |xi |p )1/p is the q norm where
1/p + 1/q = 1.

3.2 L-smoothness
We say that a differentiable function f : Rn → R is L-smooth with respect to a norm k · k, if for
any x, y ∈ int dom(f ),
k∇f (x) − ∇f (y)k∗ ≤ Lkx − yk, (1)
where k · k∗ is the dual norm to k · k. (We will sometimes omit the reference to the norm, in which
case this means we work with the Euclidean norm.) The following lemma will be important in the
analysis of optimization algorithms.

Lemma 1 (Descent lemma). If f is L-smooth, then for any x ∈ int dom(f ) and y ∈ dom(f ),
L
f (y) ≤ f (x) + h∇f (x), y − xi + ky − xk2 . (2)
2
Remark 1. To appreciate the implication of the inequality above, assume k · k = k · k2 is the
Euclidean norm, and consider taking y = x − (1/L)∇f (x), i.e., one step of the gradient method
1
with step size t = 1/L. Then we get f (y) ≤ f (x) − 2L k∇f (x)k22 < f (x), i.e., the function value
decreases at each iteration.

Proof of Lemma 1. Let h = y − x and φ(t) = f (x + th) − (f (x) + t h∇f (x), hi). Then φ is differen-
tiable and φ0 (t) = h∇f (x + th) − ∇f (x), hi ≤ k∇f (x + th) − ∇f (x)k 2
R 1 ∗0khk ≤ Ltkhk where we used
the Lipschitz assumption (1). Thus it follows that φ(1) = φ(0) + 0 φ (t)dt ≤ L/2khk2 which gives
precisely the desired inequality (2).

A simple way to check L-smoothness, is by analyzing the Hessian matrix. One can show the
following proposition:

Proposition 3.1. Assume f : Rn → R is such that dom(f ) is open, and f is twice continuously
differentiable on its domain. Then f is L-smooth if, and only if,

∀u, v ∈ Rn , ∇2 f (x)u, v ≤ Lkukkvk. (3)

1
Remark 2. Condition (3) can be equivalently written as k∇2 f (x)uk∗ ≤ Lkuk for all u ∈ Rn .
Equivalently, this is saying that the (Rn , k · k) → (Rn , k · k∗ ) induced norm of the linear map ∇2 f (x)
is at most L.

Proof. ⇐ Let x, y ∈ dom(f ). The fundamental theorem of R 1calculus applied to the function t 7→
∇f (x + th) with h = y − x tells us that ∇f (y) − ∇f (x) = 0 ∇2 f (x + th)hdt. Thus we can write
Z 1 Z 1
2
k∇f (y) − ∇f (x)k∗ ≤ k∇ f (x + th)hk∗ dt ≤ Lkhkdt = Lkhk
0 0

as desired.
⇒ Assume f is L-smooth. Let u, v be arbitrary vectors, and define ψ(t) = h∇f (x + tu) − ∇f (x), vi.
Then by L-smoothness, ψ(t) ≤ Ltkukkvk, and so ψ 0 (0) = limt→0 (ψ(t) − ψ(0))/t ≤ Lkukkvk. But
ψ 0 (0) = ∇2 f (x)u, v .

Remark 3. If k · k = k · k2 is the Euclidean norm, then condition (3) is equivalent to saying that
the eigenvalues of ∇2 f (x) are all in [−L, L].

3.3 Strong convexity


We say that f is m-strongly convex (with respect to the norm k · k) if for any x, y ∈ dom(f ), and
t ∈ [0, 1]
m
f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y) − t(1 − t)kx − yk2 . (4)
2
• If f is m-strongly convex and differentiable at x, then for any y ∈ dom(f ) we have
m
f (y) ≥ f (x) + h∇f (x), y − xi + ky − xk2 . (5)
2
(This can be proved simply by subtracting f (x) from both sides of (4), dividing by t, and
letting t → 0.) The converse is also true, i.e., if dom(f ) is open, f is differentiable everywhere
on its domain, and (5) holds for all x, y ∈ dom(f ), then f is m-strongly convex. (Exercise)

• If f is twice continuously differentiable on its domain (assumed open), then strong convexity
is equivalent to ∇2 f (x)h, h ≥ mkhk2 for all x ∈ dom(f ) and h ∈ Rn . (Proof left as an
exercise.)

Remark 4. When considering the Euclidean norm, we see that a convex function f is L-smooth
if, and only if, ∇2 f (x)  LI, i.e., LI − ∇2 f (x) is positive semidefinite (where I is the identity
matrix), i.e., all the eigenvalues of f are ≤ L. Similarly, a function f is m-strongly convex if, and
only if, ∇2 f (x)  mI, i.e., all the eigenvalues of ∇2 f (x) are ≥ m.

To summarize, if a function f is L-smooth, and m-strongly convex, then we can find, at any
point x ∈ int dom(f ) global quadratic lower and upper bounds on f :

m L
f (x) + h∇f (x), y − xi + ky − xk2 ≤ f (y) ≤ f (x) + h∇f (x), y − xi + ky − xk2 . (6)
| {z 2 } | {z 2 }
strong convexity L-smoothness

The ratio κ = L/m can be interpreted as a condition number of f . This quantity will play a
prominent role in the convergence analysis of optimization algorithms for strongly convex functions.

2
The inequalities (6) can be expressed more concisely if we introduce the so-called Bregman
divergence of f , defined as the gap between f and its linear approximation:

Df (y|x) = f (y) − (f (x) + h∇f (x), y − xi).

The inequalities above can then be written as:


m L
ky − xk2 ≤ Df (y|x) ≤ ky − xk2 .
2 2

You might also like