Lecture Notes On Advanced Functional Analysis
Lecture Notes On Advanced Functional Analysis
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 3
Recommended Books:
4. M. C. Joshi and R. K. Bose: Some Topics in Nonlinear Functional Analysis, Wiley Eastern
Limited, New Delhi, 1985.
5. E. Kreyazig: Introductory Functional Analysis with Applications, John Wiley and Sons, New
York, 1989.
Throughout these notes, 0 denotes the zero vector of the corresponding vector space, and
h., .i denotes the inner product on an inner product space.
1.1.1 Orthogonality
One of the major differences between an inner product and a normed space is that in an
inner product space we can talk about the angle between two vectors.
Definition 1.1.1. The angle θ between two vectors x and y of an inner product space X is
defined by the following relation:
hx, yi
cos θ = . (1.1)
kxk kyk
Definition 1.1.2. Let X be an inner product space whose inner product is denoted by h., .i.
(a) Two vectors x and y in X are said to be orthogonal if hx, yi = 0. When two vectors x
and y are orthogonal, we denoted by x ⊥ y.
5
Qamrul Hasan Ansari Advanced Functional Analysis Page 6
(c) Let A be a nonempty subset of X. The set of all vectors orthogonal to A, denoted by
A⊥ , is called the orthogonal complement of A, that is,
A⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A}.
A⊥⊥ = (A⊥ )⊥ denotes the orthogonal complement of A⊥ , that is,
A⊥⊥ = (A⊥ )⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A⊥ }.
(c) Two subsets A and B of X are said to be orthogonal, denoted by A⊥B, if hx, yi = 0
for all x ∈ A and all y ∈ B.
Clearly, x and y are orthogonal if and only if the angle θ between is 90◦ , that is, cos θ = 0
which is equivalent to (in view of (1.1)) hx, yi = 0 ⇔ x ⊥ y.
Remark 1.1.1. (a) Since hx, yi = hy, xi (conjugate of hy, xi) hx, yi = 0 implies that
hy, xi = 0 or hy, xi = 0 and vice versa. Hence, x⊥y if and only if y ⊥ x, that is, all
vectors in X are mutually orthogonal.
(b) Since hx, 0i = 0 for all x, x ⊥ 0 for every x belonging to an inner product space. By
the definition of the inner product, 0 is the only vector orthogonal to itself.
(c) Clearly, {0}⊥ = X and X ⊥ = {0}.
(d) If A ⊥ B, then A ∩ B = {0}.
(e) Nonzero mutually orthogonal vectors, x1 , x2 , x3 , . . . , xn , of an inner product space are
linearly independent (Prove it!).
Example 1.1.1. Let A = {(x, 0, 0) ∈ R3 : x ∈ R} be a line in R3 and B = {(0, y, z) ∈ R3 :
y, z ∈ R} be a plane in R3 . Then A⊥ = B and B ⊥ = A.
Example 1.1.2. Let X = R3 and A be its subspace spanned by a non-zero vector x. The
orthogonal complement of A is the plane through the origin and perpendicular to the vector
x.
Example 1.1.3. Let A be a subspace of R3 generated by the set {(1, 0, 1), (0, 2, 3)}. An
element of A can be expressed as
x = (x1, x2, x3 ) = λ(1, 0, 1) + µ(0, 2, 3)
= λi + 2µj + (λ + 3µ)k
⇒ x1 = λ, x2 = 2µ, x3 = λ + 3µ.
Thus, the element of A is of the form x1 , x2 , x1 + 23 x2 . The orthogonal complement of A
can be constructed as follows: Let x = (x1 , x2 , x3 ) ∈ A⊥ . Then for y = (y1 , y2 , y3 ) ∈ A, we
have
3
hx, yi = x1 y1 + x2 y2 + x3 y3 = x1 y1 + x2 y2 + x3 y1 + y2
2
3
= (x1 + x3 ) y1 + x2 + x3 y2 = 0.
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 7
Answer. A⊥ is the straight line spanned by the vector (1, −1, 1).
Proof. Let x, y ∈ A⊥ . Then, hx, zi = 0 for all z ∈ A and hy, zi = 0 for all z ∈ A. Since for
arbitrary scalars α, β, hαx + βy, zi = αhx, zi + βhy, zi = 0, we get hαx + βy, zi = 0; that is,
αx + βy ∈ A⊥ . So A⊥ is a subspace of X.
To show that A⊥ is closed, let {xn } ∈ A⊥ such that xn → y. We need to show that y must
belongs to A⊥ . Since xn ∈ A⊥ , hx, xn i = 0 for all x ∈ X and all n. Since h., .i is a continuous
function, we have
lim hx, xn i = lim hxn , xi = h lim xn , xi = hy, xi = 0.
n→∞ n→∞ n→∞
Hence, y ∈ A⊥ .
Exercise 1.1.2. Let X be an inner product space and A and B be subsets of X. Prove the
following assertions:
(b) Let y ∈ A, but y ∈ / A⊥⊥ . Then there exists an element z ∈ A⊥ such that hy, zi =
6 0.
Since z ∈ A , hy, zi = 0 which is a contradiction. Hence, y ∈ A⊥⊥ .
⊥
Exercise 1.1.3. Let X be an inner product space and A and B be subsets of X. Prove the
following assertions:
(b) A⊥ = A⊥⊥⊥ .
Exercise 1.1.5. Let A be a nonempty subset of a Hilbert space X. Show that A is closed
if and only if A = A⊥⊥ .
The well-known Pythagorean theorem of plane geometry says that the sum of the squares
of the base and the perpendicular in a right-angled triangle is equal to the square of the
hypotenuse. Its infinite-dimensional analogue is as follows.
Proof. Note that kx + yk2 = hx + y, x + yi = hx, xi + hy, xi + hx, yi + hy, yi. Since x⊥y,
hx, yi = 0 and hy, xi = 0, we have kx + yk2 = kxk2 + kyk2 .
Exercise 1.1.6. Let K and D be subset of an inner product space X. Show that
(K + D)⊥ = K ⊥ ∩ D ⊥ ,
where K + D = {x + y : x ∈ K, y ∈ D}.
Exercise 1.1.8. Let X be an inner product space and for a nonzero vector y ∈ X, Ky :=
{x ∈ X : hx, yi = 0}. Determine the subspace Ky⊥ .
Qamrul Hasan Ansari Advanced Functional Analysis Page 9
(a) A subset A of nonzero vectors in X is said to orthogonal if any two distinct elements
in A are orthogonal.
(a) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthogonal if
xα ⊥ xβ for all α, β ∈ Λ, α 6= β.
(b) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthonormal if
it is orthogonal and kxα k = 1 for all xα , that is, for all α, β ∈ Λ, we have
0, if α 6= β
hxα , yβ i = δαβ = (1.3)
1, if α = β.
Example 1.1.4. The standard / canonical basis for Rn (with usual inner product)
e1 = (1, 0, 0, . . . , 0),
e2 = (0, 1, 0, . . . , 0),
.. ..
. .
en = (0, 0, 0, . . . , 1),
Clearly, c00 ⊆ ℓ∞ .
P [a, b] is complete with respect to the norm kf k∞ = supx∈[a,b] |f (x)|.
However, P [a, b] is dense in C[a, b] with k · k∞ .
∞
!1/2
X
1/2 2
kxk = hx, xi = |xn | .
n=1
The space ℓp with p 6= 2 is not an inner product space, and hence not a Hilbert space.
However, ℓp with p 6= 2 is a Banach space.
For 1 ≤ p < ∞, Lp [a, b] is a complete normed space with respect to the norm
Z b 1/p
p
kf kp = |f | dµ .
a
Note that kf kp does not define a norm on Lp [a, b] for 0 < p < 1.
Example 1.1.5. Consider ℓ2 space and its subset E = {e1 , e2 , . . .} with en = δnj , that is,
Qamrul Hasan Ansari Advanced Functional Analysis Page 11
The set E = {e1 , e2 , . . .} with en = δnj , that is, en = {0, 0, . . . 0, 1, 0, . . .} (1 is at nth place)
forms an orthonormal set for c00 , and {en } is an orthonormal sequence.
Example 1.1.7. Consider the space C[0, 2π] with the inner product
Z 2π
hf, gi = f (t)g(t) dt, for all f, g ∈ C[0, 2π].
0
Consider the sets E = {u1 , u2 , . . .} and G = {v1 , v2 , . . .} or the sequences {un } and {vn },
where
un (t) = cos nt, for all n = 0, 1, 2, . . . ,
and
vn (t) = sin nt, for all n = 1, 2, . . . .
Then, E = {u1 , u2 , . . .} is an orthogonal set and {un } is an orthogonal sequence. Also,
G = {v1 , v2 , . . .} is an orthogonal set and {vn } is an orthogonal sequence.
Proof. We have
n 2 * n n
+
X X X
xi = xi , xi
i=1 i=1 i=1
n
X n
X
= hxi , xi i + hxi , xj i
i=1 i,j=1, 6=j
Xn
= hxi , xi i
i=1
Xn
= kxi k2 .
i=1
as kuj k = 1 since {u1 , u2, . . .} is an orthonormal sequence. Therefore, the unknown coeffi-
cients αk in (1.5) can be easily calculated.
The following Gram-Schmidt process provides that how to obtain an orthonormal sequence
if an arbitrary linearly independent sequence is given.
span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }.
Qamrul Hasan Ansari Advanced Functional Analysis Page 14
v1
1st Step Take v1 = x1 and u1 =
kv1 k
v2
2nd Step Take v2 = x2 − hx2 , u1iu1 and u2 =
kv2 k
v1 v1
= x2 − x2 ,
kv1 k kv1 k
hx2 , v1 i
= x2 − v1
hv1 , v1 i
v3
3rd Step Take v3 = x3 − hx3 , u2iu2 and u3 =
kv3 k
2
X hxj , vj i
= x3 − vj
j=1
hvj , vj i
.. .. ..
. . .
n−1
X vn
nth Step Take vn = xn − hxn , uj iuj and un =
j=1
kvn k
n−1
X hxn , vj i
= xn − vj
j=1
hvj , vj i
.. .. ..
. . .
Then {vn } is an orthogonal sequence of vectors in X and {un } is an orthonormal sequence
in X. Also, for every n:
span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }.
Theorem 1.1.3. Let {xn } be a linearly independent sequence in an inner product space
X. Let v1 = x1 , and
n−1
X hxn , vn i
vn = xn − vj , for n = 2, 3, . . . .
j=1
hvj , vj i
vn
Then {v1 , v2 , . . .} is an orthogonal set, {un } is an orthonormal sequence where un = kvn k
,
and
span{x1 , x2 , . . . , xk } = span{u1 , u2 , . . . , uk }, for all k = 1, 2, . . . , n.
Proof. Since {xn } is a sequence of linearly independent vectors, so xn 6= 0 for all n. Define
v1 = x1 and
hx2 , v1 i
v2 = x2 − v1 .
hv1 , v1 i
Clearly, v2 ∈ span{x1 , x2 } and
hx2 , v1 i
hv2 , v1 i = hx2 , v1 i − hv1 , v1 i = 0,
hv1 , v1 i
Qamrul Hasan Ansari Advanced Functional Analysis Page 15
that is, v2 and v1 are orthogonal. Since {x1 , x2 } is linearly independent, v2 6= 0. Then, by
Exercise 1.1.3 (d), {v1 , v2 } is linearly independent, and hence, it follows that
span{v1 , v2 } = span{x1 , x2 }.
Continuing in this way, we define an orthogonal set {v1 , v2 , . . . , vn−1 } such that
Let
n−1
X hxn , vk i
vn = xn − vj .
j=1
hvj , vj i
Then we have vk ∈ span{x1 , x2 , . . . , xk } and hvk , vi i = 0 for i < k. Again, since {x1 , x2 , . . . , xk }
is linearly independent, vk 6= 0. Thus, {v1 , v2 , . . . , vn } is the required orthogonal set and
{u1 , u2, . . . , un } is the required orthonormal set
Exercise 1.1.13. Let Y be the plane in R3 spanned by the vectors x1 = (1, 2, 2) and
x2 = (−1, 0, 2), that is, Y = span{x1 , x2 }. Find orthonormal basis for Y and for R3 .
Solution. x1 , x2 is a basis for the plane Y . We can extend it to a basis for R3 by adding one
vector from the standard basis. For instance, vectors x1 , x2 and x2 = (0, 0, 1) form a basis
for R3 because
1 2 2
1 2
−1 0 2 = = 2 6= 0.
−1 0
0 0 1
By using the Gram-Schmidt process, we orthogonalize the basis x1 = (1, 2, 2), x2 = (−1, 0, 2)
and x3 = (0, 0, 1):
v1 = x1 = (1, 2, 2),
hx2 , v1 i
v2 = x2 − v1
hv1 , v1 i
3
= (−1, 0, 2) − (1, 2, 2) = (−4/3, −2/3, 4/3)
9
hx3 , v1 i hx3 , v2 i
v3 = x3 − v1 − v2
hv1 , v1 i hv2 , v2 i
2 4/3
= (0, 0, 1) − (1, 2, 2) − (−4/3, −2/3, 4/3) = (2/9, −2/9, 1/9).
9 4
Now, v1 = (1, 2, 2), v2 = (−4/3, −2/3, 4/3), v3 = (2/9, −2/9, 1/9) is an orthogonal basis for
R3 , while v1 , v2 is an orthogonal basis for Y . The orthonromal basis for Y is u1 = kvv11 k =
1
3
(1, 2, 2), u2 = kvv22 k = 31 (−2, −1, 2).
v1
The orthonromal basis for R3 is u1 = kv1 k
= 31 (1, 2, 2), u2 = v2
kv2 k
= 31 (−2, −1, 2), u3 = v3
kv3 k
=
1
3
(2, −2, 1).
Qamrul Hasan Ansari Advanced Functional Analysis Page 16
Exercise 1.1.14. Let {un } be an orthonormal sequence in an inner product space X. Prove
the following statements (Use Pythagorean theorem).
P∞ P∞ 2
(a) If w = n=1 αn un , then kwk = n=1 |αn | , where αn ’s are scalars.
PN
(b) If x ∈ X and sN = n=1 hx, un iun , then kxk2 = kx − sN k2 + ksN k2 .
P
(c) If x ∈ X and sN = N n=1 hx, un iun , and XN = span{u1 , u2 , . . . uN }, then kx − sN k =
miny∈XN kx − yk (It is called best approximation property).
Theorem 1.1.4 (Bessel’s inequality). Let {uk } be an orthonormal set in an inner prod-
uct space X. Then for any x ∈ X, we have
∞
X
|hx, uk i|2 ≤ kxk2 .
k=1
P
Proof. Let xn = nk=1 hx, uk iuk be the nth partial sum. Then, by using the properties of the
inner product and applying the fact that
0, if i 6= j
hui , uj i = δij =
1, if i = j,
we have
Exercise 1.1.15. Let {ui } be a countably infinite orthonormal set in a Hilbert space X.
Then prove the following statements:
P
∞
(a) The infinite series αn un , where αn ’s are scalars, converges if and only if the series
n=1
P
∞ P
∞
|αn |2 converges, that is, |αn |2 < ∞.
n=1 n=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 17
P
∞
(b) If αn un converges and
n=1
∞
X ∞
X
x= αn u n = βn un ,
n=1 n=1
P
∞
then αn = βn for all n and kxk2 = |αn |2 .
n=1
P
∞
Proof. (a) Let αn un be convergent and assume that
n=1
∞ N 2
X X
x= αn u n , or equivalently, lim x− αn u n = 0.
N →∞
n=1 n=1
Now,
*∞ +
X
hx, um i = αn u n , u m
n=1
∞
X
= αn hun , um i, for m = 1, 2, . . .
n=1
= αm (as {ui } is orthonormal).
P
∞
which shows that |αn |2 converges.
n=1
P
∞
To prove the converse, assume that |αn |2 is convergent. Consider the finite sum sn =
n=1
P
n
αi ui . Then, we have
i=1
* n n
+
2
X X
ksn − sm k = αi u i , αi u i
i=m+1 i=m+1
Xn
= |αi |2 → 0 as n, m → ∞.
i=m+1
This means that {sn } is a Cauchy sequence. Since X is complete, the sequence of partial
P
∞
sums {sn } is convergent in X, and therefore, the series αn un converges.
n=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 18
P
∞
(b) We first prove that kxk2 = |αn |2 . We have
n=1
N
X N X
X N
2 2
kxk − |αn | = hx, xi − hαn un , αm um i
n=1 n=1 m=1
* N
+ *N N
+
X X X
= x, x − αn u n + αn u n , x − αn u n
n=1 n=1 n=1
N N
!
X X
≤ x− αn u n kxk + αn u n = M.
n=1 n=1
P
N
Since αn un converges to x, the M converges to zero, proving the result.
n=1
∞
X ∞
X
If x = αn u n = βn un , then
n=1 n=1
" N # ∞
X X
0 = lim (αn − βn ) un ⇒ 0= |αn − βn |2 , by (a),
N →∞
n=1 n=1
Prove that ∞ ∞
X X
u αn u n and v = βn un
n=1 n=1
P∞
are convergent series with respect to the norm of X and hu, vi = n=1 αn βn .
Proof. Let
N
X N
X
uN αn u n and vN = βn un .
n=1 n=1
and so, {uN } is a Cauchy sequence in a complete space X and thus converging to some
u ∈ X. Similarly, {vN } is a Cauchy sequence in a complete space X that converges to some
Qamrul Hasan Ansari Advanced Functional Analysis Page 19
v ∈ X. Finally,
N
X N
X N
X
huN , vN i = hαj uj , βk uk i = αj βk huj , uk i = αj βj ,
j,k=1 j,k=1 j=1
Recall that if {u1 , u2 , . . . , un } is a basis of a linear space X, then for every x ∈ X, there
exists scalars α1 , α2 , . . . , αn such that x = α1 u1 + α2 u2 + · · · + αn un .
Definition 1.1.4. (a) An orthogonal set of vectors {ui } in an inner product space X is
called an orthogonal basis if for any x ∈ X, there exist scalars αi such that
∞
X
x= αi u i .
i=1
(b) An orthonormal basis {ui } in a Hilbert space X is called maximal or complete if there
is no unit vector u0 in X such that {u0 , u1, u2 , . . .} is an orthonormal set. In other
words, the sequence {ui } of orthonormal basis in X is complete if and only if the only
vector orthogonal to each of ui ’s is the null vector.
In general, an orthonormal set E in an inner product space X is complete or maximal
if it is a maximal orthonormal set in X, that is, E is an orthonormal set, and for every
orthonormal set E e satisfying E ⊆ E,
e we have E e = E.
(c) Let {ui } be an orthonormal basis in a Hilbert space X, then the numbers αi = hx, ui i
are called the Fourier coefficients of the element x with respect to the system {ui } and
P ∞
i=1 αi ui is called the Fourier series of the element x.
Example 1.1.8. The set {ei : i ∈ N}, where ei = (0, 0, . . . , 0, 1, 0, . . .) with 1 lies in the ith
place, forms an orthonormal basis for ℓ2 (C).
Example 1.1.9. Let X = L2 (−π, π) be a complex Hilbert space and un be the element of
X defined by
1
un (t) = √ exp(i n t), for n = 0, ±1, ±2, . . . .
2π
Then
1 cos nt sin nt
√ , √ , √ : n = 1, 2, . . .
2π π π
forms an orthonormal basis for X as exp(i n t) = cos nt + i sin nt.
Qamrul Hasan Ansari Advanced Functional Analysis Page 20
Theorem 1.1.5. Let {ui : i ∈ N} be an orthonormal set in a Hilbert space X. Then the
following assertions are equivalent:
∞
X
(c) For all x ∈ X, kxk2 = |hx, ui i|2 .
i=1
Proof. (a) ⇔ (b): Let {ui : i ∈ N} be an orthonormal basis for X. Then we can write
∞
X n
X
x= αi u i , that is x = lim αi u i .
n→∞
i=1 i=1
For k ≤ n in N, we have
* n + n
X X
αi u i , u k = αi hui , uk i = uk .
i=1 i=1
The same argument shows that if (b) holds, then this expansion is unique and so {ui : i ∈ N}
is an orthonormal basis for X.
(b) ⇔ (c): By Pythagorean theorem and continuity of the inner product, we have
∞ 2 ∞
X X
2
kxk = hx, ui iui = |hx, ui i|2 .
i=1 i=1
P∞
(c) ⇔ (d): Let hx, ui i = 0 for all i. Then kxk2 = i=1 |hx, ui i|2 = 0 which implies that
x = 0.
∞
X
(d) ⇔ (b): Take any x ∈ X and let y = x − hx, ui iui . Then for each k ∈ N, we have
i=1
* ∞
+
X
hy, uk i = hx, uk i − lim hx, ui iui , uk =0
n→∞
i=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 21
∞
X
since eventually n ≥ k. It follows from (d) that y = 0, and hence x = hx, ui iui .
i=1
Theorem 1.1.6 (Fourier Series Representation). Let Y be the closed subspace spanned
by a countable orthonormal set {ui } in a Hilbert space X. Then every element x ∈ Y
can be written uniquely as
∞
X
x= hx, ui iui . (1.6)
i=1
Proof. Uniqueness of (1.6) is a consequence of Exercise 1.1.15 (b). For any x ∈ Y , we can
write
M
X
x = lim αi ui , for M ≥ N
N →∞
i=1
Theorem 1.1.7 (Fourier Series Theorem). For any orthonormal set {un } in a separable
Hilbert space X, the following statements are equivalent:
where αi = hui , xi are Fourier coefficients of x, and βi = hy, ui i are Fourier coeffi-
cients of y.
Proof. (a) ⇒ (b). It follows from (1.6) and the fact that {ui } is orthonormal.
(a) ⇒ (d). The statement (d) is equivalent to the statement that the orthogonal projection
onto S, the closure of S, is the identity. In view of Theorem 1.1.6, statement (d) is equivalent
to statement (a).
Exercise 1.1.17. Let X be a Hilbert space and E be an orthonormal basis of X. Prove
that E is countable if and only if X is separable.
Let K be a nonempty subset of a normed space X. Recall that distance from an element
x ∈ X to the set K is defined by
b
x
K
y b
b
x b
x b
x
ρ ρ ρ
One can see in the following figures that even in the simple space R2 , there may be no z
satisfying (1.12), or precisely one such z, or more than one z.
To get the existence and uniqueness of such z, we recall the concept of a convex set.
Definition 1.2.1. A subset K of a vector space X is said to be a convex set if for all
x, y ∈ K and α, β ≥ 0 such that α + β = 1, we have αx + βy ∈ K, that is, for all x, y ∈ K
and α ∈ [0, 1], we have αx + (1 − α)y ∈ K.
Theorem 1.2.1. Let K be a nonempty closed convex subset of a Hilbert space X. Then
for any given x ∈ X, there exists a unique z ∈ K such that
Proof. Existence. Let ρ := inf kx − yk. By the definition of the infimum, there exists a
y∈K
sequence {yn } in K such that kx − yn k → ρ as n → ∞. We will prove that {yn } is a Cauchy
sequence.
Qamrul Hasan Ansari Advanced Functional Analysis Page 24
x
y
b
x
K
y b
Figure 1.4: Existence and uniqueness of z that minimizes the distance from K
Therefore, lim kyn − ym k2 = 0, and thus, {yn } is a Cauchy sequence. Since X is complete,
n,m→∞
there exists z ∈ X such that lim yn = z. Since yn ∈ K and K is closed, z ∈ K. In
n→∞
conclusion, we have
kx − zk = inf kx − yk.
y∈K
Qamrul Hasan Ansari Advanced Functional Analysis Page 25
z+ẑ
By using parallelogram law and x − 2
≥ ρ (since 21 (z + ẑ) ∈ K), we have
2
z + ẑ
kz − ẑk2 + 4 x − = 2 kx − zk2 + kx − ẑk2 = 4ρ2 ,
2
that is,
2
2 2 z + ẑ
0 ≤ kz − ẑk = 4ρ − 4 x − ≤ 0.
2
Thus, kz − ẑk = 0, and hence, z = ẑ.
Remark 1.2.1. Theorem 1.2.1 does not hold in the setting of Banach spaces. For example, c0
is a closed subspace of ℓ∞ , but there is no closest sequence in c0 to the sequence {1, 1, 1, . . .}.
In fact, the distance between c0 and the sequence {1, 1, 1, . . .} is 1, and this is achieved by
any bounded sequence {xn } with xn ∈ [0, 2].
that is,
0 ≤ |λ|2 kyk2 − λhx − z, yi − λhx − z, yi.
hx−z,yi
Putting λ = kyk2
in the above inequality, we obtain
|hx − z, yi|2
≤ 0,
kyk2
Lemma 1.2.1. If K is a proper closed subspace of a Hilbert space X, then there exists
a nonzero vector x ∈ X such that x ⊥ K.
Proof. Let u ∈
/ K and ρ = inf ku − yk, the distance from u to K. By Theorem 1.2.1, there
y∈K
exists a unique element z ∈ K such that ku − zk = ρ. Let x = u − z. Then x 6= 0 as ρ > 0.
(If x = 0, then u − z = 0 and ku − zk = 0 implies that ρ = 0.)
Now, we show that x ⊥ K. For this, we show that for arbitrary y ∈ K, hx, yi = 0. For
any scalar α, we have kx − αyk = ku − z − αyk = ku − (z + αy)k. Since K is a subspace,
z + αy ∈ K whenever z, y ∈ K. Thus, z + αy ∈ K implies that kx − αyk ≥ ρ = kxk or
kx − αyk2 − kxk2 ≥ 0 or hx − αy, x − αyi − kxk2 ≥ 0. Since
we have,
−αhx, yi − αhx, yi + |α|2kyk2 ≥ 0.
Putting α = βhx, yi in the above inequality, β being an arbitrary real number, we get
−2βa + β 2 ab ≥ 0,
or
βa(βb − 2) ≥ 0, for all real β.
If a > 0, the above inequality is false for all sufficiently small positive β. Hence, a must be
zero, that is, a = |hx, yi|2 = 0 or hx, yi = 0 for all y ∈ K.
Lemma 1.2.2. If M and N are closed subspaces of a Hilbert space X such that M ⊥ N,
then the subspace M + N = {x + y ∈ X : x ∈ M and y ∈ N} is also closed.
(It is clear that (xm − xn ) ⊥ (ym − yn ) for all m, n.) Since {zn } is convergent, it is a Cauchy
sequence and so kzm − zn k2 → 0. Hence, from (1.13), we see that kxm − xn k → 0 and
kym − yn k → 0 as m, n → ∞. Hence, {xm } and {yn } are Cauchy sequences in M and N,
respectively. Being closed subspaces of a complete space, M and N are also complete. Thus,
{xm } and {yn } are convergent in M and N, respectively, say xm → x ∈ M and yn → y ∈ N,
x + y ∈ M + N as x ∈ M and y ∈ N. Then
Definition 1.2.2. A vector space X is said to be the direct sum of two subspaces Y and Z
of X, denoted by X = Y ⊕ Z, if each x ∈ X has a unique representation x = y + z for y ∈ Y
and z ∈ Z.
Proof. Since every subspace is a convex set, by previous two results, for every x ∈ X, there
is a z ∈ K such that x − z ∈ K ⊥ , that is, there is a y ∈ K ⊥ such that y = x − z which is
equivalently to x = z + y for z ∈ K and y ∈ K ⊥ .
To prove the uniqueness, assume that there is also ŷ ∈ K ⊥ such that x = ŷ + ẑ for ẑ ∈ K.
Then x = y + z = ŷ + ẑ, and therefore, y − ŷ = ẑ − z. Since y − ŷ ∈ K ⊥ whereas ẑ − z ∈ K,
we have y − ŷ ∈ K ∩ K ⊥ = {0}. This implies that y = ŷ, and hence also z = ẑ.
y = PK ⊥ (x)
z = PK (x)
Example 1.2.1. (a) Let X = L2 (−1, 1). Then X = K ⊕ K ⊥ , where K is the space of
even functions, that is,
K = {f ∈ L2 (−1, 1) : f (−t) = f (t) for all t ∈ (−1, 1)},
and K ⊥ is the space of odd functions, that is,
K ⊥ = {f ∈ L2 (−1, 1) : f (−t) = −f (t) for all t ∈ (−1, 1)}.
2
Let X be a vector space. A linear operator P : X → X is called projection operator if P ◦ P = P 2 = P .
Theorem 1.2.5. If a vector space X is expressed as the directed sum of its subspaces Y and Z, then
there is a uniquely determined projection P : X → X such that Y = R(P ) and Z = N (P ) = R(I − P ),
where I be the identity mapping on X.
Qamrul Hasan Ansari Advanced Functional Analysis Page 29
Similarly, it can be verified that the orthogonal projection I − PK corresponds to the case
R(I − PK ) = K ⊥ and N (I − PK ) = K.
Exercise 1.2.3. Let K be a closed subspace of a Hilbert space X and I be the identity
mapping on X. Then prove that there exists a unique mapping PK from X onto K such
that I − PK maps X onto K ⊥ .
(e) If K1 and K2 are closed subspaces of X such that K1 ⊆ K2 , then PK1 (PK2 (x)) =
PK1 (x).
(f) PK is a linear mapping, that is, for all α, β ∈ R and all x, y ∈ X, PK (αx + βy) =
αPK (x) + βPK (y).
(b) The null space N (PK ) and the range set R(PK ) are closed subspaces of X.
(c) N (PK ) = (R(PK ))⊥ and R(PK ) = N (PK )⊥ .
(d) PK is idempotent.
Exercise 1.2.6. Let K1 and K2 be closed subspaces of a Hilbert space X and PK1 and PK2
be orthogonal projections onto K1 and K2 , respectively. If hx, yi = 0 for all x ∈ K1 and
y ∈ K2 , then prove that
We discuss here the concepts of projection and projection operator on convex sets which are
of vital importance in such diverse fields as optimization, optimal control and variational
inequalities.
Definition 1.2.4. Let K be a nonempty closed convex subset of a Hilbert space X. For
x ∈ X, by projection of x on K, we mean the element z ∈ K, denoted by PK (x), such that
equivalently,
kx − zk = inf kx − yk. (1.16)
y∈K
b
x
K
y b
In order to prove the converse, let (1.17) be satisfied for some element z ∈ K. This implies
that g ′(0) is non-negative, and by (1.19), g ′′ (α) is non-negative. Hence, g(0) ≤ g(1) for all
y ∈ K such that (1.16) is satisfied.
Remark 1.2.2. The inequality (1.17) shows that x − z and y − z subtend a non-acute angle
between them. The projection PK (x) of x on K can be interpreted as the result of applying
to x the operator PK : X → K, which is called projection operator. Note that PK (x) = x
for all x ∈ K.
Theorem 1.2.8. The projection operator PK defined on a Hilbert space X into its
nonempty closed convex subset K has the following properties:
(a) PK is a nonexpansive, that is, kPK (x) − PK (y)k ≤ kx − yk for all x, y ∈ X; which
implies that PK is continuous.
Since PK (x2 ) and PK (x1 ) ∈ K, choose y = PK (x2 ) and y = PK (x1 ), respectively, in (1.21)
and (1.22), we obtain
or
hPK (x1 ) − PK (x2 ), PK (x1 i − PK (x2 )i ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i,
equivalently,
kPK (x1 ) − PK (x2 )k2 ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i. (1.23)
Therefore, by the Cauchy-Schwartz-Bunyakowski inequality, we get
and hence,
kPK (x1 ) − PK (x2 )k ≤ kx1 − x2 k . (1.25)
x̃ b
b PK (x) b
PK (x̃)
ỹ b
K
b
PK (ỹ)
PK (y) b
b y
Figure 1.7: The nonexpansiveness of the projection operator
Let X and Y be inner product spaces over the same field K (= R or C). A functional
a(·, ·) : X × Y → K will be called a form.
Definition 1.3.1. Let X and Y be inner product spaces over the same field K (= R or C).
A form a(·, ·) : X × Y → K is called a sesquilinear functional or sesquilinear form if the
following conditions are satisfied for all x, x1 , x2 ∈ X, y, y1, y2 ∈ Y and all α, β ∈ K:
Remark 1.3.1. (a) The sesquilinear functional is linear in the first variable but not so
in the second variable. A sesquilinear functional which is also linear in the second
variable is called a bilinear form or a bilinear functional. Thus, a bilinear form a(·, ·) is
a mapping defined from X × Y into K which satisfies conditions (i) - (iii) of the above
definition and a(x, βy) = βa(x, y).
(b) If X and Y are real inner product spaces, then the concepts of sesquilinear functional
and bilinear form coincide.
(c) An inner product is an example of a sesquilinear functional. The real inner product is
an example of a bilinear form.
Qamrul Hasan Ansari Advanced Functional Analysis Page 34
Indeed,
|a(xn , yn ) − a(x, y)| ≤ |a(xn − x, yn )| + |a(x, yn − y)|
≤ kak (kxn − xk kyn k + kxk kyn − yk) .
Definition 1.3.3. Let X be an inner product space. A form a(·, ·) : X × X → K is called:
where x and y are considered as column vectors and y ⊤ denotes the transpose of y. By the
Cauchy-Schwarz inequality, we have
|a(x, y)| = |y ⊤ Ax| = |hy, Axi|
≤ kyk kAxk ≤ kAk kxk kyk.
Qamrul Hasan Ansari Advanced Functional Analysis Page 35
If A is a symmetric and positive definite matrix, then the bilinear form is symmetric and
coercive since we know that
X n
aij yj yi ≥ αkyk2,
i,j=1
and kak = kT k.
that is,
kfx k ≤ kak kxk.
Thus, fx is a bounded linear functional on Y . By Riesz representation theorem3 , there exists
a unique vector y ∗ ∈ Y such that
The vector y ∗ depends on the choice vector x. Therefore, we can write y ∗ = T (x) where
T : X → Y . We observe that
The operator T is linear in view of the following relations: For all y ∈ Y , x, x1 , x2 ∈ X and
α ∈ K, we have
Moreover, T is continuous as
To prove that kT k = kak, it is enough to show that kT k ≥ kak which follows from the
following relation:
|a(x, y)| |hT (x), yi|
kak = sup = sup
x6=0 y6=0 kxk kyk x6=0 y6=0 kxk kyk
kT (x)k kyk
≤ sup = kT k.
x6=0 y6=0 kxk kyk
To prove the uniqueness of T , let us assume that there is another linear operator S : X → Y
such that
a(x, y) = hS(x), yi, for all (x, y) ∈ X × Y.
Then, for every x ∈ X and y ∈ Y , we have
equivalently, h(T − S)(x), yi = 0. This implies that (T − S)(x) = 0 for all x ∈ X, that
is, T ≡ S. This proves that there exists a unique bounded linear operator T such that
a(x, y) = hT (x), yi.
Remark 1.3.3 (Converse of above theorem). Let X and Y be Hilbert spaces and T : X → Y
be a bounded linear operator. Then the form a(·, ·) : X × Y → K defined by
Proof. Since T is a bounded linear operator on X × Y and the inner product is a sesquilinear
mapping, we have that a(x, y) = hT (x), yi is sesquilinear.
Since |a(x, y)| = |hT (x), yi| ≤ kT k kxk kyk, by the Cauchy-Schwartz-Bunyakowski inequality,
we have sup |a(x, y)| ≤ kT k, and hence a(·, ·) is bounded.
kxk=kyk=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 37
By Theorem 1.3.1, a(x, y) is a bounded bilinear form on X and kak = kT k. Since we have
b(x, y) = a(y, x); b is also bounded bilinear on X and
Therefore, we have b(x, y) = a(y, x) = hT (y), xi = hx, T (y)i for all (x, y) ∈ X × X.
Proof. By Theorem 1.3.1, for every bounded linear operator on X, there is a bounded bilinear
form a such that a(x, y) = hT (x), yi and kak = kT k. Then,
Definition 1.3.4. Let X be a Hilbert space and a(·, ·) : X × X → K be a form. Then the
operator F : X → K is called a quadratic form associated with a(·, ·) if F (x) = a(x, x) for
all x ∈ X.
Remark 1.3.4. (a) We immediately observe that F (αx) = |α|2F (x) and |F (x)| ≤ kak kxk.
(b) The norm of F is defined as
|F (x)|
kF k = sup = sup |F (x)|.
x6=0 kxk2 kxk=1
Remark 1.3.5. If a(·, ·) is any fixed sesquilinear form and F (x) is an associated quadratic
form on a Hilbert space X. Then
1 x+y
x−y
(a) 2
[a(x, y) + a(y, x)] = F 2
−F 2
;
(b) a(x, y) = 14 [F (x + y) − F (x − y) + iF (x + iy) − iF (x − iy)].
and
F (x − y) = a(x − y, x − y) = a(x, x) − a(y, x) − a(x, y) + a(y, y).
By subtracting the second of the above equation from the first, we get
or
F (x + iy) − F (x − iy) = 2ia(x, y) + 2ia(y, x). (1.31)
Multiplying (1.31) by i and adding it to (1.30), we get the result.
F (x) = a(x, x)
= a(x, x)
= F (x).
Conversely, let F (x) be real, then by Remark 1.3.5 (d) and in view of the relation
On the other hand, suppose F is bounded. From Remark 1.3.5 (d) and the parallelogram
law, we get
1
|a(x, y)| ≤ kF k(kx + yk2 + kx − yk2 + kx + iyk2 + kx − iyk2)
4
1
= kF k2 kxk2 + kyk2 + kxk2 + kyk2
4
= kF k kxk2 + kyk2 ,
or
sup |a(x, y)| ≤ 2kF k.
kxk=kyk=1
(a) T is self-adjoint.
Proof. (a) ⇒ (b): F (x) = hT (x), xi = hx, T (x)i = hT (x), xi = F (x). In view of Lemma
1.3.1, we obtain the result.
(b) ⇒ (a): hT (x), yi = a(x, y) = a(y, x) = hT (y), xi = hx, T (y)i. This shows that T ∗ ≡ T
that T is self-adjoint.
The following theorem, known as the Lax-Milgram lemma proved by PD Lax and AN Mil-
gram in 1954, has important applications in different fields.
Theorem 1.3.4 (Lax-Milgram Lemma). Let X be a Hilbert space, a(·, ·) : X × X → R
be a coercive bounded bilinear form, and f : X → R be a bounded linear functional. Then
there exists a unique element x ∈ X such that
Proof. Since a(·, ·) is bounded, there exists a constant M > 0 such that
|a(x, y)| ≤ Mkxk kyk. (1.33)
By Theorem 1.3.1, there exists a bounded linear operator T : X → X such that
a(x, y) = hT (x), yi, for all (x, y) ∈ X × X.
By Riesz representation theorem4 , there exists a continuous linear functional f : X → R
such that equation a(x, y) = f (y) can be rewritten as, for all λ > 0,
hλT (x), yi = λhf, yi, (1.34)
or
hλT (x) − λf, yi = 0, for all y ∈ X.
This implies that
λT (x) = λf. (1.35)
We will show that (1.35) has a unique solution by showing that for appropriate values of
parameter ρ > 0, the affine mapping for y ∈ X, y 7→ y − ρ(λT (y) − λf ) ∈ X is a contraction
mapping. For this, we observe that
ky − ρλT (y)k2 = hy − ρλT (y), y − ρλT (y)i
= kyk2 − 2ρhλT (y), yi + ρ2 kλT (y)k2 (by applying inner product axioms)
≤ kyk2 − 2ραkyk2 + ρ2 M 2 kyk2,
4
Riesz Representation Theorem. If f is a bounded linear functional on a Hilbert space X, then there
exists a unique vector y ∈ X such that f (x) = hx, yi for all x ∈ X and kf k = kyk
Qamrul Hasan Ansari Advanced Functional Analysis Page 41
as
a(y, y) = hλT (y), yi ≥ αkyk2 (by the coercivity), (1.36)
and
kλT (y)k ≤ Mkyk (by boundedness of T ).
Therefore,
ky − ρλT (y)k2 ≤ (1 − 2ρα + ρ2 M 2 )kyk2 , (1.37)
or
ky − ρλT (y)k ≤ (1 − 2ρα + ρ2 M 2 )1/2 kyk. (1.38)
Let S(y) = y − ρ (λT (y) − λf ). Then
This implies that S is a contraction mapping if 0 < 1−2ρα+ρ2 M 2 < 1 which is equivalent to
the condition that ρ ∈ (0, 2α/M 2). Hence, by the Banach contraction fixed point theorem,
S has a unique fixed point which is the unique solution.
This problem is known as abstract variational problem. In view of the Lax-Milgram lemma,
it has a unique solution.
2
Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range
R(T ) and null space N (T ) of T are defined, respectively, as
The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the
nullity of T .
It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}.
Recall that
Clearly, c00 ⊆ c0 ⊆ c ⊆ ℓ∞ .
42
Qamrul Hasan Ansari Advanced Functional Analysis Page 43
Recall that the set {T (x) : kxk ≤ 1} is closed and bounded if T : X → Y is a bounded linear
operator from a normed space X to another normed space Y . However, if T : X → Y is
bounded linear operator of finite rank, then the set {T (x) : kxk ≤ 1} is compact as every
closed and bounded subset of a finite dimensional normed space is compact. But this is
not true if the rank of T (dimension of R(T ) is called rank of T ) is infinity. For example,
consider the identity operator I : X → X on an infinite dimensional normed space X, then
the above set reduces to the closed unit ball {x ∈ X : kxk ≤ 1} which is not compact.
Definition 2.1.1 (Compact linear operator). Let X and Y be normed spaces. A linear
operator T : X → Y is said be compact or completely continuous if the image T (M) of every
bounded subset M of X is relatively compact, that is, T (M) is compact for every bounded
subset M of X.
Proof. (a) Since the unit space S = {x ∈ X : kxk = 1} is bounded and T is a compact linear
operator, T (S) is compact, and hence is bounded1 . Therefore,
(b) Note that the closed unit ball B = {x ∈ X : kxk ≤ 1} is bounded. If dimX = ∞, then
B cannot be compact2 . Therefore, I(B) = B = B is not relatively compact.
1
Every compact subset of a normed space is closed and bounded
2
The normed space X is finite dimensional if and only if the closed unit ball is compact
Qamrul Hasan Ansari Advanced Functional Analysis Page 44
Exercise 2.1.1. Let X and Y be normed spaces and T : X → Y be a linear operator. Then
prove that the following statements are equivalent.
Proof. Clearly, (a) implies (b) and (c). Assume that (c) holds, that is, {T (x) : kxk ≤ 1}
is compact in Y . Let M be a bounded subset of X. Then, there exists r > 0 such that
M ⊆ {x ∈ X : kxk ≤ r}. Since
and the fact that a closed subset of a compact set is compact, it follows that (c) implies (b)
and (a), and (b) implies (a).
Proof. If T is compact and {xn } is bounded. Then we can assume that kxn k ≤ c for
every n ∈ N and some constant c > 0. Let M = {x ∈ X : kxk ≤ c}. Then {T (xn )} is a
sequence in the closure of {T (xn )} in Y which is compact, and hence it contains a convergent
subsequence.
Conversely, assume that every bounded sequence {xn } contains a subsequence {xnk } such
that {T (xnk )} converges in Y . Let B be a bounded subset of X. To show that T (B) is
compact, it is enough to prove that every sequence in it has a convergent subsequence.
Suppose that {yn } be any sequence in T (B). Then yn = T (xn ) for some xn ∈ B and {xn } is
bounded since B is bounded. By assumption, {T (xn )} contains a convergent subsequence.
Hence T (B) is compact3 because {yn } in T (B) was arbitrary. It shows that T is compact.
Exercise 2.1.2. Let X and Y be normed spaces. Prove that K(X, Y ) is a subspace of
B(X, Y ) the space of all bounded linear operators from X to Y .
Proof. Let B be any bounded subset of X. Since S is bounded, S(B) is a bounded set
and T (S(B)) = T S(B) is relatively compact because T is compact. Hence T S is a linear
compact operator.
To prove ST is also compact, let {xn } be any bounded sequence in S. Then {T (xn )} has
convergent subsequence {T (xnk )} by Theorem 2.1.1 and {ST (xnk } is convergent. Hence ST
is compact again by Theorem 2.1.1.
Example 2.1.1. Let 1 ≤ p ≤ ∞ and X = ℓp . Let T : X → X be the right shift operator on
X defined by
0, if i = 1,
(T (x)(i)) :=
x(i − 1), if i > 1.
Since
21/p , if 1 ≤ p < ∞,
T (en ) = en+1 , ken − em k =
1, if = ∞,
for all n, m ∈ N, n 6= m, it follows that, corresponding to the bounded sequence {en },
{A(en )} does not have a convergent subsequence. Hence, by Theorem 2.1.1, the operator T
is not compact.
Exercise 2.1.4. Prove that the left shift operator on ℓp space is not compact for any p with
1 ≤ p ≤ ∞.
Definition 2.1.2. An operator T ∈ B(X, Y ) with dimT (X) < ∞ is called an operator of
finite rank.
Theorem 2.1.2 (Finite dimensional domain or range). Let X and Y be normed spaces
and T : X → Y be a linear operator.
(a) If T is bounded and dimT (X) < ∞, then the operator T is compact. That is, every
bounded linear operator of finite rank is compact.
Proof. (a) Let {xn } be any bounded sequence in X. Then the inequality kT (xn )k ≤ kT k kxn k
shows that the sequence {T (xn )} is bounded. Hence {T (xn )} is relatively compact4 since
dimT (X) < ∞. It follows that {T (xn )} has a convergent subsequence. Since {xn } was an
arbitrary bounded sequence in X, the operator T is compact by Theorem 2.1.1.
(b) It follows from (a) by noting that dim(X) < ∞ implies the boundedness of T 5 .
Exercise 2.1.5. Prove that the identity operator on a normed space is compact if and only
if the space is of finite dimension.
4
In a finite dimensional space, a set is compact if and only if it is closed and bounded
5
Every linear operator is bounded on a finite dimensional normed space X
Qamrul Hasan Ansari Advanced Functional Analysis Page 46
Let {xn } be a bounded sequence in X, and ε > 0 be given. Since {Tn } is a sequence in
K(X, Y ), there exists N ∈ N such that
Since TN ∈ K(X, Y ), there exists a subsequence {x̃n } of {xn } such that {TN (x̃n )} is conver-
gent. In particular, there exists n0 ∈ N such that
kT (x̃n ) − T (x̃m )k ≤ kkT (x̃n ) − TN (x̃n )k + kTN (x̃n ) − TN (x̃m )k + kTN (x̃m ) − T (x̃m )k
≤ kT − TN k kx̃j k + kTN (x̃n ) − TN (x̃m )k + kTN − T k kx̃m k
≤ (2c + 1)ε,
where c > 0 is such that kxn k ≤ c for all n ∈ N. This shows that {T (x̃n )} is a Cauchy
sequence and hence converges since Y is complete. Remembering that {x̃n } is a subsequence
of the arbitrary bounded sequence {xn }, we see that Theorem 2.1.1 implies compactness of
the operator T .
Remark 2.1.2. The above theorem does not hold if we replace unform operator convergence
by strong operator convergence kTn (x) − T (x)k → 0. For example, consider the sequence
Tn : ℓ2 → ℓ2 defined by Tn (x) = (ξ1 , . . . , ξn , 0, 0, . . .), where x = {ξj } ∈ ℓ2 . Since T is linear
and bounded, Tn is compact by Theorem 2.1.2 (a). Clearly, Tn (x) → x = I(x), but I is not
compact since dimℓ2 = ∞ (see Lemma 2.1.1 (b).
Remark 2.1.3. As a particular case of the above theorem, we can say that if X is a Banach
space, and if {Tn } is a sequence of finite rank operators in B(X) such that kTn − T k → 0 as
n → ∞ for some T ∈ B(X), then T is a compact operator.
Then Tn is linear and bounded, and is compact by Theorem 2.1.2 (a). Furthermore,
∞
X X∞
1 2
k(T − Tn )(x)k2 = |ηj |2 = |ξj |
j=n+1 j=n+1
j
X∞
1 2 kxk2
≤ |ξ j | ≤ .
(n + 1)2 j=n+1 (n + 1)2
Since λn → 0 as n → ∞, we obtain
kT − Tn kp ≤ sup |λi | → 0 as n → ∞.
i>n
Since T (en ) = λn en for all n ∈ N, T is of infinite rank whenever λn 6= 0 for infinitely many
n.
Exercise 2.1.6. Prove that the operator T defined in the Example 2.1.3 is not compact if
λn → λ 6= 0 as n → ∞.
Exercise 2.1.7. Show that the zero operator on any normed space is compact.
Qamrul Hasan Ansari Advanced Functional Analysis Page 48
Proof. We write yn = T (xn ) and y = T (x). We first show that yn ⇀ y and then yn → y.
Now we prove yn → y. Assume that it does not hold. Then {yn } has a subsequence {ynk }
such that
kynk − yk ≥ δ, (2.1)
for some δ > 0. Since {xn } is weakly convergent, {xn } is bounded, and so is {xnk }. Com-
pactness of T implies that (by Theorem 2.1.1) {T (xnk )} has a convergent subsequence, say
{ỹj }. Let ỹj → ỹ. Then of course ỹj ⇀ ỹ. Hence ỹ = y because yn ⇀ y. Consequently,
kỹj − yk → 0 but kỹj − yk ≥ δ > 0 by (2.1).
This contradicts, so that yn → y.
Remark 2.1.4. In general, the converse of the above theorem does not hold. For example,
consider the space X = ℓ1 , then by Schur’s lemma6 , every weakly convergent sequence in ℓ1 is
convergent. Thus, every bounded operator on ℓ1 maps every weakly convergent sequence onto
a convergent sequence. Obviously, every bounded operator on ℓ1 is not compact. However,
if the space X is reflexive, then the converse of Theorem 2.1.4 does hold.
6
Schur’s Lemma. Every weakly convergent sequence in ℓ1 is convergent
Qamrul Hasan Ansari Advanced Functional Analysis Page 49
Theorem 2.1.5. Let X and Y be normed spaces such that X is reflexive, and T : X → Y
be a linear operator such that for any sequence {xn } in X,
Proof. It is enough to show that for every bounded sequence {xn } in X, {T (xn )} has a
convergent sequence.
P∞
(a) 1 ≤ p ≤ ∞, 1 ≤ r ≤ ∞ and j=1 |aij |
→ 0 as i → ∞.
P P∞ r
(b) 1 ≤ p ≤ ∞, 1 ≤ r < ∞ and ∞
i=1 j=1 |aij | < ∞.
P∞ q
(c) 1 < p ≤ ∞, 1 ≤ r ≤ ∞ and j=1 |aij | → 0 as i → ∞.
P∞ P∞ q
r/q
(d) 1 < p ≤ ∞, 1 ≤ r < ∞ and i=1 j=1 |aij | < ∞.
7
Eberlein-Shmulyan Theorem. Every bounded sequence in reflexive space has a weakly convergent
subsequence
Qamrul Hasan Ansari Advanced Functional Analysis Page 50
Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range
R(T ) and null space N (T ) of A are defined, respectively, as
The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the
nullity of T .
It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}.
T (x) = λx.
Remark 2.2.2. A linear operator may not have any eigenvalue at all. For example, the
linear operator T : R2 → R2 defined by
has no eigenvalue.
(a) Let {λn } be a bounded sequence of scalars. Let T : X → X be the diagonal operator
defined by
T (x)(j) = λj x(j), for all x ∈ X and j ∈ N.
Then it is easy to see that, for λ ∈ K, the equation T (x) = λx is satisfied for a nonzero
x ∈ X if and only if λ = λj for some j ∈ N. Hence,
σeig (T ) = {λ1 , λ2 , . . .}.
In fact, for n ∈ N, en ∈ X defined by en (j) = δij is an eigenvector of T corresponding to the
eigenvalue λn .
Next consider the cases of X = c0 or X = ℓp for 1 ≤ p < ∞. In these cases, we see that
(1, λ, λ2, λ3 . . .) ∈ X if and only if |λ| < 1, so that σeig (T ) = {λ : |λ| < 1}.
For the case of X = c, we see that (1, λ, λ2 , λ3 . . .) ∈ X if and only if either |λ| < 1 or λ = 1.
Thus, in this case
σeig (T ) = {λ : |λ| < 1} ∪ {1}.
Proof. Since
∞ n
[ o
σeig (T ) \ {0} = λ ∈ σeig (T ) : |λ| ≥ 1/n ,
n=1
it is enough to show that the set Er := {λ ∈ σeig (T ) : |λ| ≥ r} is finite for each r > 0.
Assume that there is an r > 0 such that Er is an infinite set. Let {λn } be a sequence of
distinct elements in Er , that is, {λn } be a sequence of distinct eigenvalues of T such that
|λn | ≥ r. For n ∈ N, let xn be eigenvector of T corresponding to the eigenvalue λn , and let
Xn := span{x1 , x2 , . . . , xn }, n ∈ N. Then each Xn is a proper closed subspace of Xn+1 . By
Riesz Lemma8 , there exists a sequence {un } ∈ X such that un ∈ Xn , kun k = 1 for all n ∈ N
and
1
dist(un , Xm ) ≥ , for all m < n.
2
Therefore, for every m, n ∈ N with m < n, we have
Therefore, we have
|λn | r
kT (un ) − T (um )k ≥ |λn |dist(un , Xn−1) ≥ ≥ .
2 2
Thus, {T (un )} has no convergent subsequence, contradicting the fact that T is a compact
operator.
T (x) − λx = y
can have atmost one solution which depends continuously on y. In other words, one would
like to know that the inverse operator
8
Riesz Lemma. Let X0 be a proper closed subspace of a normed space X. Then for every r ∈ (0, 1),
there exists xr ∈ X such that kxr k = 1 and dist(xr , X0 ) ≥ r.
Qamrul Hasan Ansari Advanced Functional Analysis Page 53
is continuous which is equivalent to say that the operator T − λI is bounded below, that is,
there exists c > 0 such that
Proof. If λ ∈/ σapp (T ), that is, if there exists c > 0 such that kT (x) − λxk ≥ ckxk for all
x ∈ X, then there would not exist any sequence {xn } in X such that kxn k = 1 for all n ∈ N
and kT (xn ) − λxn k → 0 as n → ∞.
Conversely, assume that λ ∈ σapp (T ), that is, there does not exist any c > 0 such that
kT (x) − λxk ≥ ckxk for all x ∈ X. Then for all n ∈ N, there exists un ∈ X such that
1
kT (un ) − λun k <
kun k, for all n ∈ N.
n
un
6 0 for all n ∈ N. Taking xn =
Clearly, un = for all n ∈ N, then we have
kun k
1
kxn k = 1 for all n ∈ N and kT (xn ) − λxn k < → 0 as n → ∞.
n
This completes the proof.
9
Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists γ > 0 such that
kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous, and in that
case, kT −1(y)k ≤ γ1 kyk for all y ∈ R(T ).
Qamrul Hasan Ansari Advanced Functional Analysis Page 54
σeig (T ) ⊆ σapp (T ).
σeig (T ) = σapp (T ).
The following example illustrates that the strict inclusion in σeig (A) ⊆ σapp (A) can occur
if the space X is infinite dimensional.
Example 2.2.2. Let X be any of the sequence spaces c00 , c0 , c, ℓp with any norm satisfying
ken k = 1 for all n ∈ N. Let T : X → X be defined by
Thus, we can conclude that λ ∈ σapp (T ). Note that if λ 6= λn for every n ∈ N, then
λ∈/ σeig (T ).
Proof. (a) We have already observed that σeig (T ) ⊆ σapp (T ). Now, suppose that 0 6= λ ∈
σapp (T ). We show that λ ∈ σeig (T ).
Let {xn } be a sequence in X such that kxn k = 1 for every n ∈ N and kT (xn ) − λxn k → 0 as
n → ∞. Since T is compact operator, there exists a subsequence {x̃n } of {xn } and y ∈ X
such that T (x̃n ) → y. Hence,
(b) Suppose that T is a finite rank operator. In view of (a), it is enough to show that 0 ∈
σapp (T ) implies 0 ∈ σeig (A). Suppose that 0 ∈ / σeig (T ). Then T is injective so that by the
hypothesis that T is of finite rank, X is finite dimensional. Therefore, σapp (T ) = σeig (T ),
and consequently, 0 ∈/ σapp (T ).
(c) Let X be infinite dimensional. Suppose that 0 ∈/ σapp (T ), that is, T is bounded below.
We show that every bounded sequence in X has a Cauchy subsequence so that X would be
of finite dimension, contradicting the assumption.
From the above theorem part (c), we can observe that an operator defined on an infinite
dimensional space is not compact. The following example illustrates this point of view.
T : (α1 , α2 , . . .) 7→ (λ1 α1 , λ2 α2 , . . .)
associated with a sequence {λn } of nonzero scalars which converges to a nonzero scalar. We
know that T is not a compact operator but bounded below. Hence 0 ∈ / σapp (T ). Thus, the
fact that T is not compact as follows from Theorem 2.2.4 (c).
Qamrul Hasan Ansari Advanced Functional Analysis Page 56
We know that the range of an infinite rank compact operator on a Banach space is not closed.
Does Theorem 2.2.4 (c) hold for every bounded operator with nonclosed range as well? The
answer is in the affirmative if X is a Banach space, as the following theorem shows.
Theorem 2.2.5. Let X be a Banach space and T : X → X be a bounded linear operator.
If the range R(T ) of T is not closed in X, then 0 ∈ σapp (T ).
Proof. The proof follows from result “Let T : X → Y be a bounded linear operator from a
Banach space X to a normed space Y . If T is bounded below, then the range R(T ) of T is
a closed subspace of Y .”
Proof. Let {λn } be a sequence in σapp (T ) such that λn → λ for some λ ∈ K. Suppose that
λ∈/ σapp (T ). Let c > 0 be such that
kT (x) − λxk ≥ ckxk, for all x ∈ X.
Observe that, for every x ∈ X, n ∈ N,
kT (x) − λn xk = k(T (x) − λx) − (λn − λ)xk
≥ kT (x) − λxk − |λn − λ)|kxk
≥ (c − |λn − λ|)kxk.
Thus, for all large enough n, T − λn I is bounded below. More precisely, let N ∈ N be such
that |λn − λ| ≤ c/2 for all n ≥ N. Then we have
c
kT (x) − λn xk ≤ kxk, for all x ∈ X and all n ≥ N,
2
which shows that λn ∈
/ σapp (T ) for all n ≥ N. Thus, we arrive at a contradiction.
The above result, in particular, shows that if {λn } is a sequence of eigenvalues of T ∈ B(X)
(T : X → X is bounded linear operator) such that λn → λ, then λ is an approximate
eigenvalue. One may ask whether every approximate eigenvalue arises in this manner. The
answer is, in general, negative, as the following examples shows.
Example 2.2.4. Let X = ℓ1 and T be the right shift operator on ℓ1 . Then we know that
σeig (T ) = ∅. We show that σapp (T ) 6= ∅.
Then we see that kxn k = 1 for all n ∈ N, and kT (xn ) − xn k1 = 2/n → 0 as n → ∞ so that
1 ∈ σapp (T ).
T (x) − λx = y,
and the map y 7→ x is continuous. Thus, if x and y are as above, and if {yn } is a sequence
in R(T − λI) such that yn → y, and {xn } in X satisfies T (xn ) − λxn = yn , then xn → x.
One would like to have the above situation not only for every y ∈ R(T − λI), but also for
every y ∈ X. Motivated by this requirement, we have the concept of spectrum of T .
where B(X) denotes the set of all bounded linear operators from X into itself.
and, in that case, S = T −1 . If 0 ∈ ρ(T ), then we say that T is invertible in B(X). We note
that if T, S ∈ B(X) are invertible, then T S is invertible, and
(T S)−1 = S −1 T −1 .
σapp (T ) ⊆ σ(T ).
10
Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists
γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous,
and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ).
Qamrul Hasan Ansari Advanced Functional Analysis Page 59
We have seen examples of infinite rank operators T for which σeig (T ) 6= σapp (T ). The
following example shows that strict inclusion is possible in σapp (T ) ⊆ σ(T ) as well.
continuous, and R(T − λI) is closed. Hence, R(T − λI) is not dense in X; otherwise, T − λI
would become bijective and (T − λI)−1 ∈ B(X), which is a contradiction to the assumption
that λ ∈ σ(T ).
11
Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists
γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous,
and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ).
12
Proposition B: Let T : X → Y be a bounded linear operator from a Banach space X to a normed space
Y . If T is bounded below, then the range R(T ) of T is a closed subspace of Y
Qamrul Hasan Ansari Advanced Functional Analysis Page 60
Proof. We prove it by showing that the closed unit ball B = {x ∈ N (Tλ ) : kxk ≤ 1} is
compact as a normed space is finite dimensional if the closed unit ball in it is compact.
Let {xn } be in B. Then {xn } is bounded as kxn k ≤ 1. Since T is compact, by Theorem 2.1.4,
{T (xn )} has a convergent subsequence {T (xnk )}. Now xn ∈ B ⊂ N (Tλ ) implies Tλ (xn ) =
T (xn )−λxn = 0, so that xn = λ−1 T (xn ) because λ 6= 0. Consequently, {xnk } = {λ−1 T (xnk )}
also converges and its limit lies in B as B is closed. Since {xn } was arbitrary, it says that
every sequence in B has convergent subsequence, and therefore, B is compact. This implies
that domN (T ) < ∞.
Assume that it does not hold. Then {xn −zn } has bounded subsequence. Since T is compact,
it follows from Theorem 2.1.1 that {T (xn − zn )} has a convergent subsequence. Now from
Tλ = T − λI and λ 6= 0, we have I = λ−1 (T − Tλ ). Since zn ∈ N (Tλ), we have Tλ (zn ) = 0
and thus we obtain
1 1
xn − zn = (T − Tλ )(xn − zn ) = [T (xn − zn ) − Tλ (xn )].
λ λ
Qamrul Hasan Ansari Advanced Functional Analysis Page 61
{T (xn − zn )} has convergent subsequence and {Tλ (xn )} converges by (2.2); hence {xn − zn }
has convergent subsequence, say, xnk − znk → v. Since T is compact, T is continuous and so
is Tλ . Hence
Tλ (xnk − znk ) → Tλ (v).
Here Tλ (znk ) = 0 because zn ∈ N (Tλ), so by (2.2) we also have
1
wn = (xn − zn ), (2.5)
an
we have kwn k = 1. Since an → ∞, whereas Tλ (zn ) = 0 and {Tλ (zn )} converges, it follows
that
1
Tλ (wn ) = Tλ (xn ) → 0, (2.6)
an
Using again I = λ−1 (T − Tλ ), we obatin
1
wn = T (wn ) − Tλ (wn )). (2.7)
λ
Since T is compact and {wn } is bounded, {T (wn )} has convergent subsequence. Furthermore,
{Tλ (wn )} converges by (2.6). Hence (2.7) shows that {wn } has a convergent subsequence,
say
wnj → w. (2.8)
A comparison with (2.6) implies that Tλ (w) = 0. Hence w ∈ N (Tλ ). Since zn ∈ N (Tλ ), also
un = zn + an w ∈ N (Tλ ). Hence for the distance from xn to un , we must have
kxn − un k ≥ δn .
δn ≤ kxn − zn − an wk
= kan wn − an wk
= an kwn − wk
< 2δn kwn − wk.
1
Dividing by 2δn > 0, we have 2
< kwn −wk. This contradicts (2.8) and proves the result.
Exercise 2.4.4. Give an example of a bijective operator T on a normed space X such that
0 ∈ σ(A).
3
Throughout this section, unless otherwise specified, we assume that X is a real vector space
and f : X → R ∪ {±∞} is an extended real-valued function. In this section, we discuss
directional derivatives of f and present some of their basic properties.
Since
f (x − td) − f (x) f (x + τ d) − f (x)
f+′ (x; −d) = lim+ = lim− = −f−′ (x; d),
t→0 t τ →0 −τ
63
Qamrul Hasan Ansari Advanced Functional Analysis Page 64
we have
−f+′ (x; −d) = f−′ (x; d).
If f+′ (x; d) exists and f+′ (x; d) = f−′ (x; d), then it is called the directional derivative of f at x
in the direction d. Thus, the directional derivative of f at x in the direction d ∈ X is defined
by
f (x + td) − f (x)
f ′ (x; d) = lim ,
t→0 t
provided the limit exists in [−∞, +∞], that is, finite or not.
Remark 3.1.1. (a) If f ′ (x; d) exists, then f ′ (x; −d) = −f ′ (x; d).
For an extended convex function1 f : X → R ∪ {±∞}, the following proposition shows that
f (x + td) − f (x)
the function t 7→ is monotonically increasing on (0, ∞).
t
Proposition 3.1.1. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, for each direction d ∈ X, function
f (x + td) − f (x)
t 7→ is monotonically nondecreasing on (0, ∞).
t
Proof. Let x ∈ X be any point such that f (x) is finite, and s, t ∈ (0, ∞) with s ≤ t. Then,
by convexity of f , we have
s s
f (x + sd) = f (x + td) + 1 − x
t t
s s
≤ f (x + td) + 1 − f (x).
t t
It follows that
f (x + sd) − f (x) f (x + td) − f (x)
≤ .
s t
f (x + td) − f (x)
Thus, function t 7→ is monotonically nondecreasing on (0, ∞).
t
1
A function f : X → R ∪ {±∞} is said to be convex if for all x, y ∈ X with f (x), f (y) 6= ±∞, and all
α ∈ [0, 1], f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y).
Qamrul Hasan Ansari Advanced Functional Analysis Page 65
The following result ensures the existence of f+′ (x; d) and f−′ (x; d) when f is a convex func-
tion.
Proposition 3.1.2. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, f+′ (x; d) and f−′ (x; d) exist for every
direction d ∈ X. Also,
f (x + td) − f (x)
f+′ (x; d) = inf , (3.1)
t>0 t
and
f (x + td) − f (x)
f−′ (x; d) = sup . (3.2)
t<0 t
Proof. Let x ∈ X be any point such that f (x) is finite. For given t > 0, by the convexity of
f , we have
t 1
f (x) = f (x − d) + (x + td)
1+t 1+t
t 1
≤ f (x − d) + f (x + td)
1+t 1+t
1
= (tf (x − d) + f (x + td)) .
1+t
It follows that (1 + t)f (x) ≤ tf (x − d) + f (x + td), and so,
f (x + td) − f (x)
≥ f (x) − f (x − d).
t
f (x + td) − f (x)
Hence the decreasing sequence of values , as t → 0+ , is bounded below by
t
the constant f (x) − f (x − d). Thus, the limit in the definition of f+′ (x; d) exists and is given
by
f (x + td) − f (x) f (x + td) − f (x)
f+′ (x; d) = lim+ = inf .
t→0 t t>0 t
Since f+′ (x; d) exists in every direction d, the equality −f+′ (x; −d) = f−′ (x; d) implies that
f−′ (x; d) exists in every direction d.
The relation (3.2) can be established on the lines of the proof given to derive (3.1).
Next, we show that f+′ (x; ·) is convex. Let d1 , d2 ∈ X and λ1 , λ2 ≥ 0 be such that λ1 +λ2 = 1.
From the convexity of f , we have
f (x + t(λ1 d1 + λ2 d2 )) − f (x)
= f ((λ1 + λ2 )x + t(λ1 d1 + λ2 d2 )) − (λ1 + λ2 )f (x)
= f (λ1 (x + td1 ) + λ2 (x + td2 )) − λ1 f (x) − λ2 f (x)
≤ λ1 f (x + td1 ) + λ2 f (x + td2 ) − λ1 f (x) − λ2 f (x)
= λ1 (f (x + td1 ) − f (x)) + λ2 (f (x + td2 ) − f (x))
for all sufficiently small t. Dividing by t > 0 and letting t → 0+ , we obtain
f+′ (x; λ1 d1 + λ2 d2 ) ≤ λ1 f+′ (x; d1 ) + λ2 f+′ (x; d2 ).
Hence f+′ (x; d) is convex in d.
By subadditivity of f+′ (x; d) in d with f+′ (x; d) < +∞ and f+′ (x; −d) < +∞, we obtain
f+′ (x; d) + f+′ (x; −d) ≥ f+′ (x; 0) = 0,
and thus,
f+′ (x; d) ≥ −f+′ (x; −d) = f−′ (x; d).
If f+′ (x; d) = +∞ or f+′ (x; −d) = +∞, then the inequality (3.3) holds trivially.
f (x + td) − f (x)
f ′ (x; d) = inf .
t∈(0,∞) t
a
A function f : X → R is said to be sublinear if f (λx) = λf (x) and f (x + y) ≤ f (x) + f (y) for all
x, y ∈ X and all λ ≥ 0.
(b) If y is not in Dom(f ), then the inequality (3.4) trivially holds. So, let y ∈ Dom(f ). For
t ∈ (0, 1), we have
f ((1 − t)x + ty) − f (x) ≤ t(f (y) − f (x)),
which implies that
f ((1 − t)x + ty) − f (x)
≤ f (y) − f (x).
t
Letting limit as t → 0. we obtain
and
f−′ (y; y − x) ≥ f−′ (x; y − x). (3.6)
In particular, if f : Rn → R is differentiable at x and y, then
and
f (x) ≥ f (y) + f+′ (y; x − y). (3.9)
Qamrul Hasan Ansari Advanced Functional Analysis Page 68
Since −f+′ (x; −d) = f−′ (x; d), by using inequality (3.3), we get
Hence, the inequality (3.5) holds. Similarly, we can establish the inequality (3.6). The
inequality (3.7) holds using Remark 3.1.1 (b).
Qamrul Hasan Ansari Advanced Functional Analysis Page 69
The continuous linear functional fG′ (x) : X → R is called the Gâteaux derivative of f at x.
fG′ (x; d) is called the value of the Gâteaux derivative of f at x in the direction d.
The continuous linear operator TG′ (x) : X → Y is called the Gâteaux derivative of T at x.
TG′ (x; d) is called the value of the Gâteaux derivative of T at x in the direction d.
T (x + td) − T (x)
lim − TG′ (x; d) = 0. (3.12)
t→0 t
Remark 3.2.1. If fG′ (x; d) exists, then fG′ (x; −d) = −fG′ (x; d).
Remark 3.2.2. If X = Rn is an Euclidean space with the standard inner product. If
f : Rn → R has continuous partial derivatives of order 1, then f is Gâteaux differentiable at
x = (x1 , x2 , . . . , xn ) ∈ Rn and in the direction d = (d1 , d2 , . . . , dn ) ∈ Rn , and it is given by
n
X ∂f (x)
fG′ (x; d) = dk ,
∂xk
k=1
2
René Gâteaux (1889-1914) had died in the First World War and his work was published by Lévy in 1919
with some improvement.
Qamrul Hasan Ansari Advanced Functional Analysis Page 70
∂f (x)
where denotes a partial derivative of f at the point x with respect to xk . Thus,
∂xk
∂f (x) ∂f (x) ∂f (x)
∇G f (x) = , ,..., is gradient of f at the point x.
∂x1 ∂x2 ∂xn
Remark 3.2.3. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner
product. If T : Rn → Rm be given by T = (f1 , f2 , . . . , fm ) and A = (aij ) be a m × n matrix,
where fi : Rn → R be functions for each i = 1, 2, . . . , m. Let d = ej = (0, 0, . . . , 1, . . . , 0, 0)
where 1 at jth place. Then
T (x + td) − T (x)
lim − Ad = 0
t→0 t
implies that
fi (x + tej ) − fi (x)
lim − aij = 0,
t→0 t
for all i = 1, 2, . . . , m and all j = 1, 2, . . . , n. This shows that fi has partial derivatives at x
and
∂fi (x)
= aij , for i = 1, 2, . . . , m and j = 1, 2, . . . , n.
∂xj
Hence ∂f1 (x)
∂x1
. . . ∂f∂x1 (x)
n
.. .. ..
TG′ (x) = . . . .
∂fm (x) ∂fm (x)
∂x1
... ∂xn
Proof. Assume that there exist two continuous linear operator TG′ (x) and TG∗′ (x) which satisfy
(3.12). Then, for all d ∈ X, and for sufficiently small t, we have
′ ∗′ T (x + td) − T (x) ′
kTG (x; d) − TG (x; d)k = − TG (x; d)
t
T (x + td) − T (x) ∗′
− − TG (x; d)
t
T (x + td) − T (x)
≤ − TG′ (x; d)
t
T (x + td) − T (x)
+ − TG∗′ (x0 ; d)
t
→ 0 as t → 0.
Therefore, kTG′ (x; d) − TG∗′ (x; d)k = 0 for all d ∈ X. Hence, TG′ (x; d) = TG∗′ (x; d), and thus,
TG′ (x) ≡ TG∗′ (x).
Qamrul Hasan Ansari Advanced Functional Analysis Page 71
Theorem 3.2.1. Let K be a nonempty open convex subset of a normed space X and
f : K → R be a convex function. If f is Gâteaux differentiable at x ∈ K, then fG′ (x; d)
is linear in d. Conversely, if f+′ (x; d) is linear in d, then f is Gâteaux differentiable at
x.
and thus,
f+′ (x; d + u) = f+′ (x; d) + f+′ (x; u).
Since fG′ (x; d) = f+′ (x; d) = f−′ (x; d), we have
Then,
f ((0, 0) + t(d1 , d2)) − f (0, 0) d2 d2
fG′ ((0, 0); d) = lim = 2 1 2.
t→0 t d1 + d2
Therefore, f is Gâteaux differentiable at (0, 0), but fG′ ((0, 0); d) is not linear in d.
(b) For a real valued function f defined on Rn , the partial derivatives may exist at a point
but f may not be Gâteaux differentiable at that point. For example, consider the
function f : R2 → R defined by
( xx
1 2
x21 +x22
, if x 6= (0, 0),
f (x) =
0, if x = (0, 0),
Then,
f ((0, 0) + t(d1 , d2 )) − f (0, 0) d1 d2
lim = lim ,
t→0 t t→0 t(d2 2
1 + d2 )
∂f (0, 0)
exists only if d = (d1 , 0) or d = (0, d2). That is, fG′ (0; 0) does not exist but =
∂x1
∂f (0, 0)
0= , where 0 = (0, 0) is the zero vector in R2 .
∂x2
(c) The existence, linearity and continuity of fG′ (x; d) in d do not imply the continuity of
the function f . For example, consider the function f : R2 → R defined by
( x3
1
x2
, if x1 6= 0 and x2 6= 0,
f (x) =
0, if x1 = 0 or x2 = 0,
t3 d31
fG′ ((0, 0); d) = lim = 0,
t→0 t2 d2
for all d = (d1 , d2 ) ∈ R2 with (d1 , d2 ) 6= (0, 0). Thus, fG′ (0; d) exists and it is continuous
and linear in d but f is discontinuous at (0, 0). The function f is Gâteaux differen-
tiable but not continuous. Hence a Gâteaux differentiable function is not necessarily
continuous.
Qamrul Hasan Ansari Advanced Functional Analysis Page 73
(d) The Gâteaux derivative fG′ (x; d) of a function f is positively homogeneous in the second
argument, that is, fG′ (x; rd) = rfG′ (x; d) for all r > 0. But, as we have seen in part (a),
in general, fG′ (x; d) is not linear in d.
Remark 3.2.5. The Gâteaux derivative of a linear operator T : X → Y is also a linear
operator. Indeed, if T : X → Y is a linear operator, then we have
T (x + td) − T (x) T (x) + tT (d) − T (x)
TG′ (x; d) = lim = lim = T (d).
t→0 t t→0 t
Hence TG′ (x; d) = T (d) for all x ∈ X and d ∈ X.
The following theorem shows that the partial derivatives and Gâteaux derivative are the
same if the function f defined on X is convex.
Theorem 3.2.2. Let K be nonempty convex subset of Rn and f : K → R be a convex
function. If the partial derivatives of f at x ∈ K exist, then f is Gâteaux differentiable
at x.
Proof. Suppose that the partial derivatives of f at x ∈ K exist. Then, the Gâteaux derivative
of f at x is the linear functional
n
X ∂f (x)
fG′ (x; d) = dk , for d = (d1 , d2, . . . dn ) ∈ Rn .
k=1
∂xk
So,
n
g(λd) X g(nλdk ek )
≤ , for λ > 0,
λ nλ
k=1
and n
g(λd) X g(nλdk ek )
≥ , for λ < 0.
λ k=1
nλ
Since
g(nλdk ek ) ∂g(0)
lim = = 0, for all k = 1, 2, . . . , n,
λ→0 nλ ∂dk
Qamrul Hasan Ansari Advanced Functional Analysis Page 74
we have
g(λd)
lim = 0,
λ→0 λ
and so, f is Gâteaux differentiable at x.
Proof. Since K is an open subset of X, we can select an open interval I of real numbers,
which contains the numbers 0 and 1, such that x + λd belongs to K for all λ ∈ I. For all
λ ∈ I, define
ϕ(λ) = T (x + λd).
Then,
ϕ(λ + τ ) − ϕ(λ)
ϕ′ (λ) = lim
τ →0 τ
T (x + λd + τ d) − T (x + λd)
= lim
τ →0 τ
= TG′ (x + λd; d). (3.14)
By applying the mean value theorem for real-valued functions of one variable to the restric-
tion of the function ϕ : I → R to the closed interval [0, 1], we obtain
ϕ(1) − ϕ(0) = ϕ′ (s), for some s ∈ ]0, 1[.
By using (3.14) and the definition of ϕ : [0, 1] → R, we obtain the desired result.
For the differentiable function, we have the following result which follows from the above
theorem.
Corollary 3.2.1. If in the above theorem T is a differentiable function from Rn to R,
then there exists s ∈ ]0, 1[ such that T (x + d) − T (x) = hTG′ (x + sd), i = h∇T (x + sd), di.
(a) f is convex on K.
Qamrul Hasan Ansari Advanced Functional Analysis Page 75
In this case, T ′ (x), also denoted by DT (x), is called Fréchet derivative of T at the point x.
The operator T ′ : X → B(X, Y ) which assigns a continuous linear operator T ′ (x) to a vector
x is known as the Fréchet derivative3 of T .
The domain of the operator T ′ contains naturally all vectors in X at which the Fréchet
derivative can be defined.
The meaning of the relation (3.16) is that for each ε > 0, there exists a δ > 0 (depending on
ε) such that
kT (x + d) − T (x) − T ′ (x)(d)k
< ε,
kdk
for all d ∈ X satisfying the condition kdk < δ.
Example 3.3.1. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner
product. If T : Rn → Rm is Fréchet differentiable at a point x ∈ Rn , then T is represented
by T (x) = (f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn )), where fj : Rn → R be a function for each
j = 1, 2, . . . , m. Let {ei : i = 1, 2, .P . . n} denote the standardPbasis in Rn . Then the vector
d ∈ R can be represented as d = in=1 di ei and f ′ (x)(d) = ni=1 di f ′ (x)(ei ). Therefore we
n
find that
(f1 (·, xi + t, ·), . . . , fm (·, xi + t, ·)) − (f1 (·, xi , ·), . . . , fm (·, xi , ·))
lim
t→0 t
∂f1 (x) ∂fm (x)
= ,..., = T ′ (x)(ei ).
∂xi ∂xi
Thus the Fréchet derivative T ′ is expressed in the following form
X n
′ ∂f1 (x) ∂fm (x)
T (x)(d) = di ,...,
i=1
∂xi ∂xi
X n
∂f1 (x) ∂fm (x)
= di , . . . , di
i=1
∂xi ∂xi
∂f1 (x)
∂x1
. . . ∂f∂x1 (x) n
d 1
.. .. .. ..
= . . . . .
∂fm (x) ∂fm (x)
∂x
. . . ∂x dn
1 n
3
The Fréchet derivative is introduced by the French mathematician Gil Fréchet in 1925.
Qamrul Hasan Ansari Advanced Functional Analysis Page 78
This shows that the Fréchet derivative T ′ (x) at a point x is a linear operator represented by
the Jacobian matrix.
Remark 3.3.1. If the operators λT (λ is a scalar) and T + S are Fréchet differentiable, then
for all d ∈ X,
(λT )′ (d) = αT ′ (d) and (T + S)′ (d) = T ′ (d) + S ′ (d).
The following example shows that the converse of Proposition 3.3.1 is not true, that is, if an
operator T : X → Y is Gâteaux differentiable, then it may not be Fréchet differentiable.
Example 3.3.2. Let X = R2 with the Euclidean norm k · k and f : X → R be a function
defined by x3 y
, if (x, y) 6= (0, 0),
f (x, y) = x4 +x2
0, if (x, y) = (0, 0).
It can be easily seen that the f is Gâteaux differentiable at (0, 0) with Gâteaux derivative
fG′ (0, 0) = 0.
Proof. Since T has a Fréchet derivative at x ∈ X, for each ε1 > 0, there exists a δ1 > 0
(depending on ε1 ) such that
Choose δ = min{δ1 , ε/(ε1 + kT ′ (x)k)} for each ε > 0. Then for all y ∈ X, we have
Theorem 3.3.3 (Mean Value Theorem). Let K be an open convex subset of a normed
space X, a, b ∈ K and T : K → X be a Fréchet differentiable such that at each x ∈ (a, b)
(open line segment joining a and b) and T (x) is continuous on closed line segment [a, b].
Then
kT (b) − T (a)k ≤ sup kT ′ (y)k kb − ak. (3.17)
y∈(a,b)
By Classical Mean Value Theorem of Calculus for ϕ, we have that for some λ̂ ∈ [0, 1] and
x = (1 − λ̂)a + λ̂b,
where we have used the Chain Rule and the fact that a bounded linear functional is its own
derivative. Therefore, for each continuous linear functional F on X, we have
Now, if we define a function G on the subspace [T (b) −T (a)] of X as G(α(F (b)) −F (a)) = α,
then kGk = kT (b) − T (a)k−1 . If F is a Hahn-Banach extension of G to entire X, we find by
substitution in (3.18) that
kf ′ (x + d) − f ′ (x) − A(d)k
lim = 0.
t→0 t
The second derivative of f at x and is f ′′ (x) = A.
Theorem 3.3.5 (Taylor’s Formula for Twice Fréchet Differentiable Functions). Let T :
Ω ⊂ X → Y and [a, a + h] be any closed segment lying in Ω. If T is differentiable in Ω
Qamrul Hasan Ansari Advanced Functional Analysis Page 81
For proofs of these two theorems and other related results, we refer to the book by H. Cartan,
Differential Calculus, Herman, 1971.
Qamrul Hasan Ansari Advanced Functional Analysis Page 82
Let X be a Hilbert space and f : X → (−∞, ∞] be a proper functional such that f is Gâteaux
differentiable at a point x ∈ int(Dom(f )). Then, by Riesz representation theorem4 ,there
exists exactly one vector, denoted by ∇G f (x) in X such that
fG′ (x)(d) = h∇G f (x), di, for all d ∈ X and kfG′ (x)k∗ = k∇G f (x)k. (3.19)
f (x + td) − f (x)
fG′ (x)(d) = h∇G f (x), di = lim , for all d ∈ X.
t∈R, t→0 t
f (x + td) − f (x) 1
fG′ (x)(d) = lim = hx, di = h∇G f (x), di,
t→0 t kxk
1
where ∇G f (x) = kxk
x.
β
f (y) − f (x) ≤ ky − xk2 + hy − x, ∇f (x)i.
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 83
Noticing that
Hence
Z 1
f (y) = f (x) + φ′ (t)dt
Z0 1
= f (x) + hy − x, ∇f (x + t(y − x))idt
0
Z 1
= f (x) + hy − x, ∇f (x + t(y − x)) − ∇f (x)idt + hy − x, ∇f (x)i
0
Z 1
≤ f (x) + ky − xk k∇f (x + t(y − x)) − ∇f (x)kdt + hy − x, ∇f (x)i
0
β
≤ f (x) + ky − xk2 + hy − x, ∇f (x)i.
2
1
(b) Replacing y by x − ∇f (x) in (a), we get (b).
2β
Definition 3.4.1. Let X be an inner product space. An operator T : X → X is said to be
γ-inverse strongly monotone or γ-cocercive if there exists γ > 0 such that
Note
g(x) = 0 ≤ f (z) − f (x) − h∇f (x), z − xi = g(z), for all z ∈ X
and
∇g(z) = ∇f (z) − ∇f (x), for allz ∈ X.
Clearly, inf g(z) = 0. One can see that
z∈X
(a) nonexpansive if
kT (x) − T (y)k ≤ kx − yk, for all x, y ∈ X;
It can be easily seen that every firmly nonexpansive mapping is nonexpansive but converse
may not hold. For example, consider the negative of identity operator, that is, (−I).
Exercise 3.4.1. Let X be a Hilbert space and Y be an inner product space, A ∈ B(X, Y )
and b ∈ Y . Define a functional f : X → R by
1
f (x) = kA(x) − bk2 , for all x ∈ X.
2
Then prove that f is Fŕechet differentiable on X with ∇f (x) = A∗ (Ax−b) and ∇2 f (x) = A∗ A
for each x ∈ X.
f (x + y) − f (x)
1
= hA(x) − b + A(y), A(x) − b + A(y)i − f (x)
2
1
= [hA(x) − b, A(x) − bi + hA(x) − b, A(y)i + hA(y), A(x) − bi + hA(y), A(y)i] − f (x)
2
1
= hA(x) − b, A(y)i + kA(y)k2
2
∗ 1
= hA (Ax − b), yi + kA(y)k2.
2
Thus,
1 kAk2
|f (x + y) − f (x) − hA∗ (Ax − b), yi| = kA(y)k2 ≤ kyk2, for all y ∈ X.
2 2
Therefore, f is Fŕechet differentiable on X with
When X = RN is finite dimensional, then the above operator A coincides with a positive
definite matrix. Then ∇2 f (x) = A and
where λmin and λmax are the minimum and maximum eigenvalues of A, respectively. Hence
α = λmin ≤ λmax = kAk.
Qamrul Hasan Ansari Advanced Functional Analysis Page 87
The inequality (3.20) motivates us to introduce the notion of another kind of differentiability
when f is not Gâteaux differentiable at x, but the inequality (3.20) holds.
Clearly, ∂f (x) may be empty set even if f (x) ∈ R. But for the case x ∈ / Dom(f ), we
consider ∂f (x) = ∅. Thus, subdifferential of a proper convex function f is a set-valued
mapping ∂f : X ⇒ X ∗ defined by
Note that f is convex and continuous, but not differentiable at 0. Clearly, f is subdifferen-
tiable at 0 with ∂f (0) = [−1, 1]. Also Dom(∂f ) = Dom(f ) = R.
Example 3.5.2. Define f : R → (−∞, ∞] by
0, if x = 0,
f (x) =
∞, otherwise.
Then
∅, if x 6= 0,
∂f (x) =
R, if x = 0.
Note that f is not continuous at 0, but f is subdifferentiable at 0 with ∂f (0) = R.
Example 3.5.3. Define f : R → (−∞, ∞] by
∞,
√ if x < 0,
f (x) =
− x, if x ≥ 0.
Then
∅, if x ≤ 0,
∂f (x) =
− 2√1 x , if x > 0.
Note that Dom(f ) = [0, ∞) and f is not continuous at 0. Moreover, ∂f (0) = ∅ and
Dom(∂f ) = (0, ∞). Thus, f is not subdifferentianble at 0 even 0 ∈ Dom(f ).
Then
∂iK (x) = {j ∈ X ∗ : hx − y, ji ≥ 0 for all y ∈ K} , for x ∈ K.
Qamrul Hasan Ansari Advanced Functional Analysis Page 89
Proof. Since the indicator function is a proper lower semicontinuous convex function on X,
from (3.21), we have
Remark 3.5.2. Dom(iK ) = Dom(∂iK ) = K and ∂iK (x) = {0} for each x ∈ int(K).
and
f (y) ≤ f (w) + hy − w, vi, for all w ∈ X. (3.24)
Taking z = y in (3.23) and w = x in (3.24) and adding the resultants, we get
Proof. Noticing that R(I + ∂f ) ⊆ X. It is suffices to show that X ⊆ R(I + ∂f ). For this,
let x0 ∈ X and define
1
ψ(x) = kxk2 + f (x) − hx, x0 i, for all x ∈ X.
2
Note that ψ has an affine lower bound and lim ψ(x) = ∞. Hence, from Theorem A5 , there
kxk→∞
exists z ∈ Dom(f ) such that
ψ(z) = inf ψ(x).
x∈X
and
1 1
kzk2 + f (z) − hz, x0 i ≤ kxk2 + f (x) − hx, x0 i,
2 2
which imply that
1
f (z) ≤ f (x) + (kxk2 − kzk2 ) + hz − x, x0 i
2
≤ f (x) + hx − z, xi + hz − x, x0 i
= f (x) + hx − z, x − x0 i.
Let u ∈ X. Define zt = (1 − t)z + tu for t ∈ (0, 1). Hence, for t ∈ (0, 1), we obtain
f (z) ≤ f (u) + hu − z, z − x0 i.
5
Theorem A. Let K be a nonempty closed convex subset of a Hilbert space X and f : K → (−∞, +∞]
be a proper lower semicontinuous function such that f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ K
such that f (x̄) = inf f (x).
x∈K
6
Let X be an inner product space. Then for any x, y ∈ X, kxk2 ≤ kyk2 − 2hy − x, xi
Qamrul Hasan Ansari Advanced Functional Analysis Page 91
Proof. Exercise.
Proof. Exercise.
Therefore,
∂(λf ) = λ∂f, for all λ ∈ (0, ∞).
∂(f + g) = ∂f + ∂g.
Qamrul Hasan Ansari Advanced Functional Analysis Page 92
Hence
f (x + ty) − f (x)
hy, ui ≤ , for all t ∈ (0, ∞).
t
Letting limit as t → 0, we get
hy, ui ≤ f ′ (x; y).
d
f (x0 + ty) = hy, ∂f (x0 )i = hy, fG′ (x0 )i, for all y ∈ X.
dt t=0
f (x0 + λ(y − x0 )) = f ((1 − λ)x0 + λy) ≤ (1 − λ)f (x0 ) + λf (y), for all y ∈ X and λ ∈ (0, 1),
i.e,
f (x0 + λ(y − x0 )) − f (x0 )
≤ f (y) − f (x0 ), for all y ∈ X and λ ∈ (0, 1),
λ
Qamrul Hasan Ansari Advanced Functional Analysis Page 93
It follows that
hy − x0 , fG′ (x0 )i ≤ f (y) − f (x0 ), for all y ∈ X,
i.e., fG′ (x0 ) ∈ ∂f (x0 ). This shows that x0 ∈ Dom(∂f ).
d
f (x0 + ty) = hy, ∂f (x0 )i = hy, ∇G f (x0 )i, for all y ∈ X.
dt t=0
Hint 3.5.1. It is easy to see that f is differentiable with ∇f (x) = x − a for all x ∈ X by
Proposition 3.4.1.
Among all infinite dimensional Banach spaces, Hilbert spaces have the most important and
useful geometric properties. Namely, the inner product on an inner product space satisfies
the parallelogram law. It is well known that a normed space is an inner product space if
and only if its norm satisfies the parallelogram law. The geometric properties of an inner
product space make numerous problems posed in inner product space more manageable than
those in normed spaces. Consequently, to extend some of inner product techniques and inner
product properties, we study the geometric properties of normed spaces. In this chapter, we
study strict convexity, modulus of convexity, uniform convexity and smoothness of normed
spaces. Most of the results presented in this chapter are given in the standard books on
functional analysis, convex analysis and geometry of Banach spaces, namely, recommended
books 1 and 3.
It is well known that the norm of a normed space X is convex, that is,
kλx + (1 − λ)yk ≤ λkxk + (1 − λ)kyk, for all x, y ∈ X and λ ∈ [0, 1].
There are several norms of normed spaces which are strictly convex, that is,
kλx + (1 − λ)yk < λkxk + (1 − λ)kyk, for all x, y ∈ X with x 6= y and λ ∈ (0, 1). (4.1)
95
Qamrul Hasan Ansari Advanced Functional Analysis Page 96
Geometrically speaking, the normed space X is strictly convex if the boundary of the unit
sphere in X contains no line segments.
Clearly, if k · k is strictly convex, then X is strictly convex. Also, kλx + (1 − λ)yk < 1 =
λkxk + (1 − λ)kyk (because kxk = kyk = 1) implies that k · k is strictly convex.
Before giving the examples of strictly convex normed spaces, we present the following char-
acterizations.
Proposition 4.1.1. The following assertions are equivalent:
Proof. (a) ⇒ (b): Assume that X is strictly convex. Then for any x, y ∈ SX , we have
kxk = kyk = 1 and therefore, by strict convexity of X, we have kλx + (1 − λ)yk < 1 for all
λ ∈ [0, 1]. Take λ = 21 , then we obtain kx + yk < 2, that is, (b) holds.
(b) ⇒ (a): Suppose contrary that for each x, y ∈ X, x 6= y, kxk = kyk = 1 and λ0 ∈ (0, 1),
we have kλ0 x + (1 − λ0 )yk = 1, that is, λ0 x + (1 − λ0 )y ∈ SX . Take λ0 < λ < 1, then
λ0 λ0
λ0 x + (1 − λ0 )y = [λx + (1 − λ)y] + 1 − y,
λ λ
λ0 (1 − λ) λ0
as 1 − λ0 = + 1− and hence
λ λ
λ0 λ0
1 = kλ0 x + (1 − λ0 yk ≤ kλx + (1 − λ)yk + 1 − kyk.
λ λ
This implies that
λ0 λ0 λ0
kλx + (1 − λ)yk ≥ 1 − 1 − = ,
λ λ λ
that is, kλx + (1 − λ)yk ≥ 1.
Similarly, for 0 < λ < λ0 , we can have kλx + (1 − λ)yk ≥ 1. So for particular λ = 12 , we have
1
2
kx + yk ≥ 1, that is, kx + yk ≥ 2, a contradiction of the condition of strict convexity.
Qamrul Hasan Ansari Advanced Functional Analysis Page 97
1 x−z 1 z−y
· + · < 1.
2 kx − zk 2 kz − yk
Hence,
x−z z−y
+ = 2.
kx − zk kz − yk
Therefore,
x−z z−y
= , by (b)
kx − zk kz − yk
and this yields
kz − yk kx − zk
z= ·x+ · y.
kx − zk + kz − yk kx − zk + kz − yk
(b) The assertion (c) in Proposition 4.1.1 says that any three point x, y, z ∈ X satisfying
kx − yk = kx − zk + kz − yk must lie ona line; specially,
if kx − zk = r1 , ky − zk = r2
r1 r2
and kx − yk = r = r1 + r2 , then z = r x + r y.
Qamrul Hasan Ansari Advanced Functional Analysis Page 98
n
!1/2
X
kxk2 = x2i , x = (x1 , x2 , . . . , xn ) ∈ Rn .
i=1
SX
Then X is not strictly convex. To see this, let x = (1, 0, 0, . . . , 0) ∈ Rn and y = (0, 1, 0, . . . , 0) ∈
Rn . Then x 6= y, kxk1 = 1 = kyk1, but kx + yk1 = 2.
Qamrul Hasan Ansari Advanced Functional Analysis Page 99
SX
SX
Example 4.1.4. The space C[a, b] of all real-valued continuous functions defined on [a, b]
with the norm kf k = sup |f (t)|, is not strictly convex. Indeed, choose two functions f and
a≤t≤b
g defined as follows:
b−t
f (t) = 1, for all t ∈ [a, b] and g(t) = , for all t ∈ [a, b].
b−a
Qamrul Hasan Ansari Advanced Functional Analysis Page 100
Proof. Let X be a strictly convex normed space and f ∈ X ∗ . Suppose there exist two
distinct points x, y ∈ X with kxk = kyk = 1 such that f (x) = f (y) = kf k∗. If λ ∈ (0, 1),
then
kf k∗ = λf (x) + (1 − λ)f (y) (since f (x) = f (y) = kf k∗ )
= f (λx + (1 − λ)y) (because f is linear)
≤ kf k∗kλx + (1 − λ)yk
< kf k∗, (since kλx + (1 − λ)yk < 1)
which is a contradiction. Therefore, there exists at most one point x in X with kxk = 1 such
that f (x) = kf k∗.
Proposition 4.1.3. A normed space X is strictly convex if and only if the functional
h(x) := kxk2 is strictly convex, that is,
kλx + (1 − λ)yk2 < λkxk2 + (1 − λ)kyk2, for all x, y ∈ X, x 6= y and λ ∈ (0, 1).
Proof. Suppose that X is strictly convex. Let x, y ∈ X, λ ∈ (0, 1). Then we have
kλx + (1 − λ)yk2 ≤ (λkxk + (1 − λ)kyk)2 (4.2)
= λ2 kxk2 + 2λ(1 − λ)kxk kyk + (1 − λ)2 kyk2
≤ λ2 kxk2 + 2λ(1 − λ) kxk2 + kyk2 + (1 − λ)2 kyk2 (4.3)
= λkxk2 + (1 − λ)kyk2. (4.4)
Qamrul Hasan Ansari Advanced Functional Analysis Page 101
Hence h is convex.
Now we show that the equality can not hold. Assume that there are x, y ∈ X, x 6= y with
Conversely, assume that the functional h(x) := kxk2 is strictly convex. Let x, y ∈ X be such
that x 6= y, kxk = kyk = 1 with kλx + (1 − λ)yk = 1 for some λ ∈ (0, 1). Then
Exercise 4.1.2. Let X be a normed space. Prove that X is strictly convex if and only if
for every 1 < p < ∞,
kλx + (1 − λ)ykp < λkxkp + (1 − λ)kykp, for all x, y ∈ X, x 6= y and λ ∈ (0, 1).
Proof. Suppose that X is strictly convex, and let x, y ∈ X with x 6= y. Then by strict
convexity of X and hence by strict convexity of k · k, we have
kλx + (1 − λ)ykp < (λkxk + (1 − λ)kyk)p , for all λ ∈ (0, 1). (4.5)
Assume that kxk = 6 kyk, and consider the function λ 7→ λp for 1 < p < ∞. Then it is a
convex function and
p
a+b ap + bp
< , for all a, b ≥ 0 and a 6= b.
2 2
p
x+y
kλx + (1 − λ)ykp = 2λ + (1 − 2λ)y (after adding and substracting λy)
2
p
x+y
< 2λ + (1 − 2λ)kyk
2
p
x+y
< 2λ + (1 − 2λ)kykp
2
1
< 2λ kxk + kyk + (1 − 2λ)kykp
p p
2
< λkxkp + (1 − λ)kykp. (by (4.6))
Proposition 4.1.4. Let X be a normed space. Then X is strictly convex if and only
if for any two linearly independent elements x, y ∈ X, kx + yk < kxk + kyk. In other
words, X is strictly convex if and only if kx + yk = kxk + kyk for 0 6= x ∈ X and y ∈ X,
then there exists λ ≥ 0 such that y = λx.
Proof. Suppose that X is not strictly convex. Then there exist x and y in X such that
kxk = kyk = 1, x 6= y and kx + yk = 2. By hypothesis, for any two linearly independent
elements x, y ∈ X, kx + yk < kxk + kyk. Since kx + yk = kxk + kyk, x and y are linearly
dependent. Then, x = αy for some α ∈ R, and therefore, kxk = |α| kyk for some α ∈ R
which implies that |α| = 1 because kxk = kyk = 1. If α = 1, then x = y, contradicting that
x 6= y. So we have α = −1, and therefore,
2 = kx + yk = k − y + yk = 0.
This is a contradiction.
Conversely, suppose that X is a strictly convex space and there exist linearly independent
elements x and y in X such that kx + yk = kxk + kyk. Without loss of generality, we may
Qamrul Hasan Ansari Advanced Functional Analysis Page 103
x y
2 > +
kxk kyk
x y
because = = 1 and X is strictly convex
kxk kyk
1
= k(xkyk + ykxk)k
kxkkyk
1
= k(xkyk + ykyk − ykyk + ykxk)k
kxkkyk
1
= k[kyk(x + y) − (kyk − kxk)y]k
kxkkyk
1
≥ k[kykkx + yk − (kyk − kxk)kyk]k
kxkkyk
1
= k[kyk(kxk + kyk) − (kyk − kxk)kyk]k = 2
kxkkyk
(because kx + yk = kxk + kyk).
This is a contradiction.
We now present the existence and uniqueness of elements of minimal norm in convex subsets
of strictly convex normed spaces.
Proposition 4.1.5. Let X be a strictly convex normed space and C be a nonempty con-
vex subset of X. Then there is at most one point x ∈ C such that kxk = inf {kzk : z ∈ C}.
Proof. Let d := inf {kzk : z ∈ C}. Then there exists a sequence {xn } in C such that
lim kxn k = d. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xni }
n→∞
Qamrul Hasan Ansari Advanced Functional Analysis Page 104
of {xn } that converges weakly to an element x in C. The weak lower semicontinuity of the
norm gives
kxk ≤ lim kxn k = d.
n→∞
Proof. Let x ∈ C. Since C is a nonempty closed convex subset the Banach space X,
D = C − x := {y − x : y ∈ C} is a nonempty closed convex subset of X. By Proposition
4.1.6, there exists a unique point ux ∈ D such that kux k = inf{ky −xk : y ∈ C}. For ux ∈ D,
there exists a point zx ∈ C such that ux = zx − x. Hence, there exists a unique point zx ∈ C
such that kzx − xk = d(x, C).
In order to measure the degree of strict convexity of X, we define its modulus of convexity.
Definition 4.1.3. Let X be a normed space. A function δX : [0, 2] → [0, 1] defined by
kx + yk
δX (ε) = inf 1 − : kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε
2
is called the modulus of convexity of X .
Roughly speaking, δX measures how deeply the midpoint of the linear segment joining points
in the sphere SX of X must lie within SX .
The notion of the modulus of convexity was introduced by Clarkson in 19361 . It allows us
to measure the convexity and rotundity of the unit ball of a normed space.
Remark 4.1.2. (a) It is easy to see that δX (0) = 0 and δX (ε) ≥ 0 for all ε ≥ 0.
(b) The function δ is increasing on [0, 2], that is, if ε1 ≤ ε2 , then δX (ε1 ) ≤ δX (ε2 ).
(c) The function δX is continuous on [0, 2), but not necessarily continuous at ε = 2.
(d) The modulus of convexity of an inner product space H is
r
ε2
δH (ε) = 1 − 1 − .
4
1
J.A. Clarkson: Uniform convex spaces, Trans. Amer. Math. Soc., 40 (1936), 396–414.
Qamrul Hasan Ansari Advanced Functional Analysis Page 105
(f) δX (ε) ≤ δH (ε) for any normed space X and any inner product space H. That is, an
inner product space is the most convex normed space.
Remark 4.1.3. We note that for any ε > 0, the number δX (ε) is the largest number for
which the following implication always holds: For any x, y ∈ X,
x+y
kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε ⇒ ≤ 1 − δX (ε). (4.7)
2
Example 4.1.5. Let X = R2 be a normed space equipped with one of the following norms:
k(x1 , x2 )k = kx1 k + kx2 k or k(x1 , x2 )k = max {kx1 k, kx2 k} ,
for all (x1 , x2 ) ∈ X. Then, δX (ε) = 0 for all ε ∈ [0, 2].
Example 4.1.6. Let X = R2 be a normed space equipped with the following norm:
x2 x2
k(x1 , x2 )k = max kx2 k, x1 + √ , x1 − √ , for all (x1 , x2 ) ∈ X.
3 3
Then the unit sphere is a regular hexagon and
1
lim δX (ε) = δX (2) = .
ε→2 2
We now give some important properties of the modulus of convexity of normed spaces.
Theorem 4.1.1. A normed space X is strictly convex if and only if δX (2) = 1.
Proof. Let X be a strictly convex normed space with modulus of convexity δX (ε). Suppose
kxk = kyk = 1 and kx − yk = 2 with x 6= −y. By strict convexity of X, we have
x−y x + (−y)
1= = < 1,
2 2
a contradiction. Hence x = −y. Therefore, δX (2) = 1.
Conversely, suppose δX (2) = 1. Let x, y ∈ X such that kxk = kyk = k(x + y)/2k = 1, that
is, kx + yk = 2 or kx − (−y)k = 2. Then
x−y x + (−y)
= ≤ 1 − δX (2) = 0,
2 2
which implies that x = y. Thus, kxk = kyk and kx + yk = 2 = kxk + kyk imply that x = y.
Therefore, X is strictly convex.
Qamrul Hasan Ansari Advanced Functional Analysis Page 106
The strict convexity of a normed space X says that the midpoint (x + y)/2 of the segment
joining two distinct points x, y ∈ SX with kx − yk ≥ ε > 0 does not lie on SX , that is,
x+y
< 1.
2
In such spaces, we have no information about 1 − k(x + y)/2k, the distance of midpoints
from the unit sphere SX . A stronger property than the strict convexity which provides
information about the distance 1 − k(x + y)/2k is uniform convexity.
Definition 4.2.1. A normed space X is said to be uniformly convex if for any ε, 0 < ε ≤ 2,
the inequalities kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε imply that there exists a δ = δ(ε) > 0
such that k(x + y)/2k ≤ 1 − δ.
This says that if x and y are in the closed unit ball BX := {x ∈ X : kxk ≤ 1} with
kx − yk ≥ ε > 0, the midpoint of x and y lies inside the unit ball BX at a distance of at
least δ from the unit sphere SX .
Roughly speaking, if two points on the unit sphere of a uniformly convex space are far apart,
then their midpoint must be well within it.
Example 4.2.1. Every Hilbert space H is a uniformly convex space. In fact, the parallelo-
gram law gives us
kx + yk2 ≤ 4 − ε2 .
Therefore,
k(x + y)/2k ≤ 1 − δ(ε),
p
where δ(ε) = 1 − 1 − ε2 /4. Thus, H is uniformly convex.
Example 4.2.2. The spaces ℓ1 and ℓ∞ are not uniformly convex. To see it, take x =
(1, 0, 0, 0, . . .), y = (0, −1, 0, 0, . . .) ∈ ℓ1 and ε = 1. Then
However, k(x + y)/2k1 = 1 and there is no δ > 0 such that k(x + y)/2k1 ≤ 1 − δ. Thus, ℓ1
is not uniformly convex.
and the two norms are equivalent with k · kµ near k · k0 for small µ. However (C[0, 1], k · k0 )
is not strictly convex while for any µ > 0, (C[0, 1], k · kµ ) is. On the other hand, it is easy
to see that for any ε ∈ (0, 2), there exist functions x, y, ∈ C[0, 1] with kxkµ = kykµ = 1,
kx − yk = ε and k(x + y)/2k arbitrary near 1. Thus, (C[0, 1], k · kµ ) is not uniformly convex.
Exercise 4.2.2. Show that the normed spaces ℓp , ℓnp (whenever n is a nonnegative integer),
and Lp [a, b] with 1 < p < ∞ are uniformly convex.
Exercise 4.2.3. Show that the normed spaces ℓa , c, ℓ∞ , L1 [a, b], C[a, b] and L∞ [a, b] are not
strictly convex.
∞
!1/2
X xi 2
kxkβ = kxkc0 + β , x = {xi } ∈ c0 .
i=1
i
The spaces (c0 , k · kβ ) for β > 0 are strictly convex, but not uniformly convex, while c0 with
its usual norm kxk∞ = sup |xi |, is not strictly convex.
i∈N
Remark 4.2.2. The strict convexity and uniform convexity are equivalent in finite dimen-
sional spaces.
Qamrul Hasan Ansari Advanced Functional Analysis Page 108
Theorem 4.2.2. Let X be a normed space. Then X is uniformly convex if and only if
for two sequences {xn } and {yn } in X,
Proof. Let X be uniformly convex. Assume that {xn } and {yn } are two sequences in X
such that kxn k ≤ 1, kyn k ≤ 1 for all n ∈ N and lim kxn + yn k = 2. Suppose contrary that
n→∞
lim kxn − yn k =
6 0. Then for some ε > 0, there exists a subsequence {ni } of {n} such that
n→∞
kxni − yni k ≥ ε.
2 ≤ 2(1 − δ(ε)),
a contradiction.
Conversely, assume that the condition (4.8) is satisfied. If X is not uniformly convex, then
for ε > 0, there is no δ(ε) such that
Clearly kxn − yn k ≥ ε which contradicts the hypothesis, since (ii) gives lim kxn + yn k = 2.
n→∞
Thus, X must be uniformly convex.
Theorem 4.2.3. A normed space X is uniformly convex if and only if δX (ε) > 0 for all
ε ∈ (0, 2].
Proof. Let X be a uniformly convex normed space. Then for ε > 0, there exists δ(ε) > 0
such that x+y
2
≤ 1 − δ(ε), that is,
x+y
0 < δ(ε) ≤ 1 −
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 109
for all x, y ∈ X with kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε. Therefore, from the definition of
modulus of convexity, we have δX (ε) > 0.
Conversely, suppose that X is a normed space with modulus of convexity δX such that
δX (ε) > 0 for all ε ∈ (0, 2]. Let x, y ∈ X such that kxk = 1, kyk = 1 with kx − yk ≥ ε for
fixed ε ∈ (0, 2]. By the definition of modulus of convexity δX (ε), we have
x+y
0 < δX (ε) ≤ 1 − .
2
It follows that
x+y
≤ 1 − δX (ε),
2
which is independent of x and y. Therefore, X is uniformly convex.
Theorem 4.2.4. Let {xn } be a sequence in an uniformly convex Banach space X. Then,
xn ⇀ x, kxn k → kxk ⇒ xn → x.
Proof. If x = 0, then it is obvious that xn → 0. So, let x 6= 0. Put yn = kxxnn k for n large
x
enough, and y = kxk . By construction, kyn k = kyk = 1, yn ⇀ y, and thus yn + y ⇀ 2y.
Suppose that xn 6→ x. Then, yn 6→ y. This implies that there exist ε > 0 and a subsequence
{ynk } of {yn } such that kynk − yk ≥ ε. Since X is uniformly convex, there exists δX (ε) > 0
such that
y nk + y
≤ 1 − δX (ε).
2
Since ynk ⇀ y without loss of generality, we have
y nk + y
kyk ≤ lim inf ≤ 1 − δX (ε),
k→∞ 2
For the class of uniform convex Banach spaces, we have the following important results.
and f (xn ) → 1, yield a contradiction. Hence {xn } is a Cauchy sequence and there exists a
point x in X such that xn → x. Clearly x ∈ SX . In fact,
kxk = k lim xn k = lim kxn k = 1.
n→∞ n→∞
Using James Theorem 6.0.3 (which states that a Banach space is reflexive if and only if for
each f ∈ SX ∗ , there exists x ∈ SX such that f (x) = 1), we conclude that X is reflexive.
Remark 4.2.3. Every finite-dimensional Banach space is reflexive, but it need not be uni-
Xn
n
formly convex. For example, X = R , n ≥ 2 with the norm kxk1 = |xi | is not uniformly
i=1
convex. However, it is finite dimensional space.
Combining Proposition 4.1.6 and Theorems 4.2.1 and 4.2.5, we obtain the following inter-
esting result.
Theorem 4.2.6. Let C be a nonempty closed convex subset of a uniformly convex Ba-
nach space X. Then C has a unique element of minimum norm, that is, there exists a
unique element x ∈ C such that kxk = inf {kzk : z ∈ C}.
Theorem 4.2.7 (Intersection Theorem). Let {Cn }∞ n=1 be a decreasing sequence of nonempty
bounded closed convex subsets of a uniformly convex Banach space X. Then, the inter-
∞
\
section Cn is a nonempty closed convex subset of X.
n=1
Proof. Let x be a point in X which does not belong to C1 , rn = d(x, Cn ) and r = lim rn .
n→∞
Also, let {qn } be a sequence of positive numbers that decreases to zero, Dn = {y ∈ Cn :
kx − yk ≤ r + qn }, and dn the diameter of Dn . If y and z belong to Dn and ky − zk ≥ dn − qn ,
then
y+z ky − zk
x− ≤ 1−δ (r + qn ),
2 r + qn
and
dn − qn
rn ≤ 1 − δ (r + qn ).
r + qn
Let lim dn = d, then we obtain a contradiction unless d = 0. This in turn implies that
n→∞
\∞ \∞
Dn 6= ∅, and so is Cn 6= ∅.
n=1 n=1
Remark 4.2.4. Theorem 4.2.7 remains valid if the sequence {Cn }∞ n=1 is replaced by an
arbitrary decreasing net of nonempty bounded closed convex sets. However, Theorem 4.2.7
does not hold in arbitrary Banach spaces. For example, consider the space X = C[0, 1] and
Cn = {x ∈ C[0, 1] : 0 ≤ x(t) ≤ tn for all 0 ≤ t ≤ 1 and x(1) = 1}.
Qamrul Hasan Ansari Advanced Functional Analysis Page 111
Before defining the duality mapping and giving its fundamental properties, we mention the
following notations and definitions:
equivalently,
J(x) = {j ∈ X ∗ : hx, ji = kxk kjk and kxk = kjk} .
Example 4.3.1. In a real Hilbert space H, the normalized duality mapping is the identity
mapping. Indeed, let x ∈ H with x 6= 0. Since H = H ∗ and hx, xi = kxk · kxk, we have
x ∈ J(x). Assume that y ∈ J(x). By the definition of J, we have hx, yi = kxkkyk and
kxk = kyk. Since
kx − yk2 = kxk2 + kyk2 − 2hx, yi,
it follows that x = y. Therefore, J(x) = {x}.
The following theorem presents some fundamental properties of duality mappings in Banach
spaces.
(b) For each x ∈ X, J(x) is nonempty closed convex and bounded subset of X ∗ .
(c) J(λx) = λJ(x) for all x ∈ X and real λ, that is, J is homogeneous.
Qamrul Hasan Ansari Advanced Functional Analysis Page 112
(h) If X is reflexive with strictly convex dual X ∗ , then J is demicontinuous, that is, if
xn → x in X implies J(xn ) ⇀ J(x).
(b) Let x ∈ X. If x = 0, then it is done by Part (a). So, we assume that x 6= 0. Then,
by the Hahn-Banach Theorem, there exists f ∈ X ∗ such that hx, f i = kxk and kf k∗ = 1.
Set j := kxkf . Then hx, ji = kxkhx, f i = kxk2 and kjk∗ = kxk, and it follows that J(x) is
nonempty for each x 6= 0. So, we can assume that f1 , f2 ∈ J(x). Then, we have
and
hx, f2 i = kxkkf2 k∗ , kxk = kf2 k∗ ,
and therefore, for t ∈ (0, 1), we have
Since
we have
kxk2 ≤ kxkktf1 + (1 − t)f2 k∗ ≤ kxk2 ,
which gives us
kxk2 = kxkktf1 + (1 − t)f2 k∗ ,
that is,
ktf1 + (1 − t)f2 k∗ = kxk.
Therefore,
hx, tf1 + (1 − t)f2 i = kxk ktf1 + (1 − t)f2 k∗ and kxk = ktf1 + (1 − t)f2 k∗ ,
and thus, tf1 + (1 − t)f2 ∈ J(x) for all t ∈ (0, 1), that is, J(x) is a convex set.
(c) For λ = 0, it is obvious that J(0x) = 0J(x). Assume that j ∈ J(λx) for λ 6= 0. We first
show that J(λx) ⊆ λJ(x). Since j ∈ J(λx), we have
This shows that λ−1 j ∈ J(x), that is, j ∈ λJ(x). Therefore, J(λx) ⊆ λJ(x). Similarly, we
can show that λJ(x) ⊆ J(λx). Thus, J(λx) = λJ(x).
and
hx, j2 i = kj2 k2∗ = kxk2 .
Adding the above identities, we obtain
hx, j1 + j2 i = 2kxk2 .
Since X ∗ is strictly convex and kj1 + j2 k∗ = kj1 k∗ + kj2 k∗ , there exists λ ∈ R such that
j1 = λj2 . Since
hx, j2 i = hx, j1 i = hx, λj2 i = λhx, j2 i,
this implies that λ = 1, and hence, j1 = j2 . Therefore, J is single-valued.
(g) Suppose that j ∈ J(x) ∩ J(y) for x, y ∈ X. Since j ∈ J(x) and j ∈ J(y), it follows from
kjk2∗ = kxk2 = kyk2 = hx, ji = hy, ji that
(h) It is sufficient to prove the demicontinuity of J on the unit sphere SX . For this, let {xn }
be a sequence in SX such that xn → z in X. Then kJ(xn )k∗ = kxn k = 1 for all n ∈ N,
that is, {J(xn )} is bounded. Since X is reflexive, so is X ∗ . Then, there exists a subsequence
{J(xnk )} of {J(xn )} in X ∗ such that {J(xnk )} converges weakly to some j in X ∗ . Since
xnk → z and J(xnk ) ⇀ j, we have
Moreover,
(because z ∈ SX and so kzk = 1, also hz, ji = 1 and so kjk = 1). This implies that j = J(z).
Thus, every subsequence {J(xni )} converging weakly to j ∈ X ∗ . This gives J(xn ) ⇀ J(z).
Therefore, J is demicontinuous.
(b) kx + yk2 ≤ kyk2 + 2hx, jx+y i, for all x, y ∈ X, where jx+y ∈ J(x + y).
Qamrul Hasan Ansari Advanced Functional Analysis Page 115
Proof. (a) ⇒ (b). For t > 0, let ft ∈ J(x + ty). Then hx + ty, ft i = kx + tyk kftk. Define
gt = kffttk∗ . Then kgt k∗ = 1. Since gt ∈ kft k−1
∗ J(x + ty), we have
By the Banach-Alaoglu Theorem 6.0.4 (which states that the unit ball in X ∗ is weak*-
compact), the net {gt } has a limit point g ∈ X ∗ such that
Observe that
kxk ≤ hx, gi ≤ kxkkgk∗ = kxk,
which gives that
hx, gi = kxk and kgk∗ = 1.
Set j = gkxk, then j ∈ J(x) and hy, ji ≥ 0.
(b) ⇒ (a). Assume that for x, y ∈ X with x 6= 0, there exists j ∈ J(x) such that hy, ji ≥ 0.
Then for t > 0,
Proof. We first show that J(x) ⊆ ∂ (kxk2 /2). Let x 6= 0 and j ∈ J(x). Then for y ∈ X, we
have
kyk2 kxk2 kyk2 kxk2
− − hy − x, ji = − − hy, ji + hx, ji
2 2 2 2
kyk2 kxk2
≥ − − kyk kjk∗ + kxk kjk∗
2 2
(because hy, ji ≤ kyk kjk∗ and hx, ji = kxk kjk∗)
kyk2 kxk2
= − − kyk kxk + kxk2 (because kjk∗ = kxk)
2 2
kxk2 kyk2
≥ + − kxkkyk
2 2
(kxk − kyk)2
= ≥ 0.
2
It follows that
kxk2 kyk2
− ≤ hx − y, ji.
2 2
Hence j ∈ ∂ (kxk2 /2). Thus, J(x) ⊆ ∂ (kxk2 /2) for all x 6= 0.
kxk2
We now prove ∂ (kxk2 /2) ⊆ J(x) for all x 6= 0. Suppose j ∈ ∂ 2
for 0 6= x ∈ X. Then,
kxk2 kyk2
− ≤ hx − y, ji, for all y ∈ X. (4.11)
2 2
Observe that
Thus,
To see j ∈ J(x), we show that kjk∗ = kxk. For t > 1, we take y = tx ∈ X in (4.11), then
we obtain
kxk2 t2 kxk2
− ≤ hx − tx, ji,
2 2
that is,
(1 − t2 )
kxk2 ≤ (1 − t)hx, ji,
2
which implies that
kxk2
hx, ji ≤ (t + 1) .
2
Letting t → 1, we get
From (4.12), (4.13) and (4.14), we obtain kjk∗ = kxk. Thus, ∂ (kxk2 /2) ⊆ J(x). Therefore,
J(x) = ∂ (kxk2 /2) for all x 6= 0.
Qamrul Hasan Ansari Advanced Functional Analysis Page 118
Let C be a nonempty closed convex subset of a normed space X such that the origin belongs
to the interior of C. A linear functional j ∈ X ∗ is said to be a tangent to C at the point
x0 ∈ ∂C if j(x0 ) = sup{j(x) : x ∈ C}, where ∂C denotes the boundary of C. If H = {x ∈
X : j(x) = 0} is the hyperplane, then the set H + x0 is called a tangent hyperplane to C at
x0 .
Definition 4.4.1. A Banach space X is said to be smooth if for each x ∈ SX , there exists
a unique functional jx ∈ X ∗ such that hx, jx i = kxk and kjx k = 1.
In other words, X is smooth if for all x ∈ SX , there exists jx ∈ SX ∗ such that hx, jx i = 1.
Geometrically, the smoothness condition means that at each point x of the unit sphere, there
is exactly one supporting hyperplane {jx = 1} := {y ∈ X : hy, jx i = 1}. This means that
the hyperplane {jx = 1} is tangent at x to the unit ball and this unit ball is contained in
the half space {jx ≤ 1} := {y ∈ X : hy, jx i ≤ 1}.
Example 4.4.1. ℓp , Lp (1 < p < ∞) are smooth Banach spaces. However, c0 , ℓ1 , L1 , ℓ∞ ,
L∞ are not smooth.
Theorem 4.4.1. Let X be a Banach space. Then the following assertions hold.
Proof. (a) Assume that X is not smooth. Then there exist x0 ∈ SX and j1 , j2 ∈ SX ∗
with j1 6= j2 such that hx0 , j1 i = hx0 , j2 i = 1. Since kj1 + j2 k ≤ kj1 k + kj2 k = 2, and
hx0 , j1 + j2 i = hx0 , j1 i + hx0 , j2 i = 2, we have (j1 + j2 )/2 ∈ SX ∗ . Hence X ∗ is not strictly
convex.
(b) Suppose that X is not strictly convex. Then there exist x, y ∈ SX with x 6= y such that
kx + yk = 2. Take j ∈ SX ∗ with x+y 2
, j = 1. Then, we have
x+y 1 1 1 1
1= , j = hx, ji + hy, ji ≤ + ,
2 2 2 2 2
and hence, hx, ji = hy, ji = kjk = 1. Since x, y ∈ X ⊆ X ∗∗ , we have x, y ∈ J(j). So, for
x 6= y, we have X ∗ is not smooth.
It is well known that for a reflexive Banach space X, the dual spaces X and X ∗ can be
equivalently renormed as strictly convex spaces such that the duality is preserved. By using
this fact, we have the following result.
Qamrul Hasan Ansari Advanced Functional Analysis Page 119
Theorem 4.4.2. Let X be a reflexive Banach space. Then the following assertions hold.
Theorem 4.4.3. A Banach space X is smooth if and only if the norm is Gâteaux
differentiable on X\{0}.
Proof. Since the proper convex continuous functional ϕ is Gâteaux differentiable if and only
if it has a unique subgradient, we have
(a) X is smooth.
(b) J is single-valued.
Proof. We show that xn → x implies J(xn ) → J(x) in the weak* topology. Let xn → x and
set fn := J(xn ). Then
Since {xn } is bounded, {fn } is bounded in X ∗ . Then there exists a subsequence {fnk } of
{fn } such that fnk → f ∈ X ∗ in the weak* topology. Then we show that f = J(x). Since
the norm of X ∗ is lower semicontinuous in weak* topology, we have
Since hx, f − fnk i → 0 and hx − xnk , fnk i → 0, it follows from the fact
that
hx, f i = kxk2 .
As a result
kxk2 = hx, f i ≤ kf k∗ kxk.
Thus, we have hx, f i = kxk2 , kxk = kf k∗ . Therefore, f = J(x).
Qamrul Hasan Ansari Advanced Functional Analysis Page 121
This defines a mapping PC from X into 2C and it is called the metric projection onto C. The
metric projection mapping is also known as the nearest point projection mapping, proximity
mapping or best approximation operator.
The set C is said to be proximinal (respectively, Chebyshev) set if each x ∈ X has at least
(respectively, exactly) one best approximation in C.
Remark 4.5.1. (a) C is proximinal if PC (x) 6= ∅ for all x ∈ X.
(b) C is Chebyshev if PC (x) is singleton for each x ∈ X.
(c) The set of best approximations is convex if C is convex.
Proof. Suppose contrary that C is not closed. Then there exists a sequence {xn } in C such
that xn → x and x ∈
/ C, but x ∈ X. It follows that
d(x, C) ≤ kxn − xk → 0,
so that, d(x, C) = 0. Since x ∈
/ C, we have
kx − yk > 0, for all y ∈ C.
This implies that PC (x) = ∅ which contradicts PC (x) 6= ∅.
Corollary 4.5.1. Let C be a nonempty closed convex subset of a reflexive Banach space
X. Then each element x ∈ X has a best approximation in C.
Proof. Assume contrary that y1 , y2 ∈ C are best approximations to x ∈ X. Since the set of
best approximations is convex, (y1 +y2 )/2 is also a best approximation to x. Set r := d(x, C).
Then
0 ≤ r = kx − y1 k = kx − y2 k = kx − (y1 + y2 )/2k,
and so,
k(x − y1 ) + (x − y2 )k = 2r = kx − y1 k + kx − y2 k.
By the strict convexity of X, we have
Taking the norm in this relation, we obtain r = tr, that is, t = 1, which gives us y1 = y2 .
The following example shows that the strict convexity cannot be dropped in Theorem 4.5.2.
Example 4.5.1. Let X = R2 with norm kxk1 = |x1 | + |x2 | for all x = (x1 , x2 ) ∈ R2 . As we
have seen that X is not strictly convex. Let
C = (x1 , x2 ) ∈ R2 : k(x1 , x2 )k1 ≤ 1 = (x1 , x2 ) ∈ R2 : |x1 | + |x2 | ≤ 1 .
Then C is a closed convex set. The distance from z = (−1, −1) to the set C is one and this
distance is realized by more than one point of C.
The following example shows that the uniqueness of best approximations in Theorem 4.5.2
need not be true for nonconvex sets.
1/2
Example 4.5.2. Let X = R2 with the norm k · k2 = (x21 + x22 ) for all x = (x1 , x2 ) ∈ R2 .
Let
C = SX = (x1 , x2 ) ∈ R2 : x21 + x22 = 1 .
Then X is strictly convex and C is not convex. However, all points of C are best approxi-
mations to (0, 0) ∈ X.
Proof. Assume contrary that X is not strictly convex. Then there exist x, y ∈ X, x 6= y
such that
kxk = kyk = k(x + y)/2k = 1.
Furthermore,
ktx + (1 − t)yk = 1, for all t ∈ [0, 1].
Set C := co({x, y}) the convex hull of the set {x, y}. Then k0 − zk = d(0, C) for all z ∈ C.
It follows that every element of C is the best approximation to zero which contradicts the
uniqueness.
Theorem 4.5.4. Let C be a nonempty weakly compact convex subset of a strictly convex
Banach space X. Then for each x ∈ X, C has the unique best approximation, that is,
PC (·) is a single-valued metric projection mapping from X onto C.
Corollary 4.5.2. Let C be a nonempty closed convex subset of a strictly convex reflexive
Banach space X and let x ∈ X. Then there exists a unique element x0 ∈ C such that
kx − x0 k = d(x, C).
5
Every real-valued odd function is subodd. It can be seen that the function f : R → R defined
by f (x) = x2 is subodd but it is neither odd nor subadditive.
Remark 5.0.1. (a) It can be easily seen that f is subodd if and only if f (x) + f (−x) ≥ 0,
for all x ∈ Rn \ {0}.
(b) If f is sublinear and is not constant with value −∞ such that f (0) ≥ 0, then f is
subodd.
(b) The function f is called proper if f (x) < +∞ for at least one x ∈ Rn and f (x) > −∞
for all x ∈ Rn .
124
Qamrul Hasan Ansari Advanced Functional Analysis Page 125
The epigraph (hypograph) is thus a subset of Rn+1 that consists of all the points of Rn+1
lying on or above (on or below) the graph of f . From the above definitions, we have
and
(x, α) ∈ hyp(f ) if and only if x ∈ U(f, α).
(a) bounded above if there exists a real number M such that f (x) ≤ M, for all x ∈ Rn ;
(b) bounded below if there exists a real number m such that f (x) ≥ m, for all x ∈ Rn ;
If f is differentiable, then
f (x + λv) = f (x) + λh∇f (x), vi + o(λ), for all x + λv ∈ Rn ,
o(λ)
where limλ→0 = 0.
λ
The gradient of f at x = (x1 , x2 , . . . , xn ) is a vector in Rn given by
∂f (x) ∂f (x) ∂f (x)
∇f (x) = , ,..., .
∂x1 ∂x2 ∂xn
Definition 5.0.6. An n × n symmetric matrix M of real numbers is said to be positive
semidefinite if hy, Myi ≥ 0 for all y ∈ Rn . It is called positive definite if hy, Myi > 0 for all
y 6= 0.
Definition 5.0.7. Let f = (f1 , . . . , fℓ ) : Rn → Rℓ be a vector-valued function such that the
∂fi (x)
partial derivative of fi with respect to xj exists for i = 1, 2, . . . , ℓ and j = 1, 2, . . . , n.
∂xj
Then the Jacobian matrix J(f )(x) is given by
∂f1 (x) ∂f1 (x)
∂x1 ···
∂xn
J(f )(x) = .
. .. ,
. .
∂fℓ (x) ∂fℓ (x)
···
∂x1 ∂xn
Qamrul Hasan Ansari Advanced Functional Analysis Page 127
where x = (x1 , x2 , . . . , xn ) ∈ Rn .
Definition 5.0.8. A function f : Rn → R is said to be twice differentiable at x ∈ Rn if there
exist a vector ∇f (x) and an n × n symmetric matrix ∇2 f (x), called the Hessian matrix, and
a function α : Rn → R such that
f (y) = f (x) + h∇f (x), y − xi + hy − x, ∇2 f (x)(y − x)i + ky − xk2 α(y − x), for all y ∈ Rn ,
Some of the examples of convex functions defined on R are f (x) = ex , f (x) = x, f (x) = |x|,
f (x) = max{0, x}. The functions f (x) = − log x and f (x) = xα for α < 0, α > 1 are strictly
convex defined on the interval ]0, ∞[. Clearly, every strictly convex function is convex but
the converse may not be true. For example, the function f (x) = x defined on R is not strictly
convex. The function f (x) = |x + x3 | is a nondifferentiable strictly convex function on R.
Then there exists a linear extension F of f such that F (x) ≤ p(x) for all x ∈ X.
The following corollary gives the existence of nontrivial bounded linear functionals on an
arbitrary normed space.
Corollary 6.0.1. Let x be a nonzero element of a normed space X. Then there exists
j ∈ X ∗ such that j(x) = kxk and kjk∗ = 1.
Definition 6.0.1. Let X be a normed space and X ∗ be its dual space. The duality pairing
between X and X ∗ is the functional h., .i : X × X ∗ → R defined by
hx, ji = j(x), for all x ∈ X and j ∈ X ∗ .
Theorem 6.0.3 (James Theorem). A Banach space X is reflexive if and only if for
each j ∈ SX ∗ , there exists x ∈ SX such that j(x) = 1.
Let X be a Banach space with its dual X ∗ . We say that the sequence {xn } in X converges
to x if lim kxn − xk = 0. This kind of convergence is also called norm convergence or
n→∞
129
Qamrul Hasan Ansari Advanced Functional Analysis Page 130
strong convergence. This is related to the strong topology on X with neighborhood base
Br (0) = {x ∈ X : kxk < r}, r > 0 at the origin. There is also a weak topology on X
generated by the bounded linear functionals on X. Indeed, A set G ⊆ X is said to be open
in the weak topology if for every x ∈ G, there are bounded linear functionals f1 , f2 , . . . , fn
and positive real numbers ε1 , ε2 , . . . , εn such that
Remark 6.0.1. In the finite dimensional spaces, the weak convergence and the strong con-
vergence are equivalent.
Proposition 6.0.3. Every closed convex subset of a weakly compact set is weakly com-
pact.
Theorem 6.0.6. Let X be a Banach space. Then X is reflexive if and only if every
closed convex bounded subset of X is weakly compact.
Theorem 6.0.8. Let X be a Banach space. Then X is reflexive if and only if every
bounded sequence in X in strong topology has a weakly convergent subsequence.
Proof. For all α ∈ R,[let Gα := {x ∈ X : f (x) > α}. Since f is lower semicontinuous,
Gα is open and X = Gα . By compactness of X, there exists a finite family {Gαi }ni=1 of
α∈R
{Gα }α∈R such that
n
[
X= Gαi .
i=1
Suppose that α0 = min{α1 , α2 , . . . , αn }. Then f (x) > α0 for all x ∈ X. It follows that
inf{f (x) : x ∈ X} exists. Let m = inf{f (x) : x ∈ X} and β be a number such that β > m.
Set Fβ := {x ∈ X : f (x) ≤ β}. Then Fβ is a nonempty closed subset of X, and hence, by
the intersection property (Theorem 6.0.1), we have
\
Fβ 6= ∅.
β>m
Theorem 6.0.10. Let C be a weakly compact convex subset of a Banach space X and
f : C → (−∞, ∞] be a proper lower semicontinuous convex functional. Then there exists
x̄ ∈ C such that f (x̄) = inf{f (x) : x ∈ C}.
Recall that every closed convex bounded subset of a reflexive Banach space is weakly compact
(Theorem 6.0.6). Using this fact, we have the following result.
Qamrul Hasan Ansari Advanced Functional Analysis Page 132
Theorem 6.0.11. Let C be a nonempty closed convex bounded subset of a reflexive Ba-
nach space X and f : X → (−∞, ∞] be a proper lower semicontinuous convex functional.
Then there exists x̄ ∈ C such that f (x̄) = inf f (x).
x∈C
In Theorem 6.0.11, the boundedness of C may be replaced by the following weaker assump-
tion (called coercivity condition):
lim f (x) = ∞.
x∈C,kxk→∞
Theorem 6.0.12. Let C be a nonempty closed convex subset of a reflexive Banach space
X and f : C → (−∞, ∞] be a proper lower semicontinuous convex functional such that
f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ C such that f (x̄) = inf f (x).
x∈C
Proof. Let m = inf{f (x) : x ∈ X}. Choose a minimizing sequence {xn } in X, that is,
f (xn ) → m. If {xn } is not bounded, then there exists a subsequence {xni } of {xn } such that
kxni k → ∞. From the hypothesis, we have f (xni ) → ∞, which contradicts m 6= ∞. Hence
{xn } is bounded. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xnj }
of {xn } such that xnj ⇀ x̄ ∈ X. Since f is lower semicontinuous in the weak topology, we
have
m ≤ f (x̄) ≤ lim inf f (xnj ) = lim f (xn ) = m.
j→∞ n→∞
Therefore, f (x̄) = m.