0% found this document useful (0 votes)
68 views132 pages

Lecture Notes On Advanced Functional Analysis

Uploaded by

Bui Ngoc Muoi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views132 pages

Lecture Notes On Advanced Functional Analysis

Uploaded by

Bui Ngoc Muoi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Qamrul Hasan Ansari Advanced Functional Analysis Page 1

Advanced Functional Analysis

Qamrul Hasan Ansari


Department of Mathematics
Aligarh Muslim University, Aligarh
E-mail: [email protected]
SYLLABUS
M.A. / M.Sc. II SEMESTER
ADVANCED FUNCTIONAL ANALYSIS

Course Title Advanced Functional Analysis


Course Number MMM-2009
Credits 4
Course Category Compulsory
Prerequisite Courses Functional Analysis, Linear Algebra, Real Analysis
Contact Course 4 Lecture + 1 Tutorial
Type of Course Theory
Course Assessment Sessional (1 hour) 30%
End Semester Examination (2:30 hrs) 70%
Course Objectives To discuss some advanced topics from Functional Analysis, namely
orthogonality, orthonormal bases, orthogonal projections, bilinear forms,
spectral theory of continuous linear operators,
differential calculus on normed spaces, geometry of Banach spaces.
These topics play central role in research and
advancement of various topics in mathematics.
Course Outcomes After undertaking this course, students will understand:
◮ spectral theory of continuous linear operators
◮ orthogonality, orthogonal complements, orthonormal bases
◮ orthogonal projection, bilinear form and Lax-Milgram lemma
◮ differential calculus on normed spaces
◮ geometry of Banach spaces

2
Qamrul Hasan Ansari Advanced Functional Analysis Page 3

Syllabus No. of Lectures


UNIT I: Orthogonality, Orthonormal Bases, Orthogonal Projection
and Bilinear Forms
Orthogonality, Orthogonal complements, Orthonormal Bases,
Orthogonal projections, Projection theorem, 14
Projection on convex stes, Sesquilinear forms,
Bilinear forms and their basic properties, Lax-Milgram lemma
UNIT II: Spectral Theory of Continuous Linear Operators
Eigenvalues and eigenvectors, Resolvent operators, Spectrum,
Spectral properties of bounded linear operators,
Compact linear operators on normed spaces, 13
Finite dimensional domain and range, Sequence of compact linear operators,
Weak convergence, Spectral theory of compact linear operators
UNIT III: Differential Calculus on Normed Spaces
Gâteaux derivative, Gradient of a function, Fréchet derivative,
Chain rule, Mean value theorem, Properties of Gâteaux and Fréchet derivatives, 14
Taylor’s formula, Subdifferential and its properties
UNIT IV: Geometry of Banach Spaces
Strict convexity, Modulus of convexity, Uniform convexity,
Duality mapping and its properties 15
Smooth Banach spaces, Modulus of smoothness
Total 56
Qamrul Hasan Ansari Advanced Functional Analysis Page 4

Recommended Books:

1. Q. H. Ansari: Topics in Nonlinear Analysis and Optimization, World Education, Delhi,


2012.

2. Q. H. Ansari, C. S. Lalitha and M. Mehta: Generalized Convexity, Nonsmooth Variational


and Nonsmooth Optimization, CRC Press, Taylor and Francis Group, Boca Raton, London,
New York, 2014.

3. C. Chidume: Geometric Properties of Banach Spaces and Nonlinear Iterations, Springer,


London, 2009.

4. M. C. Joshi and R. K. Bose: Some Topics in Nonlinear Functional Analysis, Wiley Eastern
Limited, New Delhi, 1985.

5. E. Kreyazig: Introductory Functional Analysis with Applications, John Wiley and Sons, New
York, 1989.

6. M. T. Nair: Functional Analysis: A First Course, Prentice-Hall of India Private Limited,


New Delhi, 2002.

7. A. H. Siddiqi: Applied Functional Analysis, CRC Press, London, 2003.


1

Orthogonality, Orthonormal Bases,


Orthogonal Projection and Bilinear Forms

Throughout these notes, 0 denotes the zero vector of the corresponding vector space, and
h., .i denotes the inner product on an inner product space.

1.1 Orthogonality and Orthonormal Bases

1.1.1 Orthogonality

One of the major differences between an inner product and a normed space is that in an
inner product space we can talk about the angle between two vectors.

Definition 1.1.1. The angle θ between two vectors x and y of an inner product space X is
defined by the following relation:
hx, yi
cos θ = . (1.1)
kxk kyk

Definition 1.1.2. Let X be an inner product space whose inner product is denoted by h., .i.

(a) Two vectors x and y in X are said to be orthogonal if hx, yi = 0. When two vectors x
and y are orthogonal, we denoted by x ⊥ y.

(b) A vector x ∈ X is said to be orthogonal to a nonempty subset A of X, denoted by


x⊥A, if hx, yi = 0 for all y ∈ A.

5
Qamrul Hasan Ansari Advanced Functional Analysis Page 6

(c) Let A be a nonempty subset of X. The set of all vectors orthogonal to A, denoted by
A⊥ , is called the orthogonal complement of A, that is,
A⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A}.
A⊥⊥ = (A⊥ )⊥ denotes the orthogonal complement of A⊥ , that is,
A⊥⊥ = (A⊥ )⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A⊥ }.

(c) Two subsets A and B of X are said to be orthogonal, denoted by A⊥B, if hx, yi = 0
for all x ∈ A and all y ∈ B.

Clearly, x and y are orthogonal if and only if the angle θ between is 90◦ , that is, cos θ = 0
which is equivalent to (in view of (1.1)) hx, yi = 0 ⇔ x ⊥ y.

Remark 1.1.1. (a) Since hx, yi = hy, xi (conjugate of hy, xi) hx, yi = 0 implies that
hy, xi = 0 or hy, xi = 0 and vice versa. Hence, x⊥y if and only if y ⊥ x, that is, all
vectors in X are mutually orthogonal.
(b) Since hx, 0i = 0 for all x, x ⊥ 0 for every x belonging to an inner product space. By
the definition of the inner product, 0 is the only vector orthogonal to itself.
(c) Clearly, {0}⊥ = X and X ⊥ = {0}.
(d) If A ⊥ B, then A ∩ B = {0}.
(e) Nonzero mutually orthogonal vectors, x1 , x2 , x3 , . . . , xn , of an inner product space are
linearly independent (Prove it!).
Example 1.1.1. Let A = {(x, 0, 0) ∈ R3 : x ∈ R} be a line in R3 and B = {(0, y, z) ∈ R3 :
y, z ∈ R} be a plane in R3 . Then A⊥ = B and B ⊥ = A.
Example 1.1.2. Let X = R3 and A be its subspace spanned by a non-zero vector x. The
orthogonal complement of A is the plane through the origin and perpendicular to the vector
x.
Example 1.1.3. Let A be a subspace of R3 generated by the set {(1, 0, 1), (0, 2, 3)}. An
element of A can be expressed as
x = (x1, x2, x3 ) = λ(1, 0, 1) + µ(0, 2, 3)
= λi + 2µj + (λ + 3µ)k
⇒ x1 = λ, x2 = 2µ, x3 = λ + 3µ.

Thus, the element of A is of the form x1 , x2 , x1 + 23 x2 . The orthogonal complement of A
can be constructed as follows: Let x = (x1 , x2 , x3 ) ∈ A⊥ . Then for y = (y1 , y2 , y3 ) ∈ A, we
have
 
3
hx, yi = x1 y1 + x2 y2 + x3 y3 = x1 y1 + x2 y2 + x3 y1 + y2
2
 
3
= (x1 + x3 ) y1 + x2 + x3 y2 = 0.
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 7

Since y1 and y2 are arbitrary, we have


3
x1 + x3 = 0 and x2 + x3 = 0.
2
Therefore,
 
⊥ 3
A = x = (x1 , x2 , x3 ) : x1 = −x3 , x2 = − x3
2
  
3
= x ∈ R3 : x = −x3 , − x3, x3 .
2
Exercise 1.1.1. Let A be a subspace of R3 generated by the set {(1, 1, 0), (0, 1, 1)}. Find
A⊥ .

Answer. A⊥ is the straight line spanned by the vector (1, −1, 1).

Theorem 1.1.1. Let X be an inner product space and A be a subset of X. Then A⊥ is


a closed subspace of X.

Proof. Let x, y ∈ A⊥ . Then, hx, zi = 0 for all z ∈ A and hy, zi = 0 for all z ∈ A. Since for
arbitrary scalars α, β, hαx + βy, zi = αhx, zi + βhy, zi = 0, we get hαx + βy, zi = 0; that is,
αx + βy ∈ A⊥ . So A⊥ is a subspace of X.

To show that A⊥ is closed, let {xn } ∈ A⊥ such that xn → y. We need to show that y must
belongs to A⊥ . Since xn ∈ A⊥ , hx, xn i = 0 for all x ∈ X and all n. Since h., .i is a continuous
function, we have
lim hx, xn i = lim hxn , xi = h lim xn , xi = hy, xi = 0.
n→∞ n→∞ n→∞

Hence, y ∈ A⊥ .
Exercise 1.1.2. Let X be an inner product space and A and B be subsets of X. Prove the
following assertions:

(a) A ∩ A⊥ ⊆ {0}. A ∩ A⊥ = {0} if and only if A is a subspace.


(b) A ⊆ A⊥⊥ .
(c) If B ⊆ A, then B ⊥ ⊇ A⊥ .

Proof. (a) If y ∈ A ∩ A⊥ and y ∈ A⊥ , then y ∈ {0}. If A is a subspace, then 0 ∈ A and


0 ∈ A ∩ A⊥ . Hence, A ∩ A⊥ = {0}.

(b) Let y ∈ A, but y ∈ / A⊥⊥ . Then there exists an element z ∈ A⊥ such that hy, zi =
6 0.
Since z ∈ A , hy, zi = 0 which is a contradiction. Hence, y ∈ A⊥⊥ .

(c) Let y ∈ A⊥ . Then hy, zi = 0 for all z ∈ A. Since every z ∈ B is an element of A, we


have hy, zi = 0 for all z ∈ B. Hence, y ∈ B ⊥ , and so B ⊥ ⊃ A⊥ .
Qamrul Hasan Ansari Advanced Functional Analysis Page 8

Exercise 1.1.3. Let X be an inner product space and A and B be subsets of X. Prove the
following assertions:

(a) If A ⊆ B, then A⊥⊥ ⊆ B ⊥⊥ .

(b) A⊥ = A⊥⊥⊥ .

(c) If A is dense in X, that is, A = X, then A⊥ = {0}.

(d) If A is an orthogonal set and 0 ∈


/ A, then prove that A is linearly independent.

Exercise 1.1.4. Let A be a nonempty subset of a Hilbert space X. Show that

(a) A⊥⊥ = spanA;

(b) spanA is dense in X whenever A⊥ = {0}.

Hint: See [5], pp. 149.

Exercise 1.1.5. Let A be a nonempty subset of a Hilbert space X. Show that A is closed
if and only if A = A⊥⊥ .

The well-known Pythagorean theorem of plane geometry says that the sum of the squares
of the base and the perpendicular in a right-angled triangle is equal to the square of the
hypotenuse. Its infinite-dimensional analogue is as follows.

Theorem 1.1.2. Let X be an inner product space and x, y ∈ X. Then for x ⊥ y, we


have kx + yk2 = kxk2 + kyk2.

Proof. Note that kx + yk2 = hx + y, x + yi = hx, xi + hy, xi + hx, yi + hy, yi. Since x⊥y,
hx, yi = 0 and hy, xi = 0, we have kx + yk2 = kxk2 + kyk2 .

Exercise 1.1.6. Let K and D be subset of an inner product space X. Show that

(K + D)⊥ = K ⊥ ∩ D ⊥ ,

where K + D = {x + y : x ∈ K, y ∈ D}.

Exercise 1.1.7. For each i = 1, 2, . . . , n, let Ki be a subspace of a Hilbert space X. If


hxi , xj i = 0 when i 6= j for each xi ∈ Ki and yj ∈ Kj , then show that the subspace
K1 + K2 + · · · + Kn is closed. Is this property true for an incomplete inner product space?

Exercise 1.1.8. Let X be an inner product space and for a nonzero vector y ∈ X, Ky :=
{x ∈ X : hx, yi = 0}. Determine the subspace Ky⊥ .
Qamrul Hasan Ansari Advanced Functional Analysis Page 9

1.1.2 Orthonormal Sets and Orthonormal Bases

Definition 1.1.3. Let X be an inner product space.

(a) A subset A of nonzero vectors in X is said to orthogonal if any two distinct elements
in A are orthogonal.

(b) A set of vectors A in X is said to be orthonormal if it is orthogonal and kxk = 1 for


all x ∈ A, that is, for all x, y ∈ A,

0, if x 6= y
hx, yi = (1.2)
1, if x = y.

If an orthogonal / orthonormal set in X is countable, then it can be arranged as a sequence


{xn } and in this case we call it an orthogonal sequence / orthonormal sequence, respectively.

More generally, let Λ be any index set.

(a) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthogonal if
xα ⊥ xβ for all α, β ∈ Λ, α 6= β.

(b) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthonormal if
it is orthogonal and kxα k = 1 for all xα , that is, for all α, β ∈ Λ, we have

0, if α 6= β
hxα , yβ i = δαβ = (1.3)
1, if α = β.

Example 1.1.4. The standard / canonical basis for Rn (with usual inner product)

e1 = (1, 0, 0, . . . , 0),
e2 = (0, 1, 0, . . . , 0),
.. ..
. .
en = (0, 0, 0, . . . , 1),

form an orthonormal set as



0, if i 6= j
hei , ej i = δij = (1.4)
1, if i = j.
Qamrul Hasan Ansari Advanced Functional Analysis Page 10

Recall that for p ≥ 1,


( ∞
)
X
ℓp = x = {xn } ⊆ K : |xn |p < ∞}
n=1

[
c00 = {{x1 , x2 , . . .} ⊆ K : xj = 0 for j ≥ k}
k=1
ℓ∞ = {{xn } ⊆ K : sup |xn | < ∞}
n∈N
C[a, b] = The space of all continuous real-valued functions defined on the interval [a, b]
P [a, b] = The space of polynomials defined on the interval [a, b]

Clearly, c00 ⊆ ℓ∞ .
P [a, b] is complete with respect to the norm kf k∞ = supx∈[a,b] |f (x)|.
However, P [a, b] is dense in C[a, b] with k · k∞ .

ℓ2 is a Hilbert space with inner product defined by



X
hx, yi = xn yn , for all x = {xn }, y = {yn } ∈ ℓ2 .
n=1

The norm on ℓ2 is defined by


!1/2
X
1/2 2
kxk = hx, xi = |xn | .
n=1

The space ℓp with p 6= 2 is not an inner product space, and hence not a Hilbert space.
However, ℓp with p 6= 2 is a Banach space.

For 0 < p < ∞,


 Z b 
p p
L [a, b] = f : [a, b] → K : f is measurable and |f | dµ < ∞
a

For 1 ≤ p < ∞, Lp [a, b] is a complete normed space with respect to the norm
Z b 1/p
p
kf kp = |f | dµ .
a

Note that kf kp does not define a norm on Lp [a, b] for 0 < p < 1.

Example 1.1.5. Consider ℓ2 space and its subset E = {e1 , e2 , . . .} with en = δnj , that is,
Qamrul Hasan Ansari Advanced Functional Analysis Page 11

en = {0, 0, . . . 0, 1, 0, . . .} (1 is at nth place). Then E forms an orthonormal set for ℓ2 , and


{en } is an orthonormal sequence.
Example 1.1.6. Consider the space c00 with the inner product

X
hx, yi = xn yn , for all x = {x1 , x2 , . . .}, y = {y1 , y2, . . .} ∈ c00 .
n=1

The set E = {e1 , e2 , . . .} with en = δnj , that is, en = {0, 0, . . . 0, 1, 0, . . .} (1 is at nth place)
forms an orthonormal set for c00 , and {en } is an orthonormal sequence.
Example 1.1.7. Consider the space C[0, 2π] with the inner product
Z 2π
hf, gi = f (t)g(t) dt, for all f, g ∈ C[0, 2π].
0

Consider the sets E = {u1 , u2 , . . .} and G = {v1 , v2 , . . .} or the sequences {un } and {vn },
where
un (t) = cos nt, for all n = 0, 1, 2, . . . ,
and
vn (t) = sin nt, for all n = 1, 2, . . . .
Then, E = {u1 , u2 , . . .} is an orthogonal set and {un } is an orthogonal sequence. Also,
G = {v1 , v2 , . . .} is an orthogonal set and {vn } is an orthogonal sequence.

Indeed, by integrating, we obtain



Z 2π  0, if m 6= n
hum , un i = cos mt cos nt dt = π, if m = n = 1, 2, . . .
0 
2π if m = n = 0,
and Z 

0, if m 6= n
hvm , vn i = sin mt sin nt dt =
0 π, if m = n = 1, 2, . . . .
Also, {e1 , e2 , . . .} is an orthonormal set and {en } is an orthonormal sequence, where
1 un (t) cos nt
e0 (t) = √ , en = = √ , for n = 1, 2, . . . .
2π kun k π
Similarly, {ẽ1 , ẽ2 , . . .} is an an orthonormal set and {ẽn } is an orthonormal sequence, where
vn (t) sin nt
ẽn = = √ , for n = 1, 2, . . . .
kvn k π
Note that um ⊥ vn for all m and n (Prove it!).
Exercise 1.1.9 (Pythagorean Theorem). If {x1 , x2 , . . . , xn } is an orthogonal subset of an
inner product space X, then prove that
n 2 n
X X
xi = kxi k2 .
i=1 i=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 12

Proof. We have
n 2 * n n
+
X X X
xi = xi , xi
i=1 i=1 i=1
n
X n
X
= hxi , xi i + hxi , xj i
i=1 i,j=1, 6=j
Xn
= hxi , xi i
i=1
Xn
= kxi k2 .
i=1

Lemma 1.1.1 (Linearly independence). An orthonormal set of non-zero vectors is lin-


early independent.

Proof. Let {ui } be an orthonormal set. Consider the linear combination


α1 u1 + α2 u2 + · · · + αn un = 0.
Multiply by any fixed uj 6= 0, we get
0 = h0, uj i = hα1 u1 + α2 u2 + · · · + αn un , uj i
= α1 hu1, uj i + α2 hu2 , uj i + · · · αj huj , uj i + · · · + αn hun , uj i.
Since hui , uj i = δij , we have
0 = αj huj , uj i or αj = 0 as uj 6= 0.
This shows that {ui } is a set of linearly independent vectors.
Exercise 1.1.10. Determine an orthogonal set in L2 [0, 2π].

Hint: (See [6], pp. 179) Consider u1 (t) = 1/ 2π, and for n ∈ N,
sin nt cos nt
u2n (t) = √ , u2n+1 (t) = √ ,
π π
and then check E = {u1 , u2, . . .} is an orthogonal set in L2 [0, 2π].
Exercise 1.1.11. Construct a set of 3 vectors in R3 and determine whether it is a basis
and, if it is, then whether it is orthogonal, orthonormal or neither.
Exercise 1.1.12. Show that every orthonormal set in a separable inner product space X is
countable.

Hint: (See [6], pp. 179)


Qamrul Hasan Ansari Advanced Functional Analysis Page 13

Advantages of an Orthonormal Sequence. A great advantage of orthonormal sequences


over arbitrary linearly independent sequences is the following: If we know that a given x can
be represented as linear combination of some elements of an orthonormal sequence, then the
orthonormality makes the actual determination of the coefficients very easy.

Let {u1 , u2, . . .} be an orthonormal sequence in an inner product space X and x ∈


span{u1, u2 , . . . , un }, where n is fixed. Then x can be written as a linear combination of
u1 , u2 , . . . , un , that is,
Xn
x= αk uk , for scalars αk . (1.5)
k=1

Take the inner product by a fixed uj , then we obtain


* n +
X X
hx, uj i = αk u k , u j = αk huk , uj i + αj huk , uj i = αj kuj k = αj ,
k=1 k=1, k6=j

as kuj k = 1 since {u1 , u2, . . .} is an orthonormal sequence. Therefore, the unknown coeffi-
cients αk in (1.5) can be easily calculated.

The following Gram-Schmidt process provides that how to obtain an orthonormal sequence
if an arbitrary linearly independent sequence is given.

Gram-Schmidt Orthogonalization Process. Let {xn } be a linearly independent se-


quence in an inner product space X. Then we obtain an orthonal sequence {vn } and an
orthonormal sequence {un } with the following property for every n:

span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }.
Qamrul Hasan Ansari Advanced Functional Analysis Page 14

v1
1st Step Take v1 = x1 and u1 =
kv1 k
v2
2nd Step Take v2 = x2 − hx2 , u1iu1 and u2 =
  kv2 k
v1 v1
= x2 − x2 ,
kv1 k kv1 k
hx2 , v1 i
= x2 − v1
hv1 , v1 i
v3
3rd Step Take v3 = x3 − hx3 , u2iu2 and u3 =
kv3 k
2
X hxj , vj i
= x3 − vj
j=1
hvj , vj i
.. .. ..
. . .
n−1
X vn
nth Step Take vn = xn − hxn , uj iuj and un =
j=1
kvn k
n−1
X hxn , vj i
= xn − vj
j=1
hvj , vj i
.. .. ..
. . .
Then {vn } is an orthogonal sequence of vectors in X and {un } is an orthonormal sequence
in X. Also, for every n:

span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }.

Theorem 1.1.3. Let {xn } be a linearly independent sequence in an inner product space
X. Let v1 = x1 , and
n−1
X hxn , vn i
vn = xn − vj , for n = 2, 3, . . . .
j=1
hvj , vj i

vn
Then {v1 , v2 , . . .} is an orthogonal set, {un } is an orthonormal sequence where un = kvn k
,
and
span{x1 , x2 , . . . , xk } = span{u1 , u2 , . . . , uk }, for all k = 1, 2, . . . , n.

Proof. Since {xn } is a sequence of linearly independent vectors, so xn 6= 0 for all n. Define
v1 = x1 and
hx2 , v1 i
v2 = x2 − v1 .
hv1 , v1 i
Clearly, v2 ∈ span{x1 , x2 } and
hx2 , v1 i
hv2 , v1 i = hx2 , v1 i − hv1 , v1 i = 0,
hv1 , v1 i
Qamrul Hasan Ansari Advanced Functional Analysis Page 15

that is, v2 and v1 are orthogonal. Since {x1 , x2 } is linearly independent, v2 6= 0. Then, by
Exercise 1.1.3 (d), {v1 , v2 } is linearly independent, and hence, it follows that

span{v1 , v2 } = span{x1 , x2 }.

Continuing in this way, we define an orthogonal set {v1 , v2 , . . . , vn−1 } such that

span{x1 , x2 , . . . , xn−1 } = span{v1 , v2 , . . . , vn−1 }.

Let
n−1
X hxn , vk i
vn = xn − vj .
j=1
hvj , vj i

Then we have vk ∈ span{x1 , x2 , . . . , xk } and hvk , vi i = 0 for i < k. Again, since {x1 , x2 , . . . , xk }
is linearly independent, vk 6= 0. Thus, {v1 , v2 , . . . , vn } is the required orthogonal set and
{u1 , u2, . . . , un } is the required orthonormal set

Exercise 1.1.13. Let Y be the plane in R3 spanned by the vectors x1 = (1, 2, 2) and
x2 = (−1, 0, 2), that is, Y = span{x1 , x2 }. Find orthonormal basis for Y and for R3 .

Solution. x1 , x2 is a basis for the plane Y . We can extend it to a basis for R3 by adding one
vector from the standard basis. For instance, vectors x1 , x2 and x2 = (0, 0, 1) form a basis
for R3 because
1 2 2
1 2
−1 0 2 = = 2 6= 0.
−1 0
0 0 1
By using the Gram-Schmidt process, we orthogonalize the basis x1 = (1, 2, 2), x2 = (−1, 0, 2)
and x3 = (0, 0, 1):

v1 = x1 = (1, 2, 2),
hx2 , v1 i
v2 = x2 − v1
hv1 , v1 i
3
= (−1, 0, 2) − (1, 2, 2) = (−4/3, −2/3, 4/3)
9
hx3 , v1 i hx3 , v2 i
v3 = x3 − v1 − v2
hv1 , v1 i hv2 , v2 i
2 4/3
= (0, 0, 1) − (1, 2, 2) − (−4/3, −2/3, 4/3) = (2/9, −2/9, 1/9).
9 4
Now, v1 = (1, 2, 2), v2 = (−4/3, −2/3, 4/3), v3 = (2/9, −2/9, 1/9) is an orthogonal basis for
R3 , while v1 , v2 is an orthogonal basis for Y . The orthonromal basis for Y is u1 = kvv11 k =
1
3
(1, 2, 2), u2 = kvv22 k = 31 (−2, −1, 2).

v1
The orthonromal basis for R3 is u1 = kv1 k
= 31 (1, 2, 2), u2 = v2
kv2 k
= 31 (−2, −1, 2), u3 = v3
kv3 k
=
1
3
(2, −2, 1).
Qamrul Hasan Ansari Advanced Functional Analysis Page 16

Exercise 1.1.14. Let {un } be an orthonormal sequence in an inner product space X. Prove
the following statements (Use Pythagorean theorem).

P∞ P∞ 2
(a) If w = n=1 αn un , then kwk = n=1 |αn | , where αn ’s are scalars.
PN
(b) If x ∈ X and sN = n=1 hx, un iun , then kxk2 = kx − sN k2 + ksN k2 .
P
(c) If x ∈ X and sN = N n=1 hx, un iun , and XN = span{u1 , u2 , . . . uN }, then kx − sN k =
miny∈XN kx − yk (It is called best approximation property).

Theorem 1.1.4 (Bessel’s inequality). Let {uk } be an orthonormal set in an inner prod-
uct space X. Then for any x ∈ X, we have

X
|hx, uk i|2 ≤ kxk2 .
k=1

P
Proof. Let xn = nk=1 hx, uk iuk be the nth partial sum. Then, by using the properties of the
inner product and applying the fact that

0, if i 6= j
hui , uj i = δij =
1, if i = j,

we have

0 ≤ kx − xn k2 = hx − xn , x − xn i = kxk2 − hxn , xi − hx, xn i + kxn k2


* n + * n
+
X X
= kxk2 − hx, uk iuk , x − x, hx, uk iuk + kxn k2
k=1 k=1
n
X n
X
= kxk2 − hx, uk ihuk , xi − hx, uk ihx, uk i + kxn k2
k=1 k=1
2 2
= kxk − kxn k .
Pn
Therefore, kxn k2 ≤ kxk2 , and hence, k=1 |hx, uk i|
2
≤ kxk2 . Taking limit as n → ∞, we get
the conclusion.

Exercise 1.1.15. Let {ui } be a countably infinite orthonormal set in a Hilbert space X.
Then prove the following statements:

P

(a) The infinite series αn un , where αn ’s are scalars, converges if and only if the series
n=1
P
∞ P

|αn |2 converges, that is, |αn |2 < ∞.
n=1 n=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 17

P

(b) If αn un converges and
n=1

X ∞
X
x= αn u n = βn un ,
n=1 n=1

P

then αn = βn for all n and kxk2 = |αn |2 .
n=1

P

Proof. (a) Let αn un be convergent and assume that
n=1

∞ N 2
X X
x= αn u n , or equivalently, lim x− αn u n = 0.
N →∞
n=1 n=1

Now,
*∞ +
X
hx, um i = αn u n , u m
n=1

X
= αn hun , um i, for m = 1, 2, . . .
n=1
= αm (as {ui } is orthonormal).

By the Bessel inequality, we get



X ∞
X
2
|hx, um i| = |αm |2 ≤ kxk2 ,
m=1 m=1

P

which shows that |αn |2 converges.
n=1

P

To prove the converse, assume that |αn |2 is convergent. Consider the finite sum sn =
n=1
P
n
αi ui . Then, we have
i=1
* n n
+
2
X X
ksn − sm k = αi u i , αi u i
i=m+1 i=m+1
Xn
= |αi |2 → 0 as n, m → ∞.
i=m+1

This means that {sn } is a Cauchy sequence. Since X is complete, the sequence of partial
P

sums {sn } is convergent in X, and therefore, the series αn un converges.
n=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 18

P

(b) We first prove that kxk2 = |αn |2 . We have
n=1

N
X N X
X N
2 2
kxk − |αn | = hx, xi − hαn un , αm um i
n=1 n=1 m=1
* N
+ *N N
+
X X X
= x, x − αn u n + αn u n , x − αn u n
n=1 n=1 n=1
N N
!
X X
≤ x− αn u n kxk + αn u n = M.
n=1 n=1

P
N
Since αn un converges to x, the M converges to zero, proving the result.
n=1


X ∞
X
If x = αn u n = βn un , then
n=1 n=1
" N # ∞
X X
0 = lim (αn − βn ) un ⇒ 0= |αn − βn |2 , by (a),
N →∞
n=1 n=1

implying that αn = βn for all n.

Exercise 1.1.16. Let {un } be an orthonormal sequence in a Hilbert space X, and



X ∞
X
2
|αn | < ∞ and |βn |2 < ∞.
n=1 n=1

Prove that ∞ ∞
X X
u αn u n and v = βn un
n=1 n=1
P∞
are convergent series with respect to the norm of X and hu, vi = n=1 αn βn .

Proof. Let
N
X N
X
uN αn u n and vN = βn un .
n=1 n=1

Then for M < N, we have


N
X
2
kuN − uM k = |αn |2 → 0 as M → ∞,
n=M

and so, {uN } is a Cauchy sequence in a complete space X and thus converging to some
u ∈ X. Similarly, {vN } is a Cauchy sequence in a complete space X that converges to some
Qamrul Hasan Ansari Advanced Functional Analysis Page 19

v ∈ X. Finally,
N
X N
X N
X
huN , vN i = hαj uj , βk uk i = αj βk huj , uk i = αj βj ,
j,k=1 j,k=1 j=1

since huj , wk i = 0 for j 6= k and hwj , wj i = 1. Taking


P∞the limit as N → ∞ and using the
Pythagorean theorem, huN , vN i → hu, vi gives hu, vi n=1 αn βn .

Recall that if {u1 , u2 , . . . , un } is a basis of a linear space X, then for every x ∈ X, there
exists scalars α1 , α2 , . . . , αn such that x = α1 u1 + α2 u2 + · · · + αn un .

Definition 1.1.4. (a) An orthogonal set of vectors {ui } in an inner product space X is
called an orthogonal basis if for any x ∈ X, there exist scalars αi such that

X
x= αi u i .
i=1

If the set {ui } is orthonormal, then it is called an orthonormal basis.

(b) An orthonormal basis {ui } in a Hilbert space X is called maximal or complete if there
is no unit vector u0 in X such that {u0 , u1, u2 , . . .} is an orthonormal set. In other
words, the sequence {ui } of orthonormal basis in X is complete if and only if the only
vector orthogonal to each of ui ’s is the null vector.
In general, an orthonormal set E in an inner product space X is complete or maximal
if it is a maximal orthonormal set in X, that is, E is an orthonormal set, and for every
orthonormal set E e satisfying E ⊆ E,
e we have E e = E.

(c) Let {ui } be an orthonormal basis in a Hilbert space X, then the numbers αi = hx, ui i
are called the Fourier coefficients of the element x with respect to the system {ui } and
P ∞
i=1 αi ui is called the Fourier series of the element x.

Example 1.1.8. The set {ei : i ∈ N}, where ei = (0, 0, . . . , 0, 1, 0, . . .) with 1 lies in the ith
place, forms an orthonormal basis for ℓ2 (C).

Example 1.1.9. Let X = L2 (−π, π) be a complex Hilbert space and un be the element of
X defined by
1
un (t) = √ exp(i n t), for n = 0, ±1, ±2, . . . .

Then  
1 cos nt sin nt
√ , √ , √ : n = 1, 2, . . .
2π π π
forms an orthonormal basis for X as exp(i n t) = cos nt + i sin nt.
Qamrul Hasan Ansari Advanced Functional Analysis Page 20

Theorem 1.1.5. Let {ui : i ∈ N} be an orthonormal set in a Hilbert space X. Then the
following assertions are equivalent:

(a) {ui : i ∈ N} is an orthonormal basis for X.



X
(b) For all x ∈ X, x = hx, ui iui .
i=1


X
(c) For all x ∈ X, kxk2 = |hx, ui i|2 .
i=1

(d) hx, ui i = 0 for all i implies x = 0.

Proof. (a) ⇔ (b): Let {ui : i ∈ N} be an orthonormal basis for X. Then we can write

X n
X
x= αi u i , that is x = lim αi u i .
n→∞
i=1 i=1

For k ≤ n in N, we have
* n + n
X X
αi u i , u k = αi hui , uk i = uk .
i=1 i=1

By letting n → ∞ and using the continuity of the inner product, we obtain


hx, uk i = lim = αk ,
n→∞

and hence (b) holds.

The same argument shows that if (b) holds, then this expansion is unique and so {ui : i ∈ N}
is an orthonormal basis for X.

(b) ⇔ (c): By Pythagorean theorem and continuity of the inner product, we have
∞ 2 ∞
X X
2
kxk = hx, ui iui = |hx, ui i|2 .
i=1 i=1

P∞
(c) ⇔ (d): Let hx, ui i = 0 for all i. Then kxk2 = i=1 |hx, ui i|2 = 0 which implies that
x = 0.

X
(d) ⇔ (b): Take any x ∈ X and let y = x − hx, ui iui . Then for each k ∈ N, we have
i=1
* ∞
+
X
hy, uk i = hx, uk i − lim hx, ui iui , uk =0
n→∞
i=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 21


X
since eventually n ≥ k. It follows from (d) that y = 0, and hence x = hx, ui iui .
i=1

Theorem 1.1.6 (Fourier Series Representation). Let Y be the closed subspace spanned
by a countable orthonormal set {ui } in a Hilbert space X. Then every element x ∈ Y
can be written uniquely as

X
x= hx, ui iui . (1.6)
i=1

Proof. Uniqueness of (1.6) is a consequence of Exercise 1.1.15 (b). For any x ∈ Y , we can
write
M
X
x = lim αi ui , for M ≥ N
N →∞
i=1

as Y is closed. From Theorem 1.1.4 and Exercise 1.1.15, it follows that


M
X M
X
x− hx, ui iui ≤ x − αi u i ,
i=1 i=1

and as N → ∞, we get the desired result.

Theorem 1.1.7 (Fourier Series Theorem). For any orthonormal set {un } in a separable
Hilbert space X, the following statements are equivalent:

(a) Every x ∈ X can be represented by the Fourier series in X; that is,



X
x= hx, ui iui . (1.7)
i=1

(b) For any pair of vectors x, y ∈ X, we have



X ∞
X
hx, yi = hx, ui ihy, uii = αi βi , (1.8)
i=1 i=1

where αi = hui , xi are Fourier coefficients of x, and βi = hy, ui i are Fourier coeffi-
cients of y.

(c) For any x ∈ X, one has



X
2
kxk = |hx, ui i|2 . (1.9)
i=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 22

(d) Any subspace Y of X that contains {ui } is dense in X.

Proof. (a) ⇒ (b). It follows from (1.6) and the fact that {ui } is orthonormal.

(b) ⇒ (c). Put x = y in (1.8) to get (1.9).

(a) ⇒ (d). The statement (d) is equivalent to the statement that the orthogonal projection
onto S, the closure of S, is the identity. In view of Theorem 1.1.6, statement (d) is equivalent
to statement (a).
Exercise 1.1.17. Let X be a Hilbert space and E be an orthonormal basis of X. Prove
that E is countable if and only if X is separable.

Hint: See Theorem 4.10 on page 187 in [6].


Exercise 1.1.18. Let X be a Hilbert space and E be an orthonormal basis of X. Prove
that E is a basis of X if and only if X is finite dimension.

Hint: See Theorem 4.13 on page 189 in [6].


Exercise 1.1.19. Let X be a Hilbert space. Prove that E is an orthonormal basis of X if
and only if spanE is dense in X.
Exercise 1.1.20. If X is a Hilbert space, then show that E is an orthonormal basis if and
only if X
hx, yi = hx, ui hy, ui, for all x, y ∈ X.
u∈E

1.2 Orthogonal Projections and Projection Theorem

1.2.1 Orthogonal Projection

Let K be a nonempty subset of a normed space X. Recall that distance from an element
x ∈ X to the set K is defined by

ρ := inf kx − yk. (1.10)


y∈K

It is important to know that whether there is a z ∈ K such that

kx − zk = inf kx − yk. (1.11)


y∈K

If such point exists, whether it is unique?


Qamrul Hasan Ansari Advanced Functional Analysis Page 23

b
x

K
y b

Figure 1.1: The distance from a point x to K

b
x b
x b
x

ρ ρ ρ

K is an open segment K is an open segment K is circular arc; Infinitely


No z in K that satisfies (1.12) z is unique that satisfies (1.12) many z’s which satisfy (1.12)

Figure 1.2: The distance from a point x to K

One can see in the following figures that even in the simple space R2 , there may be no z
satisfying (1.12), or precisely one such z, or more than one z.

To get the existence and uniqueness of such z, we recall the concept of a convex set.
Definition 1.2.1. A subset K of a vector space X is said to be a convex set if for all
x, y ∈ K and α, β ≥ 0 such that α + β = 1, we have αx + βy ∈ K, that is, for all x, y ∈ K
and α ∈ [0, 1], we have αx + (1 − α)y ∈ K.

Theorem 1.2.1. Let K be a nonempty closed convex subset of a Hilbert space X. Then
for any given x ∈ X, there exists a unique z ∈ K such that

kx − zk = inf kx − yk. (1.12)


y∈K

Proof. Existence. Let ρ := inf kx − yk. By the definition of the infimum, there exists a
y∈K
sequence {yn } in K such that kx − yn k → ρ as n → ∞. We will prove that {yn } is a Cauchy
sequence.
Qamrul Hasan Ansari Advanced Functional Analysis Page 24

x
y

A convex set A nonconvex set


Figure 1.3: A convex set and a nonconvex set

b
x

K
y b

Figure 1.4: Existence and uniqueness of z that minimizes the distance from K

By using parallelogram law, we have


2
2 yn + ym 
kyn − ym k + 4 x − = 2 kx − yn k2 + kx − ym k2 , for all n, m ≥ 1.
2
Since K is a convex subset of X and yn , ym ∈ K, we have 12 (yn + ym ) ∈ K. Therefore,
x − yn +y
2
m
≥ ρ. Hence
2
2 2 2

yn + ym
kyn − ym k = 2 kx − yn k + kx − ym k − 4 x −
2
2 2
 2
≤ 2 kx − yn k + kx − ym k − 4ρ .
Let n, m → ∞, then we have kx − yn k → ρ and kx − ym k → ρ and
0 ≤ lim kyn − ym k2 ≤ 4ρ2 − 4ρ2 = 0.
n,m→∞

Therefore, lim kyn − ym k2 = 0, and thus, {yn } is a Cauchy sequence. Since X is complete,
n,m→∞
there exists z ∈ X such that lim yn = z. Since yn ∈ K and K is closed, z ∈ K. In
n→∞
conclusion, we have
kx − zk = inf kx − yk.
y∈K
Qamrul Hasan Ansari Advanced Functional Analysis Page 25

Uniqueness. Suppose that there is also ẑ ∈ K such that

kx − ẑk = inf kx − yk.


y∈K

z+ẑ
By using parallelogram law and x − 2
≥ ρ (since 21 (z + ẑ) ∈ K), we have
2
z + ẑ 
kz − ẑk2 + 4 x − = 2 kx − zk2 + kx − ẑk2 = 4ρ2 ,
2

that is,
2
2 2 z + ẑ
0 ≤ kz − ẑk = 4ρ − 4 x − ≤ 0.
2
Thus, kz − ẑk = 0, and hence, z = ẑ.

Remark 1.2.1. Theorem 1.2.1 does not hold in the setting of Banach spaces. For example, c0
is a closed subspace of ℓ∞ , but there is no closest sequence in c0 to the sequence {1, 1, 1, . . .}.
In fact, the distance between c0 and the sequence {1, 1, 1, . . .} is 1, and this is achieved by
any bounded sequence {xn } with xn ∈ [0, 2].

Theorem 1.2.2. Let K be a closed subspace of a Hilbert space X and x ∈ X be given.


There exists a unique z ∈ K which satisfies (1.12) and x − z is orthogonal to K, that is,
x − z ∈ K ⊥.

Proof. Existence. Existence of z ∈ K follows from previous theorem as every subspace is


convex.

Orthogonality. Clearly, hx − z, 0i = 0. Take y ∈ K, y 6= 0. Then we shall prove that


hx − z, yi = 0. Since z ∈ K satisfies (1.12) and z + λy ∈ K (as K is a subspace), we have

kx − zk2 ≤ kx − (z + λy)k2 = kx − zk2 + |λ|2kyk2 − λhy, x − zi − λhx − z, yi,

that is,
0 ≤ |λ|2 kyk2 − λhx − z, yi − λhx − z, yi.
hx−z,yi
Putting λ = kyk2
in the above inequality, we obtain

|hx − z, yi|2
≤ 0,
kyk2

which is only happened when hx − z, yi = 0. Since y was arbitrary, x − z is orthogonal to K.

Uniqueness. Suppose that there is also ẑ ∈ K such that x − ẑ ∈ K ⊥ . Then z − ẑ =


(x − ẑ) − (x − z) ∈ K ⊥ . On the other hand, z − ẑ ∈ K since z, ẑ ∈ K and K is a subspace.
So, z − ẑ ∈ K ∩ K ⊥ ⊂ {0}. Therefore, z − ẑ = 0, and hence, z = ẑ.
Qamrul Hasan Ansari Advanced Functional Analysis Page 26

Lemma 1.2.1. If K is a proper closed subspace of a Hilbert space X, then there exists
a nonzero vector x ∈ X such that x ⊥ K.

Proof. Let u ∈
/ K and ρ = inf ku − yk, the distance from u to K. By Theorem 1.2.1, there
y∈K
exists a unique element z ∈ K such that ku − zk = ρ. Let x = u − z. Then x 6= 0 as ρ > 0.
(If x = 0, then u − z = 0 and ku − zk = 0 implies that ρ = 0.)

Now, we show that x ⊥ K. For this, we show that for arbitrary y ∈ K, hx, yi = 0. For
any scalar α, we have kx − αyk = ku − z − αyk = ku − (z + αy)k. Since K is a subspace,
z + αy ∈ K whenever z, y ∈ K. Thus, z + αy ∈ K implies that kx − αyk ≥ ρ = kxk or
kx − αyk2 − kxk2 ≥ 0 or hx − αy, x − αyi − kxk2 ≥ 0. Since

hx − αy, x − αyi = hx, xi − αhy, xi − αhx, yi + ααhy, yi


= kxk2 − αhx, yi − αhy, xi + |α|2kyk2 ,

we have,
−αhx, yi − αhx, yi + |α|2kyk2 ≥ 0.
Putting α = βhx, yi in the above inequality, β being an arbitrary real number, we get

−2β|hx, yi|2 + β 2 |hx, yi|2kyk2 ≥ 0.

If we put a = |hx, yi|2 and b = kyk2 in the above inequality, we obtain

−2βa + β 2 ab ≥ 0,

or
βa(βb − 2) ≥ 0, for all real β.
If a > 0, the above inequality is false for all sufficiently small positive β. Hence, a must be
zero, that is, a = |hx, yi|2 = 0 or hx, yi = 0 for all y ∈ K.

Lemma 1.2.2. If M and N are closed subspaces of a Hilbert space X such that M ⊥ N,
then the subspace M + N = {x + y ∈ X : x ∈ M and y ∈ N} is also closed.

Proof. It is a well-known result of vector spaces that M + N is a subspace of X. We show


that it is closed, that is, every limit point of M + N belongs to it. Let z be an arbitrary limit
point of M + N. Then there exists a sequence {zn } of points of M + N such that zn → z.
M ⊥ N implies that M ∩ N = {0}. So, every zn ∈ M + N can be written uniquely in the
form zn = xn + yn , where xn ∈ M and yn ∈ N.

By the Pythagorean theorem for elements (xm − xn ) and (ym − yn ), we have

kzm − zn k2 = k(xm − xn ) + (ym − yn )k2 (1.13)


= kxm − xn k2 + kym − yn k2
Qamrul Hasan Ansari Advanced Functional Analysis Page 27

(It is clear that (xm − xn ) ⊥ (ym − yn ) for all m, n.) Since {zn } is convergent, it is a Cauchy
sequence and so kzm − zn k2 → 0. Hence, from (1.13), we see that kxm − xn k → 0 and
kym − yn k → 0 as m, n → ∞. Hence, {xm } and {yn } are Cauchy sequences in M and N,
respectively. Being closed subspaces of a complete space, M and N are also complete. Thus,
{xm } and {yn } are convergent in M and N, respectively, say xm → x ∈ M and yn → y ∈ N,
x + y ∈ M + N as x ∈ M and y ∈ N. Then

z = lim zn = lim (xn + yn ) = lim xn + lim y


n→∞ n→∞ n→∞ n→∞
= x + y ∈ M + N.

This proves that an arbitrary limit point of M + N belongs to it and so it is closed.

Definition 1.2.2. A vector space X is said to be the direct sum of two subspaces Y and Z
of X, denoted by X = Y ⊕ Z, if each x ∈ X has a unique representation x = y + z for y ∈ Y
and z ∈ Z.

Theorem 1.2.3 (Orthogonal Decomposition). If K is a closed subspace of a Hilbert


space X, then every x ∈ X can be uniquely represented as x = z + y for z ∈ K and
y ∈ K ⊥ , that is, X = K ⊕ K ⊥ .

Proof. Since every subspace is a convex set, by previous two results, for every x ∈ X, there
is a z ∈ K such that x − z ∈ K ⊥ , that is, there is a y ∈ K ⊥ such that y = x − z which is
equivalently to x = z + y for z ∈ K and y ∈ K ⊥ .

To prove the uniqueness, assume that there is also ŷ ∈ K ⊥ such that x = ŷ + ẑ for ẑ ∈ K.
Then x = y + z = ŷ + ẑ, and therefore, y − ŷ = ẑ − z. Since y − ŷ ∈ K ⊥ whereas ẑ − z ∈ K,
we have y − ŷ ∈ K ∩ K ⊥ = {0}. This implies that y = ŷ, and hence also z = ẑ.

y = PK ⊥ (x)

z = PK (x)

Figure 1.5: Orthogonal decomposition


Qamrul Hasan Ansari Advanced Functional Analysis Page 28

Example 1.2.1. (a) Let X = L2 (−1, 1). Then X = K ⊕ K ⊥ , where K is the space of
even functions, that is,
K = {f ∈ L2 (−1, 1) : f (−t) = f (t) for all t ∈ (−1, 1)},
and K ⊥ is the space of odd functions, that is,
K ⊥ = {f ∈ L2 (−1, 1) : f (−t) = −f (t) for all t ∈ (−1, 1)}.

(b) Let X = L2 [a, b]. For c ∈ [a, b], let


K = {f ∈ L2 [a, b] : f (t) = 0 almost everywhere in (a, c)}
and
K ⊥ = {f ∈ L2 [a, b] : f (t) = 0 almost everywhere in (c, b)}.
Then X = K ⊕ K ⊥ .
Exercise 1.2.1. Give examples of representations of R3 as a direct sum of a subspace and
its orthogonal complement.
Exercise 1.2.2. Let K be a subspace of an inner product space X. Show that x ∈ K ⊥ if
and only if kx − yk ≥ kxk for all y ∈ K.
Definition 1.2.3. Let K be a closed subspace of a Hilbert space X. A mapping PK : X → K
defined by
PK (x) = z, where x = z + y and (z, y) ∈ K × K ⊥ ,
is called the orthogonal projection of X onto K.

Let X and Y be normed spaces and T : X → Y be an operator.

(a) The range of T is R(T ) := {T (x) ∈ Y : x ∈ X}.

(b) The null space or kernel of T is N (T ) := {x ∈ X : T (x) = 0}.

(c) The operator T is called an idempotent if T 2 = T .

2
Let X be a vector space. A linear operator P : X → X is called projection operator if P ◦ P = P 2 = P .

Theorem 1.2.4. If P : X → X is a projection operator from a vector space X to itself, then X =


R(P ) ⊕ N (P ), where R(P ) is the range set of P and N (P ) = {x ∈: P (x) = 0} is the null space of P .

Theorem 1.2.5. If a vector space X is expressed as the directed sum of its subspaces Y and Z, then
there is a uniquely determined projection P : X → X such that Y = R(P ) and Z = N (P ) = R(I − P ),
where I be the identity mapping on X.
Qamrul Hasan Ansari Advanced Functional Analysis Page 29

Theorem 1.2.6 (Existence of Projection Mapping). Let K be a closed subspace of a


Hilbert space X. Then there exists a unique mapping PK from X onto K such that
R(PK ) = K.

Proof. By Theorem 1.2.3, X = K ⊕ K ⊥ . Theorem 1.2.5 ensures the existence of a unique


projection PK such that R(PK ) = K and N (PK ) = K ⊥ . This projection is an orthogonal
projection as its null space and range are orthogonal.

Similarly, it can be verified that the orthogonal projection I − PK corresponds to the case
R(I − PK ) = K ⊥ and N (I − PK ) = K.

Exercise 1.2.3. Let K be a closed subspace of a Hilbert space X and I be the identity
mapping on X. Then prove that there exists a unique mapping PK from X onto K such
that I − PK maps X onto K ⊥ .

Such map PK is the projection mapping of X onto K.

Exercise 1.2.4 (Properties of Projection Mapping). Let K be a closed subspace of a Hilbert


space X, I be the identity mapping on X and PK is the projection mapping from X onto
K. Then prove that the following properties hold for all x, y ∈ X.

(a) Each element x ∈ X has a unique representation as a sum of an element of K and an


element of K ⊥ , that is,
x = PK (x) + (I − PK )(x). (1.14)
(Hint: Compare with Theorem 1.2.3)

(b) kxk2 = kPK (x)k2 + k(I − PK )(x)k2 .

(c) x ∈ K if and only if PK (x) = x.

(d) x ∈ K ⊥ if and only if PK (x) = 0.

(e) If K1 and K2 are closed subspaces of X such that K1 ⊆ K2 , then PK1 (PK2 (x)) =
PK1 (x).

(f) PK is a linear mapping, that is, for all α, β ∈ R and all x, y ∈ X, PK (αx + βy) =
αPK (x) + βPK (y).

(g) PK is a continuous mapping, that is, xn −→ x (that is, kxn − xk −→ 0) implies


n→∞ n→∞
PK (xn ) −→ PK (x) (that is, kPK (xn ) − PK (x) −→ 0).
n→∞ n→∞

Exercise 1.2.5 (Properties of Projection Mapping). Let K be a closed subspace of a Hilbert


space X, I be the identity mapping on X and PK is the projection mapping from X onto
K. Then prove that the following properties hold for all x, y ∈ X.
Qamrul Hasan Ansari Advanced Functional Analysis Page 30

(a) Each element z ∈ X can be written uniquely as

z = x + y, where x ∈ R(PK ) and y ∈ N (PK ).

(b) The null space N (PK ) and the range set R(PK ) are closed subspaces of X.

(c) N (PK ) = (R(PK ))⊥ and R(PK ) = N (PK )⊥ .

(d) PK is idempotent.

Exercise 1.2.6. Let K1 and K2 be closed subspaces of a Hilbert space X and PK1 and PK2
be orthogonal projections onto K1 and K2 , respectively. If hx, yi = 0 for all x ∈ K1 and
y ∈ K2 , then prove that

(a) K1 + K2 is a closed subspace of X;

(b) PK1 + PK2 is the orthogonal projection onto K1 + K2 ;

(c) PK1 PK2 ≡ 0 ≡ PK2 PK1 .

1.2.2 Projection on Convex Sets

We discuss here the concepts of projection and projection operator on convex sets which are
of vital importance in such diverse fields as optimization, optimal control and variational
inequalities.

Definition 1.2.4. Let K be a nonempty closed convex subset of a Hilbert space X. For
x ∈ X, by projection of x on K, we mean the element z ∈ K, denoted by PK (x), such that

kx − PK (x)k ≤ kx − yk, for all y ∈ K, (1.15)

equivalently,
kx − zk = inf kx − yk. (1.16)
y∈K

An operator on X into K, denoted by PK , is called the projection operator if PK (x) = z,


where z is the projection of x on K.

In view of Theorem 1.2.1, there always exists a z ∈ K which satisfies (1.16)

Theorem 1.2.7 (Variational Characterization of Projection). Let K be a nonempty


closed convex subset of a Hilbert space X. For any x ∈ X, z ∈ K is the projection of x
if and only if
hx − z, y − zi ≤ 0, for all y ∈ K. (1.17)
Qamrul Hasan Ansari Advanced Functional Analysis Page 31

Proof. Let z be the projection of x ∈ X. Then for any α, 0 ≤ α ≤ 1, since K is convex,


αy + (1 − α)z ∈ K for all y ∈ K. Define a real-valued function g : [0, 1] → R by
g(α) := kx − (αy + (1 − α)z)k2 , for all α ∈ [0, 1]. (1.18)
Then g is a twice continuously differentiable function of α. Moreover,
g ′ (α) = 2hx − αy − (1 − α)z, z − yi
g ′′ (α) = 2hz − y, z − yi. (1.19)

b
x

K
y b

Figure 1.6: The projection of a point x onto K

Now, for z to be the projection of x, it is clear that g ′ (0) ≥ 0, which is (1.17).

In order to prove the converse, let (1.17) be satisfied for some element z ∈ K. This implies
that g ′(0) is non-negative, and by (1.19), g ′′ (α) is non-negative. Hence, g(0) ≤ g(1) for all
y ∈ K such that (1.16) is satisfied.
Remark 1.2.2. The inequality (1.17) shows that x − z and y − z subtend a non-acute angle
between them. The projection PK (x) of x on K can be interpreted as the result of applying
to x the operator PK : X → K, which is called projection operator. Note that PK (x) = x
for all x ∈ K.

Theorem 1.2.8. The projection operator PK defined on a Hilbert space X into its
nonempty closed convex subset K has the following properties:

(a) PK is a nonexpansive, that is, kPK (x) − PK (y)k ≤ kx − yk for all x, y ∈ X; which
implies that PK is continuous.

(b) hPK (x) − PK (y), x − yi ≥ 0 for all x, y ∈ X.

Proof. (a) From (1.17), we obtain


hPK (x) − x, PK (x) − yi ≤ 0, for all y ∈ K. (1.20)
Qamrul Hasan Ansari Advanced Functional Analysis Page 32

Put x = x1 in (1.20), we get

hPK (x1 ) − x1 , PK (x1 ) − yi ≤ 0, for all y ∈ K. (1.21)

Put x = x2 in (1.20), we get

hPK (x2 ) − x2 , PK (x2 ) − yi ≤ 0, for all y ∈ K. (1.22)

Since PK (x2 ) and PK (x1 ) ∈ K, choose y = PK (x2 ) and y = PK (x1 ), respectively, in (1.21)
and (1.22), we obtain

hPK (u1 ) − u1 , PK (u1 ) − PK (u2 )i ≤ 0


hPK (u2 ) − u2 , PK (u2 ) − PK (u1 )i ≤ 0.

From above two inequalities, we obatin

hPK (x1 ) − x1 − PK (x2 ) + x2 , PK (x1 ) − PK (x2 )i ≤ 0,

or
hPK (x1 ) − PK (x2 ), PK (x1 i − PK (x2 )i ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i,
equivalently,
kPK (x1 ) − PK (x2 )k2 ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i. (1.23)
Therefore, by the Cauchy-Schwartz-Bunyakowski inequality, we get

kPK (x1 ) − PK (x2 )k2 ≤ kx1 − x2 k kPK (x1 ) − PK (x2 )k , (1.24)

and hence,
kPK (x1 ) − PK (x2 )k ≤ kx1 − x2 k . (1.25)

(b) follows from (1.23).

The geometric interpretation of the nonexpansivity of PK is given in the following figure.


We observe that if strict inequality holds in (a), then the projection operator PK reduces
the distance. However, if the equality holds in (a), then the distance is conserved.
Qamrul Hasan Ansari Advanced Functional Analysis Page 33
b
x

x̃ b

b PK (x) b

PK (x̃)

ỹ b

K
b

PK (ỹ)
PK (y) b

b y
Figure 1.7: The nonexpansiveness of the projection operator

1.3 Bilinear Forms and Lax-Milgram Lemma

Let X and Y be inner product spaces over the same field K (= R or C). A functional
a(·, ·) : X × Y → K will be called a form.

Definition 1.3.1. Let X and Y be inner product spaces over the same field K (= R or C).
A form a(·, ·) : X × Y → K is called a sesquilinear functional or sesquilinear form if the
following conditions are satisfied for all x, x1 , x2 ∈ X, y, y1, y2 ∈ Y and all α, β ∈ K:

(i) a(x1 + x2 , y) = a(x1 , y) + a(x2 , y).

(ii) a(αx, y) = αa(x, y).

(iii) a(x, y1 + y2 ) = a(x, y1 ) + a(x, y2 ).

(iv) a(x, βy) = βa(x, y).

Remark 1.3.1. (a) The sesquilinear functional is linear in the first variable but not so
in the second variable. A sesquilinear functional which is also linear in the second
variable is called a bilinear form or a bilinear functional. Thus, a bilinear form a(·, ·) is
a mapping defined from X × Y into K which satisfies conditions (i) - (iii) of the above
definition and a(x, βy) = βa(x, y).

(b) If X and Y are real inner product spaces, then the concepts of sesquilinear functional
and bilinear form coincide.

(c) An inner product is an example of a sesquilinear functional. The real inner product is
an example of a bilinear form.
Qamrul Hasan Ansari Advanced Functional Analysis Page 34

(d) If a(·, ·) is a sesquilinear functional, then g(x, y) = a(y, x) is a sesquilinear functional.


Definition 1.3.2. Let X and Y be inner product spaces. A form a(·, ·) : X × Y → K is
called:

(a) symmetric if a(x, y) = a(y, x) for all (x, y) ∈ X × Y ;


(b) bounded or continuous if there exists a constant M > 0 such that
|a(x, y)| ≤ Mkxk kyk, for all x ∈ X, y ∈ Y,
and the norm of a is defined as
 
|a(x, y)| x y
kak = sup = sup a ,
x6=0 y6=0 kxk kyk x6=0 y6=0 kxk kyk
= sup |a(x, y)|.
kxk=kyk=1

It is clear that |a(x, y)| ≤ kak kxk kyk.


Remark 1.3.2. Let a(·, ·) : X×Y → K be a continuous form and {xn } and {yn } be sequences
in X and Y , respectively, such that xn → x and yn → y. Then a(xn , yn ) → a(x, y).

Indeed,
|a(xn , yn ) − a(x, y)| ≤ |a(xn − x, yn )| + |a(x, yn − y)|
≤ kak (kxn − xk kyn k + kxk kyn − yk) .
Definition 1.3.3. Let X be an inner product space. A form a(·, ·) : X × X → K is called:

(a) positive if a(x, x) ≥ 0 for all x ∈ X;


(b) positive definite if a(x, x) ≥ 0 for all x ∈ X and a(x, x) = 0 implies that x = 0;
(c) coercive or X-elliptic if there exists a constant α > 0 such that a(x, x) ≥ αkxk2 for all
x ∈ X.
Example 1.3.1. Let X = Rn with the usual Euclidean inner product. Then any n × n
metrix with real entries defines a continuous bilinear form.

If A = (aij ), 1 ≤ i, j ≤ n, and if we have x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ), then the


bilinear form is defined as n
X
a(x, y) := aij xj yi = y ⊤ Ax,
i,j=1

where x and y are considered as column vectors and y ⊤ denotes the transpose of y. By the
Cauchy-Schwarz inequality, we have
|a(x, y)| = |y ⊤ Ax| = |hy, Axi|
≤ kyk kAxk ≤ kAk kxk kyk.
Qamrul Hasan Ansari Advanced Functional Analysis Page 35

If A is a symmetric and positive definite matrix, then the bilinear form is symmetric and
coercive since we know that
X n
aij yj yi ≥ αkyk2,
i,j=1

where α > 0 is the smallest eigenvalue of the matrix A.

Theorem 1.3.1 (Extended form of Riesz Representation Theorem). Let X and Y be


Hilbert spaces and a(·, ·) : X × Y → K be a bounded sesquilinear form. Then there exists
a unique bounded linear operator T : X → Y such that

a(x, y) = hT (x), yi, for all (x, y) ∈ X × Y, (1.26)

and kak = kT k.

Proof. For each fixed x ∈ X, define a functional fx : Y → K by

fx (y) = a(x, y), for all y ∈ Y. (1.27)

Then, fx is a linear functional since for all y1 , y2 ∈ Y and all α ∈ K, we have

fx (y1 + y2 ) = a(x, y1 + y2 ) = a(x, y1 ) + a(x, y2 ) = fx (y1 ) + fx (y2 )


fx (αy) = a(x, αy) = αa(x, y) = αfx (y).

Since a(·, ·) is bounded, we have

|fx (y)| = |a(x, y)| = |a(x, y)| ≤ kak kxk kyk,

that is,
kfx k ≤ kak kxk.
Thus, fx is a bounded linear functional on Y . By Riesz representation theorem3 , there exists
a unique vector y ∗ ∈ Y such that

fx (y) = hy, y ∗i, for all y ∈ Y. (1.28)

The vector y ∗ depends on the choice vector x. Therefore, we can write y ∗ = T (x) where
T : X → Y . We observe that

a(x, y) = hy, T (x)i or a(x, y) = hT (x), yi, for all x ∈ X and y ∈ Y.

Since y ∗ is unique, the operator T is uniquely determined.


3
Riesz Representation Theorem. If f is a bounded linear functional on a Hilbert space X, then
there exists a unique vector y ∈ X such that f (x) = hx, yi for all x ∈ X and kf k = kyk
Qamrul Hasan Ansari Advanced Functional Analysis Page 36

The operator T is linear in view of the following relations: For all y ∈ Y , x, x1 , x2 ∈ X and
α ∈ K, we have

hT (x1 + x2 ), yi = a(x1 + x2 , y) = a(x1 , y) + a(x2 , y)


= hT (x1 ), yi + hT (x2 , yi,
hT (αx1 ), yi = a(αx, y) = αa(x, y) = αhT (x), yi.

Moreover, T is continuous as

kfx k = ky ∗ k = kT (x)k ≤ kak kxk

implies that kT k ≤ kak.

To prove that kT k = kak, it is enough to show that kT k ≥ kak which follows from the
following relation:
|a(x, y)| |hT (x), yi|
kak = sup = sup
x6=0 y6=0 kxk kyk x6=0 y6=0 kxk kyk
kT (x)k kyk
≤ sup = kT k.
x6=0 y6=0 kxk kyk

To prove the uniqueness of T , let us assume that there is another linear operator S : X → Y
such that
a(x, y) = hS(x), yi, for all (x, y) ∈ X × Y.
Then, for every x ∈ X and y ∈ Y , we have

a(x, y) = hT (x), yi = hS(x), yi

equivalently, h(T − S)(x), yi = 0. This implies that (T − S)(x) = 0 for all x ∈ X, that
is, T ≡ S. This proves that there exists a unique bounded linear operator T such that
a(x, y) = hT (x), yi.
Remark 1.3.3 (Converse of above theorem). Let X and Y be Hilbert spaces and T : X → Y
be a bounded linear operator. Then the form a(·, ·) : X × Y → K defined by

a(x, y) = hT (x), yi, for all (x, y) ∈ X × Y, (1.29)

is a bounded sesquilinear form on X × Y .

Proof. Since T is a bounded linear operator on X × Y and the inner product is a sesquilinear
mapping, we have that a(x, y) = hT (x), yi is sesquilinear.

Since |a(x, y)| = |hT (x), yi| ≤ kT k kxk kyk, by the Cauchy-Schwartz-Bunyakowski inequality,
we have sup |a(x, y)| ≤ kT k, and hence a(·, ·) is bounded.
kxk=kyk=1
Qamrul Hasan Ansari Advanced Functional Analysis Page 37

Corollary 1.3.1. Let X be a Hilbert space and T : X → X be a bounded linear operator.


Then the complex-valued function b(·, ·) : X × X → C defined by b(x, y) = hx, T (y)i is a
bounded bilinear form on X and kbk = kT k.

Conversely, if b(·, ·) : X × X → C is a bounded bilinear form, then there is a unique


bounded linear operator T : X → X such that b(x, y) = hx, T (y)i for all (x, y) ∈ X × X.

Proof. Define a function a(·, ·) : X × X → C by

a(x, y) = b(y, x) = hT (x), yi.

By Theorem 1.3.1, a(x, y) is a bounded bilinear form on X and kak = kT k. Since we have
b(x, y) = a(y, x); b is also bounded bilinear on X and

kbk = sup |b(x, y)| = sup |a(y, x)| = kak = kT k.


kxk=kyk=1 kxk=kyk=1

Conversely, if b is given, we define a bounded bilinear form a(·, ·) : X × X → C by

a(x, y) = b(y, x), for all x, y ∈ X.

Again, by Theorem 1.3.1, there is a bounded linear operator T on X such that

a(x, y) = hT (x), yi, for all (x, y) ∈ X × X.

Therefore, we have b(x, y) = a(y, x) = hT (y), xi = hx, T (y)i for all (x, y) ∈ X × X.

Corollary 1.3.2. Let X be a Hilbert space. If T is a bounded linear operator on X,


then
kT k = sup |hx, T (y)i| = sup |hT (x), yi|.
kxk=kyk=1 kxk=kyk=1

Proof. By Theorem 1.3.1, for every bounded linear operator on X, there is a bounded bilinear
form a such that a(x, y) = hT (x), yi and kak = kT k. Then,

kak = sup |a(x, y)| = sup hT (x), yi.


kxk=kyk=1 kxk=kyk=1

From this, we conclude that kT k = sup t|hT (x), yi|.


kxk=kyk=1

Definition 1.3.4. Let X be a Hilbert space and a(·, ·) : X × X → K be a form. Then the
operator F : X → K is called a quadratic form associated with a(·, ·) if F (x) = a(x, x) for
all x ∈ X.

A quadratic form F is called real if F (x) is real for all x ∈ X.


Qamrul Hasan Ansari Advanced Functional Analysis Page 38

Remark 1.3.4. (a) We immediately observe that F (αx) = |α|2F (x) and |F (x)| ≤ kak kxk.
(b) The norm of F is defined as
|F (x)|
kF k = sup = sup |F (x)|.
x6=0 kxk2 kxk=1

Remark 1.3.5. If a(·, ·) is any fixed sesquilinear form and F (x) is an associated quadratic
form on a Hilbert space X. Then

1 x+y
 x−y

(a) 2
[a(x, y) + a(y, x)] = F 2
−F 2
;
(b) a(x, y) = 14 [F (x + y) − F (x − y) + iF (x + iy) − iF (x − iy)].

Varification. By using linearity of the bilinear form a, we have

F (x + y) = a(x + y, x + y) = a(x, x) + a(y, x) + a(x, y) + a(y, y)

and
F (x − y) = a(x − y, x − y) = a(x, x) − a(y, x) − a(x, y) + a(y, y).
By subtracting the second of the above equation from the first, we get

F (x + y) − F (x − y) = 2a(x, y) + 2a(y, x). (1.30)

Replacing y by iy in (1.30), we obtain

F (x + iy) − F (x − iy) = 2a(x, iy) + 2a(iy, x),

or
F (x + iy) − F (x − iy) = 2ia(x, y) + 2ia(y, x). (1.31)
Multiplying (1.31) by i and adding it to (1.30), we get the result.

Lemma 1.3.1. A bilinear form a(·, ·) : X × X → K is symmetric if and only if the


associated quadratic functional F (x) is real.

Proof. If a(x, y) is symmetric, then we have

F (x) = a(x, x)
= a(x, x)
= F (x).

This implies that F (x) is real.

Conversely, let F (x) be real, then by Remark 1.3.5 (d) and in view of the relation

F (x) = F (−x) = F (ix)


Qamrul Hasan Ansari Advanced Functional Analysis Page 39

F (x) = a(x, x), F (−x) = a(x, x) = a(−x, −x),
and 
F (ix) = a(ix, ix) = iia(x, x) = a(x, x) ,
we obtain,
1
a(y, x) = [F (x + y) − F (y − x) + iF (y + ix) − iF (y − ix)]
4
1
= [F (x + y) − F (x − y) + iF (x − iy) − iF (x + iy)]
4
1
= [F (x + y) − F (x − y) + iF (x + iy) − iF (x − iy)]
4
= a(x, y).
Hence, a(·, ·) is symmetric.

Lemma 1.3.2. A bilinear form a(·, ·) : X × X → K is bounded if and only if the


associated quadratic form F is bounded. If a(·, ·) is bounded, then kF k ≤ kak ≤ 2kF k.

Proof. Suppose that a(·, ·) is bounded. Then we have


sup |F (x)| = sup |a(x, x)| ≤ sup |a(x, y)| = kak,
kxk=1 kxk=1 kxk=kyk=1

and, therefore, F is bounded and kF k ≤ kak.

On the other hand, suppose F is bounded. From Remark 1.3.5 (d) and the parallelogram
law, we get
1
|a(x, y)| ≤ kF k(kx + yk2 + kx − yk2 + kx + iyk2 + kx − iyk2)
4
1 
= kF k2 kxk2 + kyk2 + kxk2 + kyk2
4 
= kF k kxk2 + kyk2 ,
or
sup |a(x, y)| ≤ 2kF k.
kxk=kyk=1

Thus, a(·, ·) is bounded and kak ≤ 2kF k.

Theorem 1.3.2. Let X be a Hilbert space and T : X → X be a bounded linear operator.


Then the following statements are equivalent:

(a) T is self-adjoint.

(b) The bilinear form a(·, ·) on X defined by a(x, y) = hT (x), yi is symmetric.


Qamrul Hasan Ansari Advanced Functional Analysis Page 40

Proof. (a) ⇒ (b): F (x) = hT (x), xi = hx, T (x)i = hT (x), xi = F (x). In view of Lemma
1.3.1, we obtain the result.

(b) ⇒ (a): hT (x), yi = a(x, y) = a(y, x) = hT (y), xi = hx, T (y)i. This shows that T ∗ ≡ T
that T is self-adjoint.

Theorem 1.3.3. Let X be a Hilbert space. If a bilinear form a(·, ·) : X × X → K is


bounded and symmetric, then kak = kF k, where F is the associated quadratic functional.

The following theorem, known as the Lax-Milgram lemma proved by PD Lax and AN Mil-
gram in 1954, has important applications in different fields.
Theorem 1.3.4 (Lax-Milgram Lemma). Let X be a Hilbert space, a(·, ·) : X × X → R
be a coercive bounded bilinear form, and f : X → R be a bounded linear functional. Then
there exists a unique element x ∈ X such that

a(x, y) = f (y), for all y ∈ X. (1.32)

Proof. Since a(·, ·) is bounded, there exists a constant M > 0 such that
|a(x, y)| ≤ Mkxk kyk. (1.33)
By Theorem 1.3.1, there exists a bounded linear operator T : X → X such that
a(x, y) = hT (x), yi, for all (x, y) ∈ X × X.
By Riesz representation theorem4 , there exists a continuous linear functional f : X → R
such that equation a(x, y) = f (y) can be rewritten as, for all λ > 0,
hλT (x), yi = λhf, yi, (1.34)
or
hλT (x) − λf, yi = 0, for all y ∈ X.
This implies that
λT (x) = λf. (1.35)

We will show that (1.35) has a unique solution by showing that for appropriate values of
parameter ρ > 0, the affine mapping for y ∈ X, y 7→ y − ρ(λT (y) − λf ) ∈ X is a contraction
mapping. For this, we observe that
ky − ρλT (y)k2 = hy − ρλT (y), y − ρλT (y)i
= kyk2 − 2ρhλT (y), yi + ρ2 kλT (y)k2 (by applying inner product axioms)
≤ kyk2 − 2ραkyk2 + ρ2 M 2 kyk2,
4
Riesz Representation Theorem. If f is a bounded linear functional on a Hilbert space X, then there
exists a unique vector y ∈ X such that f (x) = hx, yi for all x ∈ X and kf k = kyk
Qamrul Hasan Ansari Advanced Functional Analysis Page 41

as
a(y, y) = hλT (y), yi ≥ αkyk2 (by the coercivity), (1.36)
and
kλT (y)k ≤ Mkyk (by boundedness of T ).
Therefore,
ky − ρλT (y)k2 ≤ (1 − 2ρα + ρ2 M 2 )kyk2 , (1.37)
or
ky − ρλT (y)k ≤ (1 − 2ρα + ρ2 M 2 )1/2 kyk. (1.38)
Let S(y) = y − ρ (λT (y) − λf ). Then

kS(y) − S(z)k = k(y − ρ(λT (y) − λf (u))) − (z − ρ(λT (z) − λf (u)))k


= k(y − z) − ρ(λT (y − z))k
≤ (1 − 2ρα + ρ2 M 2 )1/2 ky − zk (by (1.38). (1.39)

This implies that S is a contraction mapping if 0 < 1−2ρα+ρ2 M 2 < 1 which is equivalent to
the condition that ρ ∈ (0, 2α/M 2). Hence, by the Banach contraction fixed point theorem,
S has a unique fixed point which is the unique solution.

Remark 1.3.6 (Abstract Variational Problem). Find an element x such that

a(x, y) = f (y), for all y ∈ X,

where a(x, y) and f are as in Theorem 1.3.4.

This problem is known as abstract variational problem. In view of the Lax-Milgram lemma,
it has a unique solution.
2

Spectral Theory of Continuous and Compact


Linear Operators

Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range
R(T ) and null space N (T ) of T are defined, respectively, as

R(T ) = {T (x) : x ∈ X} and N (T ) = {x ∈ X : T (x) = 0}.

The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the
nullity of T .

It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}.

Recall that

c = space of convergent sequences of real or complex numbers


= {{xn } ⊆ K : {xn } is convergent}
c0 = space of convergent sequences of real or complex numbers that converge to zero
= {{xn } ⊆ K : xn → 0 as n → ∞}
[∞
c00 = {{x1 , x2 , . . .} ⊆ K : xj = 0 for j ≥ k}
k=1
ℓ∞ = {{xn } ⊆ K : sup |xn | < ∞}
n∈N

Clearly, c00 ⊆ c0 ⊆ c ⊆ ℓ∞ .

42
Qamrul Hasan Ansari Advanced Functional Analysis Page 43

2.1 Compact Linear Operators on Normed Spaces

Recall that the set {T (x) : kxk ≤ 1} is closed and bounded if T : X → Y is a bounded linear
operator from a normed space X to another normed space Y . However, if T : X → Y is
bounded linear operator of finite rank, then the set {T (x) : kxk ≤ 1} is compact as every
closed and bounded subset of a finite dimensional normed space is compact. But this is
not true if the rank of T (dimension of R(T ) is called rank of T ) is infinity. For example,
consider the identity operator I : X → X on an infinite dimensional normed space X, then
the above set reduces to the closed unit ball {x ∈ X : kxk ≤ 1} which is not compact.

Let T : X → Y be a bounded linear operator from a normed space X to another normed


space Y . Then for any r > 0, we have

{T (x) : kxk ≤ r} is compact ⇔ {T (x) : kxk ≤ 1} is compact


{T (x) : kxk < r} is compact ⇔ {T (x) : kxk < 1} is compact

Definition 2.1.1 (Compact linear operator). Let X and Y be normed spaces. A linear
operator T : X → Y is said be compact or completely continuous if the image T (M) of every
bounded subset M of X is relatively compact, that is, T (M) is compact for every bounded
subset M of X.

Lemma 2.1.1. Let X and Y be normed spaces.

(a) Every compact linear operator T : X → Y is bounded, and hence continuous.

(b) If dimX = ∞, then the identity operator I : X → X (which is always continuous)


is not compact.

Proof. (a) Since the unit space S = {x ∈ X : kxk = 1} is bounded and T is a compact linear
operator, T (S) is compact, and hence is bounded1 . Therefore,

sup kT (x)k < ∞.


kxk=1

Hence T is bounded and so it is continuous.

(b) Note that the closed unit ball B = {x ∈ X : kxk ≤ 1} is bounded. If dimX = ∞, then
B cannot be compact2 . Therefore, I(B) = B = B is not relatively compact.
1
Every compact subset of a normed space is closed and bounded
2
The normed space X is finite dimensional if and only if the closed unit ball is compact
Qamrul Hasan Ansari Advanced Functional Analysis Page 44

Exercise 2.1.1. Let X and Y be normed spaces and T : X → Y be a linear operator. Then
prove that the following statements are equivalent.

(a) T is a compact operator.

(b) {T (x) : kxk < 1} is compact in Y .

(c) {T (x) : kxk ≤ 1} is compact in Y .

Proof. Clearly, (a) implies (b) and (c). Assume that (c) holds, that is, {T (x) : kxk ≤ 1}
is compact in Y . Let M be a bounded subset of X. Then, there exists r > 0 such that
M ⊆ {x ∈ X : kxk ≤ r}. Since

T (M) ⊆ {T (x) ∈ Y : x ∈ X, kxk < r} ⊆ {T (x) ∈ Y : x ∈ X, kxk ≤ r},

and the fact that a closed subset of a compact set is compact, it follows that (c) implies (b)
and (a), and (b) implies (a).

Theorem 2.1.1 (Compactness criterion). Let X and Y be normed spaces and T : X →


Y be a linear operator. Then T is compact if and only if it maps every bounded sequence
{xn } in X onto a sequence {T (xn )} in Y which has a convergent subsequence.

Proof. If T is compact and {xn } is bounded. Then we can assume that kxn k ≤ c for
every n ∈ N and some constant c > 0. Let M = {x ∈ X : kxk ≤ c}. Then {T (xn )} is a
sequence in the closure of {T (xn )} in Y which is compact, and hence it contains a convergent
subsequence.

Conversely, assume that every bounded sequence {xn } contains a subsequence {xnk } such
that {T (xnk )} converges in Y . Let B be a bounded subset of X. To show that T (B) is
compact, it is enough to prove that every sequence in it has a convergent subsequence.
Suppose that {yn } be any sequence in T (B). Then yn = T (xn ) for some xn ∈ B and {xn } is
bounded since B is bounded. By assumption, {T (xn )} contains a convergent subsequence.
Hence T (B) is compact3 because {yn } in T (B) was arbitrary. It shows that T is compact.

Remark 2.1.1. Sum T1 + T2 of two compact linear operators T1 , T2 : X → Y is compact.


Also, for all α scalar, αT1 is compact. Therefore the set of compact linear operators, denoted
by K(X, Y ) from a normed space X to another normed space Y forms a vector space.

Exercise 2.1.2. Let X and Y be normed spaces. Prove that K(X, Y ) is a subspace of
B(X, Y ) the space of all bounded linear operators from X to Y .

Exercise 2.1.3. Let T : X → X be a compact linear operator and S : X → X be a bounded


linear operator on a normed space X. Then prove that T S and ST are compact.
3
A set is compact if every sequence has a convergent subsequence
Qamrul Hasan Ansari Advanced Functional Analysis Page 45

Proof. Let B be any bounded subset of X. Since S is bounded, S(B) is a bounded set
and T (S(B)) = T S(B) is relatively compact because T is compact. Hence T S is a linear
compact operator.

To prove ST is also compact, let {xn } be any bounded sequence in S. Then {T (xn )} has
convergent subsequence {T (xnk )} by Theorem 2.1.1 and {ST (xnk } is convergent. Hence ST
is compact again by Theorem 2.1.1.
Example 2.1.1. Let 1 ≤ p ≤ ∞ and X = ℓp . Let T : X → X be the right shift operator on
X defined by 
0, if i = 1,
(T (x)(i)) :=
x(i − 1), if i > 1.
Since 
21/p , if 1 ≤ p < ∞,
T (en ) = en+1 , ken − em k =
1, if = ∞,
for all n, m ∈ N, n 6= m, it follows that, corresponding to the bounded sequence {en },
{A(en )} does not have a convergent subsequence. Hence, by Theorem 2.1.1, the operator T
is not compact.
Exercise 2.1.4. Prove that the left shift operator on ℓp space is not compact for any p with
1 ≤ p ≤ ∞.
Definition 2.1.2. An operator T ∈ B(X, Y ) with dimT (X) < ∞ is called an operator of
finite rank.

Theorem 2.1.2 (Finite dimensional domain or range). Let X and Y be normed spaces
and T : X → Y be a linear operator.

(a) If T is bounded and dimT (X) < ∞, then the operator T is compact. That is, every
bounded linear operator of finite rank is compact.

(b) If dim(X) < ∞, then the operator T is compact.

Proof. (a) Let {xn } be any bounded sequence in X. Then the inequality kT (xn )k ≤ kT k kxn k
shows that the sequence {T (xn )} is bounded. Hence {T (xn )} is relatively compact4 since
dimT (X) < ∞. It follows that {T (xn )} has a convergent subsequence. Since {xn } was an
arbitrary bounded sequence in X, the operator T is compact by Theorem 2.1.1.

(b) It follows from (a) by noting that dim(X) < ∞ implies the boundedness of T 5 .
Exercise 2.1.5. Prove that the identity operator on a normed space is compact if and only
if the space is of finite dimension.

4
In a finite dimensional space, a set is compact if and only if it is closed and bounded
5
Every linear operator is bounded on a finite dimensional normed space X
Qamrul Hasan Ansari Advanced Functional Analysis Page 46

Theorem 2.1.3 (Sequence of compact linear operators). Let {Tn } be a sequence of


compact linear operators from a normed space X to a Banach space Y . If {Tn } is
uniformly operator convergent to an operator T (that is, kTn − T k → 0), then the limit
operator T is compact.

Proof. Let {Tn } be a sequence in K(X, Y ) such that kTn − T k → 0 as n → ∞. In order to


prove that T ∈ K(X, Y ), it is enough to show that for any bounded sequence {xn } in X, the
image sequence {T (xn )} has a convergent subsequence, and then apply Theorem 2.1.1.

Let {xn } be a bounded sequence in X, and ε > 0 be given. Since {Tn } is a sequence in
K(X, Y ), there exists N ∈ N such that

kTn − T k < ε, for all n ≥ N.

Since TN ∈ K(X, Y ), there exists a subsequence {x̃n } of {xn } such that {TN (x̃n )} is conver-
gent. In particular, there exists n0 ∈ N such that

kTN (x̃n ) − TN (x̃m )k < ε, for all m, n ≥ n0 .

Hence we obtain for n, m ≥ n0

kT (x̃n ) − T (x̃m )k ≤ kkT (x̃n ) − TN (x̃n )k + kTN (x̃n ) − TN (x̃m )k + kTN (x̃m ) − T (x̃m )k
≤ kT − TN k kx̃j k + kTN (x̃n ) − TN (x̃m )k + kTN − T k kx̃m k
≤ (2c + 1)ε,

where c > 0 is such that kxn k ≤ c for all n ∈ N. This shows that {T (x̃n )} is a Cauchy
sequence and hence converges since Y is complete. Remembering that {x̃n } is a subsequence
of the arbitrary bounded sequence {xn }, we see that Theorem 2.1.1 implies compactness of
the operator T .

Remark 2.1.2. The above theorem does not hold if we replace unform operator convergence
by strong operator convergence kTn (x) − T (x)k → 0. For example, consider the sequence
Tn : ℓ2 → ℓ2 defined by Tn (x) = (ξ1 , . . . , ξn , 0, 0, . . .), where x = {ξj } ∈ ℓ2 . Since T is linear
and bounded, Tn is compact by Theorem 2.1.2 (a). Clearly, Tn (x) → x = I(x), but I is not
compact since dimℓ2 = ∞ (see Lemma 2.1.1 (b).

Remark 2.1.3. As a particular case of the above theorem, we can say that if X is a Banach
space, and if {Tn } is a sequence of finite rank operators in B(X) such that kTn − T k → 0 as
n → ∞ for some T ∈ B(X), then T is a compact operator.

By using the above theorem, we give the example of compact operator.

Example 2.1.2. The operator T : ℓ2 → ℓ2 defined by T (x) = y, where x = {ξj } ∈ ℓ2 and


y = {ηj } with ηj = ξ/j for all j = 1, 2, . . ., is a compact linear operator.
Qamrul Hasan Ansari Advanced Functional Analysis Page 47

Clearly, if x = {ξj } ∈ ℓ2 , then y = {ηj } ∈ ℓ2 . Let Tn : ℓ2 → ℓ2 be defined by


 
ξ1 ξ2 ξ3 ξn
Tn (x) = , , , . . . , , 0, 0, . . . .
1 2 3 n

Then Tn is linear and bounded, and is compact by Theorem 2.1.2 (a). Furthermore,

X X∞
1 2
k(T − Tn )(x)k2 = |ηj |2 = |ξj |
j=n+1 j=n+1
j
X∞
1 2 kxk2
≤ |ξ j | ≤ .
(n + 1)2 j=n+1 (n + 1)2

Taking the supremum over all x of norm 1, we see that


1
kT − Tn k ≤ .
n+1
Hence Tn → T , and T is compact by Theorem 2.1.3.
Example 2.1.3. Let {λn } be a sequence of scalars such that λn → 0 as n → ∞. Let
T : ℓp → ℓp (1 ≤ p ≤ ∞) be defined by

(T (x))(i) = λi x(i), for all x ∈ ℓp , i ∈ N.

Then we see that T is a compact operator.

For each n ∈ N, let 


λi x(i), if 1 ≤ i ≤ n
(Tn (x))(i) =
0, if i > n.
Then for each n, clearly Tn : ℓp → ℓp is a bounded operator of finite rank. In particular,
each Tn is a compact operator. It also follows that
 
k(T − Tn )(x)kp ≤ sup |λi | kxkp , for all x ∈ ℓp , n ∈ N.
i>n

Since λn → 0 as n → ∞, we obtain

kT − Tn kp ≤ sup |λi | → 0 as n → ∞.
i>n

Then by Theorem 2.1.3 is a compact operator.

Since T (en ) = λn en for all n ∈ N, T is of infinite rank whenever λn 6= 0 for infinitely many
n.
Exercise 2.1.6. Prove that the operator T defined in the Example 2.1.3 is not compact if
λn → λ 6= 0 as n → ∞.
Exercise 2.1.7. Show that the zero operator on any normed space is compact.
Qamrul Hasan Ansari Advanced Functional Analysis Page 48

Exercise 2.1.8. If T1 , T2 : X → Y are compact linear operators from a normed space X to


another normed space Y and α is a scalar, then show that T1 + T2 and αT1 are also compact
linear operators.
Exercise 2.1.9. Show that the projection of a Hilbert space H onto a finite dimensional
subspace of H is compact.
Exercise 2.1.10. Show that the operator T : ℓ2 → ℓ2 defined by T (x) = y, where x =
{ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /2i , is compact.
Exercise 2.1.11. Show that the operator T : ℓp → ℓp , 1 ≤ p < ∞, defined by T (x) = y,
where x = {ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /i, is compact.
Exercise 2.1.12. Show that the operator T : ℓ∞ → ℓ∞ defined by T (x) = y, where x =
{ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /i, is compact.

Theorem 2.1.4. Let X and Y be normed spaces and T : X → Y be a linear compact


operator. Suppose that the sequence {xn } in X is weakly convergent, say, xn ⇀ x. Then
{T (xn )} converges strongly to T (x) in Y .

Proof. We write yn = T (xn ) and y = T (x). We first show that yn ⇀ y and then yn → y.

Let g be any bounded linear functional on Y . We define a functional f on X by setting


f (z) = g(T (z)), for all z ∈ X.
Then f is linear. Also, f is bounded because T is compact, hence bounded, and
|f (z)| = |g(T (z))| ≤ kgk kT (z)k ≤ kgk kT k kzk.
By definition, xn ⇀ x implies f (xn ) → f (x), hence by the definition, g(T (xn )) → g(T (x)),
that is, g(yn) → g(y). Since g was arbitrary, this proves yn ⇀ y.

Now we prove yn → y. Assume that it does not hold. Then {yn } has a subsequence {ynk }
such that
kynk − yk ≥ δ, (2.1)
for some δ > 0. Since {xn } is weakly convergent, {xn } is bounded, and so is {xnk }. Com-
pactness of T implies that (by Theorem 2.1.1) {T (xnk )} has a convergent subsequence, say
{ỹj }. Let ỹj → ỹ. Then of course ỹj ⇀ ỹ. Hence ỹ = y because yn ⇀ y. Consequently,
kỹj − yk → 0 but kỹj − yk ≥ δ > 0 by (2.1).
This contradicts, so that yn → y.
Remark 2.1.4. In general, the converse of the above theorem does not hold. For example,
consider the space X = ℓ1 , then by Schur’s lemma6 , every weakly convergent sequence in ℓ1 is
convergent. Thus, every bounded operator on ℓ1 maps every weakly convergent sequence onto
a convergent sequence. Obviously, every bounded operator on ℓ1 is not compact. However,
if the space X is reflexive, then the converse of Theorem 2.1.4 does hold.
6
Schur’s Lemma. Every weakly convergent sequence in ℓ1 is convergent
Qamrul Hasan Ansari Advanced Functional Analysis Page 49

Theorem 2.1.5. Let X and Y be normed spaces such that X is reflexive, and T : X → Y
be a linear operator such that for any sequence {xn } in X,

xn ⇀ x implies T (xn ) → T (x).

Then T is a compact operator.

Proof. It is enough to show that for every bounded sequence {xn } in X, {T (xn )} has a
convergent sequence.

Let {xn } be a bounded sequence in X. By Eberlein-Shmulyan theorem7 , {xn } has a weakly


convergent subsequence, say {x̃n }. Then by hypothesis, {T (x̃n )} converges.

Exercise 2.1.13. Let X be an infinite dimensional normed space and T :→ X be a compact


linear operator. If λ is a nonzero scalar, then prove that λI − T is not a compact operator.
Further deduce that the operator
 α3 α4 
S : (α1 , α2 , . . .) 7→ α1 + α2 , α2 + , α3 + , . . .
2 3
is not a compact operator on ℓp , 1 ≤ p ≤ ∞.

Exercise 2.1.14. Let 1 ≤ p ≤ ∞ and q be the conjugate exponent of p, that is, 1p + 1q = 1.


Let (aij ) be an infinite matrix with aij ∈ K, i, j ∈ N. Show that the operator (T (x)(i)) =
P ∞ p p p
j=1 aij x(j), x ∈ ℓ , i ∈ N, is well defined and T : ℓ → ℓ is a compact operator in each of
the following cases:

P∞
(a) 1 ≤ p ≤ ∞, 1 ≤ r ≤ ∞ and j=1 |aij |
→ 0 as i → ∞.
P P∞ r
(b) 1 ≤ p ≤ ∞, 1 ≤ r < ∞ and ∞
i=1 j=1 |aij | < ∞.
P∞ q
(c) 1 < p ≤ ∞, 1 ≤ r ≤ ∞ and j=1 |aij | → 0 as i → ∞.
P∞ P∞ q
r/q
(d) 1 < p ≤ ∞, 1 ≤ r < ∞ and i=1 j=1 |aij | < ∞.

Exercise 2.1.15. Let X be a Hilbert space and T : X → X be a bounded linear operator.


Show that T is compact if and only if for every sequence {xn } in X

hxn , ui → hx, ui, for all u ∈ X implies T (xn ) → T (x).

Exercise 2.1.16. Let X and Y be infinite dimensional normed spaces. If T : X → Y is a


surjective linear operator, then prove that T is compact.

7
Eberlein-Shmulyan Theorem. Every bounded sequence in reflexive space has a weakly convergent
subsequence
Qamrul Hasan Ansari Advanced Functional Analysis Page 50

2.2 Eigenvalues and Eigenvectors

Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range
R(T ) and null space N (T ) of A are defined, respectively, as

R(T ) = {T (x) : x ∈ X} and N (T ) = {x ∈ X : T (x) = 0}.

The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the
nullity of T .

It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}.

Definition 2.2.1. Let X be a linear space and T : X → X be a linear operator. A scalar


λ ∈ K is called an eigenvalue of T if there exists a nonzero vector x ∈ X such that

T (x) = λx.

In this case, x is called an eigenvector of T corresponding to eigenvector λ.

The set of all eigenvalues of T is known as eigenspectrum of T or point of spectrum, and it


is denoted by σeig (T ). Thus,

σeig (T ) := {λ ∈ K : ∃x 6= 0 such that T (x) = λx}.

Remark 2.2.1. Note that

λ ∈ σeig (T ) ⇔ N (T − λI) 6= {0},

and nonzero element of N (T − λI) are eigenvectors of T corresponding to the eigenvalue λ.


The subspace N (T − λI) is called the eigenspace of T corresponding to the eigenvalue λ.

Remark 2.2.2. A linear operator may not have any eigenvalue at all. For example, the
linear operator T : R2 → R2 defined by

T ((α1 , α2 )) = (α2 , −α1 ), for all (α1 , α2 ) ∈ R2

has no eigenvalue.

Remark 2.2.3. It can be easy seen that

• λ ∈ K is an eigenvalue of T if and only if the operator Tλ I is not injective;

• λ ∈ K is an eigenvalue of T if and only if the operator Tλ I is not surjective.


Qamrul Hasan Ansari Advanced Functional Analysis Page 51

Example 2.2.1. Let X be any of the sequence spaces c00 , c0 , c, ℓp .

(a) Let {λn } be a bounded sequence of scalars. Let T : X → X be the diagonal operator
defined by
T (x)(j) = λj x(j), for all x ∈ X and j ∈ N.
Then it is easy to see that, for λ ∈ K, the equation T (x) = λx is satisfied for a nonzero
x ∈ X if and only if λ = λj for some j ∈ N. Hence,
σeig (T ) = {λ1 , λ2 , . . .}.
In fact, for n ∈ N, en ∈ X defined by en (j) = δij is an eigenvector of T corresponding to the
eigenvalue λn .

(b) Let T : X → X be the right shift operator, that is,


T : (α1 , α2 , . . .) 7→ (0, α1 , α2 , . . .).
Let λ ∈ K. Then the equation T (x) = λx is satisfied for some x = (α1 , α2 , . . .) ∈ X if and
only if
0 = λα1 , αj = λαj+1, for all j ∈ N.
This is possible only if αj = 0 for all j ∈ N. Thus, σeig (T ) = ∅.

(c) Let T : X → X be the left shift operator, that is,


T : (α1 , α2 , . . .) 7→ (α2 , α3 , . . .).
Then for x = (α1 , α2 , . . .) ∈ X and λ ∈ K,
T (x) = λx ⇔ αn+1 = λn α1 .
From this, we can infer the following:

Clearly, λ = 0 is an eigenvalue of T with a corresponding eigenvector e1 .

Now suppose that λ 6= 0. If λ is an eigenvalue, then a corresponding eigenvector is of the


form x = α1 (1, λ, λ2, λ3 . . .) for some nonzero α1 . Note that if α1 6= 0 and λ 6= 0, then
x = α1 (1, λ, λ2, λ3 . . .) does not belong to c00 . Thus, if X = c00 , then σeig (T ) = {0}.

Next consider the cases of X = c0 or X = ℓp for 1 ≤ p < ∞. In these cases, we see that
(1, λ, λ2, λ3 . . .) ∈ X if and only if |λ| < 1, so that σeig (T ) = {λ : |λ| < 1}.

For the case of X = c, we see that (1, λ, λ2 , λ3 . . .) ∈ X if and only if either |λ| < 1 or λ = 1.
Thus, in this case
σeig (T ) = {λ : |λ| < 1} ∪ {1}.

If X = ℓ∞ , then (1, λ, λ2, λ3 . . .) ∈ X if and only if |λ| ≤ 1. Thus, in this case


σeig (T ) = {λ : |λ| ≤ 1}.
Qamrul Hasan Ansari Advanced Functional Analysis Page 52

Theorem 2.2.1. Let X be a normed space and T : X → X be a compact linear operator.


Then zero is the only possible limit point of σeig (T ). In particular, σeig (T ) is a countable
subset of K.

Proof. Since
∞ n
[ o
σeig (T ) \ {0} = λ ∈ σeig (T ) : |λ| ≥ 1/n ,
n=1

it is enough to show that the set Er := {λ ∈ σeig (T ) : |λ| ≥ r} is finite for each r > 0.

Assume that there is an r > 0 such that Er is an infinite set. Let {λn } be a sequence of
distinct elements in Er , that is, {λn } be a sequence of distinct eigenvalues of T such that
|λn | ≥ r. For n ∈ N, let xn be eigenvector of T corresponding to the eigenvalue λn , and let
Xn := span{x1 , x2 , . . . , xn }, n ∈ N. Then each Xn is a proper closed subspace of Xn+1 . By
Riesz Lemma8 , there exists a sequence {un } ∈ X such that un ∈ Xn , kun k = 1 for all n ∈ N
and
1
dist(un , Xm ) ≥ , for all m < n.
2
Therefore, for every m, n ∈ N with m < n, we have

kT (un ) − T (um )k = k(T − λn I)(un ) − (T − λm I)(um ) + λn xn − λm xm k


= kλn un − [λm um + (T − λm I)(um ) − (T − λn I)(un )k

Note that um ∈ Xm ⊆ Xn−1 and

(T − λn I)(un ) ∈ Xn−1 , (T − λm I)(um ) ∈ Xm−1 ⊆ Xn−1 .

Therefore, we have
|λn | r
kT (un ) − T (um )k ≥ |λn |dist(un , Xn−1) ≥ ≥ .
2 2
Thus, {T (un )} has no convergent subsequence, contradicting the fact that T is a compact
operator.

Let X be a normed space and T : X → X be a linear operator. Assume that λ is not an


eigenvalue of T . Then we can say that for y ∈ X, the operator equation

T (x) − λx = y

can have atmost one solution which depends continuously on y. In other words, one would
like to know that the inverse operator

(T − λI)−1 : R(T − λI) → X

8
Riesz Lemma. Let X0 be a proper closed subspace of a normed space X. Then for every r ∈ (0, 1),
there exists xr ∈ X such that kxr k = 1 and dist(xr , X0 ) ≥ r.
Qamrul Hasan Ansari Advanced Functional Analysis Page 53

is continuous which is equivalent to say that the operator T − λI is bounded below, that is,
there exists c > 0 such that

kT (x) − λxk ≥ ckxk, for all x ∈ X.

Motivated by the above requirement, we generalize the concept of eigenspectrum.


Definition 2.2.2. Let X be a normed space and T : X → X be a linear operator. A scalar
λ is said to be an approximate eigenvalue of T if T − λI is not bounded below.

The set of all approximate eigenvalues of T is called the approximate eigenspectrum of T ,


and it is denoted by σapp (T ), that is,
σapp (T ) = {λ ∈ K : T − λI not bounded below}.
/ σapp (T ) if and only if T −λI is injective and (T −λI)−1 :
Remark 2.2.4. By the result9 , λ ∈
R(T − λI) → X is continuous.

The following result provides the characterization of σapp (T ).


Theorem 2.2.2. Let X be a normed space, T : X → X be a linear operator and λ ∈ K.
Then λ ∈ σapp (T ) if and only if there exists a sequence {xn } in X such that kxn k = 1
for all n ∈ N, and
kT (xn ) − λxn k → 0 as n → ∞.

Proof. If λ ∈/ σapp (T ), that is, if there exists c > 0 such that kT (x) − λxk ≥ ckxk for all
x ∈ X, then there would not exist any sequence {xn } in X such that kxn k = 1 for all n ∈ N
and kT (xn ) − λxn k → 0 as n → ∞.

Conversely, assume that λ ∈ σapp (T ), that is, there does not exist any c > 0 such that
kT (x) − λxk ≥ ckxk for all x ∈ X. Then for all n ∈ N, there exists un ∈ X such that
1
kT (un ) − λun k <
kun k, for all n ∈ N.
n
un
6 0 for all n ∈ N. Taking xn =
Clearly, un = for all n ∈ N, then we have
kun k
1
kxn k = 1 for all n ∈ N and kT (xn ) − λxn k < → 0 as n → ∞.
n
This completes the proof.

9
Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists γ > 0 such that
kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous, and in that
case, kT −1(y)k ≤ γ1 kyk for all y ∈ R(T ).
Qamrul Hasan Ansari Advanced Functional Analysis Page 54

Theorem 2.2.3. Let X be a normed space and T : X → X be a linear operator. Then,

σeig (T ) ⊆ σapp (T ).

If X is a finite dimensional space, then

σeig (T ) = σapp (T ).

Proof. Clearly, λ ∈/ σapp (T ) implies T − λI is injective (one-one) so that λ ∈


/ σeig (T ). Thus,
σeig (T ) ⊆ σapp (T ).

Now, assume that X is a finite dimensional space. If λ ∈ / σeig (T ), then T − λI is injective


(one-one) so that using the finite dimensionality of X, it follows that T − λI is surjective
as well, and hence the operator (T − λI)−1 is continuous. Consequently, T − λI is bounded
below, that is, λ ∈
/ σapp (T ). Thus, if X is finite dimensional, then σeig (T ) = σapp (T ).

The following example illustrates that the strict inclusion in σeig (A) ⊆ σapp (A) can occur
if the space X is infinite dimensional.
Example 2.2.2. Let X be any of the sequence spaces c00 , c0 , c, ℓp with any norm satisfying
ken k = 1 for all n ∈ N. Let T : X → X be defined by

(T (x))(j) = λj x(j), for all x ∈ X and all j ∈ N.

where {λn } is a bounded sequence of scalars. As in Example 2.2.1, we have

σeig (T ) = {λ1 , λ2 , . . .}.

Now assume that λn → λ as n → ∞. Then we have

kT (en ) − λen k = |λn − λ| ken k = |λn − λ| → 0 as n → ∞.

Thus, we can conclude that λ ∈ σapp (T ). Note that if λ 6= λn for every n ∈ N, then
λ∈/ σeig (T ).

Theorem 2.2.4. Let X be a normed space and T : X → X be a linear compact operator.


Then the following assertions hold:

(a) σapp (T )\{0} = σeig (T )\{0}.

(b) If T is a finite rank operator, then σapp (T ) = σeig (T ).

(c) If X is infinite dimensional, then 0 ∈ σapp (T ).

(d) 0 is the only possible limit point of σapp (T ).


Qamrul Hasan Ansari Advanced Functional Analysis Page 55

Proof. (a) We have already observed that σeig (T ) ⊆ σapp (T ). Now, suppose that 0 6= λ ∈
σapp (T ). We show that λ ∈ σeig (T ).

Let {xn } be a sequence in X such that kxn k = 1 for every n ∈ N and kT (xn ) − λxn k → 0 as
n → ∞. Since T is compact operator, there exists a subsequence {x̃n } of {xn } and y ∈ X
such that T (x̃n ) → y. Hence,

λx̃n = T (x̃n ) − (T (x̃n ) − λx̃n ) → y.

Then it follows that kyk = |λ| and


y
y = lim T (x̃n ) = T ,
n→∞ λ
so that T (y) = λy, showing that λ ∈ σeig (T ).

(b) Suppose that T is a finite rank operator. In view of (a), it is enough to show that 0 ∈
σapp (T ) implies 0 ∈ σeig (A). Suppose that 0 ∈ / σeig (T ). Then T is injective so that by the
hypothesis that T is of finite rank, X is finite dimensional. Therefore, σapp (T ) = σeig (T ),
and consequently, 0 ∈/ σapp (T ).

(c) Let X be infinite dimensional. Suppose that 0 ∈/ σapp (T ), that is, T is bounded below.
We show that every bounded sequence in X has a Cauchy subsequence so that X would be
of finite dimension, contradicting the assumption.

Let {xn } be a bounded sequence in X. Since A is compact, there is subsequence {x̃n } of


{xn } such that {T (x̃n )} converges. Since T is bounded below, it follows that {x̃n } is Cauchy
subsequence of {xn }.

(d) It follows from the proof of (a) and Theorem 2.2.1.

From the above theorem part (c), we can observe that an operator defined on an infinite
dimensional space is not compact. The following example illustrates this point of view.

Example 2.2.3. Let X = ℓp with 1 ≤ p ≤ ∞. Let T be the right shift operator on X


defined as
T : (α1 , α2 , . . .) 7→ (0, α1 , α2 , . . .),
or the diagonal operator on X defined as

T : (α1 , α2 , . . .) 7→ (λ1 α1 , λ2 α2 , . . .)

associated with a sequence {λn } of nonzero scalars which converges to a nonzero scalar. We
know that T is not a compact operator but bounded below. Hence 0 ∈ / σapp (T ). Thus, the
fact that T is not compact as follows from Theorem 2.2.4 (c).
Qamrul Hasan Ansari Advanced Functional Analysis Page 56

We know that the range of an infinite rank compact operator on a Banach space is not closed.
Does Theorem 2.2.4 (c) hold for every bounded operator with nonclosed range as well? The
answer is in the affirmative if X is a Banach space, as the following theorem shows.
Theorem 2.2.5. Let X be a Banach space and T : X → X be a bounded linear operator.
If the range R(T ) of T is not closed in X, then 0 ∈ σapp (T ).

Proof. The proof follows from result “Let T : X → Y be a bounded linear operator from a
Banach space X to a normed space Y . If T is bounded below, then the range R(T ) of T is
a closed subspace of Y .”

Now we prove a topological property of σapp (T ).

Theorem 2.2.6. Let X be a normed space and T : X → X be a bounded linear operator.


Then σapp (T ) is a closed subset of K.

Proof. Let {λn } be a sequence in σapp (T ) such that λn → λ for some λ ∈ K. Suppose that
λ∈/ σapp (T ). Let c > 0 be such that
kT (x) − λxk ≥ ckxk, for all x ∈ X.
Observe that, for every x ∈ X, n ∈ N,
kT (x) − λn xk = k(T (x) − λx) − (λn − λ)xk
≥ kT (x) − λxk − |λn − λ)|kxk
≥ (c − |λn − λ|)kxk.
Thus, for all large enough n, T − λn I is bounded below. More precisely, let N ∈ N be such
that |λn − λ| ≤ c/2 for all n ≥ N. Then we have
c
kT (x) − λn xk ≤ kxk, for all x ∈ X and all n ≥ N,
2
which shows that λn ∈
/ σapp (T ) for all n ≥ N. Thus, we arrive at a contradiction.

The above result, in particular, shows that if {λn } is a sequence of eigenvalues of T ∈ B(X)
(T : X → X is bounded linear operator) such that λn → λ, then λ is an approximate
eigenvalue. One may ask whether every approximate eigenvalue arises in this manner. The
answer is, in general, negative, as the following examples shows.
Example 2.2.4. Let X = ℓ1 and T be the right shift operator on ℓ1 . Then we know that
σeig (T ) = ∅. We show that σapp (T ) 6= ∅.

Let {xn } in ℓ1 be defined by


 1
n
, if j ≤ n,
xn (j) =
0, if j > n.
Qamrul Hasan Ansari Advanced Functional Analysis Page 57

Then we see that kxn k = 1 for all n ∈ N, and kT (xn ) − xn k1 = 2/n → 0 as n → ∞ so that
1 ∈ σapp (T ).

Few other examples of operators describing eigenspectrum and approximate eigenspctrum


completely are given in the book by M. T. Nair: Functional Analysis: A First Course,
Prentice-Hall of India Private Limited, New Delhi, 2002.
Qamrul Hasan Ansari Advanced Functional Analysis Page 58

2.3 Resolvent Operators

Let X be a normed space and T : X → X be a linear operator. We have seen in Remark


/ σapp (T ) if and only if T − λI is injective and (T − λI)−1 : R(T − λI) → X is
2.2.4 that λ ∈
continuous. That is, a scalar λ is not an approximate eigenvalue of T if and only if for every
y ∈ R(T − λI), there exists a unique x ∈ X such that

T (x) − λx = y,

and the map y 7→ x is continuous. Thus, if x and y are as above, and if {yn } is a sequence
in R(T − λI) such that yn → y, and {xn } in X satisfies T (xn ) − λxn = yn , then xn → x.

One would like to have the above situation not only for every y ∈ R(T − λI), but also for
every y ∈ X. Motivated by this requirement, we have the concept of spectrum of T .

Definition 2.3.1. The resolvent set of T , denoted by ρ(T ), is defined as

ρ(T ) = {λ ∈ K : T − λI is bijective and (T − λI)−1 ∈ B(X)},

where B(X) denotes the set of all bounded linear operators from X into itself.

The complement of ρ(T ) in K is called the spectrum of T and is denoted by σ(T ).

Thus, λ ∈ σ(T ) if and only if either T − λI is not bijective or else (T − λI)−1 ∈


/ B(X).

The elements of the spectrum are called the spectral values of T .

We observe that, for T ∈ B(X),

0 ∈ ρ(T ) ⇔ ∃S ∈ B(X) such that T S = I = ST,

and, in that case, S = T −1 . If 0 ∈ ρ(T ), then we say that T is invertible in B(X). We note
that if T, S ∈ B(X) are invertible, then T S is invertible, and

(T S)−1 = S −1 T −1 .

In view of Proposition A10 , if λ ∈ ρ(T ), then T − λI is bounded below. Hence, every


approximate eigenvalue is a spectral value, that is,

σapp (T ) ⊆ σ(T ).
10
Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists
γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous,
and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ).
Qamrul Hasan Ansari Advanced Functional Analysis Page 59

Clearly, if X is a finite dimensional space, then

σeig (T ) = σapp (T ) = σ(T ).

We have seen examples of infinite rank operators T for which σeig (T ) 6= σapp (T ). The
following example shows that strict inclusion is possible in σapp (T ) ⊆ σ(T ) as well.

Example 2.3.1. Let X = ℓp , 1 ≤ p ≤ ∞, and T be the right shift operator on X. We


have seen in Example 2.2.3 that 0 ∈
/ σapp (T ). But 0 ∈ σ(T ), since T is not onto. In fact,
e1 ∈
/ R(T ).

Now we give some characterizations of the spectrum.

Theorem 2.3.1. Let X be a Banach space, T : X → X be a bounded linear operator


and λ ∈ K. Then λ ∈ σ(T ) if and only if either λ ∈ σapp (T ) or R(T − λI) is not dense
in X.

Proof. Clearly, if λ ∈ σapp (T ) or R(T − λI) is not dense in X, then λ ∈ σ(T ).

/ σapp (T ), then by Proposition A11 , Proposition


Conversely, suppose that λ ∈ σ(T ). If λ ∈
B , the operator T − λI is injective, and its inverse (T − λI)−1 : R(T − λI) → X is
12

continuous, and R(T − λI) is closed. Hence, R(T − λI) is not dense in X; otherwise, T − λI
would become bijective and (T − λI)−1 ∈ B(X), which is a contradiction to the assumption
that λ ∈ σ(T ).

11
Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists
γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous,
and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ).
12
Proposition B: Let T : X → Y be a bounded linear operator from a Banach space X to a normed space
Y . If T is bounded below, then the range R(T ) of T is a closed subspace of Y
Qamrul Hasan Ansari Advanced Functional Analysis Page 60

2.4 Spectral Theory of Compact Linear Operators


Theorem 2.4.1 (Null Space). Let X be a normed space and T : X → X be a linear
compact operator. Then for every λ 6= 0, the null space N (Tλ ) = {x ∈ D(Tλ ) : Tλ (x) =
0} of Tλ = T − λI is finite dimensional.

Proof. We prove it by showing that the closed unit ball B = {x ∈ N (Tλ ) : kxk ≤ 1} is
compact as a normed space is finite dimensional if the closed unit ball in it is compact.

Let {xn } be in B. Then {xn } is bounded as kxn k ≤ 1. Since T is compact, by Theorem 2.1.4,
{T (xn )} has a convergent subsequence {T (xnk )}. Now xn ∈ B ⊂ N (Tλ ) implies Tλ (xn ) =
T (xn )−λxn = 0, so that xn = λ−1 T (xn ) because λ 6= 0. Consequently, {xnk } = {λ−1 T (xnk )}
also converges and its limit lies in B as B is closed. Since {xn } was arbitrary, it says that
every sequence in B has convergent subsequence, and therefore, B is compact. This implies
that domN (T ) < ∞.

Theorem 2.4.2. Let X be a normed space and T : X → X be a linear compact operator.


Then for every λ 6= 0, the range of Tλ = T − λI is closed.

Proof. The proof is divided into three steps.

Step 1. Suppose that Tλ (X) is not closed. Then there is a y ∈ Tλ (X), y ∈


/ Tλ (X) and a
sequence {xn } in X such that
yn = Tλ (xn ) → y. (2.2)
Since Tλ (X) is a vector space, 0 ∈ Tλ (X). But y ∈ / Tλ (X), so that y 6= 0. This implies
that yn 6= 0 and xn ∈
/ N (Tλ ) for all sufficiently large n. Without loss of generality, we may
assume that this holds for all n. Since N (Tλ) is closed, the distance δn from xn to N (Tλ ) is
positive, that is,
δn = inf kxn − zk > 0.
z∈N (Tλ )

By the definition of an infimum, there is a sequence {zn } in N (Tλ) such that


an = kxn − zn k < 2δn . (2.3)

Step 2. We show that


an = kxn − zn k → ∞, as n → ∞. (2.4)

Assume that it does not hold. Then {xn −zn } has bounded subsequence. Since T is compact,
it follows from Theorem 2.1.1 that {T (xn − zn )} has a convergent subsequence. Now from
Tλ = T − λI and λ 6= 0, we have I = λ−1 (T − Tλ ). Since zn ∈ N (Tλ), we have Tλ (zn ) = 0
and thus we obtain
1 1
xn − zn = (T − Tλ )(xn − zn ) = [T (xn − zn ) − Tλ (xn )].
λ λ
Qamrul Hasan Ansari Advanced Functional Analysis Page 61

{T (xn − zn )} has convergent subsequence and {Tλ (xn )} converges by (2.2); hence {xn − zn }
has convergent subsequence, say, xnk − znk → v. Since T is compact, T is continuous and so
is Tλ . Hence
Tλ (xnk − znk ) → Tλ (v).
Here Tλ (znk ) = 0 because zn ∈ N (Tλ), so by (2.2) we also have

Tλ (xnk − znk ) = Tλ (xnk ) → y.

hence Tλ (v) = y. Thus y ∈ Tλ (X), which contradicts y ∈


/ Tλ (X) (we assumed it in Step 1).
This is a contradiction and hence an = kxn − zn k → ∞ as n → ∞.

Step 3. Using an as in (2.4) and setting

1
wn = (xn − zn ), (2.5)
an

we have kwn k = 1. Since an → ∞, whereas Tλ (zn ) = 0 and {Tλ (zn )} converges, it follows
that
1
Tλ (wn ) = Tλ (xn ) → 0, (2.6)
an
Using again I = λ−1 (T − Tλ ), we obatin

1
wn = T (wn ) − Tλ (wn )). (2.7)
λ
Since T is compact and {wn } is bounded, {T (wn )} has convergent subsequence. Furthermore,
{Tλ (wn )} converges by (2.6). Hence (2.7) shows that {wn } has a convergent subsequence,
say
wnj → w. (2.8)
A comparison with (2.6) implies that Tλ (w) = 0. Hence w ∈ N (Tλ ). Since zn ∈ N (Tλ ), also
un = zn + an w ∈ N (Tλ ). Hence for the distance from xn to un , we must have

kxn − un k ≥ δn .

Writing un out and using (2.5) and (2.3), we thus obtain

δn ≤ kxn − zn − an wk
= kan wn − an wk
= an kwn − wk
< 2δn kwn − wk.
1
Dividing by 2δn > 0, we have 2
< kwn −wk. This contradicts (2.8) and proves the result.

Exercise 2.4.1. Let X be a normed space and T :→ X be a linear operator. Let λ ∈ K be


such that T − λI is injective. Show that (T − λI)−1 : R(T − λI) → X is continuous if and
only if λ is not an approximate eigenvalue.
Qamrul Hasan Ansari Advanced Functional Analysis Page 62

Exercise 2.4.2. Let X be a normed space and T : X → X be a bounded linear operator.


Let λ ∈ K be such that |λ| > kT k. Show that

(a) T − λI is bounded below,

(b) R(T − λI) is dense in X,

(c) (T − λI)−1 : R(T − λI) → X is continuous.

Exercise 2.4.3. Let X be a Banach space and T : X → X be a bounded linear operator.


Show that λ ∈ σeig (T ) if and only if there exists a nonzero operator S ∈ B(X) such that
(T − λI)S = 0.

Exercise 2.4.4. Give an example of a bijective operator T on a normed space X such that
0 ∈ σ(A).
3

Differential Calculus on Normed Spaces

3.1 Directional Derivatives and Their Properties

Throughout this section, unless otherwise specified, we assume that X is a real vector space
and f : X → R ∪ {±∞} is an extended real-valued function. In this section, we discuss
directional derivatives of f and present some of their basic properties.

Definition 3.1.1. Let f : X → R ∪ {±∞} be a function and x ∈ Rn be a point where f is


finite.

(a) The right-sided directional derivative of f at x in the direction d ∈ X is defined by


f (x + td) − f (x)
f+′ (x; d) = lim+ ,
t→0 t
if the limit exists in [−∞, +∞], that is, finite or not.

(b) The left-sided directional derivative of f at x in the direction d ∈ X is defined by


f (x + td) − f (x)
f−′ (x; d) = lim− ,
t→0 t
if the limit exists in [−∞, +∞], that is, finite or not.

For d = 0 the zero vector in X, f+′ (x; 0) = f−′ (x; 0) = 0.

Since
f (x − td) − f (x) f (x + τ d) − f (x)
f+′ (x; −d) = lim+ = lim− = −f−′ (x; d),
t→0 t τ →0 −τ

63
Qamrul Hasan Ansari Advanced Functional Analysis Page 64

we have
−f+′ (x; −d) = f−′ (x; d).

If f+′ (x; d) exists and f+′ (x; d) = f−′ (x; d), then it is called the directional derivative of f at x
in the direction d. Thus, the directional derivative of f at x in the direction d ∈ X is defined
by
f (x + td) − f (x)
f ′ (x; d) = lim ,
t→0 t
provided the limit exists in [−∞, +∞], that is, finite or not.
Remark 3.1.1. (a) If f ′ (x; d) exists, then f ′ (x; −d) = −f ′ (x; d).

(b) If f : Rn → R is differentiable, then the directional derivative of f at x ∈ X in the


direction d is given by
n
X
′ ∂f (x)
f (x; d) = di = h∇f (x), di.
i=1
∂xi

In particular, if d = (0, 0, . . . , 0, 1, 0, . . . , 0, 0) = ei , where 1 is at the ith place, then


∂f (x)
f ′ (x; ei ) = the partial derivative of f with respect to xi .
∂xi

For an extended convex function1 f : X → R ∪ {±∞}, the following proposition shows that
f (x + td) − f (x)
the function t 7→ is monotonically increasing on (0, ∞).
t
Proposition 3.1.1. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, for each direction d ∈ X, function
f (x + td) − f (x)
t 7→ is monotonically nondecreasing on (0, ∞).
t

Proof. Let x ∈ X be any point such that f (x) is finite, and s, t ∈ (0, ∞) with s ≤ t. Then,
by convexity of f , we have
s  s 
f (x + sd) = f (x + td) + 1 − x
t  t
s s
≤ f (x + td) + 1 − f (x).
t t
It follows that
f (x + sd) − f (x) f (x + td) − f (x)
≤ .
s t
f (x + td) − f (x)
Thus, function t 7→ is monotonically nondecreasing on (0, ∞).
t
1
A function f : X → R ∪ {±∞} is said to be convex if for all x, y ∈ X with f (x), f (y) 6= ±∞, and all
α ∈ [0, 1], f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y).
Qamrul Hasan Ansari Advanced Functional Analysis Page 65

The following result ensures the existence of f+′ (x; d) and f−′ (x; d) when f is a convex func-
tion.
Proposition 3.1.2. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, f+′ (x; d) and f−′ (x; d) exist for every
direction d ∈ X. Also,
f (x + td) − f (x)
f+′ (x; d) = inf , (3.1)
t>0 t
and
f (x + td) − f (x)
f−′ (x; d) = sup . (3.2)
t<0 t

Proof. Let x ∈ X be any point such that f (x) is finite. For given t > 0, by the convexity of
f , we have
 
t 1
f (x) = f (x − d) + (x + td)
1+t 1+t
t 1
≤ f (x − d) + f (x + td)
1+t 1+t
1
= (tf (x − d) + f (x + td)) .
1+t
It follows that (1 + t)f (x) ≤ tf (x − d) + f (x + td), and so,

f (x + td) − f (x)
≥ f (x) − f (x − d).
t
f (x + td) − f (x)
Hence the decreasing sequence of values , as t → 0+ , is bounded below by
t
the constant f (x) − f (x − d). Thus, the limit in the definition of f+′ (x; d) exists and is given
by
f (x + td) − f (x) f (x + td) − f (x)
f+′ (x; d) = lim+ = inf .
t→0 t t>0 t
Since f+′ (x; d) exists in every direction d, the equality −f+′ (x; −d) = f−′ (x; d) implies that
f−′ (x; d) exists in every direction d.

The relation (3.2) can be established on the lines of the proof given to derive (3.1).

Proposition 3.1.3. Let f : X → R ∪ {±∞} be an extended real-valued convex function


and x be a point in X where f is finite. Then, f+′ (x; d) is a convex and positively
homogeneous functiona of d and

f−′ (x; d) ≤ f+′ (x; d). (3.3)


a
A function f : X → R is said to be (a) convex if for all x, y ∈ X and all α ∈ [0, 1], f (αx+(1−α)y) ≤
αf (x) + (1 − α)f (y); (b) positive homogeneous if for all x ∈ X and all r ≥ 0, f (rx) = rf (x).
Qamrul Hasan Ansari Advanced Functional Analysis Page 66

Proof. Let λ > 0 be a real number. Then,


λ(f (x + λtd) − f (x))
f+′ (x; λd) = lim+ = λf+′ (x; d).
λt→0 λt
Hence, f+′ (x; ·) is positively homogeneous.

Similarly, we can show that f−′ (x; ·) is also positively homogeneous.

Next, we show that f+′ (x; ·) is convex. Let d1 , d2 ∈ X and λ1 , λ2 ≥ 0 be such that λ1 +λ2 = 1.
From the convexity of f , we have
f (x + t(λ1 d1 + λ2 d2 )) − f (x)
= f ((λ1 + λ2 )x + t(λ1 d1 + λ2 d2 )) − (λ1 + λ2 )f (x)
= f (λ1 (x + td1 ) + λ2 (x + td2 )) − λ1 f (x) − λ2 f (x)
≤ λ1 f (x + td1 ) + λ2 f (x + td2 ) − λ1 f (x) − λ2 f (x)
= λ1 (f (x + td1 ) − f (x)) + λ2 (f (x + td2 ) − f (x))
for all sufficiently small t. Dividing by t > 0 and letting t → 0+ , we obtain
f+′ (x; λ1 d1 + λ2 d2 ) ≤ λ1 f+′ (x; d1 ) + λ2 f+′ (x; d2 ).
Hence f+′ (x; d) is convex in d.

By subadditivity of f+′ (x; d) in d with f+′ (x; d) < +∞ and f+′ (x; −d) < +∞, we obtain
f+′ (x; d) + f+′ (x; −d) ≥ f+′ (x; 0) = 0,
and thus,
f+′ (x; d) ≥ −f+′ (x; −d) = f−′ (x; d).

If f+′ (x; d) = +∞ or f+′ (x; −d) = +∞, then the inequality (3.3) holds trivially.

Corollary 3.1.1. Let f : X → R ∪ {±∞} be an extended real-valued convex function


and x be a point in X where f is finite. Then, for each direction d ∈ X,

f (x + td) − f (x)
f ′ (x; d) = inf .
t∈(0,∞) t

Proposition 3.1.4. Let f : X → R ∪ {±∞} be an extended real-valued convex function


and x be a point in X where f is finite. Then the following assertions hold:

(a) f ′ (x; ·) is sublinear.a

(b) For every y ∈ X,


f ′ (x; y − x) + f (x) ≤ f (y). (3.4)
Qamrul Hasan Ansari Advanced Functional Analysis Page 67

a
A function f : X → R is said to be sublinear if f (λx) = λf (x) and f (x + y) ≤ f (x) + f (y) for all
x, y ∈ X and all λ ≥ 0.

Proof. (a) It follows from Proposition 3.1.3.

(b) If y is not in Dom(f ), then the inequality (3.4) trivially holds. So, let y ∈ Dom(f ). For
t ∈ (0, 1), we have
f ((1 − t)x + ty) − f (x) ≤ t(f (y) − f (x)),
which implies that
f ((1 − t)x + ty) − f (x)
≤ f (y) − f (x).
t
Letting limit as t → 0. we obtain

f ′ (x; y − x) + f (x) ≤ f (y).

Corollary 3.1.2. Let f : Rn → R ∪ {+∞} be an extended real-valued convex function


and x ∈ Rn be such that f (x) is finite and f is differentiable at x. Then,

f (y) ≥ f (x) + h∇f (x), y − xi, for all y ∈ X,

where ∇f (x) denotes the gradient of f at x.

Corollary 3.1.3. Let f : X → R ∪ {+∞} be an extended real-valued convex function


and x, y ∈ X be such that f (x) and f (y) are finite. Then,

f+′ (y; y − x) ≥ f+′ (x; y − x), (3.5)

and
f−′ (y; y − x) ≥ f−′ (x; y − x). (3.6)
In particular, if f : Rn → R is differentiable at x and y, then

h∇f (y) − ∇f (x), y − xi ≥ 0. (3.7)

Proof. From Corollary 3.1.2, we have

f (y) ≥ f (x) + f+′ (x; y − x), (3.8)

and
f (x) ≥ f (y) + f+′ (y; x − y). (3.9)
Qamrul Hasan Ansari Advanced Functional Analysis Page 68

By adding inequalities (3.8) and (3.9), we obtain

−f+′ (y; x − y) ≥ f+′ (x; y − x).

Since −f+′ (x; −d) = f−′ (x; d), by using inequality (3.3), we get

f+′ (y; y − x) ≥ f−′ (y; y − x) = −f+′ (y; x − y) ≥ f+′ (x; y − x).

Hence, the inequality (3.5) holds. Similarly, we can establish the inequality (3.6). The
inequality (3.7) holds using Remark 3.1.1 (b).
Qamrul Hasan Ansari Advanced Functional Analysis Page 69

3.2 Gâteaux Derivative and Its Properties

Definition 3.2.1. Let X be a normed space. A function f : X → (−∞, ∞] is said to be


Gâteaux2 differentiable at x ∈ int(Dom(f )) if there exists a continuous linear functional,
denoted by fG′ (x), on X such that
f ′ (x; d) = fG′ (x)(d), for all d ∈ X, (3.10)
f (x + td) − f (x)
that is, lim exists for all d ∈ X and it is equal to the value of the functional
t→0 t
fG′ (x) at d.

The continuous linear functional fG′ (x) : X → R is called the Gâteaux derivative of f at x.

fG′ (x; d) is called the value of the Gâteaux derivative of f at x in the direction d.

Similarly, the Gâteaux derivative of an operator T : X → Y from a normed space X to


another normed space Y can be defined as follows:
Definition 3.2.2. Let X and Y be normed spaces. An operator T : X → Y is said to
be Gâteaux differentiable at x ∈ int(Dom(T )) if there exists a continuous linear operator
TG′ (x) : X → Y such that
T (x + td) − T (x)
lim = TG′ (x)(d), for all d ∈ X. (3.11)
t→0 t

The continuous linear operator TG′ (x) : X → Y is called the Gâteaux derivative of T at x.

TG′ (x; d) is called the value of the Gâteaux derivative of T at x in the direction d.

The relation (3.11) is equivalent to the following relation

T (x + td) − T (x)
lim − TG′ (x; d) = 0. (3.12)
t→0 t

Remark 3.2.1. If fG′ (x; d) exists, then fG′ (x; −d) = −fG′ (x; d).
Remark 3.2.2. If X = Rn is an Euclidean space with the standard inner product. If
f : Rn → R has continuous partial derivatives of order 1, then f is Gâteaux differentiable at
x = (x1 , x2 , . . . , xn ) ∈ Rn and in the direction d = (d1 , d2 , . . . , dn ) ∈ Rn , and it is given by
n
X ∂f (x)
fG′ (x; d) = dk ,
∂xk
k=1
2
René Gâteaux (1889-1914) had died in the First World War and his work was published by Lévy in 1919
with some improvement.
Qamrul Hasan Ansari Advanced Functional Analysis Page 70

∂f (x)
where denotes a partial derivative of f at the point x with respect to xk . Thus,
∂xk 
∂f (x) ∂f (x) ∂f (x)
∇G f (x) = , ,..., is gradient of f at the point x.
∂x1 ∂x2 ∂xn
Remark 3.2.3. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner
product. If T : Rn → Rm be given by T = (f1 , f2 , . . . , fm ) and A = (aij ) be a m × n matrix,
where fi : Rn → R be functions for each i = 1, 2, . . . , m. Let d = ej = (0, 0, . . . , 1, . . . , 0, 0)
where 1 at jth place. Then
T (x + td) − T (x)
lim − Ad = 0
t→0 t
implies that
fi (x + tej ) − fi (x)
lim − aij = 0,
t→0 t
for all i = 1, 2, . . . , m and all j = 1, 2, . . . , n. This shows that fi has partial derivatives at x
and
∂fi (x)
= aij , for i = 1, 2, . . . , m and j = 1, 2, . . . , n.
∂xj
Hence  ∂f1 (x) 
∂x1
. . . ∂f∂x1 (x)
n
 .. .. .. 
TG′ (x) =  . . . .
∂fm (x) ∂fm (x)
∂x1
... ∂xn

We establish that the Gâteaux derivative is unique.


Proposition 3.2.1. Let X and Y be normed spaces, T : X → Y be an operator and
x ∈ int(Dom(T )). The Gâteaux derivative TG′ (x) of T at x is unique, provided it exists.

Proof. Assume that there exist two continuous linear operator TG′ (x) and TG∗′ (x) which satisfy
(3.12). Then, for all d ∈ X, and for sufficiently small t, we have
 
′ ∗′ T (x + td) − T (x) ′
kTG (x; d) − TG (x; d)k = − TG (x; d)
t
 
T (x + td) − T (x) ∗′
− − TG (x; d)
t
T (x + td) − T (x)
≤ − TG′ (x; d)
t
T (x + td) − T (x)
+ − TG∗′ (x0 ; d)
t
→ 0 as t → 0.
Therefore, kTG′ (x; d) − TG∗′ (x; d)k = 0 for all d ∈ X. Hence, TG′ (x; d) = TG∗′ (x; d), and thus,
TG′ (x) ≡ TG∗′ (x).
Qamrul Hasan Ansari Advanced Functional Analysis Page 71

Theorem 3.2.1. Let K be a nonempty open convex subset of a normed space X and
f : K → R be a convex function. If f is Gâteaux differentiable at x ∈ K, then fG′ (x; d)
is linear in d. Conversely, if f+′ (x; d) is linear in d, then f is Gâteaux differentiable at
x.

Proof. Let f be Gâteaux differentiable at x ∈ K, then for all d ∈ X

−f+′ (x; −d) = f−′ (x; d) = f+′ (x; d).

Therefore, for all d, u ∈ X, we have

f+′ (x; d) + f+′ (x; u) ≥ f+′ (x; d + u)


= −f+′ (x; −(d + u))
≥ −f+′ (x; −d) − f+′ (x; −u)
= f+′ (x; d) + f+′ (x; u),

and thus,
f+′ (x; d + u) = f+′ (x; d) + f+′ (x; u).
Since fG′ (x; d) = f+′ (x; d) = f−′ (x; d), we have

fG′ (x; d + u) = fG′ (x; d) + f G (x; u).

For α ∈ R with α 6= 0, we have

α(f (x + tαd) − f (x))


fG′ (x; αd) = lim = αfG′ (x; d).
αt→0 αt
Hence fG′ (x; d) is linear in d.

Conversely, assume that f+′ (x; d) is linear in d. Then,

0 = f+′ (x; d − d) = f+′ (x; d) + f+′ (x; −d).

Therefore, for all d ∈ X, we have

f−′ (x; d) = −f+′ (x; −d) = f+′ (x; d).

Thus, f is Gâteaux differentiable at x.

Remark 3.2.4. (a) A nonconvex function f : X → R may be Gâteaux differentiable at


a point but the Gâteaux derivative may not be linear at that point. For example,
consider the function f : R2 → R defined by
( x2 x
1 2
x21 +x22
, if x 6= (0, 0),
f (x) =
0, if x = (0, 0),
Qamrul Hasan Ansari Advanced Functional Analysis Page 72

where x = (x1 , x2 ). For d = (d1 , d2 ) 6= (0, 0) and t 6= 0, we have

f ((0, 0) + t(d1 , d2)) − f (0, 0) d2 d2


= 2 1 2.
t d1 + d2

Then,
f ((0, 0) + t(d1 , d2)) − f (0, 0) d2 d2
fG′ ((0, 0); d) = lim = 2 1 2.
t→0 t d1 + d2
Therefore, f is Gâteaux differentiable at (0, 0), but fG′ ((0, 0); d) is not linear in d.

(b) For a real valued function f defined on Rn , the partial derivatives may exist at a point
but f may not be Gâteaux differentiable at that point. For example, consider the
function f : R2 → R defined by
( xx
1 2
x21 +x22
, if x 6= (0, 0),
f (x) =
0, if x = (0, 0),

where x = (x1 , x2 ). For d = (d1 , d2 ) 6= (0, 0) and t 6= 0, we have

f ((0, 0) + t(d1 , d2 )) − f (0, 0) d1 d2


= .
t t(d21 + d22 )

Then,
f ((0, 0) + t(d1 , d2 )) − f (0, 0) d1 d2
lim = lim ,
t→0 t t→0 t(d2 2
1 + d2 )

∂f (0, 0)
exists only if d = (d1 , 0) or d = (0, d2). That is, fG′ (0; 0) does not exist but =
∂x1
∂f (0, 0)
0= , where 0 = (0, 0) is the zero vector in R2 .
∂x2
(c) The existence, linearity and continuity of fG′ (x; d) in d do not imply the continuity of
the function f . For example, consider the function f : R2 → R defined by
( x3
1
x2
, if x1 6= 0 and x2 6= 0,
f (x) =
0, if x1 = 0 or x2 = 0,

where x = (x1 , x2 ). Then,

t3 d31
fG′ ((0, 0); d) = lim = 0,
t→0 t2 d2

for all d = (d1 , d2 ) ∈ R2 with (d1 , d2 ) 6= (0, 0). Thus, fG′ (0; d) exists and it is continuous
and linear in d but f is discontinuous at (0, 0). The function f is Gâteaux differen-
tiable but not continuous. Hence a Gâteaux differentiable function is not necessarily
continuous.
Qamrul Hasan Ansari Advanced Functional Analysis Page 73

(d) The Gâteaux derivative fG′ (x; d) of a function f is positively homogeneous in the second
argument, that is, fG′ (x; rd) = rfG′ (x; d) for all r > 0. But, as we have seen in part (a),
in general, fG′ (x; d) is not linear in d.
Remark 3.2.5. The Gâteaux derivative of a linear operator T : X → Y is also a linear
operator. Indeed, if T : X → Y is a linear operator, then we have
T (x + td) − T (x) T (x) + tT (d) − T (x)
TG′ (x; d) = lim = lim = T (d).
t→0 t t→0 t
Hence TG′ (x; d) = T (d) for all x ∈ X and d ∈ X.

The following theorem shows that the partial derivatives and Gâteaux derivative are the
same if the function f defined on X is convex.
Theorem 3.2.2. Let K be nonempty convex subset of Rn and f : K → R be a convex
function. If the partial derivatives of f at x ∈ K exist, then f is Gâteaux differentiable
at x.

Proof. Suppose that the partial derivatives of f at x ∈ K exist. Then, the Gâteaux derivative
of f at x is the linear functional
n
X ∂f (x)
fG′ (x; d) = dk , for d = (d1 , d2, . . . dn ) ∈ Rn .
k=1
∂xk

For each fixed x ∈ K, define a function g : K → R by

g(d) = f (x + d) − f (x) − fG′ (x; d).


∂g(0)
Then, g is convex and = 0 for all k = 1, 2, . . . , n, since the partial derivatives of f
∂xk
exist at x. Now, if {e1 , e2 , . . . , en } is the standard basis for Rn , then by the convexity of g,
we have for λ 6= 0
n
! n n
X 1X X g(nλdk ek )
g(λd) = g λ dk ek ≤ g (nλdk ek ) = λ .
k=1
n k=1 k=1

So,
n
g(λd) X g(nλdk ek )
≤ , for λ > 0,
λ nλ
k=1

and n
g(λd) X g(nλdk ek )
≥ , for λ < 0.
λ k=1

Since
g(nλdk ek ) ∂g(0)
lim = = 0, for all k = 1, 2, . . . , n,
λ→0 nλ ∂dk
Qamrul Hasan Ansari Advanced Functional Analysis Page 74

we have
g(λd)
lim = 0,
λ→0 λ
and so, f is Gâteaux differentiable at x.

The mean value theorem in terms of Gâteaux derivative is the following.


Theorem 3.2.3. Let X and Y be normed spaces, K be a nonempty open subset of X
and T : X → Y be Gâteaux differentiable with Gâteaux derivative fG′ (x; d) at x ∈ X in
the direction d ∈ X. Then for any points x ∈ X and x + d ∈ X, there exists s ∈ ]0, 1[
such that
T (x + d) − T (x) = TG′ (x + sd; d). (3.13)

Proof. Since K is an open subset of X, we can select an open interval I of real numbers,
which contains the numbers 0 and 1, such that x + λd belongs to K for all λ ∈ I. For all
λ ∈ I, define
ϕ(λ) = T (x + λd).
Then,
ϕ(λ + τ ) − ϕ(λ)
ϕ′ (λ) = lim
τ →0 τ
T (x + λd + τ d) − T (x + λd)
= lim
τ →0 τ
= TG′ (x + λd; d). (3.14)
By applying the mean value theorem for real-valued functions of one variable to the restric-
tion of the function ϕ : I → R to the closed interval [0, 1], we obtain
ϕ(1) − ϕ(0) = ϕ′ (s), for some s ∈ ]0, 1[.
By using (3.14) and the definition of ϕ : [0, 1] → R, we obtain the desired result.

For the differentiable function, we have the following result which follows from the above
theorem.
Corollary 3.2.1. If in the above theorem T is a differentiable function from Rn to R,
then there exists s ∈ ]0, 1[ such that T (x + d) − T (x) = hTG′ (x + sd), i = h∇T (x + sd), di.

Now we give the characterization of a convex functional in terms of Gâteaux derivative.


Theorem 3.2.4. Let X be a normed space and f : X → (−∞, ∞] be a proper function.
Let K be a convex subset of int(Dom(f )) such that f is Gâteaux differentiable at each
point of K. Then the following are equivalent:

(a) f is convex on K.
Qamrul Hasan Ansari Advanced Functional Analysis Page 75

(b) f (y) − f (x) ≥ fG′ (x)(y − x) for all x, y ∈ K.

(c) fG′ (y)(y − x) − fG′ (x)(y − x) ≥ 0 for all x, y ∈ K.

Proof. (a) ⇒ (b). Suppose that f is convex on K. Let x, y ∈ K. Then


f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y), for all t ∈ (0, 1).
It follows that
f (x + t(y − x)) − f (x)
≤ f (y) − f (x), for all t ∈ (0, 1).
t
Letting limit as t → 0, we obtain
fG′ (x)(y − x) ≤ f (y) − f (x).
Thus, (b) holds.

(b)⇒(c). Suppose that (b) holds. Let x, y ∈ K. Note that


fG′ (y)(x − y) ≤ f (x) − f (y)
and
fG′ (x)(y − x) ≤ f (y) − f (x).
Adding the above inequalities, we obtain
fG′ (y)(y − x) − fG′ (x)(y − x) ≥ 0.

(c) ⇒ (a). Suppose that (c) holds. Then we have


fG′ (u)(u − v) − fG′ (v)(u − v) ≥ 0, for all u, v ∈ K. (3.15)
Let x, y ∈ K. Define a function g : [0, 1] → R by
g(t) = f (x + t(y − x)), for all t ∈ [0, 1].
Then
g ′ (t) = fG′ (x + t(y − x))(y − x).
Consider u = (1 − t)x + ty and v = (1 − s)x + sy in (3.15), for 0 ≤ s < t ≤ 1. Then we have
(fG′ ((1 − t)x + ty) − fG′ ((1 − s)x + sy)) ((1 − t)x + ty − ((1 − s)x + sy)) ≥ 0,
which implies that
(g ′ (t) − g ′(s))(t − s) = (fG′ ((1 − t)x + ty) − fG′ ((1 − s)x + sy)) (y − x) ≥ 0.
Hence g ′ is monotonic increasing on [0, 1] and hence g is convex on [0, 1]. Thus,
g(λ) ≤ (1 − λ)g(0) + λg(1), for all λ ∈ (0, 1),
it follows that f is convex on K.
Qamrul Hasan Ansari Advanced Functional Analysis Page 76

Exercise 3.2.1. Let f : R2 → R be defined by



 2x2 e−x−2
1
, if x1 6= 0,
−2x−2
f (x1 , x2 ) = 2
x2 +e 1
 0, if x1 = 0.

Prove that f is Gâteaux differentiable at 0 but not continuous there.


Qamrul Hasan Ansari Advanced Functional Analysis Page 77

3.3 Fréchet Derivative and Its Properties

Definition 3.3.1. Let X and Y be normed spaces. An operator (possibly nonlinear) T :


X → Y is said to be Fréchet differentiable at a point x ∈ int(Dom(T )) if there exists a
continuous linear operator T ′ (x) : X → Y such that
kT (x + d) − T (x) − T ′ (x)(d)k
lim = 0. (3.16)
kdk→0 kdk

In this case, T ′ (x), also denoted by DT (x), is called Fréchet derivative of T at the point x.
The operator T ′ : X → B(X, Y ) which assigns a continuous linear operator T ′ (x) to a vector
x is known as the Fréchet derivative3 of T .

The domain of the operator T ′ contains naturally all vectors in X at which the Fréchet
derivative can be defined.

The meaning of the relation (3.16) is that for each ε > 0, there exists a δ > 0 (depending on
ε) such that
kT (x + d) − T (x) − T ′ (x)(d)k
< ε,
kdk
for all d ∈ X satisfying the condition kdk < δ.

Example 3.3.1. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner
product. If T : Rn → Rm is Fréchet differentiable at a point x ∈ Rn , then T is represented
by T (x) = (f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn )), where fj : Rn → R be a function for each
j = 1, 2, . . . , m. Let {ei : i = 1, 2, .P . . n} denote the standardPbasis in Rn . Then the vector
d ∈ R can be represented as d = in=1 di ei and f ′ (x)(d) = ni=1 di f ′ (x)(ei ). Therefore we
n

find that
(f1 (·, xi + t, ·), . . . , fm (·, xi + t, ·)) − (f1 (·, xi , ·), . . . , fm (·, xi , ·))
lim
t→0 t
 
∂f1 (x) ∂fm (x)
= ,..., = T ′ (x)(ei ).
∂xi ∂xi
Thus the Fréchet derivative T ′ is expressed in the following form
X n  
′ ∂f1 (x) ∂fm (x)
T (x)(d) = di ,...,
i=1
∂xi ∂xi
X n  
∂f1 (x) ∂fm (x)
= di , . . . , di
i=1
∂xi ∂xi
 ∂f1 (x)   
∂x1
. . . ∂f∂x1 (x) n
d 1
 .. .. ..   .. 
=  . . .   . .
∂fm (x) ∂fm (x)
∂x
. . . ∂x dn
1 n

3
The Fréchet derivative is introduced by the French mathematician Gil Fréchet in 1925.
Qamrul Hasan Ansari Advanced Functional Analysis Page 78

This shows that the Fréchet derivative T ′ (x) at a point x is a linear operator represented by
the Jacobian matrix.
Remark 3.3.1. If the operators λT (λ is a scalar) and T + S are Fréchet differentiable, then
for all d ∈ X,
(λT )′ (d) = αT ′ (d) and (T + S)′ (d) = T ′ (d) + S ′ (d).

We establish the relation between Gâteaux and Fréchet differentiability.


Proposition 3.3.1. Let X and Y be normed spaces. If the operator T : X → Y is
Fréchet differentiable at x ∈ X, it is Gâteaux differentiable at x and these two derivatives
are equal.

Proof. Since T is Fréchet differentiable at x, we have


kT (x + d) − T (x) − T ′ (x)(d)k
lim = 0.
kdk→0 kdk
Set d = td0 for t > 0 and for any fixed d0 6= 0. Then
kT (x + td0 ) − T (x) − tT ′ (x)(d0 )k
0 = lim
t→0 tkd0 k
kT (x + td0 ) − T (x) 1
= lim − T ′ (x)(d0 )
t→0 t kd0k
which implies that
T (x + td0 ) − T (x)
T ′ (x)(d0 ) = lim = TG′ (x)(d0 ), for all d0 ∈ X.
t→0 t
Hence TG′ (x) ≡ T ′ (x).

The following example shows that the converse of Proposition 3.3.1 is not true, that is, if an
operator T : X → Y is Gâteaux differentiable, then it may not be Fréchet differentiable.
Example 3.3.2. Let X = R2 with the Euclidean norm k · k and f : X → R be a function
defined by  x3 y
, if (x, y) 6= (0, 0),
f (x, y) = x4 +x2
0, if (x, y) = (0, 0).
It can be easily seen that the f is Gâteaux differentiable at (0, 0) with Gâteaux derivative
fG′ (0, 0) = 0.

Since for (x, x2 ) ∈ X with (x, x2 ) 6= (0, 0), we have


|f (x, x2 )| |x3 x3 | 1 1
2
= √ = √ , for k = h2 .
k(x, x )k 4 4 2
(x + x )( x + x )4 2 1 + x2
Therefore, f is not Fréchet differentiable at (0, 0).
Qamrul Hasan Ansari Advanced Functional Analysis Page 79

Theorem 3.3.1. Let X and Y be normed spaces. If the operator T : X → Y is Fréchet


differentiable at x ∈ X, then it is continuous at x.

Proof. Since T has a Fréchet derivative at x ∈ X, for each ε1 > 0, there exists a δ1 > 0
(depending on ε1 ) such that

kT (y) − T (x) − T ′ (x)(y − x)k < ε1 ky − xk,

for all y ∈ X satisfying ky − xk < δ1 . By the triangle inequality

kT (y) − T (x) − T ′ (x)(y − x)k ≥ kT (y) − T (x)k − kT ′ (x)(y − x)k,

we find for ky − xk < δ1 that

kT (y) − T (x)k < ε1 ky − xk + kT ′ (x)(y − x)k


≤ (ε1 + kT ′ (x)k)ky − xk.

Choose δ = min{δ1 , ε/(ε1 + kT ′ (x)k)} for each ε > 0. Then for all y ∈ X, we have

kT (y) − T (x)k < ε whenever ky − xk < δ,

that is, T is continuous at x.

Theorem 3.3.2 (Chain Rule). Let X, Y and Z be normed spaces. If T : X → Y and


S : Y → Z are Fréchet differentiable, then the operator R := S ◦ T : X → Z is also
Fréchet differentiable and its Fréchet derivative is given by

R′ (x) = S ′ (T (x)) ◦ T ′ (x).

Proof. For exercise.

Theorem 3.3.3 (Mean Value Theorem). Let K be an open convex subset of a normed
space X, a, b ∈ K and T : K → X be a Fréchet differentiable such that at each x ∈ (a, b)
(open line segment joining a and b) and T (x) is continuous on closed line segment [a, b].
Then
kT (b) − T (a)k ≤ sup kT ′ (y)k kb − ak. (3.17)
y∈(a,b)

Proof. Let F be a continuous linear functional on X and ϕ : [0, 1] → R be a function defined


by
ϕ(λ) = F ((T ((1 − λ)a + λb))), for all λ ∈ [0, 1].
Qamrul Hasan Ansari Advanced Functional Analysis Page 80

By Classical Mean Value Theorem of Calculus for ϕ, we have that for some λ̂ ∈ [0, 1] and
x = (1 − λ̂)a + λ̂b,

F (T (b) − T (a)) = F (T (b)) − F (T (a))


= ϕ(1) − ϕ(0)
= ϕ′ (λ̂) = F (T ′ (x)(b − a)),

where we have used the Chain Rule and the fact that a bounded linear functional is its own
derivative. Therefore, for each continuous linear functional F on X, we have

kF (T (b) − T (a))k ≤ kF k kT ′(x)k kb − ak. (3.18)

Now, if we define a function G on the subspace [T (b) −T (a)] of X as G(α(F (b)) −F (a)) = α,
then kGk = kT (b) − T (a)k−1 . If F is a Hahn-Banach extension of G to entire X, we find by
substitution in (3.18) that

1 = kF (T (b) − T (a))k ≤ kT (b) − T (a)k−1 kT ′ (x)k kb − ak,

which gives (3.17).

Definition 3.3.2. If T : X → Y is Fréchet differentiable on an open set Ω ⊂ X and the


first Fréchet derivative T ′ at x ∈ Ω is Fréchet differentiable at x, then the Fréchet derivative
of T ′ at x is called the second derivative of T at x and is denoted by T ′′ (x).

Definition 3.3.3. Let X be a normed space. A function f : X → R is said to be twice


Fréchet differentiable at x ∈ int(Dom(T )) if there exists A ∈ B(X, X ∗ ) such that

kf ′ (x + d) − f ′ (x) − A(d)k
lim = 0.
t→0 t
The second derivative of f at x and is f ′′ (x) = A.

It may be observed that if T : X → Y is Fréchet differentiable on an open set Ω ⊂ X, then


T ′ is a mapping on X into B[X, Y ]. Consequently, if T ′′ (x) exists, it is a bounded linear
mapping from X into B[X, Y ]. If T ′′ exists at every point of Ω, then T ′′ : X → B[X, B[X, Y ]].

Theorem 3.3.4 (Taylor’s Formula for Differentiable Functions). Let T : Ω ⊂ X → Y


and let [a, a + h] be any closed segment in Ω. If T is Fréchet differentiable at a, then

T (a + h) = T (a) + T ′ (a)h + khkε(h), lim ε(h) = 0.


h→0

Theorem 3.3.5 (Taylor’s Formula for Twice Fréchet Differentiable Functions). Let T :
Ω ⊂ X → Y and [a, a + h] be any closed segment lying in Ω. If T is differentiable in Ω
Qamrul Hasan Ansari Advanced Functional Analysis Page 81

and twice differentiable at a, then


1
T (a + h) = T (a) + T ′ (a)h + (T ′′ (a)h)h + khk2 ε(h),
2
lim ε(h) = 0.
h→0

For proofs of these two theorems and other related results, we refer to the book by H. Cartan,
Differential Calculus, Herman, 1971.
Qamrul Hasan Ansari Advanced Functional Analysis Page 82

3.4 Some Related Results

Let X be a Hilbert space and f : X → (−∞, ∞] be a proper functional such that f is Gâteaux
differentiable at a point x ∈ int(Dom(f )). Then, by Riesz representation theorem4 ,there
exists exactly one vector, denoted by ∇G f (x) in X such that

fG′ (x)(d) = h∇G f (x), di, for all d ∈ X and kfG′ (x)k∗ = k∇G f (x)k. (3.19)

We say that ∇G f (x) is the Gâteaux gradient vector of f at x. Alternatively, we have

f (x + td) − f (x)
fG′ (x)(d) = h∇G f (x), di = lim , for all d ∈ X.
t∈R, t→0 t

Example 3.4.1. Let X be p a real inner product space and f : X → R be a functional


defined by f (x) = kxk = hx, xi for all x ∈ X. Then f is differentiable on X \ {0} with
1
∇G f (x) = kxk x for 0 6= x ∈ X.

In fact, for x, d ∈ X with x 6= 0, we have


p p
f (x + td) − f (x) = kxk2 + 2thx, di + t2 kdk2 − kxk2
kxk2 + 2thx, di + t2 kdk2 − kxk2
= p p
kxk2 + 2thx, di + t2 kdk2 + kxk2
2thx, di + t2 kdk2
= p p , for all t ∈ R,
kxk2 + 2thx, di + t2 kdk2 + kxk2

which implies that

f (x + td) − f (x) 1
fG′ (x)(d) = lim = hx, di = h∇G f (x), di,
t→0 t kxk
1
where ∇G f (x) = kxk
x.

Lemma 3.4.1 (Descent lemma). Let X be a Hilbert space and f : X → R be a differ-


entiable convex function such that ∇f : X → X is ∇f is β-Lipschitz continuous. Then
the following assertions hold:

(a) For all x, y ∈ X,

β
f (y) − f (x) ≤ ky − xk2 + hy − x, ∇f (x)i.
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 83

(b) For all x ∈ X,  


1 1
f x − ∇f (x) ≤ f (x) − k∇f (x)k2 .
β 2β

Proof. (a) Let x, y ∈ X. Define φ : [0, 1] → R by

φ(t) = f (x + t(y − x)), for all t ∈ [0, 1].

Noticing that

φ(1) = f (y), φ(0) = f (x) and φ′ (t) = hy − x, ∇f (x + t(y − x))i.

Hence
Z 1
f (y) = f (x) + φ′ (t)dt
Z0 1
= f (x) + hy − x, ∇f (x + t(y − x))idt
0
Z 1
= f (x) + hy − x, ∇f (x + t(y − x)) − ∇f (x)idt + hy − x, ∇f (x)i
0
Z 1
≤ f (x) + ky − xk k∇f (x + t(y − x)) − ∇f (x)kdt + hy − x, ∇f (x)i
0
β
≤ f (x) + ky − xk2 + hy − x, ∇f (x)i.
2

1
(b) Replacing y by x − ∇f (x) in (a), we get (b).

Definition 3.4.1. Let X be an inner product space. An operator T : X → X is said to be
γ-inverse strongly monotone or γ-cocercive if there exists γ > 0 such that

hT (x) − T (y), x − yi ≥ γkT (x) − T (y)k2, for all x, y ∈ X.

Proposition 3.4.1. Let X be a Hilbert space and f : X → R be a Fréchet differentiable


convex function such that ∇f : X → X is ∇f is β-Lipschitz continuous for some β > 0.
1
Then ∇f is -inverse strongly monotone, that is,
β
1
h∇f (x) − ∇f (y), x − yi ≥ k∇f (x) − ∇f (y)k2.
β

Proof. Let x ∈ X. Define g : X → R by

g(z) = f (z) − f (x) − h∇f (x), z − xi, for all z ∈ X.


Qamrul Hasan Ansari Advanced Functional Analysis Page 84

Note
g(x) = 0 ≤ f (z) − f (x) − h∇f (x), z − xi = g(z), for all z ∈ X
and
∇g(z) = ∇f (z) − ∇f (x), for allz ∈ X.
Clearly, inf g(z) = 0. One can see that
z∈X

k∇g(u) − ∇g(v)k = k∇f (u) − ∇f (v)k ≤ βku − vk, for all u, v ∈ X.

Let y ∈ X. From Lemma 3.4.1(b), we have


1
inf g(z) ≤ g(y) − k∇g(y)k2,
z∈X 2β
which implies that
1
0 ≤ f (y) − f (x) − h∇f (x), y − xi − k∇f (y) − ∇f (x)k2 .

Similarly, we have
1
0 ≤ f (x) − f (y) − h∇f (y), x − yi − k∇f (x) − ∇f (y)k2.

Thus, we have
1
0 ≤ h∇f (x) − ∇f (y), x − yi − k∇f (x) − ∇f (y)k2.
β

Definition 3.4.2. Let X be a inner product space. An operator T : X → X is said to be

(a) nonexpansive if
kT (x) − T (y)k ≤ kx − yk, for all x, y ∈ X;

(b) firmly nonexpansive if

kT (x) − T (y)k2 + k(I − T )(x) − (I − T )(y)k2 ≤ kx − yk2, for all x, y ∈ X,

where I is the identity operator.

It can be easily seen that every firmly nonexpansive mapping is nonexpansive but converse
may not hold. For example, consider the negative of identity operator, that is, (−I).

Corollary 3.4.1. Let X be a Hilbert space and f : X → R be a Fréchet differentiable


convex function. Then

∇f is nonexpansive ⇔ ∇f is firmly nonexpansive.


Qamrul Hasan Ansari Advanced Functional Analysis Page 85

Exercise 3.4.1. Let X be a Hilbert space and Y be an inner product space, A ∈ B(X, Y )
and b ∈ Y . Define a functional f : X → R by
1
f (x) = kA(x) − bk2 , for all x ∈ X.
2
Then prove that f is Fŕechet differentiable on X with ∇f (x) = A∗ (Ax−b) and ∇2 f (x) = A∗ A
for each x ∈ X.

Proof. Let x ∈ X. Then, for y ∈ X, we have

f (x + y) − f (x)
1
= hA(x) − b + A(y), A(x) − b + A(y)i − f (x)
2
1
= [hA(x) − b, A(x) − bi + hA(x) − b, A(y)i + hA(y), A(x) − bi + hA(y), A(y)i] − f (x)
2
1
= hA(x) − b, A(y)i + kA(y)k2
2
∗ 1
= hA (Ax − b), yi + kA(y)k2.
2
Thus,
1 kAk2
|f (x + y) − f (x) − hA∗ (Ax − b), yi| = kA(y)k2 ≤ kyk2, for all y ∈ X.
2 2
Therefore, f is Fŕechet differentiable on X with

f ′ (x)y = h∇f (x), yi, for all x ∈ X,

where ∇f (x) = A∗ (A(x) − b). It is easy to see that ∇2 f (x) = A∗ A.


Exercise 3.4.2. Let X an inner product space and a ∈ X. Define a functional f : X → R
by
1
f (x) = kx − ak2 , for all x ∈ X.
2
Then prove that f is Fŕechet differentiable on X with ∇f (x) = x − a and ∇2 f (x) = I for
each x ∈ X.
Exercise 3.4.3. Let X be a Hilbert space X and A : X → X be a bounded linear operator.
Let b ∈ X, c ∈ R and define
1
f (x) = hAx, xi − hb, xi + c, x ∈ C.
2
1
Then prove that f is Fréchet differentiable on X with ∇f (x) = (A + A∗ )(x) − b and
2
2 1 ∗
∇ f (x) = (A + A ) for each x ∈ X.
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 86

Proof. Let x ∈ X. Then, for y ∈ X, we have


1
f (x + y) = hA(x + y), x + yi − hb, (x + y)i + c
2
1
= [hA(x) + A(y), xi + hA(x) + A(y), yi] − hb, (x + y)i + c
2
1
= [hA(x), xi + hA(y), xi + hA(x), yi + hA(y), yi] − hb, (x + y)i + c
2
1 1
= hA(x), xi − hb, xi + c + [hy, A∗(x)i + hA(x), yi + hA(y), yi] − hb, yi
2 2
1 1
= f (x) + h (A + A∗ )(x) − b, yi + hA(y), yi.
2 2
Thus,
1
kf (x + y) − f (x) − h (A + A∗ )(x) − b, yik ≤ kAkkyk2, for all y ∈ X.
2
Therefore,
kf (x + y) − f (x) − h 21 (A + A∗ )(x) − b, yik
lim = 0,
kyk→0 kyk
1
i.e., f is differentiable with ∇f (x) = (A + A∗ )(x) − b. One can see that
2
1
∇2 f (x) = (A + A∗ ).
2

Exercise 3.4.4. Let X be a Hilbert space, b ∈ X and A : X → X be a self-adjoint, bounded,


linear operator and strongly positive, i.e., there exists α > 0 such that

hA(x), xi ≥ αkxk2 , for all x ∈ X.

Let b ∈ X and define a quadratic function f : X → R by


1
f (x) = hA(x), xi + hx, bi, for all x ∈ X.
2
Then prove that ∇f (·) = A(·) + b is α-strongly monotone and kAk-Lipschitz continuous.

When X = RN is finite dimensional, then the above operator A coincides with a positive
definite matrix. Then ∇2 f (x) = A and

λmin kxk2 ≤ hAx, xi ≤ λmax kxk2 , for all x ∈ RN ,

where λmin and λmax are the minimum and maximum eigenvalues of A, respectively. Hence
α = λmin ≤ λmax = kAk.
Qamrul Hasan Ansari Advanced Functional Analysis Page 87

3.5 Subdifferential and Its Properties

The concept of a subdifferential plays an important role in problems of optimization and


convex analysis. In this section, we study subgradients and subdifferentials of R∞ -valued
convex functions and their properties in normed spaces.

We have already seen in Theorem 3.2.4 that if X is a normed space, f : X → (−∞, ∞] is a


proper convex function and x ∈ int(Dom(f )), then the following inequality holds:

fG′ (x)(y − x) + f (x) ≤ f (y), for all y ∈ X. (3.20)

The inequality (3.20) motivates us to introduce the notion of another kind of differentiability
when f is not Gâteaux differentiable at x, but the inequality (3.20) holds.

Definition 3.5.1. Let X be a normed space, f : X → (−∞, ∞] be a proper function and


x ∈ Dom(f ). Then an element j ∈ X ∗ is said to be a subgradient of f at x if
f (x) ≤ f (y) + hx − y, ji for all y ∈ X. (3.21)

The set (possibly nonempty)


∂f (x) := {j ∈ X ∗ : f (x) ≤ f (y) + hx − y, ji, for all y ∈ X},
of subgradients of f at x is called the subdifferential or Fenchel subdifferential of f at x.

Clearly, ∂f (x) may be empty set even if f (x) ∈ R. But for the case x ∈ / Dom(f ), we
consider ∂f (x) = ∅. Thus, subdifferential of a proper convex function f is a set-valued
mapping ∂f : X ⇒ X ∗ defined by

∂f (x) = {j ∈ X ∗ : f (x) ≤ f (y) + hx − y, ji for all y ∈ X}.


The domain of the subdifferential ∂f is defined by
Dom(∂f ) = {x ∈ X : ∂f (x) 6= ∅}.

Obviously, Dom(∂f ) ⊆ Dom(f ).


Remark 3.5.1. (a) If f (x) 6= ∞, then Dom(∂f ) is a subset of Dom(f ).
(b) If f (x) = ∞ for some x, then ∂f (x) = ∅.
Definition 3.5.2. Let X be a Hilbert space, f : X → (−∞, ∞] be a proper function. The
subdifferential of f is the set-valued map ∂f : X ⇒ X defined by
∂f (x) = {u ∈ X : f (x) ≤ f (y) + hx − y, ui for all y ∈ X}, for x ∈ X. (3.22)

Then f is said to be subdifferentiable at x ∈ X if ∂f (x0 ) 6= ∅. The elements of ∂f (x) are


called the subgradients of f at x.
Qamrul Hasan Ansari Advanced Functional Analysis Page 88

Example 3.5.1. Let f : R → R be a function defined by f (x) = |x| for x ∈ R. Then



 {−1}, if x < 0,
∂f (x) = [−1, 1], if x = 0,

{1}, if x > 0.

Note that f is convex and continuous, but not differentiable at 0. Clearly, f is subdifferen-
tiable at 0 with ∂f (0) = [−1, 1]. Also Dom(∂f ) = Dom(f ) = R.
Example 3.5.2. Define f : R → (−∞, ∞] by

0, if x = 0,
f (x) =
∞, otherwise.

Then 
∅, if x 6= 0,
∂f (x) =
R, if x = 0.
Note that f is not continuous at 0, but f is subdifferentiable at 0 with ∂f (0) = R.
Example 3.5.3. Define f : R → (−∞, ∞] by

∞,
√ if x < 0,
f (x) =
− x, if x ≥ 0.

Then 
∅, if x ≤ 0,
∂f (x) =
− 2√1 x , if x > 0.
Note that Dom(f ) = [0, ∞) and f is not continuous at 0. Moreover, ∂f (0) = ∅ and
Dom(∂f ) = (0, ∞). Thus, f is not subdifferentianble at 0 even 0 ∈ Dom(f ).

We now consider some more general functions.


Example 3.5.4. Let X be a inner product space, a ∈ X and define f : X → R by
f (x) = kx − ak for x ∈ X. Then

S1 (0), if x = a,
∂f (x) =
x − a, if x 6= a,

where S1 (0) is open unit ball at 0 ∈ X


Example 3.5.5. Let K be a nonempty closed convex subset of a normed space X and iK
the indicator function of K, i.e.,

0, if x ∈ K,
iK (x) =
∞, otherwise.

Then
∂iK (x) = {j ∈ X ∗ : hx − y, ji ≥ 0 for all y ∈ K} , for x ∈ K.
Qamrul Hasan Ansari Advanced Functional Analysis Page 89

Proof. Since the indicator function is a proper lower semicontinuous convex function on X,
from (3.21), we have

∂iK (x) = {j ∈ X ∗ : iK (x) − iK (y) ≤ hx − y, ji for all y ∈ K} .

Remark 3.5.2. Dom(iK ) = Dom(∂iK ) = K and ∂iK (x) = {0} for each x ∈ int(K).

3.5.1 Properties of Subdifferentials

Definition 3.5.3. Let X be an inner product space. A set-valued mapping T : X ⇒ X is


said to be

(a) monotone if for all x, y ∈ X,

hu − v, x − yi ≥ 0, for all u ∈ T (x) and v ∈ T (y);

(b) maximal monotone if it is monotone and its graph Graph(T ) := {(x, u) ∈ X × X :


u ∈ T (x)} is not contained properly in the graph of any other monotone set-valued
mapping.

Theorem 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex


function. Then ∂f is monotone.

Proof. Let x, y ∈ and u ∈ ∂f (x), v ∈ ∂f (y) be arbitrary. Then

f (x) ≤ f (z) + hx − z, ui, for all z ∈ X (3.23)

and
f (y) ≤ f (w) + hy − w, vi, for all w ∈ X. (3.24)
Taking z = y in (3.23) and w = x in (3.24) and adding the resultants, we get

f (x) + f (y) ≤ f (y) + f (x) + hx − y, ui + hy − x, vi,

which implies that


hu − v, x − yi ≥ 0.
Thus, ∂f is monotone.
Qamrul Hasan Ansari Advanced Functional Analysis Page 90

Theorem 3.5.2. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower


semicontinuous convex function. Then R(I + ∂f ) = X.

Proof. Noticing that R(I + ∂f ) ⊆ X. It is suffices to show that X ⊆ R(I + ∂f ). For this,
let x0 ∈ X and define
1
ψ(x) = kxk2 + f (x) − hx, x0 i, for all x ∈ X.
2
Note that ψ has an affine lower bound and lim ψ(x) = ∞. Hence, from Theorem A5 , there
kxk→∞
exists z ∈ Dom(f ) such that
ψ(z) = inf ψ(x).
x∈X

Thus, for all x ∈ X, from Proposition P6 , we have

kxk2 ≤ kzk2 + 2hx − z, xi

and
1 1
kzk2 + f (z) − hz, x0 i ≤ kxk2 + f (x) − hx, x0 i,
2 2
which imply that
1
f (z) ≤ f (x) + (kxk2 − kzk2 ) + hz − x, x0 i
2
≤ f (x) + hx − z, xi + hz − x, x0 i
= f (x) + hx − z, x − x0 i.

Let u ∈ X. Define zt = (1 − t)z + tu for t ∈ (0, 1). Hence, for t ∈ (0, 1), we obtain

f (z) ≤ (1 − t)f (z) + tf (u) + thu − z, zt − x0 i,

which gives us that


f (z) ≤ f (u) + hu − z, zt − x0 i.
Letting limit as t → 0+ , we get

f (z) ≤ f (u) + hu − z, z − x0 i.

Hence x0 − z ∈ ∂f (z), i.e., x0 ∈ (I + ∂f )(z) ⊆ R(I + ∂f ). Thus, X ⊆ R(I + ∂f ).

From Theorem 3.5.2, we have

5
Theorem A. Let K be a nonempty closed convex subset of a Hilbert space X and f : K → (−∞, +∞]
be a proper lower semicontinuous function such that f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ K
such that f (x̄) = inf f (x).
x∈K
6
Let X be an inner product space. Then for any x, y ∈ X, kxk2 ≤ kyk2 − 2hy − x, xi
Qamrul Hasan Ansari Advanced Functional Analysis Page 91

Corollary 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower


semicontinuous convex function. Then R(I + λ∂f ) = X for all λ ∈ (0, ∞).

Theorem 3.5.3. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower


semicontinuous convex function. Then ∂f is maximal monotone.

Theorem 3.5.4. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex


function. Then, for each x ∈ Dom(f ), ∂f (x) is closed and convex.

Proof. Exercise.

Theorem 3.5.5. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex


function. Then ∂f −1 (0) is closed and convex.

Proof. Exercise.

We now study some calculus of subgradients.

Proposition 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper


function. Then
∂(λf ) = λ∂f, for all λ ∈ (0, ∞).

Proof. Let λ ∈ (0, ∞). Then, for x ∈ X, we have

z ∈ ∂(λf )(x) ⇔ λf (x) ≤ λf (y) + hx − y, zi, for all y ∈ X


1
⇔ f (x) ≤ f (y) + hx − y, zi, for all y ∈ X
λ
1
⇔ z ∈ ∂f (x)
λ
⇔ z ∈ λ∂f x).

Therefore,
∂(λf ) = λ∂f, for all λ ∈ (0, ∞).

Theorem 3.5.6. Let X be a Hilbert space. Let f, g : X → (−∞, ∞] be proper convex


functions and there exists x0 ∈ Dom(f ) ∩ Dom(g) where f is continuous. Then

∂(f + g) = ∂f + ∂g.
Qamrul Hasan Ansari Advanced Functional Analysis Page 92

Theorem 3.5.7. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex


function. Let x ∈ Dom(f ) and u ∈ X. Then

u ∈ ∂f (x) ⇔ hy, ui ≤ f ′ (x; y), for all y ∈ X.

Proof. Suppose that u ∈ ∂f (x) and y ∈ X. From (3.22), we have

f (x) ≤ f (x + ty) + hx − (x + ty), ui, for all t ∈ (0, ∞).

Hence
f (x + ty) − f (x)
hy, ui ≤ , for all t ∈ (0, ∞).
t
Letting limit as t → 0, we get
hy, ui ≤ f ′ (x; y).

Conversely, suppose that


hy, ui ≤ f ′ (x; y), for all y ∈ X. (3.25)
From (3.4) and (3.25), we have

hy − x, ui ≤ f ′ (x; y − x) ≤ f (y) − f (x), for all y ∈ X.

This shows that u ∈ ∂f (x).

We now give a relation between Gâteaux differentiability and subdifferentiability.

Theorem 3.5.8. Let X be a Banach space and f : X → (−∞, ∞] a proper convex


function. Let f be Gâteaux differentiable at a point x0 ∈ Dom(f ). Then x0 ∈ Dom(∂f )
and ∂f (x0 ) = {fG′ (x0 )}. In this case,

d
f (x0 + ty) = hy, ∂f (x0 )i = hy, fG′ (x0 )i, for all y ∈ X.
dt t=0

Proof. Since f is Gâteaux differentiable at x0 ∈ Dom(f ). Then

f (x0 + ty) − f (x0 )


hy, fG ′ (x0 )i = lim , for all y ∈ X.
t→0 t
By the convexity of f , we have

f (x0 + λ(y − x0 )) = f ((1 − λ)x0 + λy) ≤ (1 − λ)f (x0 ) + λf (y), for all y ∈ X and λ ∈ (0, 1),

i.e,
f (x0 + λ(y − x0 )) − f (x0 )
≤ f (y) − f (x0 ), for all y ∈ X and λ ∈ (0, 1),
λ
Qamrul Hasan Ansari Advanced Functional Analysis Page 93

It follows that
hy − x0 , fG′ (x0 )i ≤ f (y) − f (x0 ), for all y ∈ X,
i.e., fG′ (x0 ) ∈ ∂f (x0 ). This shows that x0 ∈ Dom(∂f ).

Now, let jx0 ∈ ∂f (x0 ). Then, we have


f (x0 ) − f (u) ≤ hx0 − u, jx0 i, for all u ∈ X.
Let h ∈ X and let ut = x0 + λh for λ ∈ (0, ∞). Then
f (x0 + λh) − f (x0 )
≥ hh, jx0 i, for all λ ∈ (0, ∞).
λ
Letting limit as λ → 0, we get
hh, f ′G (x0 ) − jx0 i ≥ 0, for all h ∈ X,
i.e., jx0 = fG′ (x0 ). Therefore, f is Gâteaux differentiable at x0 and fG′ (x0 ) = ∂f (x0 ).

Corollary 3.5.2. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex


function such that f is Gâteaux differentiable at a point x0 ∈ Dom(f ). Then x0 ∈
Dom(∂f ) and ∂f (x0 ) = {∇G f (x0 )}. In this case,

d
f (x0 + ty) = hy, ∂f (x0 )i = hy, ∇G f (x0 )i, for all y ∈ X.
dt t=0

Exercise 3.5.1. Let X be a Banach space. Then prove that


∂kxk = {j ∈ X ∗ : hx, ji = kxk kjk∗ , kjk∗ = 1} , for all x ∈ X \ {0}.

Proof. Let j ∈ ∂kxk. Then


hy − x, ji ≤ kxk − kyk ≤ ky − xk, for all y ∈ X. (3.26)
It follows that j ∈ X ∗ and kjk ≤ 1. It is clear from (3.26) that kxk ≤ hx, ji, which gives
hx, ji = kxk and kjk∗ = 1.
Thus,
∂kxk ⊆ {j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1} .
Suppose that j ∈ X ∗ such that j ∈ {f ∈ X ∗ : hx, f i = kxk and kf k∗ = 1}. Then hx, ji = kxk
and kjk∗ = 1. Thus,
hy − x, ji = hy, ji − kxk ≤ kyk − kxk, for all y ∈ X,
that is, j ∈ ∂kxk. It follows that
{j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1} ⊆ ∂kxk.
Therefore, ∂kxk = {j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1}
Qamrul Hasan Ansari Advanced Functional Analysis Page 94

Exercise 3.5.2. Let X be a Hilbert space and a ∈ X. Define f : X → R by


1
f (x) = kx − ak2 , for all x ∈ X.
2
Then prove that ∂f (x) = {x − a} for all x ∈ X.

Hint 3.5.1. It is easy to see that f is differentiable with ∇f (x) = x − a for all x ∈ X by
Proposition 3.4.1.

Exercise 3.5.3. Let X be a Hilbert space. Then prove that ∂ 12 k · k2 = I.


4

Geometry of Banach Spaces

Among all infinite dimensional Banach spaces, Hilbert spaces have the most important and
useful geometric properties. Namely, the inner product on an inner product space satisfies
the parallelogram law. It is well known that a normed space is an inner product space if
and only if its norm satisfies the parallelogram law. The geometric properties of an inner
product space make numerous problems posed in inner product space more manageable than
those in normed spaces. Consequently, to extend some of inner product techniques and inner
product properties, we study the geometric properties of normed spaces. In this chapter, we
study strict convexity, modulus of convexity, uniform convexity and smoothness of normed
spaces. Most of the results presented in this chapter are given in the standard books on
functional analysis, convex analysis and geometry of Banach spaces, namely, recommended
books 1 and 3.

4.1 Strict Convexity and Modulus of Convexity

It is well known that the norm of a normed space X is convex, that is,
kλx + (1 − λ)yk ≤ λkxk + (1 − λ)kyk, for all x, y ∈ X and λ ∈ [0, 1].
There are several norms of normed spaces which are strictly convex, that is,
kλx + (1 − λ)yk < λkxk + (1 − λ)kyk, for all x, y ∈ X with x 6= y and λ ∈ (0, 1). (4.1)

We denote by SX the unit sphere SX = {x ∈ X : kxk = 1} in a normed space X. If x, y ∈ SX


with x 6= y, then (4.1) reduces to
kλx + (1 − λ)yk < 1, for all λ ∈ (0, 1),
which says that the unit sphere SX contains no line segments. This suggests strict convexity
of normed space.

95
Qamrul Hasan Ansari Advanced Functional Analysis Page 96

Definition 4.1.1. A normed space X is said to be strictly convex if

x, y ∈ SX with x 6= y ⇒ kλx + (1 − λ)yk < 1, for all λ ∈ (0, 1).

Geometrically speaking, the normed space X is strictly convex if the boundary of the unit
sphere in X contains no line segments.

Clearly, if k · k is strictly convex, then X is strictly convex. Also, kλx + (1 − λ)yk < 1 =
λkxk + (1 − λ)kyk (because kxk = kyk = 1) implies that k · k is strictly convex.

Before giving the examples of strictly convex normed spaces, we present the following char-
acterizations.
Proposition 4.1.1. The following assertions are equivalent:

(a) X is strictly convex.

(b) If x 6= y and kxk = kyk = 1 (that is, x, y ∈ SX ), then kx + yk < 2.

(c) If for any x, y, z ∈ X, kx − yk = kx − zk + kz − yk, then there exists λ ∈ [0, 1]


such that z = λx + (1 − λ)y.

Proof. (a) ⇒ (b): Assume that X is strictly convex. Then for any x, y ∈ SX , we have
kxk = kyk = 1 and therefore, by strict convexity of X, we have kλx + (1 − λ)yk < 1 for all
λ ∈ [0, 1]. Take λ = 21 , then we obtain kx + yk < 2, that is, (b) holds.

(b) ⇒ (a): Suppose contrary that for each x, y ∈ X, x 6= y, kxk = kyk = 1 and λ0 ∈ (0, 1),
we have kλ0 x + (1 − λ0 )yk = 1, that is, λ0 x + (1 − λ0 )y ∈ SX . Take λ0 < λ < 1, then
 
λ0 λ0
λ0 x + (1 − λ0 )y = [λx + (1 − λ)y] + 1 − y,
λ λ
 
λ0 (1 − λ) λ0
as 1 − λ0 = + 1− and hence
λ λ
 
λ0 λ0
1 = kλ0 x + (1 − λ0 yk ≤ kλx + (1 − λ)yk + 1 − kyk.
λ λ
This implies that  
λ0 λ0 λ0
kλx + (1 − λ)yk ≥ 1 − 1 − = ,
λ λ λ
that is, kλx + (1 − λ)yk ≥ 1.

Similarly, for 0 < λ < λ0 , we can have kλx + (1 − λ)yk ≥ 1. So for particular λ = 12 , we have
1
2
kx + yk ≥ 1, that is, kx + yk ≥ 2, a contradiction of the condition of strict convexity.
Qamrul Hasan Ansari Advanced Functional Analysis Page 97

(a) ⇒ (c): Let x, y, z ∈ X such that kx − yk = kx − zk + kz − yk. Suppose that kx − zk =


6 0,
kz − yk =
6 0 and kx − zk ≤ kz − yk. Then
1 x−z 1 z−y
· + ·
2 kx − zk 2 kz − yk
1 x−z 1 z−y 1 z−y 1 z−y
≥ · + · − · − ·
2 kx − zk 2 kx − zk 2 kx − zk 2 kz − yk
1 x−z 1 z−y 1 (z − y)kz − yk − (z − y)kx − zk
= · + · − ·
2 kx − zk 2 kx − zk 2 kx − zk kz − yk
1 kx − yk 1 kz − yk − kx − zk
= · − ·
2 kx − zk 2 kx − zk
1 kx − yk − kz − yk + kx − zk
= ·
2 kx − zk
1 kx − zk + kz − yk − kz − yk + kx − zk
= · = 1,
2 kx − zk
x−z z−y
since kx − yk = kx − zk + kz − yk. Now since kx−zk
= 1 and kz−yk
= 1, that is,
x−z z−y
kx−zk
∈ SX and kz−yk
∈ SX , we have

1 x−z 1 z−y
· + · < 1.
2 kx − zk 2 kz − yk
Hence,
x−z z−y
+ = 2.
kx − zk kz − yk
Therefore,
x−z z−y
= , by (b)
kx − zk kz − yk
and this yields
kz − yk kx − zk
z= ·x+ · y.
kx − zk + kz − yk kx − zk + kz − yk

(c) ⇒ (b): Let x 6= y such that kxk = kyk = x+y 2


= 1. Then kx + yk = kxk + kyk.
Consequently, there exists λ ∈ (0, 1) such that z = 0 = λx − (1 − λ)y, that is, x = 1−λ
λ
y.
1−λ
So that kxk = λ kyk. Since kxk = kyk = 1, we have λ = 1/2. Therefore, x = y, a
contradiction.
Remark 4.1.1. (a) The assertion (b) in Proposition 4.1.1 says that the midpoint (x+y)/2
of two distinct points x and y on the unit sphere SX of X does not lie on SX . In other
words, if x, y ∈ SX with kxk = kyk = k(x + y)/2k, then x = y.

(b) The assertion (c) in Proposition 4.1.1 says that any three point x, y, z ∈ X satisfying
kx − yk = kx − zk + kz − yk must lie ona line; specially,
 if kx − zk = r1 , ky − zk = r2
r1 r2
and kx − yk = r = r1 + r2 , then z = r x + r y.
Qamrul Hasan Ansari Advanced Functional Analysis Page 98

We give some examples of strict convex spaces.

Example 4.1.1. Consider X = Rn , n ≥ 2 with norm kxk2 defined by

n
!1/2
X
kxk2 = x2i , x = (x1 , x2 , . . . , xn ) ∈ Rn .
i=1

. . . , 0) ∈ Rn and y = (0, 1, 0, . . . , 0) ∈ Rn . Then x 6= y, kxk2 = 1 = kyk2,


Let x = (1, 0, 0,√
but kx + yk2 = 2 < 2. Hence X is strictly convex.

SX

The unit sphere in R2 with


p respect to the norm
kxk2 = k(x1 , x2 )k = x21 + x22

Example 4.1.2. Consider X = Rn , n ≥ 2 with norm k · k1 defined by

kxk1 = |x1 | + |x2 | + · · · + |xn |, x = (x1 , x2 , . . . , xn ) ∈ Rn .

Then X is not strictly convex. To see this, let x = (1, 0, 0, . . . , 0) ∈ Rn and y = (0, 1, 0, . . . , 0) ∈
Rn . Then x 6= y, kxk1 = 1 = kyk1, but kx + yk1 = 2.
Qamrul Hasan Ansari Advanced Functional Analysis Page 99

SX

The unit sphere in R2 with respect to the norm


kxk1 = k(x1 , x2 )k1 = |x1 | + |x2 |

Example 4.1.3. Consider X = Rn , n ≥ 2 with norm k · k∞ defined by


kxk∞ = max |xi |, x = (x1 , x2 , . . . , xn ) ∈ Rn .
1≤i≤n

Then X is not strictly convex. Indeed, for x = (1, 0, 0, . . . , 0) ∈ Rn and y = (1, 1, 0, . . . , 0) ∈


Rn , we have, x 6= y, kxk∞ = 1 = kyk∞, but kx + yk∞ = 2.

SX

The unit sphere in R2 with respect to the norm


kxk∞ = k(x1 , x2 )k∞ = max{|x1 |, |x2 |}

Example 4.1.4. The space C[a, b] of all real-valued continuous functions defined on [a, b]
with the norm kf k = sup |f (t)|, is not strictly convex. Indeed, choose two functions f and
a≤t≤b
g defined as follows:
b−t
f (t) = 1, for all t ∈ [a, b] and g(t) = , for all t ∈ [a, b].
b−a
Qamrul Hasan Ansari Advanced Functional Analysis Page 100

Then, clearly, f, g ∈ C[a, b], kf k = kgk = k(f + g)/2k = 1, however, f 6= g. Therefore,


C[a, b] is not strictly convex.
Exercise 4.1.1. Show that the spaces L1 , L∞ and c0 are not strictly convex.

The following proposition provides some equivalent conditions of strict convexity.


Proposition 4.1.2. Let X be a normed space. Then X is strictly convex if and only if
for each nonzero f ∈ X ∗ , there exists at most one point x ∈ X with kxk = 1 such that
hx, f i = f (x) = kf k∗ .

Proof. Let X be a strictly convex normed space and f ∈ X ∗ . Suppose there exist two
distinct points x, y ∈ X with kxk = kyk = 1 such that f (x) = f (y) = kf k∗. If λ ∈ (0, 1),
then
kf k∗ = λf (x) + (1 − λ)f (y) (since f (x) = f (y) = kf k∗ )
= f (λx + (1 − λ)y) (because f is linear)
≤ kf k∗kλx + (1 − λ)yk
< kf k∗, (since kλx + (1 − λ)yk < 1)
which is a contradiction. Therefore, there exists at most one point x in X with kxk = 1 such
that f (x) = kf k∗.

Conversely, assume that x, y ∈ SX with x 6= y such that k(x + y)/2k = 1. By Hahn-Banach


Theorem (Corollary 6.0.1), there exists a functional j ∈ SX ∗ such that
kjk∗ = 1 and h(x + y)/2, ji = k(x + y)/2k.
Since hx, ji ≤ kxk kjk = 1 and hy, ji ≤ kyk kjk = 1, we have hx, ji = hy, ji because
 
x+y x+y
,j = = 1 ⇔ hx + y, ji = 2 ⇔ hx, ji + hy, ji = 2.
2 2
This implies, by hypothesis, that x = y. Therefore, X is strictly convex.

Proposition 4.1.3. A normed space X is strictly convex if and only if the functional
h(x) := kxk2 is strictly convex, that is,

kλx + (1 − λ)yk2 < λkxk2 + (1 − λ)kyk2, for all x, y ∈ X, x 6= y and λ ∈ (0, 1).

Proof. Suppose that X is strictly convex. Let x, y ∈ X, λ ∈ (0, 1). Then we have
kλx + (1 − λ)yk2 ≤ (λkxk + (1 − λ)kyk)2 (4.2)
= λ2 kxk2 + 2λ(1 − λ)kxk kyk + (1 − λ)2 kyk2

≤ λ2 kxk2 + 2λ(1 − λ) kxk2 + kyk2 + (1 − λ)2 kyk2 (4.3)
= λkxk2 + (1 − λ)kyk2. (4.4)
Qamrul Hasan Ansari Advanced Functional Analysis Page 101

Hence h is convex.

Now we show that the equality can not hold. Assume that there are x, y ∈ X, x 6= y with

kλ0 x + (1 − λ0 )yk2 = λ0 kxk2 + (1 − λ0 )kyk2 , for some λ0 ∈ (0, 1).

Then from (4.3), we obtain


2kxk kyk = kxk2 + kyk2.
Hence kxk = kyk = kλ0 x + (1 − λ0 )yk which is impossible.

Conversely, assume that the functional h(x) := kxk2 is strictly convex. Let x, y ∈ X be such
that x 6= y, kxk = kyk = 1 with kλx + (1 − λ)yk = 1 for some λ ∈ (0, 1). Then

kλx + (1 − λ)yk2 = 1 = λkxk2 + (1 − λ)kyk2

a contradiction that h is strictly convex.

Exercise 4.1.2. Let X be a normed space. Prove that X is strictly convex if and only if
for every 1 < p < ∞,

kλx + (1 − λ)ykp < λkxkp + (1 − λ)kykp, for all x, y ∈ X, x 6= y and λ ∈ (0, 1).

Proof. Suppose that X is strictly convex, and let x, y ∈ X with x 6= y. Then by strict
convexity of X and hence by strict convexity of k · k, we have

kλx + (1 − λ)yk < λkxk + (1 − λ)kyk, for all λ ∈ (0, 1).

Therefore, for every 1 < p < ∞, we have

kλx + (1 − λ)ykp < (λkxk + (1 − λ)kyk)p , for all λ ∈ (0, 1). (4.5)

If kxk = kyk, then

kλx + (1 − λ)ykp < kxkp = λkxkp + (1 − λ)kykp.

Assume that kxk = 6 kyk, and consider the function λ 7→ λp for 1 < p < ∞. Then it is a
convex function and
 p
a+b ap + bp
< , for all a, b ≥ 0 and a 6= b.
2 2

Hence from (4.5) with λ = 1/2, we have


p  p
x+y kxk + kyk 1
≤ < (kxkp + kykp) . (4.6)
2 2 2
Qamrul Hasan Ansari Advanced Functional Analysis Page 102

If λ ∈ (0, 1/2], then from (4.5), we have

p
x+y
kλx + (1 − λ)ykp = 2λ + (1 − 2λ)y (after adding and substracting λy)
2
 p
x+y
< 2λ + (1 − 2λ)kyk
2
p
x+y
< 2λ + (1 − 2λ)kykp
2
 
1
< 2λ kxk + kyk + (1 − 2λ)kykp
p p
2
< λkxkp + (1 − λ)kykp. (by (4.6))

The proof is similar if λ ∈ (1/2, 1).

The converse part is obvious.

Proposition 4.1.4. Let X be a normed space. Then X is strictly convex if and only
if for any two linearly independent elements x, y ∈ X, kx + yk < kxk + kyk. In other
words, X is strictly convex if and only if kx + yk = kxk + kyk for 0 6= x ∈ X and y ∈ X,
then there exists λ ≥ 0 such that y = λx.

Proof. Suppose that X is not strictly convex. Then there exist x and y in X such that
kxk = kyk = 1, x 6= y and kx + yk = 2. By hypothesis, for any two linearly independent
elements x, y ∈ X, kx + yk < kxk + kyk. Since kx + yk = kxk + kyk, x and y are linearly
dependent. Then, x = αy for some α ∈ R, and therefore, kxk = |α| kyk for some α ∈ R
which implies that |α| = 1 because kxk = kyk = 1. If α = 1, then x = y, contradicting that
x 6= y. So we have α = −1, and therefore,

2 = kx + yk = k − y + yk = 0.

This is a contradiction.

Conversely, suppose that X is a strictly convex space and there exist linearly independent
elements x and y in X such that kx + yk = kxk + kyk. Without loss of generality, we may
Qamrul Hasan Ansari Advanced Functional Analysis Page 103

assume that 0 < kxk ≤ kyk. Then, we have

x y
2 > +
kxk kyk
 
x y
because = = 1 and X is strictly convex
kxk kyk
1
= k(xkyk + ykxk)k
kxkkyk
1
= k(xkyk + ykyk − ykyk + ykxk)k
kxkkyk
1
= k[kyk(x + y) − (kyk − kxk)y]k
kxkkyk
1
≥ k[kykkx + yk − (kyk − kxk)kyk]k
kxkkyk
1
= k[kyk(kxk + kyk) − (kyk − kxk)kyk]k = 2
kxkkyk
(because kx + yk = kxk + kyk).

This is a contradiction.

We now present the existence and uniqueness of elements of minimal norm in convex subsets
of strictly convex normed spaces.

Proposition 4.1.5. Let X be a strictly convex normed space and C be a nonempty con-
vex subset of X. Then there is at most one point x ∈ C such that kxk = inf {kzk : z ∈ C}.

Proof. Assume that there exist two points x, y ∈ C, x 6= y such that

kxk = kyk = inf{kzk : z ∈ C} = d (say).

If λ ∈ (0, 1), then by the strict convexity of X, we have

kλx + (1 − λ)yk < λkxk + (1 − λ)kyk = λd + (1 − λ)d = d,

which is a contradiction, since λx + (1 − λ)y ∈ C by convexity of C.

Proposition 4.1.6. Let C be a nonempty closed convex subset of a reflexive strictly


convex Banach space X. Then there exists a unique point x ∈ C such that kxk =
inf {kzk : z ∈ C}.

Proof. Let d := inf {kzk : z ∈ C}. Then there exists a sequence {xn } in C such that
lim kxn k = d. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xni }
n→∞
Qamrul Hasan Ansari Advanced Functional Analysis Page 104

of {xn } that converges weakly to an element x in C. The weak lower semicontinuity of the
norm gives
kxk ≤ lim kxn k = d.
n→∞

Therefore, d = kxk. The uniqueness of x follows from Proposition 4.1.5.


Definition 4.1.2. Let C be a nonempty subset of a normed space X and x ∈ X. The
distance from the point x to the set C is defined as

d(x, C) = inf{kx − yk : y ∈ C}.

Proposition 4.1.7. Let C be a nonempty closed convex subset of a reflexive strictly


convex Banach space X. Then for all x ∈ X, there exists a unique point zx ∈ C such
that kx − zx k = d(x, C).

Proof. Let x ∈ C. Since C is a nonempty closed convex subset the Banach space X,
D = C − x := {y − x : y ∈ C} is a nonempty closed convex subset of X. By Proposition
4.1.6, there exists a unique point ux ∈ D such that kux k = inf{ky −xk : y ∈ C}. For ux ∈ D,
there exists a point zx ∈ C such that ux = zx − x. Hence, there exists a unique point zx ∈ C
such that kzx − xk = d(x, C).

In order to measure the degree of strict convexity of X, we define its modulus of convexity.
Definition 4.1.3. Let X be a normed space. A function δX : [0, 2] → [0, 1] defined by
 
kx + yk
δX (ε) = inf 1 − : kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε
2
is called the modulus of convexity of X .

Roughly speaking, δX measures how deeply the midpoint of the linear segment joining points
in the sphere SX of X must lie within SX .

The notion of the modulus of convexity was introduced by Clarkson in 19361 . It allows us
to measure the convexity and rotundity of the unit ball of a normed space.
Remark 4.1.2. (a) It is easy to see that δX (0) = 0 and δX (ε) ≥ 0 for all ε ≥ 0.
(b) The function δ is increasing on [0, 2], that is, if ε1 ≤ ε2 , then δX (ε1 ) ≤ δX (ε2 ).

(c) The function δX is continuous on [0, 2), but not necessarily continuous at ε = 2.
(d) The modulus of convexity of an inner product space H is
r
ε2
δH (ε) = 1 − 1 − .
4
1
J.A. Clarkson: Uniform convex spaces, Trans. Amer. Math. Soc., 40 (1936), 396–414.
Qamrul Hasan Ansari Advanced Functional Analysis Page 105

(e) The modulus of convexity of ℓp (1 ≤ p < ∞) is


  p 1/p
ε2
δℓp (ε) = 1 − 1 − .
4

(f) δX (ε) ≤ δH (ε) for any normed space X and any inner product space H. That is, an
inner product space is the most convex normed space.
Remark 4.1.3. We note that for any ε > 0, the number δX (ε) is the largest number for
which the following implication always holds: For any x, y ∈ X,
x+y
kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε ⇒ ≤ 1 − δX (ε). (4.7)
2
Example 4.1.5. Let X = R2 be a normed space equipped with one of the following norms:
k(x1 , x2 )k = kx1 k + kx2 k or k(x1 , x2 )k = max {kx1 k, kx2 k} ,
for all (x1 , x2 ) ∈ X. Then, δX (ε) = 0 for all ε ∈ [0, 2].
Example 4.1.6. Let X = R2 be a normed space equipped with the following norm:
 
x2 x2
k(x1 , x2 )k = max kx2 k, x1 + √ , x1 − √ , for all (x1 , x2 ) ∈ X.
3 3
Then the unit sphere is a regular hexagon and
1
lim δX (ε) = δX (2) = .
ε→2 2

We now give some important properties of the modulus of convexity of normed spaces.
Theorem 4.1.1. A normed space X is strictly convex if and only if δX (2) = 1.

Proof. Let X be a strictly convex normed space with modulus of convexity δX (ε). Suppose
kxk = kyk = 1 and kx − yk = 2 with x 6= −y. By strict convexity of X, we have
x−y x + (−y)
1= = < 1,
2 2
a contradiction. Hence x = −y. Therefore, δX (2) = 1.

Conversely, suppose δX (2) = 1. Let x, y ∈ X such that kxk = kyk = k(x + y)/2k = 1, that
is, kx + yk = 2 or kx − (−y)k = 2. Then
x−y x + (−y)
= ≤ 1 − δX (2) = 0,
2 2
which implies that x = y. Thus, kxk = kyk and kx + yk = 2 = kxk + kyk imply that x = y.
Therefore, X is strictly convex.
Qamrul Hasan Ansari Advanced Functional Analysis Page 106

4.2 Uniform Convexity

The strict convexity of a normed space X says that the midpoint (x + y)/2 of the segment
joining two distinct points x, y ∈ SX with kx − yk ≥ ε > 0 does not lie on SX , that is,

x+y
< 1.
2

In such spaces, we have no information about 1 − k(x + y)/2k, the distance of midpoints
from the unit sphere SX . A stronger property than the strict convexity which provides
information about the distance 1 − k(x + y)/2k is uniform convexity.

Definition 4.2.1. A normed space X is said to be uniformly convex if for any ε, 0 < ε ≤ 2,
the inequalities kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε imply that there exists a δ = δ(ε) > 0
such that k(x + y)/2k ≤ 1 − δ.

This says that if x and y are in the closed unit ball BX := {x ∈ X : kxk ≤ 1} with
kx − yk ≥ ε > 0, the midpoint of x and y lies inside the unit ball BX at a distance of at
least δ from the unit sphere SX .

Roughly speaking, if two points on the unit sphere of a uniformly convex space are far apart,
then their midpoint must be well within it.

The concept of uniform convexity was introduced by Clarkson2 .

Example 4.2.1. Every Hilbert space H is a uniformly convex space. In fact, the parallelo-
gram law gives us

kx + yk2 = 2(kxk2 + kyk2) − kx − yk2 , for all x, y ∈ H.

Suppose x, y ∈ BH with x 6= y and kx − yk ≥ ε. Then

kx + yk2 ≤ 4 − ε2 .

Therefore,
k(x + y)/2k ≤ 1 − δ(ε),
p
where δ(ε) = 1 − 1 − ε2 /4. Thus, H is uniformly convex.

Example 4.2.2. The spaces ℓ1 and ℓ∞ are not uniformly convex. To see it, take x =
(1, 0, 0, 0, . . .), y = (0, −1, 0, 0, . . .) ∈ ℓ1 and ε = 1. Then

kxk1 = 1, kyk1 = 1, kx − yk1 = 2 > 1 = ε.


2
J.A. Clarkson: Uniform convex spaces, Trans. Amer. Math. Soc., 40 (1936), 396–414.
Qamrul Hasan Ansari Advanced Functional Analysis Page 107

However, k(x + y)/2k1 = 1 and there is no δ > 0 such that k(x + y)/2k1 ≤ 1 − δ. Thus, ℓ1
is not uniformly convex.

Similarly, if we take x = (1, 1, 1, 0, 0, . . .), y = (1, 1, −1, 0, 0, . . .) ∈ ℓ∞ and ε = 1, then

kxk∞ = 1, kyk∞ = 1, kx − yk∞ = 2 > 1 = ε.

Since k(x + y)/2k∞ = 1, ℓ∞ is not uniformly convex.


Exercise 4.2.1. Fix µ > 0 and let C[0, 1] be the space with the norm k · kµ defined by
Z 1 1/2
2
kxkµ = kxk0 + µ x (t)dt ,
0

where k · k0 is the usual supremum norm. Then

kxk0 ≤ kxkµ ≤ (1 + µ)kxk0 , for all x ∈ C[0, 1],

and the two norms are equivalent with k · kµ near k · k0 for small µ. However (C[0, 1], k · k0 )
is not strictly convex while for any µ > 0, (C[0, 1], k · kµ ) is. On the other hand, it is easy
to see that for any ε ∈ (0, 2), there exist functions x, y, ∈ C[0, 1] with kxkµ = kykµ = 1,
kx − yk = ε and k(x + y)/2k arbitrary near 1. Thus, (C[0, 1], k · kµ ) is not uniformly convex.
Exercise 4.2.2. Show that the normed spaces ℓp , ℓnp (whenever n is a nonnegative integer),
and Lp [a, b] with 1 < p < ∞ are uniformly convex.
Exercise 4.2.3. Show that the normed spaces ℓa , c, ℓ∞ , L1 [a, b], C[a, b] and L∞ [a, b] are not
strictly convex.

Theorem 4.2.1. Every uniformly convex normed space is strictly convex.

Proof. It follows directly from Definition 4.2.1.


Remark 4.2.1. The converse of Theorem 4.2.1 is not true in general. Let β > 0 and
X = c0 the space of all sequences of scalars which converge to zero, that is, c0 = {x =
(x1 , x2 , . . . , xn , . . .) : {xi }∞
i=1 is convergent to zero} with the norm k · kβ defined by

∞ 
!1/2
X xi 2
kxkβ = kxkc0 + β , x = {xi } ∈ c0 .
i=1
i

The spaces (c0 , k · kβ ) for β > 0 are strictly convex, but not uniformly convex, while c0 with
its usual norm kxk∞ = sup |xi |, is not strictly convex.
i∈N

Remark 4.2.2. The strict convexity and uniform convexity are equivalent in finite dimen-
sional spaces.
Qamrul Hasan Ansari Advanced Functional Analysis Page 108

Theorem 4.2.2. Let X be a normed space. Then X is uniformly convex if and only if
for two sequences {xn } and {yn } in X,

kxn k ≤ 1, kyn k ≤ 1 and lim kxn + yn k = 2 ⇒ lim kxn − yn k = 0. (4.8)


n→∞ n→∞

Proof. Let X be uniformly convex. Assume that {xn } and {yn } are two sequences in X
such that kxn k ≤ 1, kyn k ≤ 1 for all n ∈ N and lim kxn + yn k = 2. Suppose contrary that
n→∞
lim kxn − yn k =
6 0. Then for some ε > 0, there exists a subsequence {ni } of {n} such that
n→∞

kxni − yni k ≥ ε.

Since X is uniformly convex, there exists δ(ε) > 0 such that

kxni + yni k ≤ 2(1 − δ(ε)). (4.9)

Since lim kxn + yn k = 2, it follows from (4.9) that


n→∞

2 ≤ 2(1 − δ(ε)),

a contradiction.

Conversely, assume that the condition (4.8) is satisfied. If X is not uniformly convex, then
for ε > 0, there is no δ(ε) such that

kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε ⇒ kx + yk ≤ 2(1 − δ(ε)),

and we can find sequences {xn } and {yn } in X such that

(i) kxn k ≤ 1, kyn k ≤ 1,


(ii) kxn + yn k ≥ 2(1 − 1/n),
(iii) kxn − yn k ≥ ε.

Clearly kxn − yn k ≥ ε which contradicts the hypothesis, since (ii) gives lim kxn + yn k = 2.
n→∞
Thus, X must be uniformly convex.

Theorem 4.2.3. A normed space X is uniformly convex if and only if δX (ε) > 0 for all
ε ∈ (0, 2].

Proof. Let X be a uniformly convex normed space. Then for ε > 0, there exists δ(ε) > 0
such that x+y
2
≤ 1 − δ(ε), that is,

x+y
0 < δ(ε) ≤ 1 −
2
Qamrul Hasan Ansari Advanced Functional Analysis Page 109

for all x, y ∈ X with kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε. Therefore, from the definition of
modulus of convexity, we have δX (ε) > 0.

Conversely, suppose that X is a normed space with modulus of convexity δX such that
δX (ε) > 0 for all ε ∈ (0, 2]. Let x, y ∈ X such that kxk = 1, kyk = 1 with kx − yk ≥ ε for
fixed ε ∈ (0, 2]. By the definition of modulus of convexity δX (ε), we have

x+y
0 < δX (ε) ≤ 1 − .
2

It follows that
x+y
≤ 1 − δX (ε),
2
which is independent of x and y. Therefore, X is uniformly convex.

Theorem 4.2.4. Let {xn } be a sequence in an uniformly convex Banach space X. Then,

xn ⇀ x, kxn k → kxk ⇒ xn → x.

Proof. If x = 0, then it is obvious that xn → 0. So, let x 6= 0. Put yn = kxxnn k for n large
x
enough, and y = kxk . By construction, kyn k = kyk = 1, yn ⇀ y, and thus yn + y ⇀ 2y.
Suppose that xn 6→ x. Then, yn 6→ y. This implies that there exist ε > 0 and a subsequence
{ynk } of {yn } such that kynk − yk ≥ ε. Since X is uniformly convex, there exists δX (ε) > 0
such that
y nk + y
≤ 1 − δX (ε).
2
Since ynk ⇀ y without loss of generality, we have

y nk + y
kyk ≤ lim inf ≤ 1 − δX (ε),
k→∞ 2

which contradicts kyk = 1. Therefore, xn → x.

For the class of uniform convex Banach spaces, we have the following important results.

Theorem 4.2.5. Every uniformly convex Banach space is reflexive.

Proof. Let X be a uniformly convex Banach space. Let SX ∗ := {j ∈ X ∗ : kjk∗ = 1} be the


unit sphere in X ∗ and f ∈ SX ∗ . Suppose that {xn } is a sequence in SX such that f (xn ) → 1.
We show that {xn } is a Cauchy sequence. Assume contrary that there exist ε > 0 and two
subsequences {xni } and {xnj } of {xn } such that kxni − xnj k ≥ ε. The uniform convexity of
X guarantees that there exists δX (ε) > 0 such that k(xni + xnj )/2k < 1 − δX (ε). Observe
that
|f ((xni + xnj )/2)| ≤ kf k∗ k(xni + xnj )/2k < kf k∗ (1 − δX (ε)) = 1 − δX (ε)
Qamrul Hasan Ansari Advanced Functional Analysis Page 110

and f (xn ) → 1, yield a contradiction. Hence {xn } is a Cauchy sequence and there exists a
point x in X such that xn → x. Clearly x ∈ SX . In fact,
kxk = k lim xn k = lim kxn k = 1.
n→∞ n→∞

Using James Theorem 6.0.3 (which states that a Banach space is reflexive if and only if for
each f ∈ SX ∗ , there exists x ∈ SX such that f (x) = 1), we conclude that X is reflexive.
Remark 4.2.3. Every finite-dimensional Banach space is reflexive, but it need not be uni-
Xn
n
formly convex. For example, X = R , n ≥ 2 with the norm kxk1 = |xi | is not uniformly
i=1
convex. However, it is finite dimensional space.

Combining Proposition 4.1.6 and Theorems 4.2.1 and 4.2.5, we obtain the following inter-
esting result.
Theorem 4.2.6. Let C be a nonempty closed convex subset of a uniformly convex Ba-
nach space X. Then C has a unique element of minimum norm, that is, there exists a
unique element x ∈ C such that kxk = inf {kzk : z ∈ C}.

Theorem 4.2.7 (Intersection Theorem). Let {Cn }∞ n=1 be a decreasing sequence of nonempty
bounded closed convex subsets of a uniformly convex Banach space X. Then, the inter-

\
section Cn is a nonempty closed convex subset of X.
n=1

Proof. Let x be a point in X which does not belong to C1 , rn = d(x, Cn ) and r = lim rn .
n→∞
Also, let {qn } be a sequence of positive numbers that decreases to zero, Dn = {y ∈ Cn :
kx − yk ≤ r + qn }, and dn the diameter of Dn . If y and z belong to Dn and ky − zk ≥ dn − qn ,
then   
y+z ky − zk
x− ≤ 1−δ (r + qn ),
2 r + qn
and   
dn − qn
rn ≤ 1 − δ (r + qn ).
r + qn
Let lim dn = d, then we obtain a contradiction unless d = 0. This in turn implies that
n→∞
\∞ \∞
Dn 6= ∅, and so is Cn 6= ∅.
n=1 n=1

Remark 4.2.4. Theorem 4.2.7 remains valid if the sequence {Cn }∞ n=1 is replaced by an
arbitrary decreasing net of nonempty bounded closed convex sets. However, Theorem 4.2.7
does not hold in arbitrary Banach spaces. For example, consider the space X = C[0, 1] and
Cn = {x ∈ C[0, 1] : 0 ≤ x(t) ≤ tn for all 0 ≤ t ≤ 1 and x(1) = 1}.
Qamrul Hasan Ansari Advanced Functional Analysis Page 111

4.3 Duality Mapping and Its Properties

Before defining the duality mapping and giving its fundamental properties, we mention the
following notations and definitions:

Let T : X ⇒ X ∗ be a set-valued mapping. The domain Dom(T ), range R(T ), inverse T −1 ,


and graph G(T ) are defined as

Dom(T ) = {x ∈ X : T (x) 6= ∅},


[
R(T ) = T (x),
x∈Dom(T )
−1
T (y) = {x ∈ X : y ∈ T (x)},
G(T ) = {(x, y) ∈ X × X ∗ : y ∈ T (x), x ∈ Dom(T )}.

The graph G(T ) of T is a subset of X × X ∗ .

The mapping T is said to be injective if T (x) ∩ T (y) = ∅ for all x 6= y.


Definition 4.3.1. Let X ∗ be the dual of a normed space X. A set-valued mapping J : X ⇒
X ∗ is said to be normalized duality if

J(x) = j ∈ X ∗ : hx, ji = kxk2 = kjk2∗ ,

equivalently,
J(x) = {j ∈ X ∗ : hx, ji = kxk kjk and kxk = kjk} .
Example 4.3.1. In a real Hilbert space H, the normalized duality mapping is the identity
mapping. Indeed, let x ∈ H with x 6= 0. Since H = H ∗ and hx, xi = kxk · kxk, we have
x ∈ J(x). Assume that y ∈ J(x). By the definition of J, we have hx, yi = kxkkyk and
kxk = kyk. Since
kx − yk2 = kxk2 + kyk2 − 2hx, yi,
it follows that x = y. Therefore, J(x) = {x}.

The following theorem presents some fundamental properties of duality mappings in Banach
spaces.

Proposition 4.3.1. Let X be a Banach space and J : X ⇒ X ∗ be a normalized duality


mapping. Then the following assertions hold:

(a) J(0) = {0}.

(b) For each x ∈ X, J(x) is nonempty closed convex and bounded subset of X ∗ .

(c) J(λx) = λJ(x) for all x ∈ X and real λ, that is, J is homogeneous.
Qamrul Hasan Ansari Advanced Functional Analysis Page 112

(d) J is a monotone set-valued map, that is, hx − y, jx − jy i ≥ 0, for all x, y ∈ X,


jx ∈ J(x) and jy ∈ J(y).

(e) kxk2 − kyk2 ≥ 2hx − y, ji, for all x, y ∈ X and j ∈ J(y).

(f) If X ∗ is strictly convex, then J is single-valued.

(g) If X is strictly convex, then J is injective, that is, x 6= y ⇒ J(x) ∩ J(y) = ∅.

(h) If X is reflexive with strictly convex dual X ∗ , then J is demicontinuous, that is, if
xn → x in X implies J(xn ) ⇀ J(x).

Proof. (a) It is obvious.

(b) Let x ∈ X. If x = 0, then it is done by Part (a). So, we assume that x 6= 0. Then,
by the Hahn-Banach Theorem, there exists f ∈ X ∗ such that hx, f i = kxk and kf k∗ = 1.
Set j := kxkf . Then hx, ji = kxkhx, f i = kxk2 and kjk∗ = kxk, and it follows that J(x) is
nonempty for each x 6= 0. So, we can assume that f1 , f2 ∈ J(x). Then, we have

hx, f1 i = kxkkf1 k∗ , kxk = kf1 k∗

and
hx, f2 i = kxkkf2 k∗ , kxk = kf2 k∗ ,
and therefore, for t ∈ (0, 1), we have

hx, tf1 + (1 − t)f2 i = kxk (tkf1 k∗ + (1 − t)kf2 k∗ ) = kxk2 .

Since

kxk2 = hx, tf1 + (1 − t)f2 i ≤ ktf1 + (1 − t)f2 k∗ kxk


≤ (tkf1 k∗ + (1 − t)kf2 k∗ ) kxk
= kxk2 ,

we have
kxk2 ≤ kxkktf1 + (1 − t)f2 k∗ ≤ kxk2 ,
which gives us
kxk2 = kxkktf1 + (1 − t)f2 k∗ ,
that is,
ktf1 + (1 − t)f2 k∗ = kxk.
Therefore,

hx, tf1 + (1 − t)f2 i = kxk ktf1 + (1 − t)f2 k∗ and kxk = ktf1 + (1 − t)f2 k∗ ,

and thus, tf1 + (1 − t)f2 ∈ J(x) for all t ∈ (0, 1), that is, J(x) is a convex set.

Similarly, we can show that J(x) is a closed and bounded set in X ∗ .


Qamrul Hasan Ansari Advanced Functional Analysis Page 113

(c) For λ = 0, it is obvious that J(0x) = 0J(x). Assume that j ∈ J(λx) for λ 6= 0. We first
show that J(λx) ⊆ λJ(x). Since j ∈ J(λx), we have

hλx, ji = kλxkkjk∗ and kλxk = kjk∗ ,

and thus, hλx, ji = kjk2∗ . Hence

hx, λ−1 ji = λ−1 hλx, λ−1 ji = λ−2 hλx, ji


= λ−2 kλxkkjk∗ = λ−1 kjk∗ kjk∗
= kλ−1 jk2∗ = kxk2 .

This shows that λ−1 j ∈ J(x), that is, j ∈ λJ(x). Therefore, J(λx) ⊆ λJ(x). Similarly, we
can show that λJ(x) ⊆ J(λx). Thus, J(λx) = λJ(x).

(d) Let jx ∈ J(x) and jy ∈ J(y) for x, y ∈ X. Then, we have

hx − y, jx − jy i = hx, jx i − hx, jy i − hy, jx i + hy, jy i


≥ kxk2 + kyk2 − kxkkjy k∗ − kykkjxk∗
≥ kxk2 + kyk2 − 2kxkkyk
= (kxk − kyk)2 ≥ 0. (4.10)

(e) Let j ∈ J(x), x, y ∈ X. Then, we have

kxk2 kyk2 − 2hx − y, ji = kxk2 − kyk2 − 2hx, ji + 2hy, ji


= kxk2 − kyk2 − 2hx, ji + 2kyk2
= kxk2 + kyk2 − 2hx, ji
≥ kxk2 + kyk2 − 2kxk kyk = (kxk − kyk)2 ≥ 0.

(f) Let j1 , j2 ∈ J(x) for x ∈ X. Then, we have

hx, j1 i = kj1 k2∗ = kxk2

and
hx, j2 i = kj2 k2∗ = kxk2 .
Adding the above identities, we obtain

hx, j1 + j2 i = 2kxk2 .

Since 2kxk2 = hx, j1 + j2 i ≤ kxkkj1 + j2 k∗ , we have

kj1 k∗ + kj2 k∗ = 2kxk ≤ kj1 + j2 k∗ .

It follows from the fact kj1 + j2 k∗ ≤ kj1 k∗ + kj2 k∗ that

kj1 + j2 k∗ = kj1 k∗ + kj2 k∗ .


Qamrul Hasan Ansari Advanced Functional Analysis Page 114

Since X ∗ is strictly convex and kj1 + j2 k∗ = kj1 k∗ + kj2 k∗ , there exists λ ∈ R such that
j1 = λj2 . Since
hx, j2 i = hx, j1 i = hx, λj2 i = λhx, j2 i,
this implies that λ = 1, and hence, j1 = j2 . Therefore, J is single-valued.

(g) Suppose that j ∈ J(x) ∩ J(y) for x, y ∈ X. Since j ∈ J(x) and j ∈ J(y), it follows from
kjk2∗ = kxk2 = kyk2 = hx, ji = hy, ji that

kxk2 = h(x + y)/2, ji ≤ k(x + y)/2kkxk,

which gives that


kxk = kyk ≤ k(x + y)/2k ≤ kxk.
Hence kxk = kyk = k(x + y)/2k. Since X is strictly convex and kxk = kyk = k(x + y)/2k,
we have x = y. Therefore, J is one-one.

(h) It is sufficient to prove the demicontinuity of J on the unit sphere SX . For this, let {xn }
be a sequence in SX such that xn → z in X. Then kJ(xn )k∗ = kxn k = 1 for all n ∈ N,
that is, {J(xn )} is bounded. Since X is reflexive, so is X ∗ . Then, there exists a subsequence
{J(xnk )} of {J(xn )} in X ∗ such that {J(xnk )} converges weakly to some j in X ∗ . Since
xnk → z and J(xnk ) ⇀ j, we have

hz, ji = lim hxnk , J(xnk )i = lim kxnk k2 = 1.


k→∞ k→∞

Moreover,

kjk∗ ≤ lim kJxnk k∗ = lim (kJxnk k∗ kxnk k)


k→∞ k→∞
= lim hxnk , Jxnk i = hz, ji = kjk∗ ,
k→∞

that is, kjk = hz, ji. This shows that

hz, ji = kjk∗ kzk and kjk∗ = kzk,

(because z ∈ SX and so kzk = 1, also hz, ji = 1 and so kjk = 1). This implies that j = J(z).
Thus, every subsequence {J(xni )} converging weakly to j ∈ X ∗ . This gives J(xn ) ⇀ J(z).
Therefore, J is demicontinuous.

The following inequalities are very useful in many applications.

Corollary 4.3.1. Let X be a Banach space and J : X ⇒ X ∗ be the duality mapping.


Then the following statements hold:

(a) kx + yk2 ≥ kxk2 + 2hy, jx i, for all x, y ∈ X, where jx ∈ J(x).

(b) kx + yk2 ≤ kyk2 + 2hx, jx+y i, for all x, y ∈ X, where jx+y ∈ J(x + y).
Qamrul Hasan Ansari Advanced Functional Analysis Page 115

Proof. (a) Replacing y by x + y in (4.11), we get the inequality.

(b) Replacing x by x + y in (4.11), we get the result.

Proposition 4.3.2. Let X be a Banach space and J : X ⇒ X ∗ be a normalized duality


mapping. For each x, y ∈ X, the following statements are equivalent:

(a) kxk ≤ kx + tyk, for all t > 0.

(b) There exists j ∈ J(x) such that hy, ji ≥ 0.

Proof. (a) ⇒ (b). For t > 0, let ft ∈ J(x + ty). Then hx + ty, ft i = kx + tyk kftk. Define
gt = kffttk∗ . Then kgt k∗ = 1. Since gt ∈ kft k−1
∗ J(x + ty), we have

kxk ≤ kx + tyk = kft k−1


∗ hx + ty, ft i
= hx + ty, gti = hx, gt i + thy, gti
≤ kxk + thy, gt i. (since kgt k∗ = 1)

By the Banach-Alaoglu Theorem 6.0.4 (which states that the unit ball in X ∗ is weak*-
compact), the net {gt } has a limit point g ∈ X ∗ such that

kgk∗ ≤ 1, hx, gi ≥ kxk and hy, gi ≥ 0.

Observe that
kxk ≤ hx, gi ≤ kxkkgk∗ = kxk,
which gives that
hx, gi = kxk and kgk∗ = 1.
Set j = gkxk, then j ∈ J(x) and hy, ji ≥ 0.

(b) ⇒ (a). Assume that for x, y ∈ X with x 6= 0, there exists j ∈ J(x) such that hy, ji ≥ 0.
Then for t > 0,

kxk2 = hx, ji ≤ hx, ji + hty, ji


= hx + ty, ji ≤ kx + tykkxk,

which implies that


kxk ≤ kx + tyk.
Qamrul Hasan Ansari Advanced Functional Analysis Page 116

Proposition 4.3.3. Let X be a Banach space and ϕ : X → R be a function defined by


ϕ(x) = kxk2 /2. Then the subdifferential ∂ϕ of ϕ coincides with the normalized duality
mapping J : X ⇒ X ∗ defined by

J(x) = {j ∈ X ∗ : hx, ji = kxkkjk∗ , kjk∗ = kxk} , for x ∈ X.

Proof. We first show that J(x) ⊆ ∂ (kxk2 /2). Let x 6= 0 and j ∈ J(x). Then for y ∈ X, we
have
kyk2 kxk2 kyk2 kxk2
− − hy − x, ji = − − hy, ji + hx, ji
2 2 2 2
kyk2 kxk2
≥ − − kyk kjk∗ + kxk kjk∗
2 2
(because hy, ji ≤ kyk kjk∗ and hx, ji = kxk kjk∗)
kyk2 kxk2
= − − kyk kxk + kxk2 (because kjk∗ = kxk)
2 2
kxk2 kyk2
≥ + − kxkkyk
2 2
(kxk − kyk)2
= ≥ 0.
2
It follows that
kxk2 kyk2
− ≤ hx − y, ji.
2 2
Hence j ∈ ∂ (kxk2 /2). Thus, J(x) ⊆ ∂ (kxk2 /2) for all x 6= 0.
 
kxk2
We now prove ∂ (kxk2 /2) ⊆ J(x) for all x 6= 0. Suppose j ∈ ∂ 2
for 0 6= x ∈ X. Then,

kxk2 kyk2
− ≤ hx − y, ji, for all y ∈ X. (4.11)
2 2
Observe that

kxkkjk∗ = sup {hy, jikxk : kyk = 1} (since j is a continuous linear functional)


= sup {hy, ji : kxk = kyk = 1}
≤ sup {hy, ji : kxk = kyk}
 
kyk2 kxk2
≤ sup hx, ji + − : kxk = kyk (by using (4.11))
2 2
≤ kxkkjk∗ .

Thus,

hx, ji = kxkkjk∗ . (4.12)


Qamrul Hasan Ansari Advanced Functional Analysis Page 117

To see j ∈ J(x), we show that kjk∗ = kxk. For t > 1, we take y = tx ∈ X in (4.11), then
we obtain
kxk2 t2 kxk2
− ≤ hx − tx, ji,
2 2
that is,

(1 − t2 )
kxk2 ≤ (1 − t)hx, ji,
2
which implies that

kxk2
hx, ji ≤ (t + 1) .
2
Letting t → 1, we get

hx, ji ≤ kxk2 . (4.13)

Further, for t > 0, we take y = (1 − t)x ∈ X in (4.11), then we obtain

kxk2 k(1 − t)2 xk2


− ≤ hx − (1 − t)x, ji,
2 2
that is,
 kxk2
1 − (1 − t)2 ≤ thx, ji.
2
It follows that
kxk2
(2 − t) ≤ hx, ji.
2
Letting t → 0, we get

kxk2 ≤ hx, ji. (4.14)

From (4.12), (4.13) and (4.14), we obtain kjk∗ = kxk. Thus, ∂ (kxk2 /2) ⊆ J(x). Therefore,
J(x) = ∂ (kxk2 /2) for all x 6= 0.
Qamrul Hasan Ansari Advanced Functional Analysis Page 118

4.4 Smooth Banach Spaces and Modulus of Smoothness

Let C be a nonempty closed convex subset of a normed space X such that the origin belongs
to the interior of C. A linear functional j ∈ X ∗ is said to be a tangent to C at the point
x0 ∈ ∂C if j(x0 ) = sup{j(x) : x ∈ C}, where ∂C denotes the boundary of C. If H = {x ∈
X : j(x) = 0} is the hyperplane, then the set H + x0 is called a tangent hyperplane to C at
x0 .
Definition 4.4.1. A Banach space X is said to be smooth if for each x ∈ SX , there exists
a unique functional jx ∈ X ∗ such that hx, jx i = kxk and kjx k = 1.

In other words, X is smooth if for all x ∈ SX , there exists jx ∈ SX ∗ such that hx, jx i = 1.

Geometrically, the smoothness condition means that at each point x of the unit sphere, there
is exactly one supporting hyperplane {jx = 1} := {y ∈ X : hy, jx i = 1}. This means that
the hyperplane {jx = 1} is tangent at x to the unit ball and this unit ball is contained in
the half space {jx ≤ 1} := {y ∈ X : hy, jx i ≤ 1}.
Example 4.4.1. ℓp , Lp (1 < p < ∞) are smooth Banach spaces. However, c0 , ℓ1 , L1 , ℓ∞ ,
L∞ are not smooth.

Theorem 4.4.1. Let X be a Banach space. Then the following assertions hold.

(a) If X ∗ is strictly convex, then X is smooth.

(b) If X ∗ is smooth, then X is strictly convex.

Proof. (a) Assume that X is not smooth. Then there exist x0 ∈ SX and j1 , j2 ∈ SX ∗
with j1 6= j2 such that hx0 , j1 i = hx0 , j2 i = 1. Since kj1 + j2 k ≤ kj1 k + kj2 k = 2, and
hx0 , j1 + j2 i = hx0 , j1 i + hx0 , j2 i = 2, we have (j1 + j2 )/2 ∈ SX ∗ . Hence X ∗ is not strictly
convex.

(b) Suppose that X is not strictly convex. Then there exist x, y ∈ SX with x 6= y such that
kx + yk = 2. Take j ∈ SX ∗ with x+y 2
, j = 1. Then, we have
 
x+y 1 1 1 1
1= , j = hx, ji + hy, ji ≤ + ,
2 2 2 2 2
and hence, hx, ji = hy, ji = kjk = 1. Since x, y ∈ X ⊆ X ∗∗ , we have x, y ∈ J(j). So, for
x 6= y, we have X ∗ is not smooth.

It is well known that for a reflexive Banach space X, the dual spaces X and X ∗ can be
equivalently renormed as strictly convex spaces such that the duality is preserved. By using
this fact, we have the following result.
Qamrul Hasan Ansari Advanced Functional Analysis Page 119

Theorem 4.4.2. Let X be a reflexive Banach space. Then the following assertions hold.

(a) X is smooth if and only if X ∗ is strictly convex.

(b) X is strictly convex if and only if X ∗ is smooth.

We now establish a relation between smoothness and Gâteaux differentiability of a norm.

Theorem 4.4.3. A Banach space X is smooth if and only if the norm is Gâteaux
differentiable on X\{0}.

Proof. Since the proper convex continuous functional ϕ is Gâteaux differentiable if and only
if it has a unique subgradient, we have

norm is Gâteaux differentiable at x

⇔ ∂kxk = {j ∈ X ∗ : hx, ji = kxk, kjk∗ = 1} is singleton


⇔ there exists a unique j ∈ X ∗ such that hx, ji = kxk and kjk∗ = 1
⇔ smooth.

Corollary 4.4.1. Let X be a Banach space and J : X ⇒ X ∗ be a duality mapping.


Then the following statements are equivalent:

(a) X is smooth.

(b) J is single-valued.

(c) The norm of X is Gâteaux differentiable with ▽kxk = kxk−1 J(x).

We now study the continuity property of duality mappings.

Theorem 4.4.4. Let X be a smooth Banach space and J : X → X ∗ be a single-valued


duality mapping. Then J is norm to weak*-continuous.

Proof. We show that xn → x implies J(xn ) → J(x) in the weak* topology. Let xn → x and
set fn := J(xn ). Then

hxn , fn i = kxn kkfn k∗ and kxn k = kfn k∗ .


Qamrul Hasan Ansari Advanced Functional Analysis Page 120

Since {xn } is bounded, {fn } is bounded in X ∗ . Then there exists a subsequence {fnk } of
{fn } such that fnk → f ∈ X ∗ in the weak* topology. Then we show that f = J(x). Since
the norm of X ∗ is lower semicontinuous in weak* topology, we have

kf k∗ ≤ lim inf kfnk k∗ = lim inf kxnk k = kxk.


k→∞ k→∞

Since hx, f − fnk i → 0 and hx − xnk , fnk i → 0, it follows from the fact

|hx, f i − kxnk k2 | = |hx, f i − hxnk , fnk i|


≤ |hx, f − fnk i| + |hx − xnk , fnk i| → 0

that
hx, f i = kxk2 .
As a result
kxk2 = hx, f i ≤ kf k∗ kxk.
Thus, we have hx, f i = kxk2 , kxk = kf k∗ . Therefore, f = J(x).
Qamrul Hasan Ansari Advanced Functional Analysis Page 121

4.5 Metric Projection on Normed Spaces

Let C be a nonempty subset of a normed space X and x ∈ X. An element y0 ∈ C is said to


be a best approximation to x if
kx − y0 k = d(x, C),
where d(x, C) = inf kx − yk. The number d(x, C) is called the distance from x to C.
y∈C

The (possibly empty) set of all best approximations from x to C is denoted by


PC (x) = {y ∈ C : kx − yk = d(x, C)}.

This defines a mapping PC from X into 2C and it is called the metric projection onto C. The
metric projection mapping is also known as the nearest point projection mapping, proximity
mapping or best approximation operator.

The set C is said to be proximinal (respectively, Chebyshev) set if each x ∈ X has at least
(respectively, exactly) one best approximation in C.
Remark 4.5.1. (a) C is proximinal if PC (x) 6= ∅ for all x ∈ X.
(b) C is Chebyshev if PC (x) is singleton for each x ∈ X.
(c) The set of best approximations is convex if C is convex.

Proposition 4.5.1. If C is a proximinal subset of a Banach space X, then C is closed.

Proof. Suppose contrary that C is not closed. Then there exists a sequence {xn } in C such
that xn → x and x ∈
/ C, but x ∈ X. It follows that
d(x, C) ≤ kxn − xk → 0,
so that, d(x, C) = 0. Since x ∈
/ C, we have
kx − yk > 0, for all y ∈ C.
This implies that PC (x) = ∅ which contradicts PC (x) 6= ∅.

Theorem 4.5.1 (The Existence of Best Approximations). Let C be a nonempty weakly


compact convex subset of a Banach space X and x ∈ X. Then x has a best approximation
in C, that is, PC (x) 6= ∅.

Proof. Define the function f : C → R+ by


f (y) = kx − yk, for all y ∈ C.
Then, f is lower semicontinuous. Since C is weakly compact, by Theorem 6.0.10, there exists
y0 ∈ C such that kx − y0 k = inf kx − yk.
y∈C
Qamrul Hasan Ansari Advanced Functional Analysis Page 122

Corollary 4.5.1. Let C be a nonempty closed convex subset of a reflexive Banach space
X. Then each element x ∈ X has a best approximation in C.

Theorem 4.5.2 (The Uniqueness of Best Approximations). Let C be a nonempty convex


subset of a strictly convex Banach space X. Then for each x ∈ X, C has at most one
best approximation.

Proof. Assume contrary that y1 , y2 ∈ C are best approximations to x ∈ X. Since the set of
best approximations is convex, (y1 +y2 )/2 is also a best approximation to x. Set r := d(x, C).
Then
0 ≤ r = kx − y1 k = kx − y2 k = kx − (y1 + y2 )/2k,
and so,
k(x − y1 ) + (x − y2 )k = 2r = kx − y1 k + kx − y2 k.
By the strict convexity of X, we have

x − y1 = t(x − y2 ), for all t ≥ 0.

Taking the norm in this relation, we obtain r = tr, that is, t = 1, which gives us y1 = y2 .

The following example shows that the strict convexity cannot be dropped in Theorem 4.5.2.

Example 4.5.1. Let X = R2 with norm kxk1 = |x1 | + |x2 | for all x = (x1 , x2 ) ∈ R2 . As we
have seen that X is not strictly convex. Let
 
C = (x1 , x2 ) ∈ R2 : k(x1 , x2 )k1 ≤ 1 = (x1 , x2 ) ∈ R2 : |x1 | + |x2 | ≤ 1 .

Then C is a closed convex set. The distance from z = (−1, −1) to the set C is one and this
distance is realized by more than one point of C.

The following example shows that the uniqueness of best approximations in Theorem 4.5.2
need not be true for nonconvex sets.
1/2
Example 4.5.2. Let X = R2 with the norm k · k2 = (x21 + x22 ) for all x = (x1 , x2 ) ∈ R2 .
Let 
C = SX = (x1 , x2 ) ∈ R2 : x21 + x22 = 1 .
Then X is strictly convex and C is not convex. However, all points of C are best approxi-
mations to (0, 0) ∈ X.

Theorem 4.5.3. Let X be a Banach space X. If every element in X possesses at most


a best approximation with respect to every convex set, then X is strictly convex.
Qamrul Hasan Ansari Advanced Functional Analysis Page 123

Proof. Assume contrary that X is not strictly convex. Then there exist x, y ∈ X, x 6= y
such that
kxk = kyk = k(x + y)/2k = 1.
Furthermore,
ktx + (1 − t)yk = 1, for all t ∈ [0, 1].
Set C := co({x, y}) the convex hull of the set {x, y}. Then k0 − zk = d(0, C) for all z ∈ C.
It follows that every element of C is the best approximation to zero which contradicts the
uniqueness.

From Theorems 4.5.1 and 4.5.2, we obtain the following result.

Theorem 4.5.4. Let C be a nonempty weakly compact convex subset of a strictly convex
Banach space X. Then for each x ∈ X, C has the unique best approximation, that is,
PC (·) is a single-valued metric projection mapping from X onto C.

Corollary 4.5.2. Let C be a nonempty closed convex subset of a strictly convex reflexive
Banach space X and let x ∈ X. Then there exists a unique element x0 ∈ C such that
kx − x0 k = d(x, C).
5

Appendix: Basic Results from Analysis - I

Definition 5.0.1. A function f : Rn → R ∪ {±∞} is said to be

(a) positively homogeneous if for all x ∈ Rn and all r ≥ 0, f (rx) = rf (x);


(b) subadditive if
f (x + y) ≤ f (x) + f (y), for all x, y ∈ Rn ;

(c) sublinear if it is positively homogeneous and subadditive;


(d) subodd if for all x ∈ Rn \ {0}, f (x) ≥ −f (−x).

Every real-valued odd function is subodd. It can be seen that the function f : R → R defined
by f (x) = x2 is subodd but it is neither odd nor subadditive.

Remark 5.0.1. (a) It can be easily seen that f is subodd if and only if f (x) + f (−x) ≥ 0,
for all x ∈ Rn \ {0}.

(b) If f is sublinear and is not constant with value −∞ such that f (0) ≥ 0, then f is
subodd.

Definition 5.0.2. Let f : Rn → R ∪ {±∞} be an extended real-valued function.

(a) The effective domain of f is defined as


dom(f ) := {x ∈ Rn : f (x) < +∞}.

(b) The function f is called proper if f (x) < +∞ for at least one x ∈ Rn and f (x) > −∞
for all x ∈ Rn .

124
Qamrul Hasan Ansari Advanced Functional Analysis Page 125

(c) The graph of f is defined as

graph(f ) := {(x, y) ∈ Rn × R : y = f (x)}.

(d) The epigraph of f is defined as

epi(f ) := {(x, α) ∈ Rn × R : f (x) ≤ α}.

(e) The hypograph of f is defined as

hyp(f ) := {(x, α) ∈ Rn × R : f (x) ≥ α}.

(f) The lower level set of f at level α ∈ R is defined as

L(f, α) := {x ∈ Rn : f (x) ≤ α}.

(g) The upper level set of f at level α ∈ R is defined as

U(f, α) := {x ∈ Rn : f (x) ≥ α}.

The epigraph (hypograph) is thus a subset of Rn+1 that consists of all the points of Rn+1
lying on or above (on or below) the graph of f . From the above definitions, we have

(x, α) ∈ epi(f ) if and only if x ∈ L(f, α),

and
(x, α) ∈ hyp(f ) if and only if x ∈ U(f, α).

Definition 5.0.3. A function f : Rn → R is said to be

(a) bounded above if there exists a real number M such that f (x) ≤ M, for all x ∈ Rn ;

(b) bounded below if there exists a real number m such that f (x) ≥ m, for all x ∈ Rn ;

(c) bounded if it is bounded above as well as bounded below.

For f : Rn → R ∪ {±∞}, we write

inf f := inf{f (x) : x ∈ Rn },

argminf := argmin{f (x) : x ∈ Rn } := {x ∈ Rn : f (x) = inf f }.


Qamrul Hasan Ansari Advanced Functional Analysis Page 126

Definition 5.0.4. A function f : Rn → R ∪ {±∞} is said to be lower semicontinuous at a


point x ∈ Rn if f (x) ≤ lim inf f (xm ) whenever xm → x as m → ∞. f is said to be lower
m→∞
semicontinuous on Rn if it is lower semicontinuous at each point of Rn .

A function f : Rn → R ∪ {±∞} is said to be upper semicontinuous at a point x ∈ Rn if


f (x) ≥ lim sup f (xm ) whenever xm → x as m → ∞. f is said to be upper semicontinuous
m→∞
on Rn if it is upper semicontinuous at each point of Rn .

Remark 5.0.2. A function f : Rn → R is lower (respectively, upper) semicontinuous on


Rn if and only if the lower level set L(f, α) (respectively, the upper level set U(f, α)) is
closed in Rn for all α ∈ R. Also, f is lower (respectively, upper) semicontinuous on Rn if
and only if the epi(f ) (respectively, hyp(f )) is closed. Equivalently, f is lower (respectively,
upper) semicontinuous on Rn if and only if the set {x ∈ Rn : f (x) > α} (respectively, the
set {x ∈ Rn : f (x) < α}) is open in Rn for all α ∈ R.

Definition 5.0.5. A function f : Rn → R is said to be differentiable at x ∈ Rn if there


exists a vector ∇f (x), called the gradient, and a function α : Rn → R such that
f (y) = f (x) + h∇f (x), y − xi + ky − xkα(y − x), for all y ∈ Rn ,
where limy→x α(y − x) = 0.

If f is differentiable, then
f (x + λv) = f (x) + λh∇f (x), vi + o(λ), for all x + λv ∈ Rn ,
o(λ)
where limλ→0 = 0.
λ
The gradient of f at x = (x1 , x2 , . . . , xn ) is a vector in Rn given by
 
∂f (x) ∂f (x) ∂f (x)
∇f (x) = , ,..., .
∂x1 ∂x2 ∂xn
Definition 5.0.6. An n × n symmetric matrix M of real numbers is said to be positive
semidefinite if hy, Myi ≥ 0 for all y ∈ Rn . It is called positive definite if hy, Myi > 0 for all
y 6= 0.
Definition 5.0.7. Let f = (f1 , . . . , fℓ ) : Rn → Rℓ be a vector-valued function such that the
∂fi (x)
partial derivative of fi with respect to xj exists for i = 1, 2, . . . , ℓ and j = 1, 2, . . . , n.
∂xj
Then the Jacobian matrix J(f )(x) is given by
 
∂f1 (x) ∂f1 (x)
 ∂x1 ···
 ∂xn  
J(f )(x) =  .
. .. ,
 . . 
 ∂fℓ (x) ∂fℓ (x) 
···
∂x1 ∂xn
Qamrul Hasan Ansari Advanced Functional Analysis Page 127

where x = (x1 , x2 , . . . , xn ) ∈ Rn .
Definition 5.0.8. A function f : Rn → R is said to be twice differentiable at x ∈ Rn if there
exist a vector ∇f (x) and an n × n symmetric matrix ∇2 f (x), called the Hessian matrix, and
a function α : Rn → R such that

f (y) = f (x) + h∇f (x), y − xi + hy − x, ∇2 f (x)(y − x)i + ky − xk2 α(y − x), for all y ∈ Rn ,

where limy→x α(y − x) = 0.

If f is twice differentiable, then

f (x + λv) = f (x) + λh∇f (x), vi + λ2 hv, ∇2 f (x)vi + o(λ2 ), for all x + λv ∈ Rn ,


o(λ2 )
where limλ→0 λ2
= 0.

The Hessian matrix of f at x = (x1 , x2 , . . . , xn ) is given by


 2 
∂ f (x) ∂ 2 f (x)
 ∂x2 · · · 
 1 ∂x1 ∂xn 
2  .
. .. 
∇ f (x) ≡ H(x) =  .
 2. . 
 ∂ f (x) ∂ 2 f (x) 
···
∂xn ∂x1 ∂x2n
Definition 5.0.9. Let K be a nonempty convex subset of Rn . A function f : K → R is said
to be

(a) convex if for all x, y ∈ K and all λ ∈ [0, 1],

f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y);

(b) strictly convex if for all x, y ∈ K, x 6= y and all λ ∈ ]0, 1[,

f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y).

A function f : K → R is said to be (strictly) concave if −f is (strictly) convex.

Geometrically speaking, a function f : K → R defined on a convex subset K of Rn is convex


if the line segment joining any two points on the graph of the function lies on or above the
portion of the graph between these points. Similarly, f is concave if the line segment joining
any two points on the graph of the function lies on or below the portion of the graph between
these points. Also, a function for which the line segment joining any two points on the graph
of the function lies strictly above the portion of the graph between these points is referred
to as strictly convex function.
Qamrul Hasan Ansari Advanced Functional Analysis Page 128

Some of the examples of convex functions defined on R are f (x) = ex , f (x) = x, f (x) = |x|,
f (x) = max{0, x}. The functions f (x) = − log x and f (x) = xα for α < 0, α > 1 are strictly
convex defined on the interval ]0, ∞[. Clearly, every strictly convex function is convex but
the converse may not be true. For example, the function f (x) = x defined on R is not strictly
convex. The function f (x) = |x + x3 | is a nondifferentiable strictly convex function on R.

Proposition 5.0.1. A function f : K → R defined on a nonempty convex subset K of


Rn is convex if and only its epigraph is a convex set.
6

Appendix: Basic Results from Analysis - II

Theorem 6.0.1 (Finite Intersection Property). Let topological space X is compact if


\n
and only if for every collection {Cα }α∈Λ of closed sets in X such that Ci 6= ∅, we have
\ i=1
Cα 6= ∅.
α∈Λ

Theorem 6.0.2 (Hahn-Banach Theorem). Let C be a subspace of a linear space X, p


be a sublinear functional on X and f be a linear functional defined on C such that

f (x) ≤ p(x), for all x ∈ C.

Then there exists a linear extension F of f such that F (x) ≤ p(x) for all x ∈ X.

The following corollary gives the existence of nontrivial bounded linear functionals on an
arbitrary normed space.
Corollary 6.0.1. Let x be a nonzero element of a normed space X. Then there exists
j ∈ X ∗ such that j(x) = kxk and kjk∗ = 1.

Definition 6.0.1. Let X be a normed space and X ∗ be its dual space. The duality pairing
between X and X ∗ is the functional h., .i : X × X ∗ → R defined by
hx, ji = j(x), for all x ∈ X and j ∈ X ∗ .

Theorem 6.0.3 (James Theorem). A Banach space X is reflexive if and only if for
each j ∈ SX ∗ , there exists x ∈ SX such that j(x) = 1.

Let X be a Banach space with its dual X ∗ . We say that the sequence {xn } in X converges
to x if lim kxn − xk = 0. This kind of convergence is also called norm convergence or
n→∞

129
Qamrul Hasan Ansari Advanced Functional Analysis Page 130

strong convergence. This is related to the strong topology on X with neighborhood base
Br (0) = {x ∈ X : kxk < r}, r > 0 at the origin. There is also a weak topology on X
generated by the bounded linear functionals on X. Indeed, A set G ⊆ X is said to be open
in the weak topology if for every x ∈ G, there are bounded linear functionals f1 , f2 , . . . , fn
and positive real numbers ε1 , ε2 , . . . , εn such that

{y ∈ X : |fi (x) − fi (y)| < εi , i = 1, 2, . . . , n} ⊆ G.

Hence a subbase σ for the weak topology on X generated by a base of neighborhoods of


x̄ ∈ X is given by the following sets:

V (f1 , f2 , . . . , fn ; ε) = {x ∈ X : |hx − x̄, fi i| < ε, for all i = 1, 2, . . . , n} .

In particular, a sequence {xn } in X converges to x with respect to a weak topology σ(X, X ∗ )


if and only if hxn , f i → hx, f i for all f ∈ X ∗ .

Definition 6.0.2. A sequence {xn } in a normed space X is said to converge weakly to x ∈ X


if f (xn ) → f (x) for all f ∈ X ∗ . In this case, we write xn ⇀ x or weak- lim xn = x.
n→∞

Definition 6.0.3. A subset C of a normed space X is said to be weakly closed if it is closed


in the weak topology.

Definition 6.0.4. A subset C of a normed space X is said to be weakly compact if it is


compact in the weak topology.

Remark 6.0.1. In the finite dimensional spaces, the weak convergence and the strong con-
vergence are equivalent.

Theorem 6.0.4 (Banach-Alaoglu Theorem). Let X be a normed space and X ∗ be its


dual. Then the unit ball in X ∗ is weak*-compact.

Proposition 6.0.1. Let C be a nonempty convex subset of a normed space X. Then C


is weakly closed if and only if it is closed.

Proposition 6.0.2. Every weakly compact subset of a Banach space is bounded.

Proposition 6.0.3. Every closed convex subset of a weakly compact set is weakly com-
pact.

Theorem 6.0.5 (Kakutani’s Theorem). Let X be a Banach space. Then X is reflexive


if and only if the unit ball SX := {x ∈ X : kxk ≤ 1} is weakly compact.
Qamrul Hasan Ansari Advanced Functional Analysis Page 131

Theorem 6.0.6. Let X be a Banach space. Then X is reflexive if and only if every
closed convex bounded subset of X is weakly compact.

Theorem 6.0.7. Let C be a subset of a reflexible Banach space X. Then C is weakly


compact if and only if C is bounded

Theorem 6.0.8. Let X be a Banach space. Then X is reflexive if and only if every
bounded sequence in X in strong topology has a weakly convergent subsequence.

Theorem 6.0.9. Let X be a compact topological space and f : X → (−∞, ∞] be a lower


semicontinuous functional. Then there exists an element x̄ ∈ X such that

f (x̄) = inf f (x).


x∈X

Proof. For all α ∈ R,[let Gα := {x ∈ X : f (x) > α}. Since f is lower semicontinuous,
Gα is open and X = Gα . By compactness of X, there exists a finite family {Gαi }ni=1 of
α∈R
{Gα }α∈R such that
n
[
X= Gαi .
i=1

Suppose that α0 = min{α1 , α2 , . . . , αn }. Then f (x) > α0 for all x ∈ X. It follows that
inf{f (x) : x ∈ X} exists. Let m = inf{f (x) : x ∈ X} and β be a number such that β > m.
Set Fβ := {x ∈ X : f (x) ≤ β}. Then Fβ is a nonempty closed subset of X, and hence, by
the intersection property (Theorem 6.0.1), we have
\
Fβ 6= ∅.
β>m

Therefore, for any point x̄ of this intersection, we have m = f (x̄).

Theorem 6.0.10. Let C be a weakly compact convex subset of a Banach space X and
f : C → (−∞, ∞] be a proper lower semicontinuous convex functional. Then there exists
x̄ ∈ C such that f (x̄) = inf{f (x) : x ∈ C}.

Remark 6.0.2. If f is a strictly convex function in Theorem 6.0.10, then x̄ ∈ C is the


unique point such that f (x̄) = inf f (x).
x∈C

Recall that every closed convex bounded subset of a reflexive Banach space is weakly compact
(Theorem 6.0.6). Using this fact, we have the following result.
Qamrul Hasan Ansari Advanced Functional Analysis Page 132

Theorem 6.0.11. Let C be a nonempty closed convex bounded subset of a reflexive Ba-
nach space X and f : X → (−∞, ∞] be a proper lower semicontinuous convex functional.
Then there exists x̄ ∈ C such that f (x̄) = inf f (x).
x∈C

In Theorem 6.0.11, the boundedness of C may be replaced by the following weaker assump-
tion (called coercivity condition):

lim f (x) = ∞.
x∈C,kxk→∞

Theorem 6.0.12. Let C be a nonempty closed convex subset of a reflexive Banach space
X and f : C → (−∞, ∞] be a proper lower semicontinuous convex functional such that
f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ C such that f (x̄) = inf f (x).
x∈C

Proof. Let m = inf{f (x) : x ∈ X}. Choose a minimizing sequence {xn } in X, that is,
f (xn ) → m. If {xn } is not bounded, then there exists a subsequence {xni } of {xn } such that
kxni k → ∞. From the hypothesis, we have f (xni ) → ∞, which contradicts m 6= ∞. Hence
{xn } is bounded. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xnj }
of {xn } such that xnj ⇀ x̄ ∈ X. Since f is lower semicontinuous in the weak topology, we
have
m ≤ f (x̄) ≤ lim inf f (xnj ) = lim f (xn ) = m.
j→∞ n→∞

Therefore, f (x̄) = m.

You might also like