0% found this document useful (0 votes)
386 views738 pages

Optimization Principles and Algorithms Ed2 v1

Uploaded by

Kavyaa Kannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
386 views738 pages

Optimization Principles and Algorithms Ed2 v1

Uploaded by

Kavyaa Kannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 738

Ce document est la propriété exclusive de Kavyaa Kannan ([email protected].

in) - jeudi 18 avril 2024 à 07h48

Optimization:
Principles and Algorithms
Michel Bierlaire

Companion website: optimizationprinciplesalgorithms.com

EPFL Press
Optimization:
Principles and Algorithms
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Michel Bierlaire

Every engineer and decision scientist must have a good mastery of optimization, an essential
element in their toolkit. Thus, this articulate introductory textbook will certainly be welcomed
by students and practicing professionals alike. Drawing from his vast teaching experience,
the author skillfully leads the reader through a rich choice of topics in a coherent, fluid and
tasteful blend of models and methods anchored on the underlying mathematical notions
(only prerequisites: first year calculus and linear algebra). Topics range from the classics to
some of the most recent developments in smooth unconstrained and constrained optimiza-
tion, like descent methods, conjugate gradients, Newton and quasi-Newton methods, linear
programming and the simplex method, trust region and interior point methods.
Furthermore elements of discrete and combinatorial optimization like network optimization,
integer programming and heuristic local search methods are also presented.
This book presents optimization as a modeling tool that beyond supporting problem formu-
lation plus design and implementation of efficient algorithms, also is a language suited for
interdisciplinary human interaction. Readers further become aware that while the roots of
mathematical optimization go back to the work of giants like Newton, Lagrange, Cauchy,
Euler or Gauss, it did not become a discipline on its own until World War Two. Also that its pre-
sent momentum really resulted from its symbiosis with modern computers, which made it
possible to routinely solve problems with millions of variables and constraints.
With his witty, entertaining, yet precise style, Michel Bierlaire captivates his readers and
awakens their desire to try out the presented material in a creative mode. One of the out-
standing assets of this book is the unified, clear and concise rendering of the various algo-
rithms, which makes them easily readable and translatable into any high level program-
ming language. This is an addictive book that I am very pleased to recommend.

Prof. Thomas M. Liebling

MICHEL BIERLAIRE holds a PhD in mathematics from the University of Namur, Belgium. He is full professor at the
School of Architecture, Civil and Environmental Engineering at the Ecole Polytechnique Fédérale de Lausanne,
Switzerland. He has been teaching optimization and operations research at the EPFL since 1998.

EPFL Press
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Optimization:
Principles and Algorithms
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Cover Illustrations: Abstract cubes background, © ProMotion and Architectural detail of modern
roof structure, © Lucian Milasan – Fotolia.com
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Michel Bierlaire
Optimization:

EPFL Press
Principles and Algorithms
This book is published under the editorial direction of
Professor Robert Dalang (EPFL).

The publisher and author express their thanks to the Ecole


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Polytechnique Fédérale de Lausanne (EPFL) for its generous


support towards the publication of this book.

The EPFL Press is the English-language imprint of the Foundation of the Presses polytechniques et
universitaires romandes (PPUR). The PPUR publishes mainly works of teaching and research of the Ecole
polytechnique fédérale de Lausanne (EPFL), of universities and other institutions of higher education.

Presses polytechniques et universitaires romandes, EPFL – Rolex Learning Center,


Post office box 119, CH-1015 Lausanne, Switzerland
E-mail : ppur@epfl.ch
Phone : 021 / 693 21 30
Fax : 021 / 693 40 27

www.epflpress.org

© 2018, Second edition, EPFL Press


ISBN 978-2-88915-279-7

© 2015, First edition, EPFL Press


ISBN 978-2-940222-78-0 (EPFL Press)
ISBN 978-1-4822-0345-5 (CRC Press)
Printed in Italy
All right reserved (including those of translation into other languages). No part of this book may be
reproduced in any form – by photoprint, microfilm, or any other means – nor transmitted or translated into
a machine language without written permission from the publisher.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

To Patricia
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Preface

Optimization algorithms are important tools for engineers, but difficult to use. In
fact, none of them is universal, and a good understanding of the different methods is
necessary in order to identify the most appropriate one in the context of specific appli-
cations. Designed to teach undergraduate engineering students about optimization,
this book also provides professionals employing optimization methods with significant
elements to identify the methods that are appropriate for their applications, and to
understand the possible failures of certain methods on their problem. The content
is meant to be formal, in the sense that the results presented are proven in detail,
and all described algorithms have been implemented and tested by the author. In
addition, the many numeric and graphic illustrations constitute a significant base for
understanding the methods.
The book features eight parts. The first part focuses on the formulation and the
analysis of the optimization problem. It describes the modeling process that leads to
an optimization problem, as well as the transformations of the problem into an equiv-
alent formulation. The properties of the problem and corresponding hypotheses are
discussed independently of the algorithms. Subsequently, the optimality conditions,
the theoretical foundations that are essential for properly mastering the algorithms,
are analyzed in detail in Part II. Before explaining the methods for unconstrained
continuous optimization in Part IV, algorithms for solving systems of non linear
equations, based on Newton’s method, are described in Part III. The algorithms for
constrained continuous optimization constitute the fifth part. Part VI addresses op-
timization problems based on network structures, elaborating more specifically on
the shortest path problem and the maximum flow problem. Discrete optimization
problems, where the variables are constrained to take integer values, are introduced
in Part VII, where both exact methods and heuristics are presented. The last part is
an appendix containing the definitions and theoretical results used in the book.
Several chapters include exercises. Chapters related to algorithms also propose
projects involving an implementation. It is advisable to use a mathematical program-
ming language, such as Octave (Eaton, 1997) or Matlab (Moled, 2004). If a language
such as C, C++, or Fortran is preferred, a library managing the linear algebra, such
as LAPACK (Anderson et al., 1999), can be useful. When time limits do not allow
a full implementation of the algorithms by the students, the teaching assistant may
prepare the general structure of the program, including the implementation of opti-
mization problems (objective function, constraints, and derivatives) in order for the
Optimization: Principles and Algorithm viii

students to focus on the key points of the algorithms. The examples described in
detail in this book enable the implementations to be verified.
Optimization is an active research field, that is, permanently stimulated by the
needs of modern applications. Many aspects are absent from this book. “Every choice
entails the rejection of what might have been better,” said Andre Gide. Among the
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

important topics not covered in this book, we can mention


• the numerical aspects related to the implementation, particularly important and
tricky in this area (see, e.g., Dennis and Schnabel, 1996);
• the convergence analysis of the algorithms (see, e.g., Ortega and Rheinboldt, 1970,
Dennis and Schnabel, 1996, Conn et al., 2000, and many others);
• automatic differentiation, allowing the automatic generation of analytical deriva-
tives of a function (Griewank, 1989, Griewank, 2000);
• techniques to deal with problems of large size, such as updates with limited mem-
ory (Byrd et al., 1994) or partially separable functions (Griewank and Toint, 1982);
• homotopy methods (Forster, 1995);
• semidefinite optimization (Gärtner and Matousek, 2012);
• the vast field of convex optimization (Ben-Tal and Nemirovski, 2001, Boyd and
Vandenberghe, 2004, Calafiore and El Ghaoui, 2014);
• stochastic programming (Birge and Louveaux, 1997, Shapiro et al., 2014);
• robust optimization (Ben-Tal et al., 2009), where uncertainty of the data is ex-
plicitly accounted for.
This book is the fruit of fifteen years of teaching optimization to undergraduate
students in engineering at the Ecole Polytechnique Fédérale of Lausanne. Except for
the parts of the book related to networks and discrete optimization that are new, the
material in the book has been translated from Bierlaire (2006), a textbook in French.
The main sources of inspiration are the following books:
• Bertsimas and Weismantel (2005)
• de Werra et al. (2003)
• Bonnans et al. (2003)
• Conn et al. (2000)
• Nocedal and Wright (1999)
• Bertsekas (1999)
• Wolsey (1998)
• Bertsekas (1998)
• Bertsimas and Tsitsiklis (1997)
• Wright (1997)
• Dennis and Schnabel (1996)
• Ahuja et al. (1993).
Preface ix

There are many books on optimization. Within the vast literature, we may cite
the following books in English: Beck (2014), Calafiore and El Ghaoui (2014), Ben-
Tal et al. (2009), Boyd and Vandenberghe (2004), Ben-Tal and Nemirovski (2001),
Conn et al. (2000), Kelley (1999), Kelley (1995), Birge and Louveaux (1997), Den-
nis and Schnabel (1996), Axelsson (1994), Polyak (1987), Scales (1985), Coleman
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

(1984), McCormick (1983), Gill et al. (1981), Fletcher (1980), Fletcher (1981), Or-
tega and Rheinboldt (1970), and Fiacco and McCormick (1968). Several books are
also available in French. Among them, we can cite Korte et al. (2010), Dodge (2006),
Cherruault (1999), Breton and Haurie (1999), Hiriart-Urruty (1998), Bonnans et al.
(1997), and Gauvin (1992).
The bibliographic source for the biographies of Jacobi, Hesse, Lagrange, Fermat,
Newton, Al Khwarizmi, Cauchy, and Lipschitz is Gillispie (1990). The information
about Tucker (Gass, 2004), Dantzig (Gass, 2003), Little (Larson, 2004), Fulkerson
(Bland and Orlin, 2005), and Gomory (Johnson, 2005) come from the series IFORS’
Operational Research Hall of Fame. The source of information for Euler is his
biography by Finkel (1897). Finally, the information on Davidon was taken from his
web page

www.haverford.edu/math/wdavidon.html

and from Nocedal and Wright (1999). The selection of persons described in this work
is purely arbitrary. Many other mathematicians have contributed significantly to the
field of optimization and would deserve a place herein. I encourage the reader to read,
in particular, the articles of the series IFORS’ Operational Research Hall of Fame
published in International Transactions in Operational Research, expressly
dealing with Morse, Bellman, Kantorovich, Erlang, and Kuhn.

Online material
The book has a companion website:

www.optimizationprinciplesalgorithms.com

The algorithms presented in the book are coded in GNU Octave, a high-level in-
terpreted language (www.gnu.org/software/octave), primarily intended for numerical
computations. The code for the algorithms, as well as examples of optimization prob-
lems, are provided. All the examples have been run on GNU Octave, version 3.8.1.
on a MacBook Pro running OS X Yosemite 10.10.2. If you use these codes, Michel
Bierlaire, the author, grants you a nonexclusive license to run, display, reproduce, dis-
tribute and prepare derivative works of this code. The code has not been thoroughly
tested under all conditions. The author, therefore, does not guarantee or imply its
reliability, serviceability, or function. The author provides no program services for
the code.
Optimization: Principles and Algorithm x

Acknowledgments
I would like to thank EPFL Press for their support in the translation of the French
version and the publication of this book. The financial support of the School of
Architecture, Civil and Environmental Engineering at EPFL is highly appreciated.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

I am grateful to my PhD advisor, Philippe Toint, who passed on his passion for
optimization to me. Among the many things he taught me, the use of geometric
interpretations of certain concepts, especially algorithmic ones, proved particularly
useful for my research and my teaching. I hope that they will now benefit the readers
of this book.
I also thank Thomas Liebling who put his trust in me by welcoming me into his
group at EPFL back in 1998. Among other things, he asked me to take care of the
optimization and operations research courses, which year after year have enabled me
to build the material that is gathered in this book. But I am especially grateful for
his friendship, and all the things that he has imparted to me.
I would like to thank my doctoral students, teaching assistants, and postdocs, not
only for the pleasure of working with them, but also for their valuable help in the
optimization course over the years, and their comments on various versions of this
book.
Earlier versions of the manuscript were carefully read by the members of the Trans-
port and Mobility Laboratory at the EPFL, and in particular Stefan Binder, Anna
Fernandez Antolin, Flurin Hänseler, Yousef Maknoon, Iliya Markov, Marija Nikolic,
Tomas Robenek, Riccardo Scarinci, and Shadi Sharif. Thomas Liebling provided a
great deal of comments, with excellent suggestions to improve the style and the con-
tent of the book. He caught several errors and imprecisions. It greatly improved the
quality of the text. If there are still mistakes (and there are always some that escape
scrutinity), I clearly take full responsibility. Errata will be published on the website
as mistakes are caught.
I am so proud of my children, Aria and François, who have recently started the
difficult challenge of obtaining a university degree. Finally, I dedicate this book to
Patricia. I am really lucky to know such a wonderful person. Her love is a tremendous
source of joy and energy for me.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Contents

I Formulation and analysis of the problem 1


1 Formulation 5
1.1 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Projectile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.2 Swisscom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.3 Ch^ateau Laupt-Himum . . . . . . . . . . . . . . . . . . . . . . 9
1.1.4 Euclid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.5 Agent 007 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.6 Indiana Jones . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.7 Geppetto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Problem transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.1 Simple transformations . . . . . . . . . . . . . . . . . . . . . . 16
1.2.2 Slack variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 Objective function 29
2.1 Convexity and concavity . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Differentiability: the first order . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Differentiability: the second order . . . . . . . . . . . . . . . . . . . . 39
2.4 Linearity and non linearity . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5 Conditioning and preconditioning . . . . . . . . . . . . . . . . . . . . . 45
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Constraints 51
3.1 Active constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 Linear independence of the constraints . . . . . . . . . . . . . . . . . . 56
3.3 Feasible directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.1 Convex constraints . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.2 Constraints defined by equations-inequations . . . . . . . . . . 62
3.4 Elimination of constraints . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.5 Linear constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
xii CONTENTS

3.5.1 Polyhedron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.5.2 Basic solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.5.3 Basic directions . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

4 Introduction to duality 93
4.1 Constraint relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2 Duality in linear optimization . . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

II Optimality conditions 111

5 Unconstrained optimization 115


5.1 Necessary optimality conditions . . . . . . . . . . . . . . . . . . . . . . 115
5.2 Sufficient optimality conditions . . . . . . . . . . . . . . . . . . . . . . 120
5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6 Constrained optimization 127


6.1 Convex constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2 Lagrange multipliers: necessary conditions . . . . . . . . . . . . . . . . 133
6.2.1 Linear constraints . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.2.2 Equality constraints . . . . . . . . . . . . . . . . . . . . . . . . 137
6.2.3 Equality and inequality constraints . . . . . . . . . . . . . . . . 142
6.3 Lagrange multipliers: sufficient conditions . . . . . . . . . . . . . . . . 152
6.3.1 Equality constraints . . . . . . . . . . . . . . . . . . . . . . . . 153
6.3.2 Inequality constraints . . . . . . . . . . . . . . . . . . . . . . . 154
6.4 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.5 Linear optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.6 Quadratic optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

III Solving equations 177

7 Newton’s method 181


7.1 Equation with one unknown . . . . . . . . . . . . . . . . . . . . . . . . 181
7.2 Systems of equations with multiple unknowns . . . . . . . . . . . . . . 192
7.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

8 Quasi-Newton methods 201


8.1 Equation with one unknown . . . . . . . . . . . . . . . . . . . . . . . . 201
8.2 Systems of equations with multiple unknowns . . . . . . . . . . . . . . 208
8.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
CONTENTS xiii

IV Unconstrained optimization 217


9 Quadratic problems 221
9.1 Direct solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.2 Conjugate gradient method . . . . . . . . . . . . . . . . . . . . . . . . 222
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

9.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

10 Newton’s local method 235


10.1 Solving the necessary optimality conditions . . . . . . . . . . . . . . . 235
10.2 Geometric interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

11 Descent methods and line search 245


11.1 Preconditioned steepest descent . . . . . . . . . . . . . . . . . . . . . . 246
11.2 Exact line search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
11.2.1 Quadratic interpolation . . . . . . . . . . . . . . . . . . . . . . 252
11.2.2 Golden section . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
11.3 Inexact line search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.4 Steepest descent method . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.5 Newton method with line search . . . . . . . . . . . . . . . . . . . . . 277
11.6 The Rosenbrock problem . . . . . . . . . . . . . . . . . . . . . . . . . 281
11.7 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
11.8 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

12 Trust region 291


12.1 Solving the trust region subproblem . . . . . . . . . . . . . . . . . . . 294
12.1.1 The dogleg method . . . . . . . . . . . . . . . . . . . . . . . . . 294
12.1.2 Steihaug-Toint method . . . . . . . . . . . . . . . . . . . . . . 298
12.2 Calculation of the radius of the trust region . . . . . . . . . . . . . . . 300
12.3 The Rosenbrock problem . . . . . . . . . . . . . . . . . . . . . . . . . 308
12.4 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

13 Quasi-Newton methods 311


13.1 BFGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
13.2 Symmetric update of rank 1 (SR1) . . . . . . . . . . . . . . . . . . . . 317
13.3 The Rosenbrock problem . . . . . . . . . . . . . . . . . . . . . . . . . 320
13.4 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
13.5 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326

14 Least squares problem 329


14.1 The Gauss-Newton method . . . . . . . . . . . . . . . . . . . . . . . . 334
14.2 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
14.3 Orthogonal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
14.4 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
xiv CONTENTS

15 Direct search methods 347


15.1 Nelder-Mead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
15.2 Torczon’s multi-directional search . . . . . . . . . . . . . . . . . . . . . 354
15.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

V Constrained optimization 359


16 The simplex method 363
16.1 The simplex algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
16.2 The simplex tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
16.3 The initial tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
16.4 The revised simplex algorithm . . . . . . . . . . . . . . . . . . . . . . 394
16.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
16.6 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396

17 Newton’s method for constrained optimization 399


17.1 Projected gradient method . . . . . . . . . . . . . . . . . . . . . . . . 399
17.2 Preconditioned projected gradient . . . . . . . . . . . . . . . . . . . . 405
17.3 Dikin’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
17.4 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412

18 Interior point methods 415


18.1 Barrier methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
18.2 Linear optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
18.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443

19 Augmented Lagrangian method 445


19.1 Lagrangian penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
19.2 Quadratic penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
19.3 Double penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
19.4 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460

20 Sequential quadratic programming 463


20.1 Local sequential quadratic programming . . . . . . . . . . . . . . . . . 464
20.2 Globally convergent algorithm . . . . . . . . . . . . . . . . . . . . . . . 471
20.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

VI Networks 487
21 Introduction and definitions 491
21.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
21.2 Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
21.3 Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
21.4 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
CONTENTS xv

21.5 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501


21.5.1 Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
21.5.2 Capacities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
21.5.3 Supply and demand . . . . . . . . . . . . . . . . . . . . . . . . 504
21.5.4 Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

21.5.5 Network representation . . . . . . . . . . . . . . . . . . . . . . 508


21.6 Flow decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
21.7 Minimum spanning trees . . . . . . . . . . . . . . . . . . . . . . . . . . 520
21.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524

22 The transhipment problem 529


22.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
22.2 Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
22.3 Total unimodularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
22.4 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
22.4.1 The shortest path problem . . . . . . . . . . . . . . . . . . . . 539
22.4.2 The maximum flow problem . . . . . . . . . . . . . . . . . . . . 541
22.4.3 The transportation problem . . . . . . . . . . . . . . . . . . . . 544
22.4.4 The assignment problem . . . . . . . . . . . . . . . . . . . . . . 546
22.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549

23 Shortest path 551


23.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
23.2 The shortest path algorithm . . . . . . . . . . . . . . . . . . . . . . . . 558
23.3 Dijkstra’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
23.4 The longest path problem . . . . . . . . . . . . . . . . . . . . . . . . . 571
23.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574

24 Maximum flow 577


24.1 The Ford-Fulkerson algorithm . . . . . . . . . . . . . . . . . . . . . . . 577
24.2 The minimum cut problem . . . . . . . . . . . . . . . . . . . . . . . . 583
24.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588

VII Discrete optimization 591


25 Introduction to discrete optimization 595
25.1 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
25.2 Classical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
25.2.1 The knapsack problem . . . . . . . . . . . . . . . . . . . . . . . 607
25.2.2 Set covering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
25.2.3 The traveling salesman problem . . . . . . . . . . . . . . . . . 610
25.3 The curse of dimensionality . . . . . . . . . . . . . . . . . . . . . . . . 614
25.4 Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
25.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
xvi CONTENTS

26 Exact methods for discrete optimization 625


26.1 Branch and bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
26.2 Cutting planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
26.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644
26.4 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

27 Heuristics 647
27.1 Greedy heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
27.1.1 The knapsack problem . . . . . . . . . . . . . . . . . . . . . . . 648
27.1.2 The traveling salesman problem . . . . . . . . . . . . . . . . . 649
27.2 Neighborhood and local search . . . . . . . . . . . . . . . . . . . . . . 656
27.2.1 The knapsack problem . . . . . . . . . . . . . . . . . . . . . . . 662
27.2.2 The traveling salesman problem . . . . . . . . . . . . . . . . . 665
27.3 Variable neighborhood search . . . . . . . . . . . . . . . . . . . . . . . 669
27.3.1 The knapsack problem . . . . . . . . . . . . . . . . . . . . . . . 670
27.3.2 The traveling salesman problem . . . . . . . . . . . . . . . . . 672
27.4 Simulated annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
27.4.1 The knapsack problem . . . . . . . . . . . . . . . . . . . . . . . 677
27.4.2 The traveling salesman problem . . . . . . . . . . . . . . . . . 679
27.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682
27.6 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682

VIII Appendices 685


A Notations 687

B Definitions 689

C Theorems 695

D Projects 699
D.1 General instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
D.2 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700

References 703
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Part I

problem
Formulation and analysis of the
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

No one trusts a model except the


man who wrote it; everyone trusts
an observation, except the man who
made it.

Harlow Shapley

Modeling is a necessity before any optimization process. How do we translate a


specific problem statement into a mathematical formulation that allows its analysis
and its resolution? In this first part, we propose modeling in the field of optimization.
Then, we identify the properties of the optimization problem that are useful in the
development of the theory and algorithms.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 1

Formulation

Contents
1.1 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Projectile . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.2 Swisscom . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.3 Ch^ateau Laupt-Himum . . . . . . . . . . . . . . . . . . . 9
1.1.4 Euclid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.5 Agent 007 . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.6 Indiana Jones . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.7 Geppetto . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Problem transformations . . . . . . . . . . . . . . . . . . . 16
1.2.1 Simple transformations . . . . . . . . . . . . . . . . . . . 16
1.2.2 Slack variables . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4 Problem definition . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.1 Modeling
The need to optimize is a direct result of the need to organize. Optimizing consists in
identifying an optimal configuration or an optimum in a system in the broadest sense
of the term. We use here Definition 1.1, given by Oxford University Press (2013).

Definition 1.1 (Optimum). (In Latin optimum, the best). The most favorable
or advantageous condition, value, or amount, especially under a particular set of
circumstances.

As part of a scientific approach, this definition requires some details. How can we
judge that the condition is favorable, and how can we formally describe the set of
circumstances?
6 Modeling

The answer to these questions is an essential step in any optimization: mathe-


matical modeling (Definition 1.2 by Oxford University Press, 2013). The modeling
process consists of three steps:
1. The identification of the decision variables. They are the components of the
system that describe its state, and that the analyst wants to determine. Or, they
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

represent configurations of the system that are possible to modify in order to


improve its performance. In general, if these variables are n in number, they are
represented by a (column)1 vector of Rn , often denoted by x = (x1 . . . xn )T , i.e.,
 
x1
 
x =  ...  .
xn

In practice, this step is probably the most complicated and most important. The
most complicated because only experience in modeling and a good knowledge of
the specific problem can guide the selection. The most important because the rest
of the process depends on it. An inappropriate selection of decision variables can
generate an optimization problem that is too complicated and impossible to solve.
2. The description of the method to assess the state of the system in question, given
a set of decision variables. In this book, we assume that the person performing
the modeling is able to identify a formula, a function, providing a measure of the
state of the system, a value that she wants to make the smallest or largest possible.
This function, called objective function, is denoted by f and the aforementioned
measure obtained for the decision variables x is a real number denoted by f(x).
3. The mathematical description of the circumstances or constraints, specifying the
values that the decision variables can take.

Definition 1.2 (Mathematical model). Mathematical representation of a physical,


economic, human phenomenon, etc., conducted in order to better study it.

The modeling process is both exciting and challenging. Indeed, there is no uni-
versal recipe, and the number of possible models for a given problem is only limited
by the imagination of the modeler. However, it is essential to master optimization
tools and to understand the underlying assumptions in order to develop the adequate
model for the analysis in question. In this chapter, we provide some simple examples
of modeling exercises. In each case, we present a possible modeling.

1.1.1 Projectile
We start with a simple problem. A projectile is launched vertically at a rate of 50
meters per second, in the absence of wind. After how long and at which altitude does
1 See Appendix A about the mathematical notations used throughout the book.
Formulation 7

it start to fall? Note that, in this case, the decision variables represent a state of the
system that the analyst wants to calculate.
The modeling process consists of three steps.
Decision variables A single decision variable is used. Denoted by x, it represents
the number of seconds from the launch of the projectile. Note that in this case,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

the decision variables represent a state of the system that the analyst wants to
calculate.
Objective function We seek to identify the maximum altitude reached by the ob-
ject. We must thus express the altitude as a function of the decision variable. Since
we are dealing with the uniformly accelerating movement of an object subjected
to gravity, we have
g 9.81 2
f(x) = − x2 + v0 x + x0 = − x + 50 x ,
2 2
where g = 9.81 is the acceleration experienced by the projectile, v0 = 50 is its
initial velocity, and x0 = 0 is its initial altitude.
Constraints Time only goes forward. Therefore, we impose x ≥ 0.
We obtain the optimization problem
9.81 2
max − x + 50 x , (1.1)
x∈R 2
subject to (s.t.)
x ≥ 0. (1.2)

1.1.2 Swisscom
The company Swisscom would like to install an antenna to connect four important
new customers to its network. This antenna must be as close as possible to each
client, giving priority to the best customers. However, to avoid the proliferation of
telecommunication antennas, the company is not allowed to install the new antenna
at a distance closer than 10 km from the other two antennas, respectively located at
coordinates (−5, 10) and (5, 0) and represented by the symbol ✪ in Figure 1.1. The
coordinates are expressed in kilometers from Swisscom’s headquarters. Swisscom
knows the geographic situation of each customer as well as the number of hours of
communication that the customer is supposed to consume per month. This data is
listed in Table 1.1. At which location should Swisscom install the new antenna?
The modeling process consists of three steps.
Decision variables Swisscom must identify the ideal location for the antenna, i.e.,
the coordinates of that location. We define two decision variables x1 and x2
representing these coordinates in a given reference system.
Objective function The distance di (x1 , x2 ) between a customer i located at the
coordinates (ai , bi ) and the antenna is given by
q
di (x1 , x2 ) = (x1 − ai )2 + (x2 − bi )2 . (1.3)
8 Modeling

3

✪ 1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

(x1 , x2 )

✪ ✲ 4

Figure 1.1: Swisscom problem

Table 1.1: Data for Swisscom customers


Customer Coord. Hours
1 (5, 10) 200
2 (10, 5) 150
3 (0, 12) 200
4 (12, 0) 300

To take into account the communication time, we measure the sum of the distances
weighted by the number of consumed hours:

f(x1 , x2 ) = 200 d1 (x1 , x2 ) + 150 d2 (x1 , x2 )


+ 200 d3 (x1 , x2 ) + 300 d4 (x1 , x2 )
q
= 200 (x1 − 5)2 + (x2 − 10)2
q
(1.4)
+ 150 (x1 − 10)2 + (x2 − 5)2
q
+ 200 x21 + (x2 − 12)2
q
+ 300 (x1 − 12)2 + x22 .

Constraints The constraints on the distances between the antennas can be expressed
as q
(x1 + 5)2 + (x2 − 10)2 ≥ 10 (1.5)

and q
(x1 − 5)2 + x22 ≥ 10 . (1.6)
Formulation 9

We can combine the various stages of modeling to obtain the following optimiza-
tion problem:
q
min f(x1 , x2 ) = 200 (x1 − 5)2 + (x2 − 10)2
x∈R2
q
+ 150 (x1 − 10)2 + (x2 − 5)2
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

q (1.7)
+ 200 x21 + (x2 − 12)2
q
+ 300 (x1 − 12)2 + x22

subject to (s.t.)
q
(x1 + 5)2 + (x2 − 10)2 ≥ 10
q (1.8)
(x1 − 5)2 + (x2 − 10)2 ≥ 10 .

1.1.3 Ch^
ateau Laupt-Himum
The Château Laupt-Himum produces rosé wine and red wine by buying grapes from
local producers. This year they can buy up to one ton of Pinot (a red grape) from a
winegrower, paying e 3 per kilo. They can then vinify the grapes in two ways: either
as a white wine to obtain a rosé wine or as a red wine to get Pinot Noir, a full-bodied
red wine. The vinification of the rosé wine costs e 2 per kilo of grapes, while that of
the Pinot Noir costs e 3.50 per kilo of grapes.
In order to take into account economies of scale, the Château wants to adjust the
price of its wine to the quantity produced. The price for one liter of the rosé is e 15
minus a rebate of e 2 per hundred liters produced. Thus, if they produce 100 liters
of rosé, they sell it for e 13 per liter. If they produce 200, they sell it for e 11 per
liter. Similarly, they sell the Pinot Noir at a price of e 23 per liter, minus a rebate of
e 1 per hundred liters produced.
How should the Château Laupt-Himum be organized in order to maximize its
profit, when a kilo of grapes produces 1 liter of wine?
The modeling process consists of three steps.

Decision variables The strategy of the Château Laupt-Himum is to decide how


many liters of rosé wine and Pinot Noir to produce each year, and the number of
kilos of grapes to buy from the winegrower. Therefore, we define three decision
variables:
• x1 is the number of liters of rosé wine to produce each year,
• x2 is the number of liters of Pinot Noir to produce,
• x3 is the number of kilos of grapes to buy.
Objective function The objective of the Château Laupt-Himum is to maximize its
profit. This gain is the income from the wine sales minus the costs.
10 Modeling

Each liter of rosé wine that is sold gives (in e )


2
15 − x1
100
taking into account the reduction. Similarly, each liter of Pinot Noir gives (in e )
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

1
23 − x2 .
100
The revenues corresponding to the production of x1 liters of rosé wine and x2
liters of Pinot Noir are equal to
   
2 1
x1 15 − x1 + x2 23 − x2 .
100 100
It costs 3x3 to purchase the grapes. To produce a liter of wine, they need one kilo
of vinified grapes, which costs e 2 for the rosé and e 3.50 for the Pinot Noir. The
total costs are
2x1 + 3.5 x2 + 3x3 .
The objective function that the Château Laupt-Himum should maximize is
   
2 1
x1 15 − x1 + x2 23 − x2 − (2x1 + 3.5 x2 + 3x3 ) .
100 100
Constraints The Château cannot buy more than 1 ton of grapes from the wine-
grower, i.e.,
x3 ≤ 1, 000 .
Moreover, they cannot produce more wine than is possible with the amount of
grapes purchased. As one kilo of grapes produces one liter of wine, we have
x1 + x2 ≤ x3 .
It is necessary to add constraints which are, although apparently trivial at the
application level, essential to the validity of the mathematical model. These con-
straints specify the nature of the decision variables. In the case of Château Laupt-
Himum, negative values of these variables would have no valid interpretation. It
is necessary to impose
x1 ≥ 0 , x2 ≥ 0 , x3 ≥ 0 . (1.9)
We combine the modeling steps to obtain the following optimization problem:
   
2 1
max f(x) = x1 15 − x1 + x2 23 − x2 − (2x1 + 3.5 x2 + 3x3 ) (1.10)
x∈R3 100 100
subject to
x1 + x2 ≤ x3
x3 ≤ 1 000
x1 ≥ 0 (1.11)
x2 ≥ 0
x3 ≥ 0 .
Formulation 11

1.1.4 Euclid
In about 300 BC, the Greek mathematician Euclid was interested in the following
geometry problem: what is the rectangle with the greatest area among the rectangles
with given perimeter L? This is considered as one of the first known optimization
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

problems in history. We write it in three steps.


Decision variables The decision variables are the length x1 and the height x2 of the
rectangle, expressed in any arbitrary unit.
Objective function We are looking for the rectangle with maximum area. There-
fore, the objective function is simply equal to x1 x2 .
Constraints The total length of the edges of the rectangle must be equal to L, that
is
2x1 + 2x2 = L. (1.12)
Moreover, the dimensions x1 and x2 must be non negative:
x1 ≥ 0 and x2 ≥ 0. (1.13)
Combining everything, we obtain the following optimization problem:
max x1 x2 (1.14)
x∈R2

subject to
2x1 + 2x2 = L
x1 ≥ 0 (1.15)
x2 ≥ 0.

1.1.5 Agent 007


James Bond, secret agent 007, has a mission to defuse a nuclear bomb on a yacht
moored 100 meters from shore. Currently, James Bond is 80 meters from the nearest
point to the yacht on the beach. He is capable of running on the beach at 20 km/h
and swimming at 6 km/h. Given that he needs 30 seconds to defuse the bomb, and
that the bomb is programmed to explode in 102 seconds, will James Bond have the
time to save the free world? This crucial issue, illustrated in Figure 1.2, can be solved
by an optimization problem.
The modeling process consists of three steps.
Decision variables The decision that James Bond must make in order to arrive as
fast as possible on the yacht is to choose when to stop running on the beach and
start swimming towards the yacht. We define a decision variable x representing
the distance in meters to run on the beach before jumping into the water.
Objective function Since the objective is to minimize the time to get to the yacht,
the objective function associates a decision x with the corresponding time in sec-
onds. Since James Bond runs at 20 km/h, the x meters on the beach are covered
in 3.6 x/20 seconds, i.e., 18x/100 seconds. From there, he swims a distance of
q
1002 + (80 − x)2 (1.16)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48 12 Modeling

100 m

007
x

80 m

Figure 1.2: The setting for James Bond

at a speed of 6 km/h. This takes him


q q
3.6
1002 + (80 − x)2 = 0.6 1002 + (80 − x)2 (1.17)
6
seconds. The objective function is
q
18
f(x) = x + 0.6 1002 + (80 − x)2 . (1.18)
100
Note that the trivial decisions such as x = 0 and x = 80 have disastrous conse-
quences for the future of the planet. An optimization approach is essential.
Constraints A secret agent like James Bond suffers no constraint! However, it makes
sense to require that he does not run backward or beyond the yacht, that is,

0 ≤ x ≤ 80 .

We can combine the different steps of the modeling to obtain the following opti-
mization problem:
q
18
min f(x) = x + 0.6 1002 + (80 − x)2 (1.19)
x∈R 100
subject to
x≥0
(1.20)
x ≤ 80 .
This example is inspired by Walker (1999).
Formulation 13

1.1.6 Indiana Jones


During his quest to find the cross of Coronado, the famous archaeologist Indiana
Jones gets stuck facing a huge room filled with Pseudechis porphyriacus, venomous
snakes. This room is 10 meters long and 5 high. Given the aversion the adventurer
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

has for these reptiles, it is impossible for him to wade through them, and he considers
passing over them. However, the roof is not strong enough, so he cannot walk on it.
Ever ingenious, he places the end of a ladder on the ground, blocked by a boulder,
leans it on the wall, and uses it to reach the other end of the room (Figure 1.3). Once
there, he uses his whip to get down to the floor on the other side of the snake room.
Where on the floor must he place the end of the ladder, so that the length used is as
small as possible, and the ladder thus less likely to break under his weight? We write
the optimization problem that would help our hero.

x2


h=5m


✛ ✲ x1
ℓ = 10 m

Figure 1.3: The setting for Indiana Jones

The modeling process consists of three steps.


Decision variables The decision variable is the position on the floor of the end of
the ladder. To facilitate our modeling work, let us also use a decision variable for
the other end of the ladder. Thus
• x1 is the position on the floor of the ladder’s end,
• x2 is the height of the other end of the ladder at the other end of the room.
Objective function Since the objective is to find the smallest possible ladder, we
minimize its length, i.e., q
f(x) = x21 + x22 .
Constraints The positions x1 and x2 should be such that the ladder is leaned exactly
on the edge of the wall of the room. By using similar triangles, this constraint
can be expressed as
x2 h x2 − h
= =
x1 x1 − ℓ ℓ
14 Modeling

or
x1 x2 − hx1 − ℓx2 = 0 .
Finally, the ends of the ladder must be outside the room, and

x1 ≥ ℓ and x2 ≥ h .
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

We can combine the modeling steps to obtain the following optimization problem:
q
min x21 + x22 (1.21)
x∈R2

subject to
x1 x2 − hx1 − ℓx2 = 0
x1 ≥ ℓ (1.22)
x2 ≥ h .

1.1.7 Geppetto
The company Geppetto Inc. produces wooden toys. It specializes in the manufacture
of soldiers and trains. Each soldier is sold for e 27 and costs e 24 in raw material. To
produce a soldier, 1 hour of carpentry labor is required, as well as 2 hours of finishing.
Each train is sold for e 21 and costs e 19 in raw material. To produce a train, it takes
1 hour of carpentry labor and 1 hour of finishing. Geppetto has two carpenters and
two finishing specialists each working 40 hours per week. He himself puts in 20 hours
per week on finishing work. The trains are popular, and he knows that he can always
sell his entire production. However, he is not able to sell more than 40 soldiers per
week. What should Geppetto do to optimize his income?
The modeling process is organized in three steps.
Decision variables Geppetto’s strategy is to determine the number of soldiers and
trains to produce per week. Thus, we define two decision variables:
• x1 is the number of soldiers to produce per week,
• x2 is the number of trains to produce per week.
Objective function The objective of Geppetto is to make as much money as pos-
sible. The quantity to maximize is the profit. Geppetto’s gains consist of the
income from the sales of the toys, minus the cost of the raw material. Geppetto’s
income is the sum of the income from the sales of the soldiers and the sales of the
trains. The income (in e ) from the sales of the soldiers is the number of soldiers
sold multiplied by the selling price of one soldier, i.e., 27 x1 . The income from
the sales of the trains is 21 x2 . The total income for Geppetto is 27 x1 + 21 x2 .
Similarly, we evaluate the material costs to 24 x1 + 19x2 . The gain is

(27 x1 + 21 x2 ) − (24 x1 + 19x2 ), (1.23)

or
f(x) = 3x1 + 2x2 . (1.24)
Formulation 15

Constraints The production is subject to three main constraints. First, the available
labor does not allow more than 100 finishing hours (performed by two workers and
Geppetto) and 80 hours of carpentry (performed by two carpenters). Furthermore,
to avoid unsold objects, they should not produce more than 40 soldiers per week.
The number of hours per week of finishing is the number of hours of finishing
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

for the soldiers, multiplied by the number of soldiers produced, plus the hours of
finishing for the trains, multiplied by the number of trains produced. This number
may not exceed 100, and we can express the following constraint:

2x1 + x2 ≤ 100 . (1.25)

A similar analysis of the carpentry resources leads to the following constraint:

x1 + x2 ≤ 80 . (1.26)

Finally, the constraint to avoid unsold products can simply be written as:

x1 ≤ 40 . (1.27)

At this stage, it seems that all the constraints of the problem have been described
mathematically. However, it is necessary to add constraints that are, although
apparently trivial at the application level, essential to the validity of the mathe-
matical model. These constraints specify the nature of the decision variables. In
the case of Geppetto, it is not conceivable to produce parts of trains or soldiers.
The decision variables must absolutely take integer values, so that

x1 ∈ N , x2 ∈ N . (1.28)

We combine the different stages of the modeling to obtain the following optimiza-
tion problem:

max f(x) = 3x1 + 2x2 , (1.29)


x∈N2

subject to

2x1 + x2 ≤ 100
x1 + x2 ≤ 80 (1.30)
x1 ≤ 40.

Interestingly, the constraints (1.28), albeit of a trivial aspect, significantly compli-


cate the optimization methods. Most of the book is devoted to problems where the
integrality of the variables is not imposed. An introduction to discrete optimization
is provided in Part VII. This example is inspired by Winston (1994).
16 Problem transformations

1.2 Problem transformations

Even though the modeling step is completed, we are not yet out of the woods. In-
deed, there are many ways to write a given problem mathematically. Algorithms and
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

software often require a special formulation based on which they solve the problem.
In this chapter, we study techniques that help us comply with these requirements.

Obtaining the mathematical formulation of a problem does not necessarily end the
modeling process. In fact, the obtained formulation may not be adequate. In particu-
lar, the software capable of solving optimization problems often require the problems
to be formulated in a specific way, not necessarily corresponding to the result of
the approach described in Section 1.1. We review some rules for transforming an
optimization problem into another equivalent problem.

Definition 1.3 (Equivalence). Two optimization problems P1 and P2 are said to be


equivalent if we can create a feasible point (i.e., satisfying the constraints) in P2 from
a feasible point in P1 (and vice versa) with the same value of the objective function.
In particular, the two problems have the same optimal cost, and we can obtain a
solution of P2 from a solution of P1 (and vice versa).

1.2.1 Simple transformations


Here are some simple transformations that are often used in modeling.
1. Consider the optimization problem

min f(x) ,
x∈X ⊆ Rn

n
where X is a subset of  R . Consider a function g : R → R that is strictly
increasing on Im(f) = z | ∃x ∈ X such that z = f(x) , i.e., for any z1 , z2 ∈ Im(f),
g(z1 ) > g(z2 ) if and only if z1 > z2 . Thus,2

argminx∈X ⊆ Rn f(x) = argminx∈X ⊆ Rn h(x) , (1.31)



where h(x) = g f(x) , and
 

min n g f(x) = g min n f(x) . (1.32)
x∈X ⊆ R x∈X ⊆ R

In particular, adding or subtracting a constant to the objective function of an


optimization problem does not change its solution

∀c ∈ R , argminx∈X ⊆ Rn f(x) = argminx∈X ⊆ Rn f(x) + c . (1.33)
2 The operator argmin identifies the values of the decision variables that reach the minimum, while
the operator min identifies the value corresponding to the objective function. See Appendix A.
Formulation 17

Similarly, if the function f generates only positive values, taking the logarithm of
the objective function or taking its square does not change its solution:

argminx∈X ⊆ Rn f(x) = argminx∈X ⊆ Rn log f(x) , (1.34)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

and
2
argminx∈X ⊆ Rn f(x) = argminx∈X ⊆ Rn f(x) , (1.35)
as g(x) = x2 is strictly increasing for x ≥ 0. Note that the log transformation is
typically used in the context of maximum likelihood estimation of unknown pa-
rameters in statistics, where the objective function is a probability and, therefore,
is positive. The square transform is relevant namely when f(x) is expressed as a
square root (see the example on Indiana Jones in Section 1.1.6). In this case, the
square root can be omitted.
2. A maximization problem whose objective function is f(x) is equivalent to a mini-
mization problem whose objective function is −f(x):

argmaxx f(x) = argminx −f(x) , (1.36)

and
max f(x) = − min −f(x) . (1.37)
x x

Similarly, we have
argminx f(x) = argmaxx −f(x) , (1.38)
and
min f(x) = − max −f(x) . (1.39)
x x

3. A constraint defined by a lower inequality can be multiplied by −1 to get an upper


inequality
g(x) ≤ 0 ⇐⇒ −g(x) ≥ 0 . (1.40)
4. A constraint defined by an equality can be replaced by two constraints defined by
inequalities
g(x) ≤ 0
g(x) = 0 ⇐⇒ (1.41)
g(x) ≥ 0 .
Note that this transformation is primarily used when constraints are linear. When
g(x) is non linear, this transformation is generally not recommended.
5. Some software require that all decision variables be non negative. However, for
problems such as the Swisscom example described in Section 1.1.2, such restric-
tions are not relevant. If a variable x can take any real value, it is then replaced
by two artificial variables denoted by x+ and x− , such that

x = x+ − x− . (1.42)

In this case, we can meet the requirements of the software and impose x+ ≥ 0 and
x− ≥ 0, without loss of generality.
18 Problem transformations

6. In the presence of a constraint x ≥ a, with a ∈ R, a simple change of variable

x=e
x+a (1.43)

transforms the constraint into


e
x ≥ 0. (1.44)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

To illustrate these transformations, let us consider the following optimization


problem:
max −x2 + sin y (1.45)
x,y

subject to

6x − y2 ≥ 1 (1.46)
2 2
x +y =3 (1.47)
x≥2 (1.48)
y ∈ R, (1.49)

and transform it in such a way as to obtain a minimization problem, in which all the
decision variables are non negative, and all constraints are defined by lower inequali-
ties. We get the following problem:

− x + 2)2 − sin(y+ − y− )
min (e (1.50)
e, y+ , y−
x

subject to
2
x + 2) + y+ − y−
−6(e +1≤0 (1.51)
2 +

− 2
(e
x + 2) + y − y −3≤0 (1.52)
2
x + 2)2 − y+ − y− + 3 ≤ 0
−(e (1.53)
e
x≥0 (1.54)
+
y ≥0 (1.55)

y ≥ 0, (1.56)

where
(1.50) is obtained by applying (1.37) to (1.45),
(1.51) is obtained by applying (1.40) to (1.46),
(1.52) and (1.53) are obtained by applying (1.41) to (1.47),
(1.54) is obtained by applying (1.43) to (1.48),
(1.55) and (1.56) are obtained by applying (1.42) to (1.49).
Note that the transformed problem has more decision variables (3 instead of 2 for the
original problem) and more constraints (6 instead of 3 for the original problem).
When the solution e x∗ , y+∗ , y−∗ of (1.50)–(1.56) is available, it is easy to obtain
the solution to the original problem (1.45)–(1.49) by applying the inverse transfor-
mations, i.e.,
x∗ = e
x∗ + 2
(1.57)
y∗ = y+∗ − y−∗ .
Formulation 19

1.2.2 Slack variables


A slack variable is introduced to replace an inequality constraint by an equality con-
straint. Such a variable should be non negative. There are several ways to define a
slack variable.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

• The slack variable y is introduced directly in the specification, and its value is
explicitly restricted to be non negative:
g(x) + y = 0
g(x) ≤ 0 ⇐⇒ (1.58)
y ≥ 0.
The above specification does not completely eliminate inequality constraints. How-
ever, it simplifies considerably the nature of these constraints.
• The slack variable is introduced indirectly using a specification enforcing its non
negativity. For example, the slack variable can be defined as y = z2 , and
g(x) ≤ 0 ⇐⇒ g(x) + z2 = 0. (1.59)
• The slack variable can also be introduced indirectly using an exponential, that is,
y = exp(z), and
g(x) ≤ 0 ⇐⇒ g(x) + ez = 0. (1.60)
The limitation of this approach is that there is no value of z such that g(x)+ez = 0
when the constraint is active, that is when g(x) = 0. Strictly speaking, the two
specifications are not equivalent. However, the slack variable exp(z) can be made
as close to zero as desired by decreasing the value of z, as
lim ez = 0. (1.61)
z→−∞

So, loosely speaking, we can say that the two specifications are “asymptotically
equivalent.”
The slack variable (z2 , ez or y) thus introduced measures the distance between the
constraint and the point x. The special status of such variables can be effectively
exploited for solving optimization problems.

Definition 1.4 (Slack variable). A slack variable is a decision variable introduced


in an optimization problem to transform an inequality constraint into an equality
constraint, possibly with a non negativity constraint.

For example, we consider the following optimization problem:


min x21 − x22 (1.62)
x1 , x2

subject to
π
sin x1 ≤ (1.63)
 √ 2
ln ex1 + ex2 ≥ e (1.64)
x1 − x2 ≤ 100 . (1.65)
20 Hypotheses

We introduce the slack variable z1 for the constraint (1.63), the slack variable z2
for the constraint (1.64), and the slack variable y3 for the constraint (1.65). The
obtained optimization problem is

min x21 − x22 , (1.66)


x1 , x2 , z1 , z2 , y3
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

subject to
π
sin x1 + z21 = (1.67)
 2

ln ex1 + ex2 − ez2 = e (1.68)
x1 − x2 + y3 = 100 (1.69)
y3 ≥ 0 . (1.70)

Note that the objective function is not affected by the introduction of slack variables.

1.3 Hypotheses
The methods and algorithms presented in this book are not universal. Each approach
is subject to assumptions about the structure of the underlying problem. Specific
assumptions are discussed for each method. However, there are three important
assumptions that concern (almost) the whole book: the continuity hypothesis, the
differentiability hypothesis, and the determinism hypothesis.
The continuity hypothesis consists in only considering problems for which the
objective to optimize and the constraints are modeled by continuous functions of
decision variables. This hypothesis excludes problems with integer variables (see
discussion in Section 1.1.7). Such problems are treated by discrete optimization
or integer programming.3 An introduction to discrete optimization in provided in
Part VII of this book. We also refer the reader to Wolsey (1998) for an introduction to
combinatorial optimization, as well as to Schrijver (2003), Bertsimas and Weismantel
(2005), and Korte and Vygen (2007).
The differentiability hypothesis (which obviously implies the continuity hypoth-
esis) also requires that the functions involved in the model are differentiable. Non
differentiable optimization is the subject of books such as Boyd and Vandenberghe
(2004), Bonnans et al. (2006), and Dem’Yanov et al. (2012).
The determinism hypothesis consists in ignoring possible errors in the data for the
problem. Measurement errors, as well as modeling errors, can have a non negligible
impact on the outcome. Stochastic optimization (Birge and Louveaux, 1997) enables
the use of models in which some pieces of data are represented by random variables.
Robust optimization (see Ben-Tal and Nemirovski, 2001 and Ben-Tal et al., 2009)
produces solutions that are barely modified by slight disturbances in the data of the
problem.
3 The term “programming”, used in the sense of optimization, was introduced during the Second
World War.
Formulation 21

It is crucial to be aware of these hypotheses in the modeling stage. The use of


inappropriate techniques can lead to erroneous results. For instance, it is shown in
Section 25.4 that solving a discrete optimization problem by solving a continuous
version (called the relaxation) and then rounding the solutions to the closest integer
is inappropriate.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

1.4 Problem definition


We now outline the main concepts that define the optimization problems, and analyze
the desired properties.
We consider the following optimization problem:

min f(x) (1.71)


x∈Rn

subject to

h(x) = 0 (1.72)
g(x) ≤ 0 (1.73)

and
x ∈ X, (1.74)

where f is a function of Rn in R, h is a function of Rn in Rm , g is a function of


Rn in Rp and X ⊆ Rn is a convex set (Definition B.2). We say that this is an opti-
mization problem with n decision variables, m equality constraints, and p inequality
constraints. We assume that n > 0, i.e., that the problem involves at least one de-
cision variable. However, we consider problems where m and p are zero, as well as
problems where X = Rn . By employing the transformations described in Section 1.2,
it is possible to express any optimization problem satisfying the hypotheses described
in Section 1.3 in the form (1.71)–(1.74).
We consider two types of solutions to this problem: local minima, where none of
their neighbors also satisfying the constraints are better, and global minima, where
no other point satisfying the constraints is better.


Definition 1.5 (Local minimum). Let Y = x ∈ Rn | h(x) = 0 , g(x) ≤ 0 and x ∈ X
be the feasible set, that is the set of vectors satisfying all constraints. The vector
x∗ ∈ Y is called a local minimum of the problem (1.71)–(1.74) if there exists ε > 0
such that
f(x∗ ) ≤ f(x) , ∀x ∈ Y such that x − x∗ < ε . (1.75)

The notation k · k denotes a vector norm on Rn (Definition B.1).


22 Problem definition


Definition 1.6 (Strict local minimum). Let Y = x ∈ Rn | h(x) = 0 , g(x) ≤
0 and x ∈ X be the feasible set, that is the set of vectors satisfying all the constraints.
The vector x∗ ∈ Y is called a strict local minimum of the problem (1.71)–(1.74) if
there exists ε > 0 such that
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

f(x∗ ) < f(x) , ∀x ∈ Y , x 6= x∗ such that x − x∗ < ε . (1.76)


Definition 1.7 (Global minimum). Let Y = x ∈ Rn | h(x) = 0 , g(x) ≤ 0 and x ∈
X be the feasible set, that is the set of vectors satisfying all the constraints. The
vector x∗ ∈ Y is called a global minimum of the problem (1.71)–(1.74) if

f(x∗ ) ≤ f(x) , ∀x ∈ Y . (1.77)


Definition 1.8 (Strict global minimum). Let Y = x ∈ Rn | h(x) = 0 , g(x) ≤
0 and x ∈ X be the feasible set, that is the set of vectors satisfying all the constraints.
The vector x∗ ∈ Y is called a strict global minimum of the problem (1.71)-(1.74) if

f(x∗ ) < f(x) , ∀x ∈ Y , x 6= x∗ . (1.78)

The notions of local maximum, strict local maximum, global maximum and strict
global maximum are defined in a similar manner.
Example 1.9 (Local minimum and maximum). Figure 1.4 illustrates these defini-
tions for the function
5 7 11
f(x) = − x3 + x2 − x + 3. (1.79)
6 2 3

The point x∗ ≃ 0.6972 is a local minimum of f. Indeed, there is an interval x∗ −
1 ∗ 1 ∗
2 , x + 2 , represented by the dotted lines, such that f(x ) ≤ f(x), for any x in the
e∗
interval. Similarly,
 ∗ 1 the  x ≃ 2.1024 is a local maximum. Indeed, there
point exists
an interval ex − 2, ex∗ + 12 , represented by the dotted lines, such that f(e
x∗ ) ≥ f(x),
for any x in the interval.

Example 1.10 (Binary optimization). Let f : Rn → R be differentiable, and let us


take the optimization problem
min f(x)
with constraints x ∈ {0, 1}n , i.e., such that each variable can take only the value 0 or
1. However, we can express the constraints in the following manner:

hi (x) = xi (1 − xi ) = 0 , i = 1, . . . , n .
Formulation 23

3.5

local maximum
3
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2.5 x∗ x̃∗
f(x)

2
local minimum
1.5

1
0 0.5 1 1.5 2 2.5 3
x
Figure 1.4: Local minimum and maximum for Example 1.9

We thus get n differentiable constraints and the hypotheses are satisfied. However,
since each feasible point is isolated, each of them is a local minimum of the problem.
Therefore, algorithms for continuous optimization designed to identify local minima
are useless for this type of problem.

In general, it is desirable to exploit the particular structure of the problem, because


an excess of generality penalizes optimization algorithms. We analyze special cases
of the problem (1.71)–(1.74).
It is important to note that the existence of a solution is not always guaranteed.
For example, the problem minx∈R f(x) = x has neither a minimum nor a maximum,
whether local or global. In fact, this function is not bounded in the sense that it can
take on values that are arbitrarily large and arbitrarily small.

Definition 1.11 (Function bounded from below). The function f : Rn → R is


bounded from below on Y ⊆ Rn if there exists a real M such that

f(x) ≥ M , ∀x ∈ Y . (1.80)

The function f(x) = x is unbounded on R. However, it is bounded on compact


subsets of R. Among all the bounds M of Definition 1.11, the largest is called the
infimum of f.

Definition 1.12 (Infimum). Let f : Rn → R be a function bounded from below on


the set Y ⊆ Rn . The infimum of f on Y is denoted by
inf f(y) (1.81)
y∈Y
24 Problem definition

and is such that


inf f(y) ≤ f(x) , ∀x ∈ Y (1.82)
y∈Y

and
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

∀ M > inf f(y) , ∃x ∈ Y such that f(x) < M . (1.83)


y∈Y

Example 1.13 (Infimum). Consider f(x) = ex and Y = R. We have

inf f(y) = 0 .
y∈Y

We verify Definition 1.12. We have

0 = inf f(y) ≤ f(x) = ex , ∀x ∈ R ,


y∈Y

and (1.82) is satisfied. Consider an arbitrary M > 0, and consider x = ln M/2. In


this case,
M
f(x) = < M,
2
and the definition is satisfied.

This example shows that an optimization problem is not always well defined, and
that no solution may exist. Theorem 1.14 identifies cases where the problem is well-
defined, and where the infimum of a function on a set is reached by at least one
point of the set. In this case, the point is a global minimum of the corresponding
optimization problem.

Theorem 1.14 (Weierstrass theorem). Let Y ⊂ Rn be a closed and non empty


subset of Rn , and let f : Y → R be a lower semi-continuous function (Definition
B.21) on Y. If Y is compact (Definition B.23) or if f is coercive (that is, if it
goes to +∞ when x goes to +∞ or −∞, see Definition B.22), there exists x∗ ∈ Y
such that
f(x∗ ) = inf f(y).
y∈Y

Proof. Consider a sequence (xk )k of elements of Y such that

lim f(xk ) = inf f(y).


k→∞ y∈Y

If Y is compact, it is bounded and the sequence has at least one limit point x∗ . If f is
coercive (Definition B.22), the sequence (xk )k is bounded and has at least one limit
point x∗ . In both cases, due to the lower semi-continuity of f in x∗ , we have

f(x∗ ) ≤ lim f(xk ) = inf f(y),


k→∞ y∈Y
Formulation 25

and
f(x∗ ) = inf f(y).
y∈Y

Note that some functions may include local minima and have no global minimum,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

as for Example 1.9, shown in Figure 1.4, where the function (1.79) is unbounded from
below, as
5 7 11
lim − x3 + x2 − x + 3 = −∞.
x→+∞ 6 2 3
In some cases, an optimization problem with constraints can be simplified and
the constraints ignored. Before detailing such simplifications in Section 3.1, we state
now a general theoretical result when the optimum is an interior point of the set of
constraints.

Definition 1.15 (Interior point). Consider Y ⊂ Rn and y ∈ Y. We say that y is in


the interior of Y if there exists a neighborhood of y in Y. Formally, y is in the interior
of Y if there exists ε > 0 such that all points in a neighborhood of size ε of y belong
to Y, that is such that
kz − yk ≤ ε =⇒ z ∈ Y. (1.84)

Theorem 1.16 (Solution in the interior of constraints). Let x∗ be a local minimum


of the optimization problem (1.71)–(1.74). Let Y = x ∈ Rn | h(x) = 0, g(x) ≤
0 and x ∈ X be the feasible set, and let Y = Y1 ∩ Y2 such that x∗ is an interior
point of Y1 . Then x∗ is a local minimum of the problem

min f(x).
x∈Y2

In particular, if x∗ is an interior point of Y, then the theorem applies with


Y2 = Rn and x∗ is a solution of the unconstrained optimization problem.

Proof. According to Definition 1.15, there exists ε1 > 0 such that


y ∈ Y1 , ∀y such that y − x∗ ≤ ε1 . (1.85)
Since x∗ is a local minimum (Definition 1.5), there exists ε2 > 0 such that
f(x∗ ) ≤ f(y) , ∀y ∈ Y such that ky − x∗ k ≤ ε2 . (1.86)
Then, if ε = min(ε1 , ε2 ), any point y ∈ Y2 such that ky − x∗ k ≤ ε belongs also to
Y1 and, therefore, is feasible. Consequently, x∗ is better, in the sense of the objective
function, than any of these y. Formally, we get
f(x∗ ) ≤ f(y) , ∀y ∈ Y2 such that y − x∗ ≤ ε , (1.87)
which is exactly (1.75) where Y has been replaced by Y2 , and x∗ is a local minimum
of the problem with only the Y2 constraints.
26 Exercises

This result is used particularly in the development of optimality conditions for


problems with constraints (Chapter 6), as well as in the development of algorithms
called interior point methods. Note that a problem containing equality constraints
has no interior points.
Before detailing algorithms that enable an optimization problem to be solved, it is
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

essential to properly understand the nature of this problem. In Chapter 2, we analyze


the objective function in detail. Constraints are discussed in Chapter 3. Finally, in
Chapter 4, we analyze the ways in which it is possible to combine the objective
function and constraints.

1.5 Exercises
Exercise 1.1 (Geometry). We want to determine a parallelepiped with a volume of
unity and minimal surface.
1. Formulate this problem as an optimization problem by determining
(a) the decision variables,
(b) the objective function,
(c) the constraint(s).
2. Formulate this problem as a minimization problem with only lower inequality
constraints.
Exercise 1.2 (Finance). A bank wants to determine how to invest its assets for the
year to come. Currently, the bank has a million euros that it can invest in bonds, real
estate loans, leases or personal loans. The annual interest rate of different investment
types are 6% for bonds, 10% for real estate loans, 8% for leases, and 13% for personal
loans.
To minimize risk, the portfolio selected by the bank must satisfy the following
restrictions:
• The amount allocated to personal loans must not exceed half of that invested in
bonds.
• The amount allocated to real estate loans must not exceed that allocated to leases.
• At most 20% of the total invested amount can be allocated to personal loans.
1. Formulate this problem as an optimization problem by determining
(a) the decision variables,
(b) the objective function,
(c) the constraint(s).
2. Formulate this problem as a minimization problem with only lower inequality
constraints.
3. Formulate this problem as a maximization problem with equality constraints and
with non negativity constraints on the decision variables.
Formulation 27

Exercise 1.3 (Stock management). The company Daubeliou sells oil and wants to
optimize the management of its stock. The annual demand is estimated at 100,000
liters and is assumed to be homogeneous throughout the year. The cost of storage
is e 40 per thousand liters per year. When the company orders oil to replenish its
stock, this costs e 80. Assuming that the order arrives instantly, how many orders
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

must the company Daubeliou place each year to satisfy demand and minimize costs?
Formulate this problem as an optimization problem by determining
1. the decision variables,
2. the objective function,
3. the constraint(s).
Advice:
• Set the amount of oil to order each time as a decision variable.
• Represent in a graph the evolution of the stock as a function of time.
Exercise 1.4 (Measurement errors). Mr. O. Beese is obsessed with his weight. He
owns 10 scales and weighs himself each morning on each of them. This morning he
got the following measurement results:

100.8 99.4 101.3 97.6 102.5 102.4


104.6 102.6 95.1 96.6

He wishes to determine an estimate of his weight while minimizing the sum of the
squares of measurement errors from the 10 scales. Formulate the optimization prob-
lem that he needs to solve.
Exercise 1.5 (Congestion). Every day, 10,000 people commute from Divonne to
Geneva. By train, the journey takes 40 minutes. By road, the travel time depends on
the level of congestion. It takes 20 minutes when the highway is completely deserted.
When there is traffic, travel time increases by 5 minutes per thousand people using
the highway (assuming that there is one person per car). If 4,000 people take the car
and 6,000 take the train, the travel time by road is equal to 20 + 5 × 4 = 40 minutes,
which is identical to the travel time by train. In this situation, it would be of no
interest to change one’s mode of transport. We say that the system is at equilibrium.
However, from the point of view of the average travel time per person, is this situation
optimal? Formulate an optimization problem to answer the question.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 2

Objective function

Before attempting to understand how to solve a problem, we first try to understand


the problem itself. In this chapter, we identify the properties of the objective function
which are useful in the development of theory and algorithms. A crucial part in
optimization is to understand the geometry of the problem. Derivatives play a central
role in this analysis. We also identify what constitutes a good or a bad geometry for
a problem.

Contents
2.1 Convexity and concavity . . . . . . . . . . . . . . . . . . . 29
2.2 Differentiability: the first order . . . . . . . . . . . . . . . 31
2.3 Differentiability: the second order . . . . . . . . . . . . . 39
2.4 Linearity and non linearity . . . . . . . . . . . . . . . . . . 42
2.5 Conditioning and preconditioning . . . . . . . . . . . . . 45
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Several concepts can be used to characterize the objective function. It is important


to identify the characteristics of the objective function because each optimization
algorithm is based on specific hypotheses. When the hypotheses of an algorithm have
not been verified for a given problem, there is no guarantee that the algorithm can
be used to solve this problem.

2.1 Convexity and concavity


A function f : Rn → R is said to be convex if, for any vector x and y of Rn, the
graph of f between x and y is not above the line segment connecting x, f(x) and
y, f(y) in Rn+1 . Definition 2.1 establishes this property formally.
30 Convexity and concavity

Definition 2.1 (Convex function). A function f : Rn → R is said to be convex if,


for any x, y ∈ Rn and for any λ ∈ [0, 1], we have

f λx + (1 − λ)y ≤ λf(x) + (1 − λ)f(y) . (2.1)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Definition
 2.1 is shown
 in Figure 2.1. The line segment connecting the points
x, f(x) and y, f(y) is never below the graph of f. The point z = λx + (1 − λ)y is
somewhere between x and y when 0 ≤ λ ≤ 1. The point with coordinates  z, λf(x) +
(1−λ)f(y) is on the line segment between the points x, f(x) and y, f(y) . In order
for the function to be convex, this point must never (i.e., for all x, y and 0 ≤ λ ≤ 1)
be below the graph of the function.

(z, λf(x) + (1 − λ)f(y))

(z, f(z))

x z y

Figure 2.1: Illustration of Definition 2.1

Figure 2.2 shows a nonconvex function,


 in which there is an x and y such that
the line connecting x, f(x) and y, f(y) is partially located below the graph of the
function.
The notion of convexity of a function can be strengthened, giving strict convexity.

Definition 2.2 (Strictly convex function). A function f : Rn → R is said to be


strictly convex if for all x, y ∈ Rn , x 6= y, and for all λ ∈ ]0, 1[ , we have

f λx + (1 − λ)y < λf(x) + (1 − λ)f(y) . (2.2)

The convexity of the function is an important feature in optimization. In gen-


eral, when the function is not convex, it is particularly difficult to identify a global
minimum of the problem (1.71)–(1.74).
Note that the importance of convexity is linked to minimization problems. When
we study maximization problems, the concept of concavity should be used, as shown
Objective function 31

(y, f(y))
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

(x, f(x))

Figure 2.2: Illustration of a counterexample to Definition 2.1

by Definition 2.3. As discussed in Section 1.2.1, a maximization problem can always


be easily transformed into a minimization problem using (1.37).

Definition 2.3 (Concave function). A function f : Rn → R is said to be concave if


−f is a convex function, i.e., if for all x, y ∈ Rn and for all λ ∈ [0, 1], we have

f λx + (1 − λ)y ≥ λf(x) + (1 − λ)f(y) . (2.3)

Note that convexity and concavity are not complementary properties. A function
may be neither convex nor concave. This is the case of the function represented in
Figure 2.2.

2.2 Differentiability: the first order


An important assumption in this book is that the objective function f and the func-
tions g and h describing the constraints are continuous (Definition B.5) and differen-
tiable. We summarize here the main concepts related to differentiability.

Definition 2.4 (Partial derivative). Let f : Rn → R be a continuous function. The


function ∇i f(x) : Rn → R, also written as ∂f(x)/∂xi is called the ith partial derivative
of f and is defined as
f(x1 , . . . , xi + α, . . . , xn ) − f(x1 , . . . , xi , . . . , xn )
lim . (2.4)
α→0 α
This limit may not exist.
32 Differentiability: the first order

If the partial derivatives ∂f(x)/∂xi exist for all i, the gradient of f is defined as
follows.

Definition 2.5 (Gradient ). Let f : Rn → R be a differentiable function. The


function denoted by ∇f(x) : Rn → Rn is called the gradient of f and is defined as
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

 
∂f(x)
 ∂x1 
 
 .. 
∇f(x) = 
 .
.
 (2.5)
 
 ∂f(x) 
∂xn

The gradient plays a key role in the development and analysis of optimization
algorithms.
Example 2.6 (Gradient). Consider f(x1 , x2 , x3 ) = ex1 +x21 x3 −x1 x2 x3 . The gradient
of f is given by
 x1 
e + 2x1 x3 − x2 x3
∇f(x1 , x2 , x3 ) =  −x1 x3 . (2.6)
2
x1 − x1 x2

The analysis of the behavior of the function in certain directions is also important
for optimization methods. We introduce the concept of a directional derivative.

Definition 2.7 (Directional derivative). Let f : Rn → R be a continuous function.


Consider x ∈ Rn and d ∈ Rn . The directional derivative of f in x in the direction d
is given by
f(x + αd) − f(x)
lim , (2.7)
>
α−→0
α
if the limit exists. In addition, when the gradient exists, the directional derivative is
the scalar product between the gradient of f and the direction d, i.e.,

∇f(x)T d . (2.8)

Definition 2.8 (Differentiable function). Let f : Rn → R be a continuous function.


If, for any d ∈ Rn , the directional derivative of f in the direction d exists, the function
f is said to be differentiable.

This concept is sometimes called “Gâteaux differentiability,” in the sense that


other types of differentiability can be defined (like the Fréchet differentiability). In
Objective function 33

this book, we deal with continuous differentiable functions for which a distinction is
unnecessary.
Example 2.9 (Directional derivative). Consider f(x1 , x2 , x3 ) = ex1 + x21 x3 − x1 x2 x3 ,
and  
d1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

d =  d2  . (2.9)
d3
The directional derivative of f in the direction d is

( d1 d2 d3 ) ∇f(x1 , x2 , x3 ) =
(2.10)
d1 (ex1 + 2x1 x3 − x2 x3 ) − d2 x1 x3 + d3 (x21 − x1 x2 ) ,

where ∇f(x1 , x2 , x3 ) is given by (2.6).

A numerical illustration of the directional derivative is given in Example 2.14.


Note that the partial derivatives are in fact directional derivatives in the direction of
the coordinate axes and
∂f(x)
= ∇f(x)T ei , (2.11)
∂xi
where ei is the column i of the identity matrix, i.e., a vector with all entries equal to
0, except the one at line i which is 1.
The directional derivative provides information about the slope of the function in
the direction d, just as the derivative gives information of the slope of functions of
one variable. In particular, the function increases in the direction d if the directional
derivative is positive and decreases if it is negative. In the latter case, we say that it
is a descent direction.

Definition 2.10 (Descent direction). Let f : Rn → R be a differentiable function.


Consider x, d ∈ Rn . The direction d is a descent direction in x if

dT ∇f(x) < 0 . (2.12)

The terminology “descent direction” is justified by Theorem 2.11. It shows the


decrease of the function along the descent direction. The theorem also states that
the decrease is proportional to the slope, that is, the directional derivative, in the
neighborhood of x.

Theorem 2.11 (Descent direction). Let f : Rn → R be a differentiable function.


Consider x ∈ Rn such that ∇f(x) 6= 0 and d ∈ Rn . If d is a descent direction, in
the sense of Definition 2.10, there exists η > 0 such that

f(x + αd) < f(x) , ∀0 < α ≤ η . (2.13)


34 Differentiability: the first order

Moreover, for any β < 1, there exists b


η > 0 such that

f(x + αd) < f(x) + αβ∇f(x)T d , (2.14)

for all 0 < α ≤ b


η.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Proof. We use (C.1) from Taylor’s theorem (Theorem C.1) to evaluate the function
in x + αd by employing the information in x. We have

f(x + αd) = f(x) + αdT ∇f(x) + o αkdk Taylor
T
= f(x) + αd ∇f(x) + o(α) kdk does not depend on α.

The result follows from the fact that dT ∇f(x) < 0 by assumption, and that o(α) can
be made as small as needed. More formally, according to the definition of the Landau
notation o( · ) (Definition B.17), for any ε > 0, there exists η such that

o(α)
< ε, ∀0 < α ≤ η .
α

We choose ε = −dT ∇f(x), which is positive according to Definition 2.10. Then,

o(α) o(α)
≤ < −dT ∇f(x) , ∀0 < α ≤ η . (2.15)
α α

We now need only multiply (2.15) by α to obtain

αdT ∇f(x) + o(α) < 0 , ∀0 < α ≤ η ,

and f(x + αd) < f(x), ∀0 < α ≤ η. The result (2.14) is obtained in a similar manner,
by choosing ε = (β − 1)∇f(x)T d in the definition of the Landau notation. We have
ε > 0 because β < 1 and ∇f(x)T d < 0.

Among all directions d from a point x, the one in which the slope is the steepest
is the direction of the gradient ∇f(x). To show this, we consider all directions d that
have the same norm (the same length) as the gradient and compare the directional
derivative for each of them.

Theorem 2.12 (Steepest ascent). Let f : Rn → R be a differentiable function.


Consider x ∈ Rn and d∗ = ∇f(x). Then, for all d ∈ Rn such that kdk = ∇f(x) ,
we have
dT ∇f(x) ≤ d∗ T ∇f(x) = ∇f(x)T ∇f(x) . (2.16)
Objective function 35

Proof. Let d be any direction. We have


dT ∇f(x) ≤ kdk ∇f(x) Cauchy-Schwartz (Theorem C.13)
2
= ∇f(x) assumption kdk = ∇f(x)
T
= ∇f(x) ∇f(x) definition of a scalar product
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

∗T
=d ∇f(x) definition of d∗ .

2
Since d∗ T ∇f(x) = ∇f(x) ≥ 0, the function increases in the direction d∗ , which
corresponds to the steepest ascent.
If the direction of the gradient corresponds to the steepest ascent of the function x,
we need only consider the direction opposite the gradient −∇f(x) to find the steepest
descent.

Corollary 2.13 (Steepest descent). Let f : Rn → R be a differentiable function.


Consider x ∈ Rn and d∗ = −∇f(x). Then, for any d ∈ Rn such that kdk =
∇f(x) , we have

−∇f(x)T ∇f(x) = d∗ T ∇f(x) ≤ dT ∇f(x) (2.17)

and the direction opposite the gradient is that in which the function has its
steepest descent.

Proof. Let d be any direction. We have


−dT ∇f(x) ≤ ∇f(x)T ∇f(x) by applying Theorem 2.12 at −d
T
= − d∗ ∇f(x) according to the definition of d∗
T
and −dT ∇f(x) ≤ − d∗ ∇f(x). We get (2.17) by multiplying this last inequality
by −1. Since d∗ T ∇f(x) ≤ 0, the function is decreasing in the direction d∗ , which
corresponds to the steepest descent.
1 2
Example 2.14 (Steepest ascent). Consider f(x) = 2 x1 + 2x22 , and x = ( 1 1 )T .
We consider three directions
     
1 1 −1
d1 = ∇f(x) = , d2 = , and d3 = .
4 1 3
The directional derivative in f in each direction equals:
dT1 ∇f(x) = 17
dT2 ∇f(x) = 5
dT3 ∇f(x) = 11 .
The shape of the function in each of these directions is shown in Figure 2.3. For each
direction di , the function f(x + αdi ) is represented for values of α between 0 and 1.
36 Differentiability: the first order

55
50  
−1
45 d3 =
3
40  
1
f(x + αdi ) 35 d2 =
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

1
30
25 d1 = ∇f(x)
20
15
10
5
0
0 0.2 0.4 0.6 0.8 1
α
1 2
Figure 2.3: Shape of the function 2 x1 + 2x2 at point (1, 1)T in several directions

Example 2.15 (Steepest descent). Consider f(x) = 12 x21 + 2x22 , and x = ( 1 1 )T .


We consider three directions
     
−1 −1 1
d1 = −∇f(x) = , d2 = , and d3 = .
−4 −1 −3
The directional derivative in f in each direction equals:
dT1 ∇f(x) = −17
dT2 ∇f(x) = −5
dT3 ∇f(x) = −11 .
The shape of the function in each of these directions is shown in Figure 2.4. For each
direction di , the function f(x + αdi ) is represented for values of α between 0 and 1.
It is important to note in this figure that the function does not constantly decrease
along one descent direction. The descent feature is local, i.e., valid in a neighborhood
of x. Even if the function increases later, the steepest descent is locally observed in
the direction −∇f(x).

In addition to providing information on the slope of the function, the gradient


also enables to verify whether the function is convex or concave.

Theorem 2.16 (Convexity according to the gradient). Let f : X ⊆ Rn → R be a


differentiable function on an open convex set X. f is convex on X if and only if
f(y) − f(x) ≥ (y − x)T ∇f(x) , ∀x, y ∈ X . (2.18)
f is strictly convex on X if and only if
f(y) − f(x) > (y − x)T ∇f(x) , ∀x, y ∈ X . (2.19)
Objective function 37

18
16 d1 = −∇f(x)
14    
−1 1
f(x + αdi ) 12 d2 = d3 =
−1 −3
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

10
8
6
4
2
0
0 0.2 0.4 0.6 0.8 1
α
1 2
Figure 2.4: Shape of the function 2 x1 + 2x2 at point (1, 1)T in several directions

Proof. Necessary condition. We first show that (2.18) is a necessary condition. Let
x, y ∈ X be arbitrary and let us consider d = y − x. We evaluate the directional
derivative of f in the direction d and obtain

(y − x)T ∇f(x) = dT ∇f(x) definition of d


f(x + αd) − f(x)
= lim Definition 2.7
>
α→0 α

f x + α(y − x) − f(x)
= lim definition of d.
>
α→0 α
Since the limit is reached for α → 0, we can assume without loss of generality that
α < 1. We obtain, by convexity of f, and by applying Definition 2.1 with λ = 1 − α,
(1 − α)f(x) + αf(y) − f(x)
(y − x)T ∇f(x) ≤ lim . (2.20)
α→0
> α
We now need only simplify (2.20) to obtain (2.18).
Sufficient condition. We now assume that (2.18) is satisfied, and let us demon-
strate the convexity of f. Let x, y ∈ X be arbitrary, and let us take z = λx + (1 − λ)y.
z ∈ X because X is convex. We apply (2.18) first for z and x, and then for z and y:
f(x) − f(z) ≥ (x − z)T ∇f(z)
(2.21)
f(y) − f(z) ≥ (y − z)T ∇f(z) .
We multiply the first inequality by λ and the second by (1 − λ) and sum them up to
obtain T
λf(x) + (1 − λ)f(y) − f(z) ≥ λx + (1 − λ)y − z ∇f(z). (2.22)
According to the definition of z, we obtain the characterization (2.2) of Definition 2.1,
and f is convex.
The proof for the strictly convex case is identical.
38 Differentiability: the first order

If we write (2.18) in a slightly different way:

f(y) ≥ f(x) + (y − x)T ∇f(x) , ∀x, y ∈ Rn , (2.23)

the right term is nothing else than the equation of the hyperplane that is tangent to
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

the function f at point x. In this case, we see that a function is convex if and only
if the graph is never below the hyperplane tangent. Figure 2.5 illustrates this idea in
the case of a function with one variable.

f(x) + (y − x)T ∇f(x)

Figure 2.5: Hyperplane tangent to a convex function

We conclude this section by generalizing the notion of the gradient for the func-
tions of Rn → Rm . In this case, the various partial derivatives are arranged in a
matrix called the gradient matrix. Each column of this matrix is the gradient of the
corresponding component of f.

Definition 2.17 (Gradient matrix). Let f : Rn → Rm be such that fi : Rn → R


is differentiable, for i = 1, . . . , m. In this case, f is differentiable, and the function
∇f(x) : Rn → Rn×m is called a gradient matrix and is defined by
 

∇f(x) =  ∇f1 (x) · · · ∇fm (x) 

 ∂f ∂f2 ∂fm 
1 (2.24)
···
 ∂x1 ∂x1 ∂x1 
 . .. .. .. 
=
 .. . . .
.

 ∂f ∂f2 ∂fm 
1
···
∂xn ∂xn ∂xn
Objective function 39

The gradient matrix is often used in its transposed form and is then called the
Jacobian matrix of f.

Definition 2.18 (Jacobian matrix). Consider f : Rn → Rm . The function J(x) :


Rn → Rm×n is called a Jacobian matrix and is defined as
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

 
∇f1 (x)T
 .. 
J(x) = ∇f(x)T =  . . (2.25)
T
∇fm (x)

Born in Potsdam (Germany) on December 10, 1804, and died in


Berlin on February 18, 1851, Jacobi taught at Königsberg with
Neumann and Bessel. He contributed significantly to the theory
of elliptic functions, in competition with Abel. His work on
first-order partial differential equations and determinants are of
prime importance. Although introduced by Cauchy in 1815, the
determinant function is called Jacobian thanks to a long thesis
published by Jacobi in 1841. The determinant of the Jacobian
matrix (Definition 2.18) is called the Jacobian.
Figure 2.6: Carl Gustav Jacob Jacobi

2.3 Differentiability: the second order


We can perform the same differentiability analysis as that on the function f in Sec-
tion 2.2 for each of the functions ∇i f(x) of Definition 2.4. The jth partial derivative
of ∇i f(x) is the second derivative of f with respect to the variables i and j, because

∂∇i f(x) ∂ ∂f(x)/∂xi ∂2 f(x)
= = . (2.26)
∂xj ∂xj ∂xi ∂xj

It is common to organize the second derivatives in an n × n matrix in which the


element of line i and column j is ∂2 f(x)/∂xi ∂xj . This matrix is called Hessian and
gets its name from the German mathematician Otto Hesse (Figure 2.7).

Definition 2.19 (Hessian matrix). Let f : Rn → R be a twice differentiable function.


The function ∇2 f(x) : Rn → Rn×n is called the Hessian matrix or Hessian of f and
40 Differentiability: the second order

is defined by
 
∂2 f(x) ∂2 f(x) ∂2 f(x)
 ··· 
 ∂x21 ∂x1 ∂x2 ∂x1 ∂xn
 2 
∂2 f(x) ∂2 f(x)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

 ∂ f(x) 
 ··· 
 ∂x2 ∂x1 ∂x22 ∂x2 ∂xn
∇2 f(x) =  . (2.27)
 .. .. .. 
 . . . 
 
 2 
 ∂ f(x) 2
∂ f(x) ∂ f(x) 
2
···
∂xn ∂x1 ∂xn ∂x2 ∂x2n

The Hessian matrix is always symmetric.

Note that the Hessian of f is the gradient matrix and the Jacobian matrix of ∇f.
Example 2.20 (Hessian). Consider f(x1 , x2 , x3 ) = ex1 + x21 x3 − x1 x2 x3 . The Hessian
of f is given by
 x1 
e + 2x3 −x3 2x1 − x2
∇2 f(x1 , x2 , x3 ) =  −x3 0 −x1 . (2.28)
2x1 − x2 −x1 0

Just like the gradient, the Hessian gives us information about the convexity of the
function.

Theorem 2.21 (Convexity by the Hessian). Let f : X ⊆ Rn → R be a twice


differentiable function on an open convex set X. If ∇2 f(x) is positive semidefinite
(resp. positive definite) for all x in X, then f is convex (resp. strictly convex)
on X.

Proof. Consider x and y in X. We utilize (C.4) of Taylor’s theorem (Theorem C.2)


to evaluate the function in y by using the information in x. By writing d = y − x, we
have

1
f(y) = f(x) + (y − x)T ∇f(x) + dT ∇2 f(x + αd)d . (2.29)
2
If 0 < α ≤ 1, x+αd ∈ X by convexity (Definition B.2) of X, and the matrix ∇2 f(x+αd)
is positive semidefinite (Definition B.8). Therefore,

dT ∇2 f(x + αd)d ≥ 0 . (2.30)

Then, we have
f(y) ≥ f(x) + (y − x)T ∇f(x) . (2.31)
We now need only invoke Theorem 2.16 to prove the convexity of f. The strict
convexity is proven in a similar manner.
Objective function 41

It is interesting to analyze the second order information along a given direction


d. If we have a twice differentiable function f, a point x and a direction d, we can
calculate the derivatives in this direction by considering the function of one variable

g : R+ −→ R : α f(x + αd) . (2.32)


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

According to the chain differentiation rule, we have

g ′ (α) = dT ∇f(x + αd) . (2.33)

Note that g ′ (0) is the directional derivative (Definition 2.7) of f at x along d. We


also have

g ′′ (α) = dT ∇2 f(x + αd)d . (2.34)

Since the second derivative of a function of one variable gives us curvature informa-
tion, (2.34) provides us with information about the curvature of the function f in
the direction d. In particular, when α = 0, this expression gives information on the
curvature of f in x. To avoid the length of the direction influencing the notion of
curvature, it is important to normalize. We obtain the following definition.

Definition 2.22 (Curvature). Let f : Rn → R be a twice differentiable function.


Consider x, d ∈ Rn . The quantity

dT ∇2 f(x)d
(2.35)
dT d
represents the curvature of the function f in x in the direction d.

In linear algebra, the quantity (2.35) is often called the Rayleigh quotient of ∇2 f(x)
in the direction d. One should immediately note that the curvature of the function
x in the direction −d is exactly the same as in the direction d.

Example 2.23 (Curvature). Consider f(x) = 21 x21 + 2x22 , and x = (1, 1)T , as in
Examples 2.14 and 2.15. The curvature of the function in different directions is given
below:
d −d dT ∇2 f(x)d/dT d
( 1 4 )T ( −1 −4 )T 3.8235
( 1 1 )T ( −1 −1 )T 2.5
( −1 3 )T ( 1 −3 )T 3.7
42 Linearity and non linearity

Otto Hesse was born in Königsberg (currently Kaliningrad, Rus-


sia) on April 22, 1811, and died in Munich (Germany) on August
4, 1874. He was a student of Jacobi and in 1845 was appointed
Extraordinary Professor at Königsberg, where he taught for 16
years. Kirchkoff and Lipschitz attended his courses. Hesse was
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

also affiliated with the University of Halle, in Heidelberg and


with Munich Polytechnicum. He worked mainly on the theory
of algebraic functions and the theory of invariants.
Figure 2.7: Ludwig Otto Hesse

2.4 Linearity and non linearity


A function f : Rn → R is said to be linear if its value is a linear combination of
variables.

Definition 2.24 (Linear function). A function f : Rn → R is said to be linear if it is


defined as
Xn
f(x) = cT x = ci xi , (2.36)
i=1

where c ∈ R is a constant vector, i.e., independent of x. A function f : Rn → Rm is


n

linear if each of its components fi : Rn → R, i = 1, . . . , m, is linear. In this case, it


can be written as
f(x) = Ax , (2.37)
where A ∈ Rm×n is a matrix of constants.

When a constant term is added to a linear function, the result is said to be affine.

Definition 2.25 (Affine function). A function f : Rn → R is said to be affine if it is


written as
n
X
f(x) = cT x + d = ci xi + d , (2.38)
i=1

where c ∈ R is a vector of constants and d ∈ R. A function f : Rn → Rm is affine


n

if each of its components fi : Rn → R, i = 1, . . . , m, is affine. In this case, it can be


written as
f(x) = Ax + b , (2.39)
where A ∈ Rm×n is a matrix and b ∈ Rm is a vector.

Note that minimizing (2.38) is equivalent to minimizing (2.36). Note that all
linear functions are affine. By abuse of language, a non linear function is actually a
function that is not affine.
Objective function 43

Definition 2.26 (Non linear function). Any function that is not affine is said to be
non linear.

The set of non linear functions is vast, and one needs to be a little more precise
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

in their characterization. Intuitively, the function shown in Figure 2.8 seems more
non linear than the one in Figure 2.9. The slope of the former changes quickly with
x, which is not the case for the latter. This is formally captured by the Lipschitz
continuity (Definition B.16) of the gradient.

3
−x
2.5 f(x) = ecos(e )

2
f(x)

1.5

0.5

0
-5 -4 -3 -2 -1 0
x
Figure 2.8: Example of a non linear function

40
30
20 x2
f(x) = 100 + 3x + 1
10
f(x)

0
-10
-20
-30
-10 -5 0 5 10
x
Figure 2.9: Example of another non linear function
44 Linearity and non linearity

Definition 2.27 (Lipschitz continuity of the gradient). Consider f : X ⊆ Rn → Rm .


The gradient matrix of the function is Lipschitz continuous on X if there exists a
constant M > 0 such that, for all x, y ∈ X, we have
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

∇f(x) − ∇f(y) n×m


≤M x−y n
, (2.40)

where k · kn×m is a norm on Rn×m and k · kn is a norm on Rn . The constant M is


called the Lipschitz constant.

Intuitively, the definition says that the slopes of the function at two close points
are close as well. And the more so when M is small. Actually, when f is linear, the
slope is the same at any point, and (2.40) is verified with M = 0. The value of M
for the function represented in Figure 2.9 is low, while it is large for the function
illustrated in Figure 2.8, where the slope varies dramatically with small modifications
of x.
The constant M can be interpreted as an upper bound on the curvature of the
function. The greater M is, the smaller the curvature is. If M = 0, the curvature is
zero, and the function is linear. Note that this constant is essentially theoretical and
that it is generally difficult to obtain a value for it.
Among the non linear functions, quadratic functions play an important role in
optimization algorithms.

Definition 2.28 (Quadratic function). A function f : Rn → R is said to be quadratic


if it can be written as
n n n
1 T 1 XX X
f(x) = x Qx + gT x + c = Qij xi xj + gi xi + c , (2.41)
2 2
i=1 j=1 i=1

where Q is a symmetric matrix n × n, g ∈ Rn and c ∈ R. We have

∇f(x) = Qx + g and ∇2 f(x) = Q . (2.42)

The presence of the factor 12 enables a simplification of the expression of ∇f(x).


Note also that the fact that Q is symmetric is not restrictive. Indeed, if Q was not
symmetric, we would have
n X
X n n X
X n
1
xT Qx = Qij xi xj = (Qij + Qji )xi xj .
2
i=1 j=1 i=1 j=i

We now define the symmetric matrix Q ′ such that Qij


′ ′
= Qji = 12 (Qij + Qji ). We
T T ′
obtain x Qx = x Q x, and the same function can be written using a symmetric
matrix.
Objective function 45

2.5 Conditioning and preconditioning


In linear algebra, the notion of conditioning is related to the analysis of numerical
errors that can occur when solving a linear system (see Golub and Van Loan, 1996).
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Definition 2.29 (Condition number). Let A ∈ Rn×n be a non singular symmetric


matrix. The condition number for A is

κ(A) = kAk A−1 . (2.43)

If the matrix norm used is the norm 2, we have


σ1
κ2 (A) = A 2
A−1 2
= , (2.44)
σn
where σ1 is the largest singular value of A (Definition B.28) and σn is the smallest. By
extension, the condition number of a singular matrix (i.e., such that σn = 0) is +∞.
If A is symmetric positive semidefinite, the singular values of A are its eigenvalues
(Definition B.7).

We propose a geometric interpretation of the condition number. For this, we


consider a non linear function f : Rn → R and a vector x ∈ Rn . We assume that
the matrix ∇2 f(x) is positive definite1 , and let λ1 be its largest eigenvalue and λn its
smallest. Let d1 be an eigenvector corresponding to λ1 . We have

Ad1 = λ1 d1 . (2.45)

By premultiplying by dT1 and normalizing, we obtain

dT1 Ad1
λ1 = . (2.46)
dT1 d1

According to Definition 2.22, the eigenvalue λ1 corresponds to the curvature of


the function in the direction of the eigenvector d1 . In addition, according to the
Rayleigh-Ritz theorem (Theorem C.4), this is the greatest curvature among all possi-
ble directions. A similar reasoning for the smallest eigenvalue allows us to determine
λn as the smallest curvature of the function in all possible directions. The condition
number of the Hessian matrix at a point x is the ratio between the largest and the
smallest curvature among the directions when starting from x.

Definition 2.30 (Conditioning). Let f : Rn → R be a twice differentiable function,


and let us take a vector x ∈ Rn . The conditioning of f at x is the condition number
of ∇2 f(x).

1 ∇2 f(x) is always symmetric.


46 Conditioning and preconditioning

By using this interpretation related to curvature, an ill-conditioned function is


characterized by a large difference in curvature between two directions. In the case of
a quadratic function with two dimensions (Figure 2.10(a)), this translates into level
curves forming elongated ellipses (Figure 2.10(b)). A well conditioned function is
characterized by a homogeneous curvature in the various directions. In the case of a
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

quadratic function with two dimensions (Figure 2.11(a)), this translates into nearly
circular level curves (Figure 2.11(b)).
Example 2.31 (Conditioning). The quadratic function

f(x1 , x2 ) = 2x21 + 9x22 (2.47)

is such that its condition number is 9/2 for all x, because


 
2 4 0
∇ f(x1 , x2 ) = . (2.48)
0 18

We now apply the change of variables


 ′    
x1 2 0
√ x1
= , (2.49)
x2′ 0 3 2 x2

i.e.,
    
x1 1/2 √ 0 x1′
= . (2.50)
x2 0 2/6 x2′
We obtain
1 ′2 1 ′2
f(x1′ , x2′ ) =
x + x2 , (2.51)
2 1 2
for which the Hessian is the identity matrix, and the condition number is 1, for all
(x1′ , x2′ ).

We see that it is possible to reduce the condition by a change of variables. In


general, a change of variables is defined by an invertible matrix M.

Definition 2.32 (Change of variables). Consider x ∈ Rn . Let M ∈ Rn×n be an


invertible matrix. The change of variables is the linear application defined by M that
transforms x into x ′ = Mx.

Consider a function f(x) and apply to it the change of variables x ′ = Mx to obtain


the function f̃(x ′ ). We have, according to the chain differentiation rule (Theorem C.3),

f̃(x ′ ) = f(M−1 x ′ )
∇f̃(x ′ ) = M−T ∇f(M−1 x ′ )
(2.52)
∇2 f̃(x ′ ) = M−T ∇2 f(M−1 x ′ )M−1
= M−T ∇2 f(x)M−1 .
Objective function 47

40
30
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

f(x1 , x2 ) 20
10
0

-2
-1
-2 0 x1
-1 1
x2 0 1 2
2

(a) Function

0 x2

-1

-2
-2 -1 0 1 2
x1
(b) Level curves

Figure 2.10: Function (2.47) of Example 2.31

The condition of f̃ in x ′ is the condition number of the matrix M−T ∇2 f(x)M−1


(where M−T is the inverse of the transpose of M). Choosing a change of variables
such that the condition is as close as possible to 1 is called preconditioning.

Definition 2.33 (Preconditioning). Let f : Rn → R be a twice differentiable function,


and let us take a vector x ∈ Rn . The preconditioning of f in x involves defining
a change of variables x ′ = Mx and a function f̃(x ′ ) = f(M−1 x ′ ), such that the
conditioning of f̃ in Mx is better than the conditioning of f in x.

In the context of optimization, the matrix for the change of variables should be
positive definite in order to preserve the nature of the problem.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48 48 Conditioning and preconditioning

f(x1 , x2 )

-2
-1
-2 0 x1
-1 1
x2 0 1 2
2

(a) Function

0 x2

-1

-2
-2 -1 0 1 2
x1
(b) Level curves

Figure 2.11: Function (2.51) of Example 2.31

If ∇2 f(x) is positive definite, we can calculate the Cholesky decomposition (Defi-


nition B.18)
∇2 f(x) = LLT , (2.53)
where L is a lower triangular matrix. We now choose the change of variables

x ′ = LT x ⇐⇒ x = L−T x ′ . (2.54)

In this case

∇2 f̃(x ′ ) = L−1 ∇2 f(x)L−T according to (2.52)


−1 T −T
=L LL L according to (2.53)
= I.
Objective function 49


The conditioning of the function f̃ in x ′ is 1. According to Definition 2.29, κ2 ∇2 f̃(x ′ )
≥ 1 and the obtained conditioning is the best possible. The best preconditioning of f
in x consists in defining the change of variables based on the Cholesky factorization
of ∇2 f(x).
Example 2.34 (Preconditioning). Consider
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

1 2 25 2 √
f(x1 , x2 ) = x + x + 3x1 x2 − 12 x1 − π x2 − 6 . (2.55)
2 1 2 2
We have  
x1 + 3x2 − 12
∇f(x1 , x2 ) = √ (2.56)
3x1 + 25 x2 − π
and     T
2 1 3 1 0 1 0
∇ f(x1 , x2 ) = = . (2.57)
3 25 3 4 3 4
We define     
x1′ 1 3 x1
= , (2.58)
x2′ 0 4 x2
i.e.,     
x1 1 −3/4 x1′
= . (2.59)
x2 0 1/4 x2′
We obtain
 2  2  
1 3 ′ 25 1 ′ 3 3 ′
f̃(x1′ , x2′ ) = ′
x1 − x2 + x + ′
x1 − x2 x2′
2 4 2 4 2 4 4
  √
3 π ′
− 12 x1′ − x2′ − x −6 (2.60)
4 4 2
 √ 
1 ′2 1 ′2 ′ π
= x1 + x2 − 12 x1 + 9 − x2′ − 6 .
2 2 4

It is easy to verify that ∇2 f̃(x1′ , x2′ ) = I. Note that there are no longer any crossed
terms in x1′ x2′ .
50 Exercises

2.6 Exercises
Exercise 2.1. Among the following functions, which are convex and which are con-
cave? Justify your answer.
1. f(x) = 1 − x2 .
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2. f(x) = x2 − 1.
q
3. f(x1 , x2 ) = x21 + x22 .
4. f(x) = x3 .
5. f(x1 , x2 , x3 ) = sin(a)x1 + cos(b)x2 + e−c x3 , a, b, c ∈ R.
Exercise 2.2. For each of the following functions:
• Calculate the gradient.
• Calculate the Hessian.
• Specify (and justify) whether the function is convex, concave, or neither.

• Calculate the curvature of the function in a direction d at the specified point x̄.
• Make a change of variables to precondition the function, using the Hessian at the
specified point x̄. Please note that the matrix for a change of variables must be
positive definite.
1 9
1. f(x1 , x2 ) = x21 + x22 , x̄ = (0, 0)T .
2 2
1 3
2. f(x1 , x2 ) = x1 + x32 − x1 − x2 , x̄ = (9, 1)T .
3
3. f(x1 , x2 ) = (x1 − 2)4 + (x1 − 2)2 x22 + (x2 + 1)2 , x̄ = (2, −1)T .

4. f(x1 , x2 ) = x21 + 2x1 x2 + 2x22 , x̄ = (1, 1)T .


5. f(x1 , x2 ) = x21 − x1 x2 + 2x22 − 2x1 + ex1 +x2 , x̄ = (0, 0)T .
Exercise 2.3. Consider f : Rn → R, a point x ∈ Rn and a direction d ∈ Rn , d 6= 0
such that f(x + αd) = f(x) for all α ∈ R. What is the curvature of f in x in the
direction d? What is the conditioning of f in x?
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 3

Constraints

Life would be easier without constraints. Or would it? In this chapter, we investigate
ways to remove some of them, or even all of them. And when some remain, they need
to be properly understood in order to be verified. As algorithms need to move along
directions that are compatible with the constraints, such directions are characterized
in various contexts. We put a special emphasis on linear constraints, which the
analysis simplifies significantly.

Contents
3.1 Active constraints . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 Linear independence of the constraints . . . . . . . . . . 56
3.3 Feasible directions . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.1 Convex constraints . . . . . . . . . . . . . . . . . . . . . . 60
3.3.2 Constraints defined by equations-inequations . . . . . . . 62
3.4 Elimination of constraints . . . . . . . . . . . . . . . . . . 75
3.5 Linear constraints . . . . . . . . . . . . . . . . . . . . . . . 78
3.5.1 Polyhedron . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.5.2 Basic solutions . . . . . . . . . . . . . . . . . . . . . . . . 83
3.5.3 Basic directions . . . . . . . . . . . . . . . . . . . . . . . . 87
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

In this chapter, we analyze the constraints of the optimization problem:


minx∈Rn f(x) subject to h(x) = 0, g(x) ≤ 0, x ∈ X. A vector satisfying all constraints
is called a feasible point.

Definition 3.1 (Feasible point). Consider the optimization problem (1.71)–(1.74).


A point x ∈ Rn is said to be feasible if it satisfies all constraints (1.72)–(1.74). Note
that, in the literature, this concept is sometimes called a feasible solution or feasible
vector.
52 Active constraints

3.1 Active constraints


The concept of active constraints is relevant mainly for inequality constraints. We
introduce the concept with an example.
Example 3.2 (Inequality constraints). Consider the optimization problem
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

min x2 (3.1)
x∈R

subject to
x ≤ 4
(3.2)
x ≥ −10.
It is illustrated in Figure 3.1, where the inequality constraints are represented by
vertical lines, associated with an arrow pointed towards the feasible domain. The
solution to this problem is x∗ = 0. In fact, one could also choose to ignore the
constraints and still obtain the same solution. We say that the constraints are inactive
at the solution. When using the notation (1.73), the problem can be written as

min x2 (3.3)
x∈R

subject to
g1 (x) = x−4 ≤ 0
(3.4)
g2 (x) = −x − 10 ≤ 0.
We have g1 (x∗ ) = −4 and g2 (x∗ ) = −10, and g1 (x∗ ) < 0 and g2 (x∗ ) < 0. The fact
that the inequality constraints are strictly verified characterizes inactive constraints.
f(x)

x∗ = 0 f(x) = x2
-10 -5 0 5 10
x

Figure 3.1: Illustration of Example 3.2


Constraints 53

Example 3.3 (Inequality constraints II). Consider the optimization problem


min x2 (3.5)
x∈R

subject to
x ≤ 4
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

(3.6)
x ≥ 1.
It is illustrated in Figure 3.2, where the inequality constraints are represented by
vertical lines, associated with an arrow pointed towards the feasible domain. The
solution to this problem is x∗ = 1. In this case, we can ignore the constraint x ≤ 4.
However, the constraint x ≥ 1 cannot be ignored without modifying the solution. It
is said to be active. Using the notation (1.73), the problem can be written as
min x2 (3.7)
x∈R

subject to
g1 (x) = x−4 ≤ 0
(3.8)
g2 (x) = 1−x ≤ 0.
We have g1 (x∗ ) = −3 and g2 (x∗ ) = 0, and g1 (x∗ ) < 0 and g2 (x∗ ) = 0. The first
constraint is verified with strict inequality and is inactive. The second constraint is
verified with equality and is active.

f(x) = x2
f(x)

x∗ = 1
-4 -2 0 2 4
x

Figure 3.2: Illustration of Example 3.3

Definition 3.4 (Active constraints). Consider g : Rn → Rp and h : Rn → Rm . For


1 ≤ i ≤ p, an inequality constraint

gi (x) ≤ 0 (3.9)
54 Active constraints

is said to be active in x∗ if
gi (x∗ ) = 0, (3.10)
and inactive in x∗ if
gi (x∗ ) < 0. (3.11)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

By extension, for 1 ≤ i ≤ m, an equality constraint

hi (x) = 0 (3.12)

is said to be active at x∗ if it is satisfied in x∗ , i.e.,

hi (x∗ ) = 0. (3.13)

The set of indices of the active constraints in x∗ is denoted by A(x∗ ).

This concept of active constraints is attractive because in x∗ the active constraints


can be considered equality constraints, while the inactive constraints can be ignored.
In Example 3.2, the unconstrained optimization problem minx∈R x2 has exactly the
same solution as (3.1)–(3.2). In Example 3.3, the constrained optimization problem
minx∈R x2 subject to x = 1 has exactly the same solution as (3.5)–(3.6).

Theorem 3.5 (Active constraints). Take a vector x∗ ∈ Rn . Consider the following


optimization problem P1
minn f(x) (3.14)
x∈R

subject to
g(x) ≤ 0, (3.15)

x ∈ Y ⊆ Rn , (3.16)
where g : Rn −→ Rm is continuous and Y is a subset of Rn . If x∗ is feasible, i.e.,
g(x∗ ) ≤ 0, and if A(x∗ ) ⊆ {1, . . . , p} is the set of indices of the active constraints
in x∗ , i.e.,
A(x∗ ) = {i|gi (x∗ ) = 0}, (3.17)
we consider the following optimization problem P2

min f(x) (3.18)


x∈Rn

subject to
gi (x) = 0, i ∈ A(x∗ ), (3.19)

x ∈ Y ⊆ Rn . (3.20)
Here, x∗ is a local minimum of P1 if and only if x∗ is a local minimum of P2 .
Constraints 55

Proof. Sufficient condition. By continuity of g, for each inactive constraint j, there


is a neighborhood of size εj around x∗ such that the constraint j is strictly verified in
the neighborhood. More formally, if gj (x∗ ) < 0, then there exists εj > 0 such that

gj (x) < 0 ∀x such that kx − x∗ k < εj , (3.21)


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

as illustrated in Figure 3.3. Consider two feasible neighborhoods around x∗ . The first
one is defined as

Y1 (ε) = {x|g(x) ≤ 0, x ∈ Y and kx − x∗ k < ε}, (3.22)

and contains neighbors of x∗ that are feasible for the problem P1 . The second is
defined as

Y2 (ε) = {x|gi (x) = 0 ∀i ∈ A(x∗ ), x ∈ Y and kx − x∗ k < ε}, (3.23)

and contains neighbors of x∗ that are feasible for the problem P2 .


Since x∗ is a local minimum of P1 , according to Definition 1.5, there exists ^ε > 0
such that f(x∗ ) ≤ f(x), ∀x ∈ Y1 (^ε).
Consider the smallest neighborhood, and define

ε̃ = min(^ε, min∗ εj ). (3.24)


j6∈A(x )

We show that
Y2 (ε̃) ⊆ Y1 (^ε). (3.25)

Indeed, take any x in Y2 (ε̃). In order to show that it belongs to Y1 (^ε), we need to
show that g(x) ≤ 0, x ∈ Y, and kx − x∗ k ≤ ^ε. Since ε̃ ≤ ^ε, we have kx − x∗ k < ε̃ ≤ ^ε,
and the third condition is immediately verified. The second condition (x ∈ Y) is
inherited from the definition of Y2 (ε̃). We need only to demonstrate that g(x) ≤ 0.
To do this, consider the constraints of A(x∗ ) separately from the others. By definition
of Y2 , we have gi (x) = 0 for i ∈ A(x∗ ), which implies gi (x) ≤ 0. Take j 6∈ A(x∗ ).
Since ε̃ ≤ εj , we have gj (x) < 0 according to (3.21), which implies gj (x) ≤ 0. This
completes the proof that g(x) ≤ 0, so that x ∈ Y1 (^ε).
As a result of (3.25), since x∗ is the best element (in the sense of the objective
function) of Y1 (^ε) (according to Definition 1.5 of the local minimum), and since x∗
belongs to Y2 (ε̃), it is also the best element of this set, and a local minimum of P2 .

Necessary condition. Let X1 be the set of feasible points of P1 and X2 the set of
feasible points of P2 . We have X2 ⊆ X1 . Let x∗ be a local minimum of P2 . We assume
by contradiction that it is not a local minimum of P1 . Then, for any ε > 0, there
exists x ∈ X1 such that kx − x∗ k ≤ ε and f(x) < f(x∗ ). Since x∗ is a local minimum
of P2 , x cannot be feasible for P2 , and x 6∈ X2 ⊆ X1 , leading to the contradiction.
56 Linear independence of the constraints

We have managed to eliminate some constraints. Unfortunately, this required know-


ing the solution x∗ . This simplification is thus relevant mainly in theory.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

gj (x) > 0

x∗✛ εj ✲
gj (x) < 0

Figure 3.3: Strictly feasible neighborhood of x∗ when the inequality constraint j is


inactive

3.2 Linear independence of the constraints


The analysis of the structure of the constraints and their management in algorithms
is complex. It is therefore necessary to introduce assumptions that are general enough
not to be restrictive from an operational point of view, and that render it possible
to avoid pathological cases. In particular, the linear independence of the constraints
plays an important role. Even though this concept is relatively intuitive when the
constraints are linear, it is necessary to define it strictly for non linear constraints.
Start with the case where all constraints are defined by affine functions (see Def-
inition 2.25). In this case, when using the techniques of Section 1.2, it is always
possible to express the optimization problem as

min f(x) (3.26)

subject to

Ax = b (3.27)
x ≥ 0 (3.28)

by writing h(x) = b − Ax in (1.72), with A ∈ Rm×n , x ∈ Rn and b ∈ Rm , and


g(x) = −x in (1.73).
Constraints 57

Since the inequality constraints are simple, we analyze in more details the system
of equations Ax = b. Like any linear system, three possibilities may arise:
• the system is incompatible, and there is no x such that Ax = b;
• the system is underdetermined, and there is an infinite number of x such that
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Ax = b;
• the system is non singular, and there is a unique x that satisfies Ax = b.
In an optimization context, incompatible and non singular systems have little rele-
vance because they leave no degree of freedom to optimize any objective. We thus
only consider underdetermined systems.
If the system is compatible, the rank of A (Definition B.29) gives us information
on the relevance of the various constraints. If the rank is deficient, this means that
certain rows of A form a linear combination of the others, and the corresponding
constraints are redundant. This is formalized by Theorem 3.6 and illustrated by
Example 3.7.

Theorem 3.6 (Redundant constraints). Consider a compatible system of linear


equality constraints Ax = b, with A ∈ Rm×n , m ≤ n. If the rank of A is
deficient, i.e., rank(A)=r < m, then there exists a matrix à ∈ Rr×n of full rank
(i.e., rank(Ã) = r), composed exclusively of rows ℓ1 , . . . , ℓr of A such that

Ãx = b̃ ⇐⇒ Ax = b, (3.29)

where b̃ is composed of elements ℓ1 , . . . , ℓr of b.

Proof. Since the rank of A is r, this signifies that m − r rows of A can be written as
linear combinations of r other rows. Without loss of generality, we can assume that
the last m − r rows are linear combinations of the first r rows. By denoting aTk the
kth row of A, we have
r
X
ak = λjk aj k = r + 1, . . . , m and ∃j t.q. λjk 6= 0. (3.30)
j=1

Moreover, since by hypothesis the system Ax = b is compatible, for each k = r +


1, . . . , m we have
bk = aTk x
Pr
= λj a T x (3.31)
Prj=1 jk j
= λ
j=1 k jb .

Denote à the matrix composed of the first r rows of A, and b̃ the vector composed
of the first r components of b. Then, ℓi = i, i = 1, . . . , r.
=⇒ Consider x such that Ãx = b̃. According to the definition of Ã, we have that x
satisfies the first r equations of the system, i.e., aTi x = bi for i = 1, . . . , r. Select
k as an arbitrary index between r + 1 and m, and demonstrate that x satisfies the
58 Linear independence of the constraints

corresponding equation. We have


r
X
aTk x = λjk aTj x according to (3.30)
j=1
r
X
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

= λjk bj because Ãx = b̃


j=1

= bk according to (3.31).

⇐= Let x be such that Ax = b. x satisfies all equations of the system, particularly


the first r ones. Then, it satisfies Ãx = b̃.

Example 3.7 (Redundant system). Take the constraints

x1 + x2 + x3 = 1
x1 − x2 + x4 = 1 (3.32)
x1 − 5x2 − 2x3 + 3x4 = 1

i.e.,
   
1 1 1 0 1
A =  1 −1 0 1  b =  1 . (3.33)
1 −5 −2 3 1
The rank of A is equal to 2, but there are 3 rows (the determinant of any squared
submatrix of dimension 3 is zero). This means that one of the rows is a linear
combination of the others. Since the system is compatible (for instance, x1 = 2/3,
x2 = 0,x3 = 1/3, x4 = 1/3 is feasible), one of the constraints must be redundant. In
this case, if aTi represents the ith row of A, we have

a3 = −2a1 + 3a2 . (3.34)

We can remove the 3rd constraint, and the system

x1 + x2 + x3 = 1
(3.35)
x1 − x2 + x4 = 1

is equivalent to the constraint system (3.32).

To generalize this result to non linear constraints h(x) = 0, we must linearize


them by invoking Taylor’s theorem (Theorem C.1) around a point x+ , i.e.,

h(x) = h(x+ ) + ∇h(x+ )T (x − x+ ) + o(k(x − x+ k).

In this case, by ignoring the term o of (C.1), h(x) = 0 can be approximated by

h(x+ ) + ∇h(x+ )T x − ∇h(x+ )T x+ = 0


Constraints 59

or by
∇h(x+ )T x = ∇h(x+ )T x+ − h(x+ ).

Therefore, the gradients of equality constraints play a similar role as the rows of
the matrix A in (3.27). As for the inequality constraints, we saw in Section 3.1 that
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

those that are inactive at x+ can be ignored, and that those that are active can be
considered equality constraints. Consequently, we can define the condition of linear
independence as follows.

Definition 3.8 (Linear independence of the constraints). Consider the optimization


problem (1.71)–(1.73) minx∈Rn f(x) subject to h(x) = 0 and g(x) ≤ 0, and x+ a feasi-
ble point. The linear independence of the constraints is satisfied in x+ if the gradients
of the equality constraints and the gradients of the active inequality constraints in
x+ are linearly independent. By abuse of language, it is sometimes simply said that
the constraints are linearly independent.

Example 3.9 (Linear independence of constraints). Take an optimization problem


in R2 with the inequality constraint

g(x) = x21 + (x2 − 1)2 − 1 ≤ 0 (3.36)

and the equality constraint


h(x) = x2 − x21 = 0. (3.37)

We have
   
2x1 −2x1
∇g(x) = and ∇h(x) = .
2x2 − 2 1

2
h(x) = x2 − x21 = 0
1.5

1 •xa

0.5
x2

0 •xb

-0.5 g(x) = x21 + (x2 − 1)2 − 1 = 0

-1
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x1
Figure 3.4: Linear independence of the constraints
60 Feasible directions

Consider the point xa = (1, 1)T , that is feasible, and for which the constraint
(3.36) is active. We have
   
2 −2
∇g(xa ) = and ∇h(xa ) = .
0 1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

These two vectors are linearly independent, and the linear independence of the con-
straints is satisfied in xa . Figure 3.4 represents the normalized vectors
  √ !
∇g(xa ) 1 ∇h(xa ) −2 5
5
= and = √ .
k∇g(xa )k 0 k∇h(xa )k 5
5

Consider the point xb = (0, 0)T , that is also feasible, and for which the constraint
(3.36) is active. We have
   
b 0 b 0
∇g(x ) = and ∇h(x ) = .
−2 1

These vectors are represented as normalized in Figure 3.4. They are linearly depen-
dent because ∇g(xb ) = −2∇h(xb ), and the linear independence of the constraints is
not satisfied in xb .

3.3 Feasible directions


A major difficulty when we develop optimization algorithms is to move within the
feasible set. The analysis of the feasible directions helps us in this task.

Definition 3.10 (Feasible direction). Consider the general optimization problem


(1.71)–(1.74), and the feasible point x ∈ Rn . A direction d is said to be feasible in x
if there exists η > 0 such that x + αd is feasible for any 0 < α ≤ η.

In short, it is a direction that can be followed, at least a little bit, while staying
within the feasible set. Some examples are provided in Figure 3.5, where the feasible
set is the polygon represented by thin lines, feasible directions are represented with
thick plain lines, and infeasible directions with thick dashed lines.

3.3.1 Convex constraints


When the set X of the constraints is convex, the identification of a feasible direction
in x ∈ X depends on the identification of a feasible point y ∈ X, other than x.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48 Constraints 61

Figure 3.5: Feasible (plain) and infeasible (dashed) directions

Theorem 3.11 (Feasible direction in a convex set). Let X be a convex set, and
consider x, y ∈ X, y 6= x. The direction d = y − x is a feasible direction in x and
x + αd = x + α(y − x) is feasible for any 0 ≤ α ≤ 1.

Proof. According to Definition B.2 of a convex set.

•y
d=y−x

x•

Figure 3.6: Feasible direction in a convex set

Corollary 3.12 (Feasible directions in an interior point). Let X ⊆ Rn be a subset


of Rn and x ∈ Rn an interior point of X. Here, any direction d ∈ Rn is feasible
in x.

Proof. According to the definition of an interior point (Definition 1.15), there exists
a (convex) neighborhood V = {z such that kz − xk ≤ ε} such that V ⊆ X and ε > 0.
Consider an arbitrary direction d, and let y = x + εd/kdk be the point where the
direction intersects the border of the neighborhood. Since ky − xk = ε, then y ∈ V.
Since V is convex, Theorem 3.11 is invoked to demonstrate that d is feasible.
This result is particularly important. The fact that all directions are feasible at
an interior point gives freedom to algorithms in the selection of the direction. This
is what motivates the method of interior points, as described in Chapter 18.
62 Feasible directions

3.3.2 Constraints defined by equations-inequations


Here we consider the problem (1.71)–(1.73). Since we have at our disposal analytical
expression of constraints, we characterize the set of feasible directions directly from
these expressions. When the constraints are linear, the characterization is simple.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Theorem 3.13 (Feasible directions: linear case). Consider the optimization prob-
lem (3.26)–(3.28) min f(x) subject to Ax = b and x ≥ 0, and let x+ be a feasible
point. A direction d is feasible in x+ if and only if
1. Ad = 0, and
2. di ≥ 0 when x+
i = 0.

Proof.=⇒ Direct implication. Let d be a feasible direction in x+ . According to


Definition 3.10, there exists η > 0 such that x+ + αd is feasible for any 0 < α ≤ η.
It satisfies the constraint (3.27), i.e.,

b = A(x+ + αd) = Ax+ + αAd = b + αAd.

We have that Ad = 0 because α > 0. Consider i such that x+ +


i = 0. Since x + αd
is feasible, we have
x+
i + αdi = αdi ≥ 0

and the second condition is satisfied.


⇐= Inverse implication. Let d be a direction satisfying the two conditions. Consider
the point x+ + αd. It satisfies the constraints (3.27) because Ad = 0. For the
constraints (3.28), we consider three types of indices:
– Consider i such that x+ +
i = 0. In this case, xi + αdi is non negative because
di ≥ 0 by hypothesis.
– Consider i such that x+
i > 0 and di ≥ 0. The same conclusion.
– Consider i such that x+i > 0 and di < 0. These are the only indices for which
chances are that x+i + αdi is non feasible. In this case, we have to determine
the step to take in direction d in order to stay feasible. Define
xi
η= min − > 0.
i|x+
i >0 and di <0 di

If we choose α ≤ η, we have

x+
α≤η≤− i
∀i t.q. x+
i > 0 and di < 0,
di

and
x+
i + αdi ≥ 0

because di < 0.
Constraints 63

Then, x+ + αd is feasible if α ≤ η. Since η is positive, d is a feasible direction


according to Definition 3.10.

Corollary 3.14 (Combination of feasible directions: linear case). Consider the


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

optimization problem (3.26)–(3.28) min f(x) subject to Ax = b and x ≥ 0. Let


x+ be a feasible point and d1 , . . . , dk feasible directions in x+ . Then, the convex
cone generated by these directions contains feasible directions. That is, any
linear combination with non negative coefficients of these directions, i.e.,
k
X
d= αj dj αj ≥ 0 ∀j, (3.38)
j=1

is a feasible direction.

Proof. The two conditions of Theorem 3.13 are trivially satisfied for d.
To switch to the non linear case, we must use the gradients of the constraints.
Before this, we propose to interpret Theorem 3.13 in terms of gradients.
The first condition of the theorem concerns equality constraints. We have seen
that the rows of the matrix A are the gradients of the constraints, i.e. ∇hi (x) = ai ,
with
hi (x) = aTi x − bi = 0.

In the linear case, the first condition can be written as

∇hi (x)T d = 0 i = 1, . . . , m.

The inequality constraints can be written as

gi (x) = −xi ≤ 0 i = 1, . . . , p,

and  
0
 .. 
 . 
 
 0 
 
 
∇gi (x) =  −1  and ∇gi (x)T d = −di .
 
 0 
 
 .. 
 . 
0
The second condition of the theorem can be expressed as follows: “If the constraint
gi (x) is active at x+ , then ∇gi (x+ )T d ≤ 0.” We should also note that if an inequality
constraint is not active at x+ , it does not involve any condition on the direction for
the latter to be feasible.
64 Feasible directions

Unfortunately, the generalization of these results to the non linear case is not
trivial. We develop it in two steps. We first see how to characterize feasible directions
for an inequality constraint. We treat the equality constraint later.
We observe that the gradient of an inequality constraint at a point where it is
active points toward the outside of the constraints, as shown by Example 3.15.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Example 3.15 (Constraint gradients). Consider the subset of R2 defined by the


constraint
1 1 1
(x1 − 1)2 + (x2 − 1)2 ≤ ,
2 2 2
and represented by Figure 3.7. Considering the formulation (1.73), we have
1 1 1
g(x) = (x1 − 1)2 + (x2 − 1)2 − ,
2 2 2
and  
x1 − 1
∇g(x) = .
x2 − 1
If we evaluate the gradient at different points along the border of the constraint, we
obtain directions pointing towards the outside of the feasible domain, as shown in
Figure 3.7.

x2

✛ 1
(x
2 1
− 1)2 + 1
(x
2 2
− 1)2 <= 1
2


x1

Figure 3.7: Constraint gradient


Constraints 65

Intuitively, the gradient direction and the directions that form an acute angle with
it cannot be considered as feasible directions.

Theorem 3.16 (Feasible directions: an inequality constraint). Let g : Rn −→ R be


a differentiable function, and consider x+ ∈ Rn such that g(x+ ) ≤ 0.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

1. If the constraint g(x) ≤ 0 is inactive at x+ , all directions are feasible in x.


2. If the constraint is active at x+ , and ∇g(x+ ) 6= 0, a direction d is feasible in
x+ if
∇g(x+ )T d < 0. (3.39)

Proof. 1. If the constraint is inactive at x+ , then x+ is an interior point, and Corol-


lary 3.12 applies.
2. Consider the case where the constraint is active at x+ , that is g(x+ ) = 0. Let d be
a direction satisfying (3.39). According to Definition 2.10, d is a descent direction
for g in x+ . We can apply Theorem 2.11 to determine that there exists η > 0 such
that
g(x+ + αd) < g(x+ ) = 0, ∀0 < α ≤ η,
and conclude that d is feasible.

It is important to note that (3.39) is a sufficient condition in order to obtain a


feasible direction when the constraint is active, but it is not a necessary condition.
Indeed, in the linear case, we have a similar condition, with a non strict inequality. We
still have to discuss the case where ∇g(x+ )T d = 0, which occurs when the gradient is
zero, or when the direction d is perpendicular to the gradient. In this case, we invoke
Theorem C.1 (Taylor’s theorem) to obtain

g(x+ + αd) = g(x+ ) + αdT ∇g(x+ ) + o(αkdk) = o(α).

In this case, nothing can guarantee that there exists α such that x+ + αd is feasible.
However, we can make the infeasibility as little as desired by choosing a sufficiently
small α.
We now analyze the feasible directions for an equality constraint

h(x) = 0.

We can express this constraint in an equivalent manner as

h(x) ≤ 0
−h(x) ≤ 0.

If x+ is feasible, these two inequality constraints are active. However, no direction d


can simultaneously satisfy (3.39) for the two inequalities. This condition is unusable
for equality constraints.
66 Feasible directions

Since the equality constraints pose a problem, we address the problem in the other
direction. Instead of positioning ourselves on a feasible point x+ and wondering how
to reach another one, we attempt to identify the ways to reach x+ while remaining
feasible. To do this, we introduce the concept of feasible sequences.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Definition 3.17 (Feasible sequences). Consider the optimization problem (1.71)–


(1.74), and a feasible point x+ ∈ Rn . A sequence (xk )k∈N , with xk ∈ Rn for any k is
said to be a feasible sequence in x+ if the following conditions are satisfied:
1. limk→∞ xk = x+ ,
2. there exists k0 such that xk is feasible for any k ≥ k0 ,
3. xk 6= x+ for all k.
The set of feasible sequences in x+ is denoted by S(x+ ).

Example 3.18 (Feasible sequence). Consider the constraint in R2

h(x) = x21 − x2 = 0,

and the feasible point x+ = (0, 0)T . The sequence defined by


 1

xk = k
1
k2

satisfies the three conditions of Definition 3.17 and belongs to S(x+ ). It is illustrated
in Figure 3.8.

h(x) = x21 − x2 = 0
x2


••
•••••••••
+•
x =0
-1 -0.5 0 0.5 1
x1
Figure 3.8: Example of a feasible sequence
Constraints 67

Given that the sequence (xk )k is feasible in x+ , we consider the directions con-
necting xk to x+ by normalizing them:
xk − x+
dk = . (3.40)
kxk − x+ k
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

We should keep in mind that these directions are generally not feasible directions.
We are looking at what happens at the limit.
Example 3.19 (Feasible direction at the limit). We consider once again Exam-
ple 3.18. We have xk − x+ = xk ,

+ k2 + 1
kxk − x k =
k2
and !
√ k
dk = k2 +1 .
√ 1
k2 +1
At the limit, we obtain  
1
d = lim dk = .
k→∞ 0
These directions are illustrated in Figure 3.9. Note that
 
+ 0
∇h(x ) = and ∇h(x+ )T d = 0.
−1

d1
h(x) = x21 − x2 = 0
x2

d2
d3

•• d
•••••••••
+•
x =0
-1 -0.5 0 0.5 1
x1
Figure 3.9: Example of a feasible direction at the limit

Unfortunately, it is not always possible to position oneself at the limit, as shown


in Example 3.20 where the sequence is not convergent, and contains two adherent
points. In this case, we should consider the limit of subsequences in order to identify
the feasible directions at the limit.
68 Feasible directions

Example 3.20 (Feasible direction at the limit). As for Example 3.18, we consider
the constraint in R2 :
h(x) = x21 − x2 = 0,

and the feasible point x+ = (0, 0)T . The sequence defined by


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

!
(−1)k
xk = k
1
k2

satisfies the three conditions of Definition 3.17 and belongs to S(x+ ). The calculation
of the directions gives
k !
(−1) k

dk = k2 +1
√ 1 ,
k2 +1

and limk→∞ dk does not exist. However, if we consider the subsequences defined by
the even and odd indices, respectively, we obtain
 
′ 1
d = lim d2k =
k→∞ 0

and  
′′ −1
d = lim d2k+1 = ,
k→∞ 0

as illustrated in Figure 3.10. Note once again that ∇h(x+ )T d ′ = ∇h(x+ )T d ′′ = 0.

• h(x) = x21 − x2 = 0

d1
x2

d2
d3
• d4

d ′′ • d′
• •• •••••••• •
x+ = 0
-1 -0.5 0 0.5 1
x1
Figure 3.10: Example of a feasible sequence

We can now formally define the notion of a feasible sequence at the limit.
Constraints 69

Definition 3.21 (Feasible direction at the limit). Consider the optimization problem
(1.71)–(1.74), and a feasible point x+ ∈ Rn . Let (xk )k∈N be a feasible sequence in
x+ . The direction d 6= 0 is a feasible direction at the limit in x+ for the sequence
(xk )k∈N if there exists a subsequence (xki )i∈N such that
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

d xki − x+
= lim . (3.41)
kdk i→∞ kxki − x+ k

Note that any feasible direction d is also a feasible direction at the limit. Just
take the feasible sequence xk = x+ + k1 d in Definition 3.21.

Definition 3.22 (Tangent cone). A feasible direction at the limit is also called a
tangent direction. The set of all tangent directions in x+ is called the tangent cone
and denoted by T (x+ ).

We can now make the connection between this concept and the constraint gradient.
According to Theorem 3.16 and the associated comments, we consider all directions
that form an obtuse angle with the active inequality constraint gradients and those
that are orthogonal to the equality constraint gradients.

Definition 3.23 (Linearized cone). Consider the optimization problem (1.71)–(1.74),


and a feasible point x+ ∈ Rn . We call the linearized cone in x+ , denoted by D(x+ ),
the set of directions d such that

dT ∇gi (x+ ) ≤ 0, ∀i = 1, . . . , p such that gi (x+ ) = 0, (3.42)

and
dT ∇hi (x+ ) = 0, i = 1, . . . , m, (3.43)
and of all their multiples, i.e.,

{αd|α > 0 and d satisfies (3.42) and (3.43)} . (3.44)

Theorem 3.24 (Feasible directions at the limit). Consider the optimization prob-
lem(1.71)–(1.74), and a feasible point x+ ∈ Rn . Any feasible direction at the
limit in x+ belongs to the linearized cone in x+ , that is

T (x+ ) ⊆ D(x+ ). (3.45)


70 Feasible directions

Proof. Let d be a normalized feasible direction at the limit, and xk a feasible sequence
such that
xk − x+
d = lim . (3.46)
k→∞ kxk − x+ k

1. Consider an active inequality constraint, i.e.,


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

gi (x+ ) = 0.

For a sufficiently large k such that xk is feasible, we can write

gi (x+ ) + (xk − x+ )T ∇gi (x+ ) + o(kxk − x+ k) = gi (xk ) ≤ 0

invoking Theorem C.1 (Taylor’s theorem), and

(xk − x+ )T ∇gi (x+ ) o(kxk − x+ k)


+ ≤ 0.
kxk − x+ k kxk − x+ k
We now only have to position ourselves at the limit, and utilize (3.46) and Defi-
nition B.17 to obtain (3.42).
2. Consider an equality constraint,

hi (x+ ) = 0.

This constraint is equivalent to two inequality constraints

hi (x+ ) ≤ 0
−hi (x+ ) ≤ 0

which are both active at x+ . According to the first point already demonstrated,
we have 0 ≤ dT ∇hi (x+ ) and 0 ≤ −dT ∇hi (x+ ), and obtain (3.43).

Example 3.25. Illustration of Theorem 3.24 Returning to Example 3.20, we can


observe in Figure 3.11 that the two feasible directions at the limit are orthogonal to
the constraint gradient in x+ , i.e.,
 
+ 0
∇h(x ) = .
−1

Theorem 3.24 does not yet provide a characterization of feasible directions at the
limit. Nevertheless, such a characterization is important, especially since the notion
of linearized cone is easier to handle than the concept of feasible direction at the limit.
Unfortunately, such a characterization does not exist in the general case. Therefore,
it is useful to assume that any element of the linearized cone is a feasible direction at
the limit. This hypothesis is called a constraint qualification.1

1 Several constraint qualifications have been proposed in the literature. This one is sometimes
called the Abadie Constraint Qualification, from the work by Abadie (1967).
Constraints 71

h(x) = x21 − x2 = 0
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

d ′′ x+ = 0 d′
x2

-1 -0.5 0 0.5 1
x1
Figure 3.11: Gradient and feasible directions at the limit

Definition 3.26 (Constraint qualification). Consider the optimization problem


(1.71)–(1.74), and let x+ be a feasible point. The constraint qualification condi-
tion is satisfied if any element of the linearized cone in x+ is a feasible direction at
the limit in x+ , that is if
T (x+ ) = D(x+ ). (3.47)

This hypothesis, seemingly restrictive, is satisfied in a number of cases. In par-


ticular, when the constraints are defined solely by equations and inequations, each of
the following is sufficient for a constraint qualification.
• If the constraints (1.71)–(1.73) are linear, the constraint qualification is satisfied
at all feasible points.
• If the constraints are linearly independent in x+ (Definition 3.8), the constraint
qualification is satisfied at x+ .
• If there exists a vector d ∈ Rn such that
1. ∇hi (x+ )T d = 0, for any i = 1, . . . , m,
2. ∇gi (x+ )T d < 0 for any i = 1, . . . , p such that gi (x+ ) = 0
and such that the equality constraints are linearly independent in x+ , then the
constraint qualification is satisfied at x+ (Mangasarian and Fromovitz, 1967).
• If there is no equality constraint, the functions gi are convex, and there exists a
vector x− such that

gi (x− ) < 0 for any i = 1, . . . , p such that gi (x+ ) = 0,

the constraint qualification is satisfied at x+ (Slater, 1950).


We develop the proof for the two first conditions.
72 Feasible directions

Theorem 3.27 (Characterization of feasible directions at the limit – I). Consider


the optimization problem (1.71)–(1.73), and a feasible point x+ ∈ Rn such that
all active constraints in x+ are linear. Every direction d such that kdk = 1 is
feasible at the limit in x+ if and only if it belongs to the linearized cone D(x+ ),
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

that is
T (x+ ) = D(x+ ). (3.48)

Proof. Theorem 3.24 shows that T (x+ ) ⊆ D(x+ ). To demonstrate that D(x+ ) ⊆
T (x+ ), consider a normalized direction2 d that belongs to the linearized cone D(x+ ).
We need to create a feasible sequence (xk )k , such that (3.41) is satisfied.
For each inequality constraint i active at x+ , we have

gi (x) = aTi x − bi , (3.49)

and for each equality constraint, we have

hi (x) = āTi x − b̄i . (3.50)

Following the arguments developed in Theorem 3.5, there exists ε such that all
constraints that are inactive at x+ are also inactive in any point of the sphere of
radius ε centered in x+ . Consider the sequence (xk )k with
ε
xk = x+ + d k = 1, 2, . . . (3.51)
k
Each xk is situated in the sphere mentioned above, and satisfies the inequality con-
straints that are inactive at x+ . For the inequality constraints that are active at x+ ,
we have

gi (xk ) = gi (xk ) − gi (x+ ) because gi (x+ ) = 0


= aTi xk − bi − aTi x+ + bi according to (3.49)
+
= aTi (xk −x )
ε
= aTi d according to (3.51).
k

Since d is in the linearized cone at x+ , ∇gi (x+ )T d = aTi d ≤ 0, and xk is feasible for
any inequality constraint. For the equality constraint, we obtain in a similar manner
ε T
hi (xk ) = ā d.
k i

Since d is in the linearized cone at x+ , ∇hi (x+ )T d = āTi d = 0, and xk is feasible


for any equality constraint. The sequence (xk )k is indeed a feasible sequence, in the
sense of Definition 3.17.
2 i.e., such that kdk = 1.
Constraints 73

Finally, we now need only deduce from kdk = 1 that, for any k,

xk − x+ (ε/k)d
= =d
kxk − x+ k ε/k

to conclude that d is indeed a feasible direction at the limit.


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Theorem 3.28 (Characterization of feasible directions at the limit – II). Consider


the optimization problem(1.71)–(1.73), and a feasible point x+ ∈ Rn for which
the constraints are linearly independent. Any d such that kdk = 1 is feasible at
the limit at x+ if and only if it belongs to the linearized cone D(x+ ), that is

T (x+ ) = D(x+ ). (3.52)

Proof. Theorem 3.24 shows that T (x+ ) ⊆ D(x+ ). To demonstrate that D(x+ ) ⊆
T (x+ ), consider a normalized direction d that belongs to the linearized cone D(x+ ).
We create a feasible sequence (xk )k , such that (3.41) is satisfied. We create it implic-
itly and not explicitly.
To simplify the notations, we first assume that all constraints are equality con-
straints. We consider the Jacobian matrix of constraints in x+ , ∇h(x+ )T ∈ Rm×n ,
for which the rows are constraint gradients in x+ (see Definition 2.18). Since the
constraints are linearly independent, the Jacobian matrix is of full rank. Consider a
matrix Z ∈ Rn×(n−m) for which the columns form a basis of the kernel of ∇h(x+ )T ,
i.e., such that ∇h(x+ )T Z = 0. We apply the theorem of implicit functions (Theo-
rem C.6) to the parameterized function F : R × Rn → Rn defined by
 
h(x) − µ∇h(x+ )T d
F(µ, x) = . (3.53)
ZT (x − x+ − µd)

The assumptions of Theorem C.6 are satisfied for µ = 0 and x = x+ . Indeed,

∇x F(µ, x) = (∇h(x) Z)

is non singular since the columns of Z are orthogonal to those of ∇h(x), and since
the two submatrices are of full rank. Then, we have a function φ such that

x+ = φ(0) (3.54)

and, for µ sufficiently close to zero,


 
h(φ(µ)) − µ∇h(x+ )T d
F(µ, φ(µ)) = = 0. (3.55)
ZT (φ(µ) − x+ − µd)

Since d is in the linearized cone, we deduce from the first part of (3.55)

h(φ(µ)) = µ∇h(x+ )T d = 0 (3.56)


74 Feasible directions

and φ(µ) is feasible. We use φ to build a feasible sequence. To do so, we show that
φ(µ) 6= x+ when µ 6= 0. Assume by contradiction that φ(µ) = x+ . In this case,
   
h(x+ ) − µ∇h(x+ )T d −µ∇h(x+ )T d
F(µ, x+ ) = = = 0. (3.57)
ZT (x+ − x+ − µd) −µZT d
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

If µ 6= 0, and since the matrices ∇h(x+ )T and Z are of full rank, we deduce that d = 0,
which is impossible since kdk = 1. Then, if µ 6= 0, we necessarily have φ(µ) 6= x+ .
We are now able to generate a feasible sequence. To do so, we consider a sequence
(µk )k such that limk→∞ (µk )k = 0. Then, the sequence

xk = φ(µk ) (3.58)

satisfies all conditions to be a feasible sequence (see Definition 3.17). We now need
to demonstrate that d is a feasible direction at the limit.
For a sufficiently large k, such that µk is sufficiently close to zero, we use a Taylor
series of h around x+

h(xk ) = h(x+ ) + ∇h(x+ )T (xk − x+ ) + o(kxk − x+ k)


= ∇h(x+ )T (xk − x+ ) + o(kxk − x+ k)

in (3.55) to obtain

0 = F(µ , x )
 k k+ T 
∇h(x ) (xk − x+ ) + o(kxk − x+ k) − µk ∇h(x+ )T d
=
ZT (xk − x+ − µk d)
 
∇h(x ) (xk − x+ − µk d) + o(kxk − x+ k)
+ T
=
ZT (xk − x+ − µk d)
 
∇h(x+ )T
= (xk − x+ − µk d) + o(kxk − x+ k)
ZT
  
∇h(x+ )T xk −x+
+
k −x k)
= T kx −x
µk
+ k − kx −x+ k d + o(kx
kxk −x+ k .
Z k k

Then, since the matrices ∇h(xk )T and ZT are of full rank, we have
 
xk − x+ µk
lim − d = 0. (3.59)
k→∞ kxk − x+ k kxk − x+ k

Define
xk − x+ µk
d̃ = lim +
and c = lim ,
k→∞ kxk − x k k→∞ kxk − x+ k

and (3.59) is written as


d̃ = cd.
Since kd̃k = kdk = 1, we have c = 1 and

xk − x+
lim = d.
k→∞ kxk − x+ k
Constraints 75

Then, d is indeed a feasible direction at the limit.


To be completely accurate, we must consider a case with inequality constraints.
For those that are active at x+ , the reasoning is identical, with the exception of (3.56)
which becomes
g(φ(µ)) = µ∇g(x+ )T d ≤ 0
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

from which we deduce the feasibility of φ(µ). Inactive constraints do not pose a prob-
lem, since there is a sphere around x+ such that all elements satisfy these constraints.
Since Definition 3.17 is asymptotic, we can always choose k sufficiently large such
that xk belongs to this sphere.

Feasible directions at the limit are an extension of the concept of a feasible direction.
It enables us to identify in which direction an infinitesimal displacement continues
to be feasible. Unfortunately, the definition is too complex to be operational. The
linearized cone, based on the constraint gradients, is directly accessible to the calcu-
lation. We usually assume that the constraint qualification is satisfied.

3.4 Elimination of constraints


Optimization problems without constraint are simpler than those with constraints.
We now analyze techniques to eliminate constraints.
We start with the optimization problem (3.26)–(3.28) min f(x) subject to Ax = b
and x ≥ 0, where the constraints are linear. We assume that we have a system
of constraints of full rank, obtained after eliminating any redundant constraint (see
Theorem 3.6). It is then possible to simplify the problem by eliminating certain
variables, as shown in Example 3.29.
Example 3.29 (Elimination of variables). Consider the following optimization prob-
lem:
min f(x1 , x2 , x3 , x4 ) = x21 + sin(x3 − x2 ) + x4 + 1 (3.60)
subject to
x1 +x2 +x3 = 1
(3.61)
x1 −x2 +x4 = 1.
We can rewrite the constraints in the following manner:

x3 = 1 − x1 − x2
(3.62)
x4 = 1 − x1 + x2 .

Thus, the optimization problem can be rewritten so as to depend only on two variables
x1 and x2 :

min f(x1 , x2 ) = x21 + sin(−x1 − 2x2 + 1) − x1 + x2 + 2 (3.63)

without constraint.
76 Elimination of constraints

To generalize Example 3.29, we consider the constraints

Ax = b (3.64)

with A ∈ Rm×n , m ≤ n, x ∈ Rn and b ∈ Rm and rank(A) = m. We choose m


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

columns of A that are linearly independent corresponding to the variables that we


wish to eliminate. Apply a permutation P ∈ Rn×n of the columns of A in such a way
that the m selected columns are the leftmost columns:

AP = (B N) (3.65)

where B ∈ Rm×m contains the m first columns of AP, and N ∈ Rm×(n−m) contains
the n − m last ones. Recalling that PPT = I, we write (3.64) in the following manner

Ax = AP(PT x) = BxB + NxN = b (3.66)

where xB ∈ Rm contains the m first components of PT x, and xN ∈ Rn−m contains


the n − m last ones. Since the m first columns of AP are linearly independent, the
matrix B is invertible. We can write

xB = B−1 (b − NxN ). (3.67)

By adopting this convention, we consider the optimization problem with linear


equality constraints   
xB
min f P (3.68)
xB ,xN xN
subject to
BxB + NxN = b. (3.69)
It is equivalent to the unconstrained problem
  −1 
B (b − NxN )
min f P . (3.70)
xN xN

The variables xB are called basic variables, and the variables xN non basic variables.
Example 3.30 (Elimination of variables – II). In Example 3.29, we have
   
1 1 1 0 1
A= b= .
1 −1 0 1 1

The variables to eliminate are x3 and x4 . They correspond to the last two columns
of the constraint matrix, and we choose the permutation matrix to make them the
first two, that is
 
0 0 1 0
 0 0 0 1 
P=  1 0 0 0 

0 1 0 0
Constraints 77

to obtain
     
1 0 1 1 1 0 1 1
AP = (B|N) = B= N=
0 1 1 −1 0 1 1 −1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

and
 
x3
xB = = B−1 (b − NxN )
x4
      
1 0 1 1 1 x1
= −
0 1 1 1 −1 x2
 
1 − x1 − x2
=
1 − x1 + x2

which is exactly (3.62).

It is relatively easy to remove linear equality constraints. However, note that the
calculation of the matrix B−1 can be tedious, especially when m is large, and it is
sometimes preferable to explicitly maintain the constraints for the problem.
The elimination of non linear constraints can be problematic. An interesting
example, proposed as an exercise by Fletcher (1983) and again by Nocedal and Wright
(1999), illustrates this difficulty.
Example 3.31 (Elimination of non linear constraints). Consider the problem

min f(x1 , x2 ) = x21 + x22


x

subject to
(x1 − 1)3 = x22

shown in Figure 3.12.

3
x22 = (x1 − 1)3
2

0
x2

(1, 0)
-1

-2

-3
-3 -2 -1 0 1 2 3
x1
Figure 3.12: The problem in Example 3.31
78 Linear constraints

The solution to this problem is (1, 0). If we eliminate x2 , we obtain an optimization


problem without constraint

min f̃(x1 ) = x21 + (x1 − 1)3 .


x1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

However, this new problem has no solution since f̃ is unbounded, i.e.,

lim f̃(x1 ) = −∞,


x1 →−∞

as shown in Figure 3.13. The problem is that the substitution can only be performed
if x1 ≥ 1, since x22 must necessarily be non negative. This implicit constraint in the
original problem should be explicitly incorporated in the problem after elimination.
It plays a crucial role since it is active at the solution.

20
x21 + (x1 − 1)3
10
0
-10
f̃(x1 )

-20
-30
-40
-50
-60
-3 -2 -1 0 1 2 3
x1
Figure 3.13: The problem without constraint in Example 3.31

One must thus be cautious when eliminating non linear constraints.

3.5 Linear constraints


When the constraints are linear, a more detailed analysis can be performed. We first
give a geometric description of the constraints. Then, geometric concepts have their
algebraic counterparts.

3.5.1 Polyhedron
We analyze the linear constraints (3.27)–(3.28) from a geometrical point of view. The
central concept in this context is the polyhedron.
Constraints 79

Definition 3.32 (Polyhedron). A polyhedron is a set of points of Rn delimited by


hyperplanes, i.e.,
{x ∈ Rn |Ax ≤ b} , (3.71)
with A ∈ Rm×n and b ∈ Rm .
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

By employing the techniques discussed in Section 1.2, it is always possible to


transform an optimization problem with general linear constraints into a problem
with the constraints Ax ≤ b. Thus, the set of feasible points in an optimization
problem with linear constraints is a polyhedron. To make the most of the technique
of elimination of variables mentioned above, it is helpful to use the representation of
a polyhedron called representation in standard form.

Definition 3.33 (Polyhedron represented in standard form). A polyhedron repre-


sented in standard form is a polyhedron defined in the following manner

{x ∈ Rn |Ax = b, x ≥ 0} , (3.72)

where A ∈ Rm×n and b ∈ Rm .

Note that according to Theorem 3.6, the matrix A is assumed to be of full rank
without loss of generality.
The identification of vertices or extreme points of a polyhedron is possible thanks
to the technique of elimination of variables described above. We begin by formally
defining a vertex.

Definition 3.34 (Vertex). Let P be a polyhedron. A vector x ∈ P is a vertex of P


if it is impossible to find two vectors y and z in P, different from x such that x is
a convex combination (Definition B.3) of y and z, i.e., such that there exists a real
number 0 < λ < 1 such that
x = λy + (1 − λ)z. (3.73)

The Definition 3.34 is illustrated by Figure 3.14, where x is a vertex. If we choose


y ∈ P, it is impossible to find a z in P such that x is a convex combination of y and
z. On the other hand, x̃ is not a vertex, and represents a convex combination of ỹ
and z̃.
We can identify the vertices of a polyhedron represented in standard form by using
the following procedure:
1. Choose m variables to eliminate.
2. Identify the matrix B that contains the corresponding columns of A.
3. Take xN = 0.
4. In this case, (3.67) is expressed as xB = B−1 b. If xB ≥ 0, then x = (xTB xTN )T is a
vertex of the polyhedron.
80 Linear constraints

x
P
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Figure 3.14: Illustration of Definition 3.34

We formalize this result with the following theorem.

Theorem 3.35 (Identification of vertices). Let P = {x ∈ Rn |Ax = b, x ≥ 0} be


a polyhedron represented in standard form, with A ∈ Rm×n of full rank and
b ∈ Rm and n ≥ m. Consider m linearly independent columns of A, and call
B the matrix containing these m columns, and N the matrix containing the
remaining n − m columns, such that

AP = (B|N) (3.74)

where P is the appropriate permutation matrix. Consider the vector


 −1 
B b
x=P . (3.75)
0Rn−m

If B−1 b ≥ 0, then x is a vertex of P.

Proof. Without loss of generality, and to simplify the notations in the proof, we
assume that the m columns chosen are the m first ones, such that the permutation
matrix P = I. We assume by contradiction that there exists y, z ∈ P, y 6= x, z 6= x
and 0 ≤ λ ≤ 1 such that
x = λy + (1 − λ)z. (3.76)
After decomposition, we obtain
xB = λyB + (1 − λ)zB , (3.77)
and
xN = λyN + (1 − λ)zN . (3.78)
Since y and z are in P, we have yN ≥ 0 and zN ≥ 0. Since 0 ≤ λ ≤ 1, the only way
for xN = 0 is that yN = zN = 0. Then,
yB = B−1 (b − NyN ) = B−1 b = xB , (3.79)
Constraints 81

and
zB = B−1 (b − NzN ) = B−1 b = xB . (3.80)
We obtain x = y = z, which contradicts the fact that y and z are different from x,
and proves the result.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

It then appears that the vertices can be characterized by the set of active con-
straints.

Theorem 3.36 (Vertices and active constraints). Let P = {x ∈ Rn |Ax = b, x ≥ 0}


be a polyhedron represented in standard form, with A ∈ Rm×n and b ∈ Rm and
n ≥ m. Let x∗ ∈ P, and
A(x∗ ) = {i|x∗i = 0}
be the set of indices of the active constraints. x∗ is a vertex of P if and only if
the linear manifold

L(x∗ ) = {x ∈ Rn |Ax = b and xi = 0 ∀i ∈ A(x∗ )} (3.81)

is zero-dimensional, i.e., L(x∗ ) = {x∗ }.

Proof. =⇒ Direct implication. Let x∗ be a vertex. Assume by contradiction that


L(x∗ ) is not zero-dimensional. There is then a straight line in L(x∗ ) characterized
by the equation
x∗ + λd, λ ∈ R
with d ∈ Rn , d 6= 0 and Ad = 0 such that di = 0 for any i ∈ A(x0 ) (see
Theorem 3.13). For every i such that di 6= 0, we define

x∗i
αi = − .
di

According to Definition (3.81) of linear manifold, αi 6= 0, for any i. Indeed, if


di 6= 0, then i 6∈ A(x∗ ), and x∗i > 0. We are now able to find two points of the
polyhedron such that x∗ is a convex combination of these points, contradicting
the fact that it is a vertex.
Consider α1 = mini {αi |αi > 0}. If no αi is positive, we take α1 = 1. Similarly,
α2 = maxi {αi |αi < 0}. If no αi is negative, we take α2 = −1. Then, the points
y = x∗ + α1 d and z = x∗ + α2 d belong by construction to the polyhedron P.
Moreover x∗ = λy + (1 − λ)z, with
−α2
λ= .
α1 − α2
Since α1 > 0 and α2 < 0, we have 0 < λ < 1, and x∗ is a convex combination of
y and z.
⇐= Inverse implication. Consider x∗ ∈ P such that L(x∗ ) = {x∗ }. We assume by
contradiction that x∗ is not a vertex of the polyhedron P. There then exists y
82 Linear constraints

and z in P such that x∗ = 12 (y + z), by arbitrarily taking λ = 12 in Definition 3.34.


For all indices i such that x∗i = 0, the corresponding indices of y and z are also
necessarily zero, because y ≥ 0 and z ≥ 0. Then, y and z belong to L(x∗ ), which
contradicts the fact that x∗ is the only element.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

The characterization of vertices by active constraints is particularly useful when


developing algorithms. A more explicit representation than linear manifold is desir-
able. This is the concept of a feasible basic solution. Before introducing this notion,
we demonstrate that a non empty polyhedron represented in standard form always
contains at least one vertex.

Theorem 3.37 (Existence of a vertex). Let P = {x ∈ Rn |Ax = b, x ≥ 0} be a


polyhedron represented in standard form, with A ∈ Rm×n and b ∈ Rm and
n ≥ m. If P is non empty, it has at least one vertex.

Proof. We construct a finite number of points belonging to linear varieties (defined


by (3.81)) of decreasing dimension. The last one is the vertex of P, then proving its
existence.
Since P is non empty, there exists x0 ∈ P. If dim L(x0 ) = 0, x0 is a vertex
according to Theorem 3.36. Otherwise, there exists a straight line contained in L(x0 )
characterized by
x0 + λd, λ ∈ R
with d ∈ Rn , d 6= 0 and Ad = 0 such that di = 0 for any i ∈ A(x0 ). For each i such
that di 6= 0, we define
(x0 )i
αi = − .
di
According to Definition (3.81) of linear manifold, αi 6= 0, for all i. Indeed, if di 6= 0,
then i 6∈ A(x0 ), and (x0 )i > 0. Without loss of generality, we can assume that there
exists at least one αi > 0 (if this is not the case, they are all non positive, and we
can utilize the same approach using the straight line defined by −d). We define

α∗ = min αi
i|di >0

and j an index for which the minimum is reached, i.e., α∗ = αj . The point x1 = x0 +
α∗ d belongs to the polyhedron by construction. Moreover, (x1 )j = 0 and (x0 )j > 0.
Then, the dimension of L(x1 ) is strictly inferior to that of L(x0 ). We now need
only repeat the procedure to obtain, after a certain number of iterations k at most
equal to the dimension of L(x0 ), a point xk such that dim L(xk ) = 0. According to
Theorem 3.36, xk is a vertex of P. This proof is illustrated in Figure 3.15, where
the linear manifold {x|Ax = b} is shown. In this example, L(x0 ) is the represented
plane, L(x1 ) is the straight line corresponding to the second coordinate axis, and
L(x2 ) = {x2 }.
Constraints 83


x0
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

x1


x2 = xk

Figure 3.15: Illustration of the proof of Theorem 3.37

3.5.2 Basic solutions


The notion of a vertex is a purely geometric concept. By invoking Theorem 3.35, it
is possible to characterize it algebraically. In this case, we speak of a feasible basic
solution.

Definition 3.38 (Basic solution). Let P = {x ∈ Rn |Ax = b, x ≥ 0} be a polyhedron


represented in standard form, with A ∈ Rm×n and b ∈ Rm and n ≥ m. A vector
x ∈ Rn such that Ax = b is along with a set of indices j1 , . . . , jm said to be a basic
solution of P if
1. the matrix B = (Aj1 · · · Ajm ) composed of columns j1 , . . . , jm of the matrix A is
non singular and
2. xi = 0 if i 6= j1 , . . . , jm .
If, moreover, xB = B−1 b ≥ 0, the vector x is called a feasible basic solution.

It is common to say that the variables j1 , . . . , jm in Definition 3.38 are basic


variables, and that the others are non basic variables. Example 3.39 identifies the
basic solutions of a polygon, written in the form of a polyhedron represented in
standard form.
Example 3.39 (Basic solutions). Consider a polyhedron represented in standard
form
   

 x1 

  x2  
P= x=   | Ax = b, x ≥ 0 (3.82)

 x3  

 
x4

with
   
1 1 1 0 1
A= b= . (3.83)
1 −1 0 1 1
84 Linear constraints

x2 ✻

3
d1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

d3
❘ x1 + x2 = 1
❄ ✠

c4
■d
1 ✲
c3
d 2 x1

■ x −x =1
1 2

Figure 3.16: Feasible domain of Example 3.39

In order to view it in R2 , we represent the polygon


 
x1
P̃ = | x1 + x2 ≤ 1, x1 − x2 ≤ 1, x1 ≥ 0, x2 ≥ 0 (3.84)
x2

in Figure 3.16. Note that if (x1 , x2 , x3 , x4 )T ∈ P, then (x1 , x2 )T ∈ P̃. Furthermore,


if (x1 , x2 )T ∈ P̃, then (x1 , x2 , 1 − x1 − x2 , 1 − x1 + x2 )T ∈ P. The variables x3 and x4
are slack variables (Definition 1.4).
Each basic solution is obtained by selecting 2 variables out of 4 to be in the basis.
There is a total of 6 possible selections of basic variables.
1. Basic solution with x1 and x2 in the basis (j1 = 1, j2 = 2).
 
1
     
1 1 1 1
1  0 
B= ; B−1 = 2
1
2 ; xB = B−1 b = ;x =  
 0 .
1 −1 2 − 12 0
0

This basic solution is feasible and corresponds to point 2 in Figure 3.16.


Constraints 85

2. Basic solution with x1 and x3 in the basis (j1 = 1, j2 = 3).


 
1
     
1 1 0 1 1  0 
B= ; B−1 = ; xB = B−1 b = ;x =  
 0 .
1 0 1 −1 0
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

This basic solution is feasible and also corresponds to point 2 in Figure 3.16.
3. Basic solution with x1 and x4 in the basis (j1 = 1, j2 = 4).
 
1
     
1 0 1 0 1  0 
B= ; B−1 = ; xB = B−1 b = ;x = 
 0 .

1 1 −1 1 0
0

This basic solution is feasible and also corresponds to point 2 in Figure 3.16.
4. Basic solution with x2 and x3 in the basis (j1 = 2, j2 = 3).
 
0
     
1 1 0 −1 −1  −1 
B= ; B−1 = ; xB = B−1 b = ;x = 
 2 .

−1 0 1 1 2
0

This basic solution is not feasible because B−1 b 6≥ 0. It corresponds to point 4 in


Figure 3.16.
5. Basic solution with x2 and x4 in the basis (j1 = 2, j2 = 4).
 
0
     
1 0 1 0 1  1 
B= ; B−1 = ; xB = B−1 b = ;x =  
 0 .
−1 1 1 1 2
2

This basic solution is feasible and corresponds to point 3 in Figure 3.16.


6. Basic solution with x3 and x4 in the basis (j1 = 3, j2 = 4).
 
0
   
1 0 1  0 
B=B = −1 −1
; xB = B b = ;x =  
 1 .
0 1 1
1

This basic solution is feasible and corresponds to point 1 in Figure 3.16.

The notion of a basic solution (Definition 3.38) enables us to analyze the poly-
hedron in terms of active constraints of the optimization problem (Definition 3.4).
Let x be a feasible basic solution such that xB = B−1 b > 0. We say that it is non
degenerate. In this case, there are exactly n active constraints in x: the m equal-
ity constraints and the n − m non basic variables which are 0, and which make the
86 Linear constraints

constraints of type xi ≥ 0 active. The constraints xi ≥ 0 corresponding to the basic


variables are all inactive because xB > 0. According to Theorem 3.5, the feasible
basic solution is defined by n equations. If A is of full rank and xB > 0, there is then
a bijective relationship between the vertices of the polyhedron and the feasible basic
solutions. This equivalence between an algebraic and a geometric concept is useful
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

when developing algorithms.

Theorem 3.40 (Equivalence between vertices and feasible basic solutions). Let P =
{x ∈ Rn |Ax = b, x ≥ 0} be a polyhedron. The point x∗ ∈ P is a vertex of P if and
only if it is a feasible basic solution.

Proof. =⇒ Let x∗ be a vertex of P. We assume that x∗ is not a feasible basic solution.


We can assume without loss of generality that the matrix A is of full rank (by
removing redundant constraints). As x∗ is not a feasible basic solution, there are
strictly more than m non zero components in x∗ . Consider m linearly indepen-
dent columns of A, corresponding to non zero components of x∗ , which form an
invertible matrix B, and where the remaining n − m columns form a matrix N.
Here, x∗ can be decomposed (see Section 3.4) into one basic component xB and
one non basic component xN such that

xB = B−1 (b − NxN ).

Since x∗ is not a feasible basic solution, there exists at least one component k of
xN that is not zero. We construct the direction d for which the basic component
is
dB = −B−1 Ak ,
where Ak is the kth column of A, and the non basic component for all zero
components, except the kth one which equals 1. Therefore,
X
Ad = BdB + Ndn = −BB−1 Ak + Aj dj = −Ak + Ak = 0.
jnon basic

Then, for all α,


A(x∗ + αd) = Ax∗ + αAd = Ax∗ = b.
Since x∗k > 0, it is possible to choose α1 > 0 and α2 > 0 sufficiently small so that
x1 = x∗ + α1 d and x2 = x∗ − α2 d are in P. We take
α2
λ= .
α1 + α2
We have 0 < λ < 1 and x∗ = λx1 + (1 − λ)x2 , which contradicts the fact that x∗
is a vertex of the polyhedron.
⇐= Theorem 3.35.

It is important to note that Theorem 3.40 does not guarantee a bijective rela-
tionship between the vertices of the polyhedron and the feasible basic solutions in all
Constraints 87

cases. Indeed, when some of the components of xB = B−1 b are zero, there are more
than n active constraints, and the feasible basic solution is defined by more than n
equations in a space of n dimensions. We say in this case that we are dealing with a
degenerate feasible basic solution.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Definition 3.41 (Degenerate feasible basic solution). Let P = {x ∈ Rn |Ax = b, x ≥


0} be a polyhedron represented in standard form, with A ∈ Rm×n and b ∈ Rm and
n ≥ m. A basic solution x ∈ Rn is said to be degenerate if more than n constraints
are active at x, i.e., if more than n − m components of x are zero.

In the presence of degeneracy, a vertex may correspond to multiple feasible basic


solutions. In Example 3.39, three constraints are active at vertex 2 (Figure 3.16), even
though we only need two constraints to characterize it. Then, the first three basic
solutions identified in the example all correspond to this vertex and are degenerate.

3.5.3 Basic directions


If x is a feasible basic solution, the feasible directions in x (there are infinitely many)
can be characterized by a finite number of directions, called basic directions, and
which correspond to the edges of the polyhedron of the constraints adjacent to the
vertex corresponding to the feasible basic solution x.
To define these basic directions, we consider a feasible basic solution
   −1 
xB B b
x= = (3.85)
xN 0
where we assume, without loss of generality, that the indices of the basic variables are
the m first ones, and that B consists of the m first columns of the matrix A. Consider
a non basic variable, for instance the variable with index p, and define a direction
that gives positive values to this non basic variable, all the while maintaining the
other non basic variables at zero. Then
   
d1 d1
 ..   .. 
 .   . 
   
 dm   dm 
   
 d   
 m+1   0 
   ..   
dB    .. 
d= = .  =  . . (3.86)
dN    
 dp−1   0 
   
 dp   1 
   
 dp+1   0 
   
 ..   .. 
 .   . 
dn 0
Since part dN of the direction is defined, we now need only define dB . For this, we
invoke Theorem 3.13. To ensure that such a direction is feasible, the first condition
88 Linear constraints

is that Ad = 0. Then, by denoting Aj the jth column of A, we obtain


n
X
Ad = BdB + NdN = BdB + Aj dj = BdB + Ap = 0, (3.87)
j=m+1

that is,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

dB = −B−1 Ap . (3.88)

Definition 3.42 (Basic direction). Let P = {x ∈ Rn |Ax = b, x ≥ 0} be a polyhedron


represented in standard form, with A ∈ Rm×n and b ∈ Rm and n ≥ m and let x ∈ Rn
be a feasible basic solution of P. A direction d is called the pth basic direction in x
if p is the index of a non basic variable, and
 
dBp
dp = P (3.89)
dNp

where P is the permutation matrix corresponding to the basic solution x, dBp =


−B−1 Ap , and dNp is such that
 
T 0
P ep = , (3.90)
dNp

i.e., that all the elements of dNp are zero, except the one corresponding to the variable
p, which is 1.

Note that these directions are not always feasible, as discussed later. But first, we
illustrate the concept with Example 3.39.
Example 3.43 (Basic directions). Consider the polygon in Example 3.39, and the
feasible basic solution where x2 and x4 are in the basis. Then
   
0 0 0 1 0
 1   1 0 0 0 
x=  
 0  and P =  0 0 0 1  .

2 0 1 0 0
The basic direction corresponding to the non basic variable x1 is
       

  1 0 1 −1 1
−B−1 A1  −     −1 
d1 = P   = P 1 1 1  = P  −2 = 
1  1   1   0 ,
0
0 0 −2
and the basic direction corresponding to the non basic variable x3 is
       

  1 0 1 −1 0
−B−1 A3  −   −1   −1 
 = P 1 1 0  = P = 
d3 = P  0    0   1 .
0
1
1 1 −1
Constraints 89

These two directions are shown in Figure 3.16, from the feasible basic solution 3.
We now consider the feasible basic solution where x1 and x4 are in the basis (point
2 in Figure 3.16). Then,
   
1 1 0 0 0
 0   0 0 1 0 
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

x=  
 0  and P =  0 0 0 1  .

0 0 1 0 0

The basic direction corresponding to the non basic variable x2 is


        
  1 0 1 −1 −1
−B−1 A2  −     
 = P −1 1 −1  = P 2  =  1 ,
d̃2 = P  1     
1 1 0 
0
0 0 2

and the basic direction corresponding to the non basic variable x3 is


        
  1 0 1 −1 −1
−B−1 A3  − −1     
 = P 1 0  = P 1  =  0 .
d̃3 = P  0    0   1 
0
1
1 1 1

These directions are not represented in Figure 3.16. Finally, consider the feasible
basic solution where x1 and x2 are in the basis. Then,
   
1 1 0 0 0
 0   0 1 0 0 
x=  
 0  and P =  0 0 1
 = I.
0 
0 0 0 0 1

The basic direction corresponding to the non basic variable x3 is


  1 1
    1 
 −1
 2 2 1 −2
−B A3  − 1
− 1
0   −1 
b3 = 
d 1 = 2 2 = 2 ,
 1   1 
0
0 0

and the basic direction corresponding to the non basic variable x4 is


  1 1
    1 
 −1
 2 2 0 −2
−B A4  − 1 1
−2 1   1 
b4 = 
d 0 = 2 = 2 .
 0   0 
1
1 1

b4 is a feasible direction,
These two directions are shown in Figure 3.16. Note that d
b
whereas d3 is not.
90 Linear constraints

Theorem 3.44 (Feasible basic directions). Let P = {x ∈ Rn |Ax = b, x ≥ 0} be


a polyhedron represented in standard form, with A ∈ Rm×n and b ∈ Rm and
n ≥ m, and let x ∈ Rn be a feasible basic solution of P. If x is non degenerate
(in the sense of Definition 3.41), then any basic direction is a feasible direction
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

in x.

Proof. Let k be an arbitrary index of a non basic variable. According to Defini-


tion 3.42, we have Adk = 0. Moreover, since x is non degenerate, only its non
basic components are zero. The corresponding components of dk are non negative by
definition. Theorem 3.13 can be applied to prove the feasibility of dk .
The following theorem enables us to consider only the feasible basic directions in
order to characterize any feasible direction.

Theorem 3.45 (Combination of basic directions). Let P = {x ∈ Rn |Ax = b, x ≥ 0}


be a polyhedron represented in standard form, with A ∈ Rm×n and b ∈ Rm and
n ≥ m, and let x ∈ Rn be a feasible basic solution of P. Any feasible direction d
in x can be written as a linear combination of the basic directions, i.e.,
X
d= (d)j dj , (3.91)
j∈N

where N is the set of indices of the non basic variables, dj ∈ Rn the jth basic
direction, and (d)j ∈ R, the jth component of d.

Proof. Consider a feasible direction d, and assume without loss of generality that the
basic variables are the m first ones. According to Theorem 3.13, we have

Ad = BdB + NdN = 0

and
n
X
dB = −B−1 NdN = − (d)j B−1 Aj (3.92)
j=m+1

where (d)j ∈ R is the jth component of the vector d. By decomposing dN in the


canonical basis, we can also write
n
X
dN = (d)j ej−m (3.93)
j=m+1

where ek ∈ Rn−m is a vector for which all the components are zero, except the kth
one, which is 1. According to Definition 3.42, (3.92) and (3.93) are written as
  Xn   Xn
dB −B−1 Aj
d= = (d)j = (d)j dj , (3.94)
dN ej−m
j=m+1 j=m+1

where dj is the jth basic direction. We obtain (3.91).


Constraints 91

The proof of Theorems 3.27 and 3.28 is inspired by Nocedal and Wright (1999).
That of Theorems 3.36 and 3.37 is inspired by de Werra et al. (2003).

3.6 Exercises
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Exercise 3.1. Take the feasible set defined by the constraints

x1 − x2 + 2x3 − 2x4 + 3x5 = 3


x1 + x3 + 2x5 = 1
x2 − x3 + 2x4 − x5 = −2.

1. Identify a feasible point.


2. Verify whether the constraints are linearly independent.
Exercise 3.2. Take the feasible set defined by the constraints
 
(x1 − 1)2 + x22 − 1
h(x1 , x2 ) = = 0,
(x1 − 2)2 + x22 − 4
sin(x1 )+cos(x2 ) 2 2
g(x1 , x2 ) = −e − x1 − x2 ≤ 0.

1. Determine the set of feasible points.


2. For each one, determine the active constraints.
3. For each one, verify whether the condition of independence of the constraints
(Definition 3.8) is satisfied.
Exercise 3.3. Take the feasible set defined by the constraints

x1 + x2 ≤ 3
x1 + x3 ≤ 7
x1 ≥ 0
x2 ≥ 0
x3 ≥ 0.

1. Take the point x = (3 0 4)T . Characterize the linearized cone in x.


2. Express the constraints in standard form.
3. Identify the basic solutions, and among them, those that are feasible.
4. For each point corresponding to a feasible basic solution,
(a) characterize the linearized cone,
(b) identify the basic directions,
(c) verify whether the basic directions are in the linearized cone.
Exercise 3.4. Take the feasible set defined by the constraints

−x1 + x2 ≤1
x1 + 2x2 ≤ 4.

1. Provide a graphic representation of this feasible set.


92 Exercises

2. Express the constraints in standard form.


3. List the basic solutions, and represent them on the graph.
4. For each feasible basic solution, list the basic directions and represent them on
the graph.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Exercise 3.5. Consider the feasible set defined by the following constraints:

x1 − x2 ≥ −2
2x1 + x2 ≤ 8
x1 + x2 ≤ 5
x1 + 2x2 ≤ 10
x1 ≥ 0
x2 ≥ 0

1. Provide a graphic representation of this feasible set.


2. Enumerate the vertices of D.
3. List the basic solutions and represent them on the graph.
4. For each feasible basic solution, list the basic directions and represent them on
the graph.
5. Reformulate the same set of constraints using a minimum number of constraints
(use the graphical representation to identify them).
Exercise 3.6. Take the feasible set defined by the constraints

(x1 + 1)2 + x22 ≤ 1


(x1 − 1)2 + x22 ≤ 1.

For each x and d below,


1. verify that x is feasible,
2. specify whether the direction d is feasible in x (justify!),
3. specify whether the direction d is feasible at the limit in x (justify!).
x d x d
0 0 -1 -1 1 0 -1 0
0 0 1 1 0 1 -1 1
1 0 0 1 0 1 -1 -1
1 0 0 -1 0 1 1 1
1 0 1 0 0 1 1 -1
Exercise 3.7. Take the√optimization problem minx∈R2 x1 +x2 subject to x21 +x22 = 2,
T
and the point x̄ = (− 2, 0) . Identify the feasible directions at the limit in x̄ by
employing the following sequences:
q ! q !
− 2 − k12 − 2 − k12
and .
1
k − k1

First verify that they are indeed feasible sequences.


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 4

Introduction to duality

There are those who are subject to constraints and others who impose them. We
now take the point of view of the second category in order to analyze an optimization
problem from a different angle.

Contents
4.1 Constraint relaxation . . . . . . . . . . . . . . . . . . . . . 93
4.2 Duality in linear optimization . . . . . . . . . . . . . . . . 102
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

In Section 3.4, we attempted to explicitly eliminate the constraints of the optimiza-


tion problem by expressing one set of variables as a function of the others. In this
chapter, we use another technique to remove constraints. This technique, called con-
straint relaxation, plays an important role both theoretically and algorithmically. In
addition, it enables us to introduce the concept of duality.

4.1 Constraint relaxation


Here we introduce the concept of constraint relaxation with a simple example.
Example 4.1 (The mountaineer and the billionaire). A billionaire decides to offer a
mountaineer a prize linked to the altitude he manages to climb, at a rate of e 1 per
meter. However, for reasons only he knows, the billionaire requires the mountaineer
to stay in the Alps. The mountaineer immediately takes on the optimal strategy. He
climbs Mont Blanc and pockets e 4,807 (Figure 4.1(a)).
However, the mountaineer loves freedom and does not easily accept constraints.
After some negotiation, the billionaire allows the climber to go elsewhere than in the
Alps if he so desires, but must then pay a fine. A problem arises for the billionaire.
If the fine is too low, the climber will want to go to the Himalayas and climb Mount
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48 94 Constraint relaxation

(a) Mont Blanc (4,807 m) (b) Mount Everest (8,848 m)

Figure 4.1: Solutions for Example 4.1

Everest, culminating at 8,848 meters (Figure 4.1(b)). The billionaire thus decides to
set the fine at e 4,041. In this case, climbing Everest would give the mountaineer
8,848 − 4,041 = e 4,807, which is exactly the same amount as if he decided to climb
Mont Blanc. Therefore, the mountaineer has no interest in violating the constraint,
and the final solution to the problem is the same as with the constraint.
We can model the problem of the climber by calling his position (longitude/latitude)
x and the corresponding altitude f(x). The first problem is a constrained optimization
problem:

max f(x)
x

subject to

x ∈ Alps .

The fine imposed by the billionaire is denoted by a(x), and depends on the position
x. In particular, a(x) = 0 if x ∈ Alps. The optimization problem is now without
constraints and can be expressed as

max f(x) − a(x) .


x

Although somewhat imaginative, Example 4.1 shows us that an optimization prob-


lem can be seen from two points of view. From the viewpoint of the one solving the
problem (the mountaineer) and of the one who defines the rules of the game (the
billionaire). If we would like a constraint relaxation in order to remove them, we
must put ourselves in the place of the billionaire, so that the new rules are consistent
with the old ones. We now apply the same approach to another simple optimization
problem.
Introduction to duality 95

János von Neumann was born on December 28, 1903, in Bu-


dapest. Originally, his name did not have the noble prefix “von”.
In 1913, his father bought a title of nobility. János took the
name John von Neumann when he became an American in 1937.
Although holder of a chemistry degree from ETH (Swiss Federal
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Institute of Technology) in Zürich, he quickly turned to math-


ematics. The results by Gödel on incompleteness gave him the
incentive to abandon his work on axiomatization of set theory.
Within the framework of quantum mechanics, he unified the theories of Schrödinger
and Heisenberg. He is considered the father of game theory. It is in this context that
he developed the principle of duality. One of his most famous quotes is “If people
do not believe that mathematics is simple, it is only because they do not realize
how complicated life is!” Von Neumann died on February 8, 1957, in Washington, DC.
Figure 4.2: John von Neumann

Example 4.2 (Constraint relaxation). Consider the optimization problem

min 2x1 + x2 (4.1)


x∈R2

subject to
1 − x1 − x2 = 0
x1 ≥ 0 (4.2)
x2 ≥ 0

for which the solution is x∗ = (0, 1)T with an optimal value 1. We now relax the
constraint 1 − x1 − x2 = 0 and introduce a fine that is proportional to the violation
of the constraint, with a proportionality factor λ. This way, the fine is zero when the
constraint is satisfied. We obtain the following problem:

min 2x1 + x2 + λ(1 − x1 − x2 ) (4.3)


x∈R2

subject to
x1 ≥ 0
(4.4)
x2 ≥ 0 .

We examine different values of λ.

• If λ = 0, (4.3) becomes 2x1 + x2 and the solution to the problem is x∗ = (0, 0)T
with an optimal value of 0 (Figure 4.3). This solution violates the constraint of
the original problem, and the optimal value is lower. It is a typical case where
the penalty value is ill-suited, and where it becomes interesting to violate the
constraint.
96 Constraint relaxation

30
20
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

10
2x + y 0
-10
-20
-30 10
5
0 x2
-5
-10 -5 -10
0 5 10
x1

Figure 4.3: Objective function of Example 4.2: λ = 0

30
20
10
2−y 0
-10
-20
-30
10
5
0 x2
-5
-10 -5 0 -10
5 10
x1

Figure 4.4: Objective function of Example 4.2: λ = 2

• If λ = 2, (4.3) becomes 2 − x2 and the problem is unbounded, because the more


the value of x2 increases, the more the objective function decreases (Figure 4.4).
It is imperative to avoid such values of the penalty parameter, which generate
unbounded problems.
• Finally, if λ = 1, (4.3) becomes x1 + 1, and each x such that x1 = 0 is a solution
to the problem, with an optimal value of 1 (Figure 4.5). In this case, regardless of
the value of x2 , there is no way to get a better value than the optimal value of the
initial problem. The penalty parameter acts as a deterrent, and there is nothing
to be gained by violating the constraint.
Introduction to duality 97

30
20
10
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

1+x 0
-10
-20
-30
10
5
0 x2
-5
-10 -5 0 -10
5 10
x1

Figure 4.5: Objective function of Example 4.2: λ = 3

Consider the optimization problem (1.71)–(1.73). We generalize this idea to in-


corporate constraints in the objective function. The function thus obtained is called
Lagrangian or the Lagrangian function.

Definition 4.3 (Lagrangian function). Consider the optimization problem (1.71)–


(1.73) min f(x) subject to h(x) = 0 and g(x) ≤ 0, and consider the vectors λ ∈ Rm
and µ ∈ Rp . The function L : Rn+m+p → R defined by

L(x, λ, µ) = f(x) + λT h(x) + µT g(x)


m
X p
X (4.5)
= f(x) + λi hi (x) + µj gj (x)
i=1 j=1

is called Lagrangian or the Lagrangian function of the problem (1.71)–(1.73).

As we did in Example 4.2, we can minimize the Lagrangian function for each fixed
value of the parameters λ and µ. Indeed, the Lagrangian function now depends only
on x. The function that associates a set of parameters to the optimal value of the
associated problem is called a dual function.

Definition 4.4 (Dual function). Consider the optimization problem (1.71)–(1.73)


and its Lagrangian function L(x, λ, µ) defined by (4.5). The function q : Rm+p → R
defined by
q(λ, µ) = minn L(x, λ, µ) (4.6)
x∈R

is the dual function of the problem (1.71)–(1.73). The parameters λ and µ are called
dual variables. In this context, the variables x are called primal variables.
98 Constraint relaxation

If we take Example 4.1, −q(λ, µ) represents the mountaineer’s prize1 if the bil-
lionaire imposes a fine for violation of the constraints λT h(x) + µT g(x).
For inequality constraints, since only non negative values of g(x) should be avoided
and result in a fine, it is essential that µ ≥ 0. Indeed, the term µT g(x) is non negative,
and thus penalizing, only when g(x) > 0.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Theorem 4.5 (Bound from dual function). Let x∗ be the solution to the opti-
mization problem (1.71)–(1.73), and let q(λ, µ) be the dual function to the same
problem. Consider λ ∈ Rm and µ ∈ Rp , µ ≥ 0. Then,

q(λ, µ) ≤ f(x∗ ) , (4.7)

and the dual function provides lower bounds on the optimal value of the problem.

Proof.

q(λ, µ) = minn L(x, λ, µ) according to (4.6)


x∈R

≤ L(x , λ, µ)
= f(x∗ ) + λT h(x∗ ) + µT g(x∗ ) according to (4.5)
∗ ∗
T
= f(x ) + µ g(x ) h(x∗ ) = 0
≤ f(x∗ ) g(x∗ ) ≤ 0 and µ ≥ 0.

Corollary 4.6 (Objective functions of the primal and dual). Let x be a feasible
solution of the optimization problem (1.71)–(1.73), and let q(λ, µ) be the dual
function to the same problem. Consider λ ∈ Rm and µ ∈ Rp , µ ≥ 0. Then,

q(λ, µ) ≤ f(x). (4.8)

Proof. Denote x∗ the optimal solution of the primal problem. As x is primal feasible,
we have f(x∗ ) ≤ f(x). The results follows from Theorem (4.5).

If we take the point of view of the billionaire, the problem is to define these fines
in such a manner that the mountaineer wins as little as possible with the new system.
He tries to optimize the dual function, ensuring that the considered parameters λ and
µ ≥ 0 do not generate an unbounded problem. This optimization problem is called
the dual problem.

1 The sign of q is changed because the problem with the mountaineer is one of maximization and
not minimization.
Introduction to duality 99

Definition 4.7 (Dual problem). Consider the optimization problem (1.71)–(1.73)


and its dual function q(λ, µ) defined by (4.6). Let Xq ⊆ Rm+p be the domain of q,
i.e., 
Xq = λ, µ | q(λ, µ) > −∞ . (4.9)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

The optimization problem


max q(λ, µ) (4.10)
λ,µ

subject to
µ≥0 (4.11)
and
(λ, µ) ∈ Xq (4.12)
is the dual problem of the problem (1.71)–(1.73). In this context, the original problem
(1.71)–(1.73) is called the primal problem.

Example 4.8 (Dual problem). Take again Example 4.2:

min 2x1 + x2 (4.13)


x∈R2

subject to
h1 (x) = 1 − x1 − x2 = 0 (λ)
g1 (x) = − x1 ≤0 (µ1 ) (4.14)
g2 (x) = − x2 ≤ 0 (µ2 ) .
The Lagrangian function of this problem is

L(x1 , x2 , λ, µ1 , µ2 ) = 2x1 + x2 + λ(1 − x1 − x2 ) − µ1 x1 − µ2 x2


= (2 − λ − µ1 )x1 + (1 − λ − µ2 )x2 + λ .

In order for the dual function to be bounded, the coefficients of x1 and x2 have to be
zero, and
2 − λ − µ1 = 0 , 1 − λ − µ2 = 0 ,

or
µ1 = 2 − λ , µ2 = 1 − λ . (4.15)

Therefore, we can eliminate µ1 and µ2 so that



Xq = λ|λ ≤ 1 ,

and the dual function becomes


q(λ) = λ
100 Constraint relaxation

The dual problem is written as


max λ

subject to
λ ≤ 1,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

for which the solution is λ∗ = 1. According to the equalities (4.15), we have µ∗1 = 1
and µ∗2 = 0.

As a direct consequence of Theorem 4.5, the optimal value of this problem can
never exceed the optimal value of the original problem. This result is called the weak
duality theorem.

Theorem 4.9 (Weak duality). Let x∗ be the optimal solution to the primal prob-
lem (1.71)–(1.73) and let (λ∗ , µ∗ ) be the optimal solution to the associated dual
problem (4.10)–(4.12). Then

q(λ∗ , µ∗ ) ≤ f(x∗ ) . (4.16)

Proof. This theorem is a special case of Theorem 4.5 for λ = λ∗ and µ = µ∗ .

Corollary 4.10 (Duality and feasibility). Consider the primal problem (1.71)–
(1.73) and the associated dual problem (4.10)–(4.12).
• If the primal problem is unbounded, then the dual problem is not feasible.
• If the dual problem is unbounded, then the primal problem is not feasible.

Proof. If the optimal value of the primal problem is −∞, there is no dual variable
(λ, µ) that satisfies (4.16) and the dual problem is not feasible. Similarly, if the
optimal value of the dual problem is +∞, there is no primal variable x that satisfies
(4.16) and the primal problem is not feasible.

Corollary 4.11 (Optimality of the primal and the dual). Let x∗ be a feasible so-
lution of the primal problem (1.71)–(1.73) and let (λ∗ , µ∗ ) be a feasible solution
of the associated dual problem (4.10)–(4.12). If q(λ∗ , µ∗ ) = f(x∗ ), then x∗ is
optimal for the primal, and (λ∗ , µ∗ ) is optimal for the dual.

Proof. Consider any x feasible for the primal. From Theorem 4.5, we have

f(x) ≥ q(λ∗ , µ∗ ) = f(x∗ ),


Introduction to duality 101

proving the optimality of x∗ . Similarly, consider any (λ, µ) feasible for the dual. From
the same theorem, we have

q(λ, µ) ≤ f(x∗ ) = q(λ∗ , µ∗ ),


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

proving the optimality of (λ∗ , µ∗ ).

Corollary 4.12 (Duality and feasibility (II)). Consider the primal problem (1.71)–
(1.73) and the associated dual problem (4.10)–(4.12).
• If the primal problem is infeasible, then the dual problem is either unbounded
or infeasible.
• If the dual problem is infeasible, then the primal problem is either unbounded
or infeasible.

Proof. We show the contrapositive. If the dual problem is bounded and feasible, it
has an optimal solution. From Corollary 4.11, the primal problem has also an optimal
solution, as is therefore feasible. The second statement is shown in a similar way.

The dual problem has interesting geometric properties. Indeed, the objective
function to maximize is concave, and the domain Xq is convex.

Theorem 4.13 (Concavity-convexity of a dual problem). Let (4.10)–(4.12) be the


dual problem of an optimization problem. The objective function (4.10) is con-
cave, and the domain of the dual function (4.9) is convex.

Proof. Consider x ∈ Rn , γ = (λ, µ) and γ̄ = (λ̄, µ̄) ∈ Rm+p , such that µ, µ̄ ≥ 0,


γ, γ̄ ∈ Xq and γ 6= γ̄. Consider also α ∈ R such that 0 ≤ α ≤ 1. According to
Definition 4.3, we have

L x, αγ + (1 − α)γ̄ = αL(x, γ) + (1 − α)L(x, γ̄) .

Taking the minimum, we obtain



min L x, αγ + (1 − α)γ̄ ≥ α min L(x, γ) + (1 − α) min L(x, γ̄) (4.17)
x x x

or

q αγ + (1 − α)γ̄ ≥ αq(γ) + (1 − α)q(γ̄) , (4.18)

which demonstrates the concavity of q (Definition 2.3). Since γ and γ̄ are in Xq , then
q(γ) > −∞ and q(γ̄) > −∞. According to (4.18), we also have q(αγ+(1−α)γ̄) > −∞
and this way αγ + (1 − α)γ̄ is in Xq , proving the convexity of Xq (Definition B.2).
102 Duality in linear optimization

Giuseppe Lodovico Lagrangia, born in Turin on January 25,


1736, is often considered a French mathematician, despite his
Italian origin, and is known under the name Joseph-Louis La-
grange. It was he himself who in his youth took the French ver-
sion of the name. In 1766, he succeeded Euler as director of the
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

mathematics section of the Academy of Sciences in Berlin. He


was the first professor of analysis at the Ecole Polytechnique in
Paris (founded 1794 under the name “Ecole Centrale des Travaux
Publics.” He was a member of the Bureau des Longitudes, created on June 25, 1795.
Napoleon presented him with the Legion of Honor in 1808 and the Grand Cross of
the Imperial Order of Reunion on April 3, 1813, a few days before his death. He
contributed significantly to diverse areas, such as calculus, astronomy, analytical me-
chanics, probability, fluid mechanics and number theory. He oversaw the introduction
of the metric system, working with Lavoisier. He died on April 10, 1813, and is buried
in the Pantheon, in Paris. The funeral oration was given by Laplace.
Figure 4.6: Joseph-Louis Lagrange

4.2 Duality in linear optimization


We now analyze the dual problem in the context of linear optimization. We consider
the following (primal) problem:
min cT x (4.19)
x

subject to
Ax = b
(4.20)
x≥0

and we have
h(x) = b − Ax and g(x) = −x .

Therefore, the Lagrangian function (4.5) can be written as

L(x, λ, µ) = cT x + λT (b − Ax) − µT x
T (4.21)
= c − AT λ − µ x + λT b .

The Lagrangian function is linear in x. The only possibility for it to be bounded is if


it is constant, i.e.,
c − AT λ − µ = 0 .

In this case, the dual function is q(λ, µ) = λT b and the dual problem is written as

max λT b (4.22)
λ,µ
Introduction to duality 103

subject to
µ≥0
(4.23)
µ = c − AT λ .
By eliminating µ, renaming λ x, and changing the maximization to a minimization,
we obtain
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

min −bT x (4.24)


x

subject to
AT x ≤ c . (4.25)
This is another linear optimization problem. We calculate its dual problem. Since
there are no equality constraints, we have

L(x, µ) = −bT x + µT (AT x − c)


= (−b + Aµ)T x − µT c .

Again, for this linear function to be bounded, it has to be constant, i.e., −b + Aµ = 0.


The dual function is q(µ) = −µT c and the dual problem is written as

max −µT c
µ

subject to
µ≥0
Aµ = b .
By replacing µ by x, and transforming the maximization into a minimization, we
obtain the original problem (4.19)–(4.20). The dual of the dual problem is the primal
problem. We can now generalize these results.

Theorem 4.14 (Dual of a linear problem). Consider the following linear problem:

min cT1 x1 + cT2 x2 + cT3 x3 (4.26)


x

subject to
A1 x1 + B1 x2 + C1 x3 = b1
A2 x1 + B2 x2 + C2 x3 ≤ b2
A3 x1 + B3 x2 + C3 x3 ≥ b3
(4.27)
x1 ≥ 0
x2 ≤ 0
x3 ∈ Rn3 ,
where x1 ∈ Rn1 , x2 ∈ Rn2 , x3 ∈ Rn3 , b1 ∈ Rm , b2 ∈ Rpi and b3 ∈ Rps . The
matrices Ai , Bi , Ci , i = 1, 2, 3, have appropriate dimensions. The dual of this
problem is
max γT b = γT1 b1 + γT2 b2 + γT3 b3 (4.28)
γ
104 Duality in linear optimization

subject to
(γ1 ∈ Rm )
γ2 ≤ 0
γ3 ≥ 0
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

 (4.29)
AT1 γ1 + AT2 γ2 + AT3 γ3 = T
A γ ≤ c1

BT1 γ1 + BT2 γ2 + BT3 γ3 = BT γ ≥ c2

CT1 γ1 + CT2 γ2 + CT3 γ3 = CT γ = c3
T T
with γ = γT1 γT2 γT3 ∈ Rm+pi +ps and A = AT1 AT2 AT3 ∈
(m+pi +ps )×n1
R . The matrices B and C are defined in a similar manner.

Proof. The Lagrangian function is written as

L(x, λ, µ2 , µ3 , µx1 , µx2 ) = cT1 x1 + cT2 x2 + cT3 x3


+ λT (b1 − A1 x1 − B1 x2 − C1 x3 )
+ µT2 (A2 x1 + B2 x2 + C2 x3 − b2 )
+ µT3 (b3 − A3 x1 − B3 x2 − C3 x3 )
− µTx1 x1
+ µTx2 x2
with µ2 , µ3 , µx1 , µx2 ≥ 0. By combining the terms, we obtain

L(x, λ, µ2 , µ3 , µx1 , µx2 ) = λT b1 − µT2 b2 + µT3 b3


T
+ c1 − AT1 λ + AT2 µ2 − AT3 µ3 − µx1 x1
T
+ c2 − BT1 λ + BT2 µ2 − BT3 µ3 + µx2 x2
T
+ c3 − CT1 λ + CT2 µ2 − CT3 µ3 x3 .

Define γ1 = λ, γ2 = −µ2 and γ3 = µ3 . We immediately deduce that γ1 ∈ Rm ,


γ2 ≤ 0 and γ3 ≥ 0. We obtain the Lagrangian function

L(x, γ, µx1 , µx2 ) = γT b


T
+ c1 − AT γ − µx1 x1
T
+ c2 − BT γ + µx2 x2
T
+ c3 − CT γ x3 .

This is a linear function. For it to be bounded, is needs to be constant and

µx1 = c1 − AT γ
µx2 = BT γ − c2
CT γ = c3 .

We now need only use µx1 ≥ 0 and µx2 ≥ 0 to obtain the result.
Introduction to duality 105

Note that the problem (4.26)–(4.27) combines all the possibilities of writing the
constraints of a linear problem: equality, lower inequality, upper inequality, non pos-
itivity, and non negativity. The result can be summarized as follows:
• For each constraint of the primal there is a dual variable
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Constraint of the primal Dual variable


= free
≤ ≤0
≥ ≥0
• For each primal variable there is a dual constraint
Primal variable Dual constraint
≥0 ≤
≤0 ≥
free =

Theorem 4.15 (The dual of the dual is the primal). Consider a (primal) linear
optimization problem. If the dual is converted into a minimization problem, and
we calculate its dual, we obtain a problem equivalent to the primal problem.

Proof. In the problem (4.28)–(4.29) of Theorem 4.14, we replace γ by −x and the


maximization by a minimization:

min xT1 b1 + xT2 b2 + xT3 b3


x1 ,x2 ,x3

subject to
x 1 ∈ Rm
x2 ≥ 0
x3 ≤0
T T T
A1 x1 + A2 x2 + A3 x3 ≥ −c1
BT1 x1 + BT2 x2 + BT3 x3 ≤ −c2
CT1 x1 + CT2 x2 + CT3 x3 = −c3 .
According to Theorem 4.14, the dual of this problem is

max −c1 γ1 − c2 γ2 − c3 γ3
γ1 ,γ2 ,γ3

subject to
A1 γ1 + B1 γ2 + C1 γ3 = b1
A2 γ1 + B2 γ2 + C2 γ3 ≤ b2
A3 γ1 + B3 γ2 + C3 γ3 ≥ b3
γ1 ≥ 0
γ2 ≤ 0
γ3 ∈ Rn3 .
106 Duality in linear optimization

We now need only replace γ by x, and convert the maximization into a minimization
to obtain (4.26)–(4.27) and prove the result.

Example 4.16 (Dual of a linear problem). Consider the linear optimization problem
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

min x1 + 2x2 + 3x3

subject to
−x1 + 3x2 = 5
2x1 − x2 + 3x3 ≥ 6
x3 ≤ 4
x1 ≥ 0
x2 ≤ 0
x3 ∈ R .

The dual problem is


max 5γ1 + 6γ2 + 4γ3

subject to
γ1 ∈ R
γ2 ≥ 0
γ3 ≤ 0
−γ1 + 2γ2 ≤1
3γ1 − γ2 ≥2
3γ2 + γ3 = 3 .

This is also a linear problem. We write it as a minimization problem and rename the
variables x.
min −5x1 − 6x2 − 4x3

subject to
x1 ∈ R
x2 ≥ 0
x3 ≤ 0
x1 − 2x2 ≥ −1
−3x1 + x2 ≤ −2
−3x2 − x3 = −3 .

We can calculate its dual :


max −γ1 − 2γ2 − 3γ3
Introduction to duality 107

subject to
γ1 − 3γ2 = −5
−2γ1 + γ2 − 3γ3 ≤ −6
−γ3 ≥ −4
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

γ1 ≥ 0
γ2 ≤ 0
γ3 ∈ R .
It is easy to verify that this problem is equivalent to the original problem.

We conclude this chapter with an important result in linear optimization, called


the strong duality theorem.

Theorem 4.17 (Strong duality). Consider a linear optimization problem and its
dual. If one problem has an optimal solution, so does the other one, and the
optimal value of their objective functions are the same.

Proof. Consider A ∈ Rm×n , b ∈ Rm , c ∈ Rn , x ∈ Rn and λ ∈ Rm . Consider the


primal problem
min cT x
subject to
Ax = b, x ≥ 0,
and the dual problem
max bT λ
subject to
AT λ ≤ c.
Assume that the dual problem has an optimal solution λ∗ . Therefore, for each λ
that is dual feasible, we have bT λ ≤ bT λ∗ . Equivalently, for any ε > 0, we have
bT λ < bT λ∗ + ε. Therefore, there is no λ which verifies both the dual constraints
AT λ ≤ c and bT λ ≥ bT λ∗ + ε or, equivalently, −bT λ ≤ −bT λ∗ − ε. In other words,
the system of n + 1 linear inequalities, with m variables
   
AT c
λ≤
−bT −bT λ∗ − ε

is incompatible. According to Farkas’ lemma (Lemma C.10), there exists a vector


 
y
,
r

where y ∈ Rn , y ≥ 0, and r ∈ R, r ≥ 0, such that


 
T AT
(y r) = 0,
−bT
108 Exercises

that is
Ay − rb = 0, (4.30)
and  
T c
(y r) < 0,
−bT λ∗ − ε
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

that is
cT y − rbT λ∗ − rε < 0. (4.31)
We distinguish two cases.
r = 0 In this case, (4.30) is Ay = 0 and (4.31) is cT y < 0. Applying Farkas’ lemma
to the compatible system AT λ ≤ c (it is verified at least by λ∗ ), we obtain that
cT y ≥ 0, for each y ≥ 0 such that Ay = 0, contradicting (4.30)–(4.31). Therefore
r 6= 0.
r > 0 Divide (4.30) by r, and define x∗ = y/r to obtain

Ax∗ − b = 0. (4.32)

As y ≥ 0 and r > 0, x∗ is feasible for the primal problem. Dividing also (4.31) by
r, we obtain
cT x∗ − bT λ∗ − ε < 0. (4.33)
Denote δ = cT x∗ − bT λ∗ . By Corollary (4.6), as x∗ is primal feasible and λ∗ dual
feasible, we know that δ = cT x∗ − bT λ∗ ≥ 0. Therefore, (4.33) is written as

0 ≤ δ < ε.

As this must be true for any arbitrary small ε, we obtain δ = 0, and cT x∗ = bT λ∗ .


From Corollary (4.11), x∗ is the optimal solution of the primal problem.
As the dual of the dual is the primal (Theorem (4.15)), the result holds in the other
direction as well.
Another proof, based on the optimality conditions, is presented in Theorem 6.33.
Note that the strong duality result does not hold in general for all optimization
problems. Yet, it holds if the objective function is convex and the constraints linear
(see Bertsekas, 1999, Proposition 5.2.1).

4.3 Exercises
Exercise 4.1. Consider the optimization problem

min x21 + x22 subject to x1 = 1 .


x∈R2

1. Write the Lagrangian of this problem.


2. Write the dual function.
3. Write and solve the dual problem.
Introduction to duality 109

Exercise 4.2. Same questions as for Exercise 4.1 for the problem
1 2
min (x + x22 ) s.c. x1 ≥ 1 .
x∈R2 2 1

Exercise 4.3. Consider a matrix A ∈ Rn×n such that AT = −A, and the vector
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

c ∈ Rn . Consider the optimization problem

min cT x
x∈Rn

subject to
Ax ≥ −c
x ≥ 0.
Demonstrate that this problem is self-dual, i.e., that the dual problem is equivalent
to the primal problem.
Exercise 4.4. Consider the optimization problem

min −3x1 + 2x2


x∈R2

subject to
x1 − x2 ≤ 2
−x1 + x2 ≤ −3
x1 , x2 ≥ 0.
1. Write the Lagrangian.
2. Write the dual function.
3. Write the dual problem.
4. Represent graphically the feasible set of the primal problem.
5. Represent graphically the feasible set of the dual problem.
Exercise 4.5. Same questions for the following problem.

min −x1 − x2
x∈R2

subject to
−x1 + x2 ≤ 1
1
x1 − x2 ≤ 0
2
x1 , x2 ≥ 0 .
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Part II

Optimality conditions
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

As far as the laws of mathematics


refer to reality, they are not certain;
and as far as they are certain, they
do not refer to reality.

Albert Einstein

Before developing algorithms that enable us to identify solutions to an optimization


problem, we must be able to decide whether a given point is optimal or not. These
optimality conditions have three key roles in the development of algorithms:
1. they provide a theoretical analysis of the problem,
2. they directly inspire ideas for algorithms,
3. they render it possible to determine a stopping criterion for iterative algorithms.
We analyze them in detail and in a gradual manner, starting with the simplest
ones.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 5

Unconstrained optimization

Contents
5.1 Necessary optimality conditions . . . . . . . . . . . . . . . 115
5.2 Sufficient optimality conditions . . . . . . . . . . . . . . . 120
5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.1 Necessary optimality conditions


Consider the problem of unconstrained optimization (1.71) minx∈Rn f(x) and a local
minimum x∗ , as defined in Definition 1.5. We attempt to characterize the minimum
by using the results developed in Chapter 2. The first optimality condition is a
generalization of a well-known result, attributed to Fermat.

Theorem 5.1 (Necessary optimality conditions). Let x∗ be a local minimum of a


function f : Rn → R. If f is differentiable in an open neighborhood V of x∗ , then,

∇f(x∗ ) = 0 . (5.1)

If, in addition, f is twice differentiable on V, then

∇2 f(x∗ ) is positive semidefinite . (5.2)

Condition (5.1) is said to be a first-order necessary condition, and condition


(5.2) is said to be a second-order necessary condition.

Proof. We recall that −∇f(x∗ ) is the direction of the steepest descent in x∗ (Theorem
2.13) and assume by contradiction that ∇f(x∗ ) 6= 0. We can then use Theorem 2.11
with the descent direction d = −∇f(x∗ ) to obtain η such that

f x∗ − α∇f(x∗ ) < f(x∗ ) , ∀α ∈ ]0, η] ,
116 Necessary optimality conditions

which contradicts the optimality of x∗ and demonstrates the first-order condition. To


demonstrate the second-order condition, we invoke Taylor’s theorem (Theorem C.2)
in x∗ , with an arbitrary direction d and an arbitrary step α > 0 such that x∗ +αd ∈ V.
As
1 
f(x∗ + αd) − f(x∗ ) = αdT ∇f(x∗ ) + α2 dT ∇2 f(x∗ )d + o kαdk2 ,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2
we have
1 2 T 2 ∗ 
f(x∗ + αd) − f(x∗ ) = α d ∇ f(x )d + o kαdk2 from (5.1)
2
1
= α2 dT ∇2 f(x∗ )d + o(α2 ) kdk does not depend on α
2
≥0 x∗ is optimal .

When we divide by α2 , we get

1 T 2 ∗ o(α2 )
d ∇ f(x )d + ≥ 0. (5.3)
2 α2
Intuitively, as the second term can be made as small as desired, the result must hold.
More formally, let us assume by contradiction that dT ∇2 f(x∗ )d is negative and that
its value is −2ε, with ε > 0. According to the Landau notation o(·) (Definition B.17),
for all ε > 0, there exists η such that

o(α2 )
< ε, ∀0 < α ≤ η ,
α2
and
1 T 2 ∗ o(α2 ) 1 T 2 ∗ o(α2 ) 1
d ∇ f(x )d + ≤ d ∇ f(x )d + < − 2ε + ε = 0 ,
2 α2 2 α2 2
which contradicts (5.3) and proves that dT ∇2 f(x∗ )d ≥ 0. Since d is an arbitrary
direction, ∇2 f(x∗ ) is positive semidefinite (Definition B.8).
From a geometrical point of view, the second-order condition means that f is
locally convex in x (Theorem 2.21).
Example 5.2 (Affine function). Consider an affine function (see Definition 2.25):

f(x) = cT x + d, (5.4)

where c ∈ Rn is a vector of constants and d ∈ R. Then, ∇f(x) = c and ∇2 f(x) = 0.


Therefore, the necessary optimality conditions are verified for every x if c = 0, and
for no x if c 6= 0. The geometric interpretation is that an affine function is bounded
only if it is constant. We have used this property in Section 4.1 to derive the dual
problem.

Example 5.3 (Necessary optimality condition – I). Consider the function


2 2
f(x1 , x2 ) = 100 x2 − x21 + 1 − x1
Unconstrained optimization 117

illustrated in Figure 5.1 (see Section 11.6 for a discussion of this function). The point
T
1 1 is a local minimum of the function. We have
 
400 x31 − 400 x1 x2 + 2x1 − 2
∇f(x1 , x2 ) = ,
200 x2 − 200 x21
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

T
which is indeed zero in 1 1 . Moreover,
 
1200 x21 − 400 x2 + 2 −400 x1
∇2 f(x1 , x2 ) = ,
−400 x1 200
T
which, in 1 1 , is:
 
802 −400
∇2 f(1, 1) = ,
−400 200

for which the eigenvalues are positive (0.39936 and 1,001.6) and the Hessian matrix
T
is positive semidefinite. Note that the conditioning of f in 1 1 is high (2,508)
and that the function is ill-conditioned at the solution (Section 2.5).

x1 x2

Figure 5.1: Function of Example 5.3

It is important to emphasize that the necessary optimality conditions are not


sufficient, as shown by Examples 5.4 and 5.5.
Example 5.4 (Necessary optimality condition – II). Consider the function

f(x1 , x2 ) = −x41 − x42


118 Necessary optimality conditions

T
illustrated in Figure 5.2. The point 0 0 satisfies the necessary optimality
conditions. Indeed,  
−4x31
∇f(x1 , x2 ) =
−4x32
T
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

is zero in 0 0 . Moreover,
 
2 −12 x21 0
∇ f(x1 , x2 ) =
0 −12 x22
T
is positive semidefinite in 0 0 . However, this is not a local minimum.
T To
demonstrate this, consider a non zero arbitrary direction
T d = d1 d2 and take
a step α > 0 of any length from the point 0 0 . We have

0 = f(0, 0) > f(αd1 , αd2 ) = −(αd1 )4 − (αd2 )4


T
and 0 0 turns out to be a local maximum. From a geometrical point of view,
T
the function is in fact concave and not convex in 0 0 .

x2
x1

Figure 5.2: Function of Example 5.4

Example 5.5 (Necessary optimality condition – III). Consider the function

f(x1 , x2 ) = 50 x21 − x32


T
illustrated in Figure 5.3. The point 0 0 satisfies the necessary optimality
conditions. Indeed,  
100 x1
∇f(x1 , x2 ) =
−3x22
Unconstrained optimization 119

T
is zero in 0 0 . Moreover,
 
2 100 0
∇ f(x1 , x2 ) =
0 −6x2
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

T
is positive semidefinite in 0 0 . However,
T it is not a local minimum. To show
this, consider the direction d = 0 1 and take any one step α > 0 from the
T
point 0 0 . We have

0 = f(0, 0) > f(0, α) = −α3


T
and 0 0 is not a local minimum. However, if we consider the direction d =
T
0 −1 , we obtain
0 = f(0, 0) < f(0, −α) = α3 .
T
Then, 0 0 is not a local maximum either. From a geometrical viewpoint, the
T
function is neither concave nor convex in 0 0 .

x2 x1

Figure 5.3: Function of Example 5.5

In practice, the second-order necessary condition is difficult to check, as this re-


quires calculations of the second derivatives and analyses of the eigenvalues of the
Hessian matrix. The first-order necessary optimality condition plays a central role in
optimization. The vectors x that satisfy this condition are called critical points or
stationary points. Among them, there are local minima, local maxima, and points
that are neither (Example 5.5). The latter are called saddle points.
120 Sufficient optimality conditions

Definition 5.6 (Critical point). Let f : Rn → R be a differentiable function. Any


vector x ∈ Rn such that ∇f(x) = 0 is said to be a critical point or stationary point
of f.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

5.2 Sufficient optimality conditions

Theorem 5.7 (Sufficient optimality conditions). Consider a function f : Rn →


R twice differentiable in an open subset V of Rn and let x∗ ∈ V satisfy the
conditions
∇f(x∗ ) = 0 (5.5)
and
∇2 f(x∗ ) is positive definite . (5.6)

In this case, x is a local minimum of f.

Proof. We assume by contradiction that there exists a direction d and η > 0 such
that, for any 0 < α ≤ η, f(x∗ + αd) < f(x∗ ). With an identical approach to the proof
of Theorem 5.1, we have

f(x∗ + αd) − f(x∗ ) 1 o(α2 )


2
= dT ∇2 f(x∗ )d +
α 2 α2
and
1 T 2 ∗ o(α2 )
d ∇ f(x )d + <0
2 α2
or
1 T 2 ∗ o(α2 )
d ∇ f(x )d + +ε=0
2 α2
with ε > 0. According to the definition of the Landau notation o( · ) (Definition
B.17), there exists η̄ such that

o(α2 )
< ε, ∀α, 0 < α ≤ η̄ ,
α2
and then, for any α ≤ min(η, η̄), we have

o(α2 ) o(α2 )
− ≤ < ε,
α2 α2
such that
1 T 2 ∗ o(α2 )
d ∇ f(x )d = − 2 − ε < 0 ,
2 α
which contradicts the fact that ∇2 f(x∗ ) is positive definite.
Unconstrained optimization 121

Example 5.8 (Optimality conditions). Consider the function

1 2
f(x1 , x2 ) = x + x1 cos x2
2 1
illustrated in Figure 5.4. We use the optimality conditions to identify the minima of
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

this function. We have


 
x1 + cos x2
∇f(x1 , x2 ) = .
−x1 sin x2

3
2
1
0
-1

-6
-4
-2
0
x2
2 -1.5
4 -0.5 -1
0.5 0
6 1
1.5 x1

Figure 5.4: Function of Example 5.8: the surface

T T
This gradient is zero for x∗k = (−1)k+1 , kπ , k ∈ Z, and for x̄k = 0, π2 + kπ ,
k ∈ Z, as illustrated in Figure 5.5. We also have
 
2 1 − sin x2
∇ f(x1 , x2 ) = .
− sin x2 −x1 cos x2

By evaluating this matrix in x∗k , we get for any k


 
1 0
∇2 f(x∗k ) = .
0 1

Since this matrix is positive definite, each point x∗k satisfies the sufficient optimality
conditions and is a local minimum of the function.
By evaluating the Hessian matrix in x̄k , we get for any k
 
2 1 (−1)k+1
∇ f(x̄k ) = .
(−1)k+1 0
122 Sufficient optimality conditions

x∗2 6
x̄1
4
x∗1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

x̄0 2

x∗0 0 x2
x̄−1 -2
x∗−1
-4
x̄−2
x∗−2 -6
-1.5 -1 -0.5 0 0.5 1 1.5
x1

Figure 5.5: Function of Example 5.8: stationary points

Regardless of k, this matrix is not positive semidefinite. Therefore, there is no x̄k


that satisfies the necessary optimality conditions. None of them can then be a local
minimum.

We now present a sufficient condition for a local minimum to also be a global


minimum.

Theorem 5.9 (Sufficient global optimality conditions). Consider a continuous


function f : Rn → R and let x∗ ∈ Rn be a local minimum of f. If f is a convex
function, then x∗ is a global minimum of f. If, moreover, f is strictly convex, x∗
is the only global minimum of f.

Proof. We assume by contradiction that there exists another local minimum x+ 6= x∗ ,


such that f(x+ ) < f(x∗ ). By the convexity of f (Definition 2.1), we have

f αx∗ + (1 − α)x+ ≤ αf(x∗ ) + (1 − α)f(x+ ),

where 0 ≤ α ≤ 1. Since f(x+ ) < f(x∗ ), we have for each α ∈ [0, 1[



f αx∗ + (1 − α)x+ < αf(x∗ ) + (1 − α)f(x∗ ) = f(x∗ ) . (5.7)

It means that any point strictly between x∗ and x+ is also strictly better than x∗ .
Consider an arbitrary ε > 0, and demonstrate that Definition 1.5 of the local minimum
is contradicted. If ε ≥ kx∗ − x+ k, (1.75) is not satisfied for x = x+ , when taking α = 1
in (5.7). If ε < kx∗ − x+ k, consider 0 < η < 1 such that kηx∗ + (1 − η)x+ k = ε. In
this case, (1.75) is not satisfied for x = αx∗ + (1 − α)x+ with η ≤ α < 1 according to
(5.7). Since η < 1, such α always exist.
Unconstrained optimization 123

We now consider a strictly convex function, and assume that x∗ and y∗ are two
distinct global minima, and then x∗ 6= y∗ and f(x∗ ) = f(y∗ ). According to Definition
2.2, we have

f αx∗ + (1 − α)y∗ < αf(x∗ ) + (1 − α)f(y∗ ) = f(x∗ ) = f(y∗ ) , ∀α ∈ ]0, 1[ ,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

which contradicts that x∗ and y∗ are global minima.

Pierre de Fermat was born in Beaumont-de-Lomagne close to


Montauban on August 20, 1601, and died in Castres on January
12, 1665. With the exception of a few isolated articles, Fermat
never published and never gave any publicity to his methods.
Some of his most important results were written in the mar-
gin of books, the most often without proof. For instance, his
“observations on Diophante,” an important part of his work on
number theory, was published by his son on the basis of margin
notes in a copy of Arithmetica. Fermat’s conjecture is probably
the most famous of his intuitions. It affirms that when n ≥ 3,
there exists no non zero integer numbers x, y and z such that
xn + yn = zn . He wrote the following note in the margin of
Arithmetica by Diophante: “I have a marvelous demonstration, but this margin is
too narrow to contain it.” This conjecture, called Fermat’s last theorem, was proven
by Wiles (1995). Fermat’s body was transferred from Castres to the Augustinian
Convent in Toulouse in 1675.
Figure 5.6: Pierre de Fermat

We conclude this chapter with a discussion of the optimality conditions for quadratic
problems (Definition 2.28).

Theorem 5.10 (Optimality conditions for quadratic problems). We consider the


problem
1
min f(x) = xT Qx + gT x + c , (5.8)
x∈Rn 2
where Q ∈ n × n is a symmetric matrix, g ∈ Rn and c ∈ R.
1. If Q is not positive semidefinite, then the problem (5.8) has no solution, i.e.,
there is no x ∈ Rn that is a local minimum of (5.8).
2. If Q is positive definite, then

x∗ = −Q−1 g (5.9)

is the only global minimum of (5.8).


3. If Q is positive semidefinite, but not positive definite,
124 Sufficient optimality conditions

either the problem is not bounded or there is an infinite number of global


minima. More precisely, we consider the Schur decomposition of Q (see
Theorem C.5):
Q = UΛUT , (5.10)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

where U is an orthogonal matrix (composed of the eigenvectors of Q organized


in columns) and Λ is a diagonal matrix with the eigenvalues of Q as entries.
As Q is positive semidefinite but not positive definite, it means that it is rank
deficient, so that r eigenvalues are positive and n − r are zero. We can sort
the indices in such a way that the r first eigenvalues on the diagonal are
positive, and the n − r last are zero:
   
Λr 0 Λr 0
Λ= = = diag(λ1 , . . . , λr , 0, . . . , 0). (5.11)
0 Λn−r 0 0

We decompose the vectors x and g as follows:


   
T yr T gr
U x= and U g = , (5.12)
yn−r gn−r

where yr , gr ∈ Rr and yn−r , gn−r ∈ Rn−r . Therefore, if gn−r 6= 0, the problem


is unbounded. If gn−r = 0, then for any yn−r ∈ Rn−r ,
 
∗ −Λ−1r gr
x =U (5.13)
yn−r

is a global minimum of (5.8).

Proof. We have ∇f(x) = Qx + g and ∇2 f(x) = Q.


1. We assume by contradiction that there exists a local minimum x∗ of (5.8). Accord-
ing to (5.2) of Theorem 5.1, ∇2 f(x) = Q is positive semidefinite, which contradicts
the hypothesis.
2. Since Q is positive definite, the point x∗ in (5.9) is indeed definite and
∇f(x∗ ) = −QQ−1 g + g = 0 .
The sufficient optimality conditions (5.5) and (5.6) are satisfied and x∗ is a lo-
cal minimum of f. Moreover, according to Theorem 2.21, f is strictly convex.
According to Theorem 5.9, x∗ is the only global minimum.
3. Using the Schur decomposition, and the fact that U is orthogonal (so that UUT =
I), we write the objective function of (5.8) as
1 T 1
f(x) = x Qx + gT x + c = xT UΛUT x + gT UUT x + c. (5.14)
2 2
Using (5.12), we obtain
1 T
f(yr , yn−r ) = y Λr yr + gTr yr + gTn−r yn−r + c. (5.15)
2 r
Unconstrained optimization 125

The gradient is  
Λr yr + gr
∇f(y) = .
gn−r
If gn−r 6= 0, the gradient is different from zero for any value of y, and the necessary
optimality condition is never verified. Now, if gn−r = 0, the variables yn−r do
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

not affect the objective function. We fix yn−r to any arbitrary value and solve
the problem for yr :
1
min f(yr ) = yTr Λr yr + gTr yr . (5.16)
2
As Λr is positive definite, the first result of this theorem applies, and

y∗r = −Λ−1
r gr . (5.17)

We obtain (5.13) using (5.12).

The last result of Theorem 5.10 has a geometric interpretation. The Schur de-
composition actually identifies one subspace where the quadratic function is strictly
convex (the subspace corresponding to the positive eigenvalues), and one subspace
where the function is linear (the subspace corresponding to zero eigenvalues). In this
latter subspace, in order to guarantee that the function is bounded, the linear part
must be constant, which corresponds to the condition gn−r = 0 (see Example 5.2).

5.3 Exercises
For the following optimization problems :
1. Calculate the gradient and the Hessian of the objective function.
2. Identify the critical points.
3. Eliminate those that do not satisfy the necessary optimality conditions.
4. Identify those that satisfy the sufficient optimality conditions.
Exercise 5.1. min x21 + x22 .
x∈R2

1 3
Exercise 5.2. min x + x32 − x1 − x2 .
3 1
x∈R2

1
Exercise 5.3. min x2 + .
x∈R x−2
Exercise 5.4. min x61 − 3x41 x22 + 3x21 x42 − x62 .
x∈R2

Exercise 5.5. min f(x), where f is defined by one of the functions of Exercise 2.2.
x∈R2
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 6

Constrained optimization

Contents
6.1 Convex constraints . . . . . . . . . . . . . . . . . . . . . . 128
6.2 Lagrange multipliers: necessary conditions . . . . . . . . 133
6.2.1 Linear constraints . . . . . . . . . . . . . . . . . . . . . . 133
6.2.2 Equality constraints . . . . . . . . . . . . . . . . . . . . . 137
6.2.3 Equality and inequality constraints . . . . . . . . . . . . . 142
6.3 Lagrange multipliers: sufficient conditions . . . . . . . . 152
6.3.1 Equality constraints . . . . . . . . . . . . . . . . . . . . . 153
6.3.2 Inequality constraints . . . . . . . . . . . . . . . . . . . . 154
6.4 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . 159
6.5 Linear optimization . . . . . . . . . . . . . . . . . . . . . . 165
6.6 Quadratic optimization . . . . . . . . . . . . . . . . . . . . 171
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

The development of optimality conditions in the presence of constraints is based on


the same intuition as in the unconstrained case: it is impossible to descend from a
minimum. However, we can no longer apply the optimality conditions described in
Chapter 5, as illustrated by the following example.
Example 6.1 (Necessary optimality condition without constraint). Consider the
problem
min f(x) = x2
subject to
x ≥ 1.
The solution to the problem is x = 1. And yet, f ′ (1) = 2 6= 0.

Here, instead of verifying that no direction is a descent direction, we must only


take into account the feasible directions and, if there is none, the feasible directions
at the limit (see Definition 3.21 and the discussions of Section 3.3). Theorem 6.2
expresses that if x∗ is a local minimum, no feasible direction at the limit is a descent
direction.
128 Convex constraints

Theorem 6.2 (Necessary optimality conditions for general constraints). Let x∗ be


a local minimum for the optimization problem minx∈Rn f(x) subject to h(x) = 0,
g(x) ≤ 0 and x ∈ X defined in Section 1.4. Here,

∇f(x∗ )T d ≥ 0
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

(6.1)

for any feasible direction d at the limit in x∗ .

Proof. We assume by contradiction that there exists a feasible direction at the


 limit
d such that ∇f(x∗ )T d < 0 and let us consider a feasible (sub-)sequence xk k∈N of
Definition 3.21. According to Taylor’s theorem (Theorem C.1) with d = xk − x∗ , we
have
T 
f(xk ) = f(x∗ ) + xk − x∗ ∇f(x∗ ) + o kxk − x∗ k . (6.2)

Since d = limk (xk − x∗ ), and ∇f(x∗ )T d < 0, there exists an index K such that
T 
xk − x∗ ∇f(x∗ ) < 0 for all k ≥ K. In addition, the term o kxk − x∗ k can be made
as small as desired by making k sufficiently large (see Theorem 2.11 or Theorem 5.1
for a more formal analysis of this result). Therefore, there exists an index k large
enough that f(xk ) < f(x∗ ), which contradicts the local optimality of x∗ .

This general result does not take into account a possible structure in the con-
straints. We now propose optimality conditions for specific problems.

6.1 Convex constraints


We now consider the optimization problem min f(x) subject to x ∈ X, where X is a
closed non empty convex set. We obtain a specific version of Theorem 6.2.

Theorem 6.3 (Necessary optimality conditions for convex constraints). Let x∗ be a


local minimum to the optimization problem

min f(x) ,
x∈X

where f : Rn −→ R is differentiable at X and X is a non empty convex set. Then,


∀x ∈ X, T
∇f x∗ (x − x∗ ) ≥ 0 . (6.3)

Proof. We assume by contradiction that (6.3) is not satisfied. In this case, according
to Definition 2.10, the direction d = x − x∗ is a descent direction. According to
Theorem 2.11, there exists η > 0 such that

f(x∗ + αd) < f(x∗ ) , ∀α ∈ [0, η] . (6.4)


Constrained optimization 129

Moreover, according to Theorem 3.11, d is a feasible direction and x∗ + αd is feasible


for any 0 < α ≤ 1. Then, for each 0 < α ≤ min(η, 1), we have x∗ + αd ∈ X and
f(x∗ + αd) < f(x∗ ). This contradicts the local optimality of x∗ .
The condition (6.3) signifies geometrically that any feasible direction should form
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

an acute angle with the gradient, as illustrated in Figure 6.1. When the convex set
has a particular structure, the necessary optimality conditions can be simplified, as
shown in Example 6.4.

3
X ∇f(x∗ )
2

1 x − x∗
x∗•
0

-1

-2

-3
-3 -2 -1 0 1 2 3

Figure 6.1: Illustration of the necessary optimality condition

Example 6.4 (Bound constraints). Consider the optimization problem

min f(x)
x∈X⊂Rn

with

X = x | ℓi ≤ xi ≤ ui , i = 1, . . . , n , (6.5)

where ℓi 6= ui , for any i = 1, . . . , n. Let x∗ be a local minimum of this problem. Since


the condition (6.3) should be satisfied for all x ∈ Rn , we select some specific values
to derive necessary conditions. Each time, we select an arbitrary index i and choose
x ∈ Rn such that xj = x∗j for all j 6= i. For such x, the condition (6.3) simplifies to

∂f(x∗ )
(xi − x∗i ) ≥ 0 . (6.6)
∂xi

We now need to specify xi and verify that ℓi ≤ xi ≤ ui in order to obtain a feasible


point and apply the necessary optimality condition. We consider three cases.
1. x∗i = ℓi . If we choose
ui − ℓi ui + ℓi
xi = ℓi + = ,
2 2
130 Convex constraints

xi is located exactly halfway between ℓi and ui , and x is feasible. Moreover,


xi − x∗i = (ui − ℓi )/2 > 0 because ui > ℓi . The condition (6.6) implies that

∂f(x∗ )
≥ 0.
∂xi
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2. x∗i = ui . If we choose
ui − ℓi ui + ℓi
xi = ui − = ,
2 2
x is feasible. Moreover, xi − x∗i = −(ui − ℓi )/2 < 0. The condition (6.6) implies
that
∂f(x∗ )
≤ 0.
∂xi
3. ℓi < x∗i < ui . If we choose
ui + x∗i
xi = ,
2
x is feasible. Moreover, xi − x∗i = (ui − x∗i )/2 > 0 because x∗i < ui . The condition
(6.6) implies that
∂f(x∗ )
≥ 0.
∂xi
If we choose
ℓi + x∗i
xi = ,
2
x is feasible. Moreover, xi − x∗i = (ℓi − x∗i )/2 < 0 because ℓi < x∗i . The condition
(6.6) implies that
∂f(x∗ )
≤ 0.
∂xi
By combining these two results, we get

∂f(x∗ )
= 0.
∂xi

Then, in the case of bound constraints defined by (6.5), the necessary optimality
conditions can be written as
∂f(x∗ )
≥ 0, if x∗i = ℓi
∂xi
∂f(x∗ )
≤ 0, if x∗i = ui
∂xi
∂f(x∗ )
= 0, if ℓi < x∗i < ui
∂xi

for any i such that ℓi < ui . Finally, let us note that, in the case where ℓi = ui , each
feasible x is such that xi = ℓi = ui = x∗i and the condition (6.6) is trivially satisfied,
regardless of the value of ∂f(x∗ )/∂xi .
Constrained optimization 131

Figure 6.2(b) illustrates the problem

min f(x) = x21 + x22

subject to
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

0.7 ≤ x1 ≤ 2
−1 ≤ x2 ≤ 1 .
T T
The solution is x∗ = 0.7 0 and ∇f(x∗ ) = 1.4 0 . Since x∗1 = ℓ1 = 0.7, we
∗ ∗ ∗
have ∂f(x )/∂x1 ≥ 0. Since ℓ2 < x2 < u2 , we have ∂f(x )/∂x2 = 0.
Figure 6.2(a) illustrates the problem

min f(x) = x21 + x22

subject to
−2 ≤ x1 ≤ −0.7
−1 ≤ x2 ≤ 1 .
T T
The solution is x∗ = −0.7 0 and ∇f(x∗ ) = −1.4 0 . Since x∗1 = u1 =
∗ ∗ ∗
−0.7, we have ∂f(x )/∂x1 ≤ 0. Since ℓ2 < x2 < u2 , we have ∂f(x )/∂x2 = 0.

∇f(x∗ ) •x∗ x∗• ∇f(x∗ )

(a) Upper bound active (b) Lower bound active

Figure 6.2: Illustration of the necessary optimality condition for bound constraints

Theorem 6.5 (Sufficient optimality conditions for convex constraints – I). Consider
the optimization problem
min f(x) ,
x∈X

where X is a closed non empty convex set, and f : Rn −→ R is differentiable and


convex at X. Then, (6.3) is a sufficient condition for x∗ to be a global minimum
of f in X.
132 Convex constraints

Proof. According to Theorem 2.16, we have


T
f(x) − f(x∗ ) ≥ x − x∗ ∇f(x∗ ) , ∀x ∈ X .

If (6.3) is satisfied, then


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

f(x) − f(x∗ ) ≥ 0 , ∀x ∈ X ,

and x∗ is a global minimum (Definition 1.7)


Example 6.6 (Projection on a convex set). Let X be a closed non empty convex set
for Rn and let us take z ∈ Rn . The projection of z over X, denoted by [z]P , is defined
as the unique solution of the following optimization problem:

1
min f(x) = (x − z)T (x − z) subject to x ∈ X.
x 2
Since f is convex and ∇f(x) = x − z, a necessary and sufficient condition for x∗ to be
the projection on z over X is
T
x∗ − z (x − x∗ ) ≥ 0 , ∀x ∈ X . (6.7)

Note that if z ∈ X, then (6.7) implies that x∗ = z.

It is interesting to characterize the optimality condition (6.3) by using the projec-


tion operator.

Theorem 6.7 (Optimality conditions for convex constraints – II). Consider the
optimization problem
min f(x) ,
x∈X

where X is a closed non empty convex set and f : Rn −→ R is differentiable. If


x∗ is a local minimum, then
 P
x∗ = x∗ − α∇f(x∗ ) , ∀α > 0 . (6.8)

If, moreover, f is convex, (6.8) is sufficient for x∗ to optimize f over X.

 P
Proof. Consider z(α) = x∗ − α∇f(x∗ ). According to (6.7), we have z(α) = x∗ for
all α > 0 if and only if
T
x∗ − z(α) (x − x∗ ) ≥ 0 , ∀x ∈ X , ∀α > 0 ,

or
T
x∗ − x∗ + α∇f(x∗ ) (x − x∗ ) ≥ 0 , ∀x ∈ X , ∀α > 0 .

The latter equation is equivalent to the optimality condition (6.3).


Constrained optimization 133

6.2 Lagrange multipliers: necessary conditions


Theorem 6.2 is based on the notion of feasible directions at the limit, which form
the tangent cone (Definition 3.22). As we discussed in the last part of Section 3.3,
this notion is too complex and the linearized cone (Definition 3.23) is much easier
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

to handle. The most common cases, where the linearized cone is equivalent to the
tangent cone, are optimization problems with linear constraints (Theorem 3.27) and
linearly independent constraints (Theorem 3.28). Therefore, we present the results for
only these cases. It is also possible to develop optimality conditions by considering the
linearized cone from a general point of view. The details are described in Mangasarian
(1979) and Nocedal and Wright (1999). The necessary optimality conditions are
generally called Karush-Kuhn-Tucker conditions or KKT conditions. In fact, for
many years, they were called Kuhn-Tucker conditions, following the article by Kuhn
and Tucker (1951). It later turned out that Karush (1939) had already formulated
them independently. John (1948) proposed a generalization a decade later (Theorem
6.12).
Note that the theory of Lagrange multipliers extends beyond the optimality con-
ditions presented in this book and that they can also be adapted to non differentiable
optimization. We refer the interested reader to Bertsekas (1982) and Rockafellar
(1993).
In this text, we adopt the approach of Bertsekas (1999), who first presents these
conditions in the case of linear constraints, and then for problems including linearly
independent equality constraints. The proof provides intuitions that are reused in
the development of algorithms. Subsequently, we generalize the result for problems
that also include inequality constraints.

6.2.1 Linear constraints


Consider the problem
min f(x) (6.9)
x∈Rn

subject to
Ax = b (6.10)

with A ∈ Rm×n and b ∈ Rm . According to Theorem 3.6, the matrix A can be


considered of full rank without loss of generality. In this case, the Karush-Kuhn-
Tucker conditions are formulated in the following manner.

Theorem 6.8 (Karush-Kuhn-Tucker: linear case). Let x∗ be a local minimum of


the problem minx∈Rn f(x) subject to Ax = b, where f : Rn → R is differentiable
and A ∈ Rm×n is of full rank. There thus exists a single vector λ∗ ∈ Rm such
that
∇x L(x∗ , λ∗ ) = ∇f(x∗ ) + AT λ∗ = 0 , (6.11)
134 Lagrange multipliers: necessary conditions

Albert William Tucker was born on November 25, 1905 in On-


tario, Canada, and died in Hightstown, New Jersey on January
25, 1995. He received his PhD in 1932 from Princeton University,
where he spent most of his career. He is particularly known for
his work on linear programming and game theory. After Dantzig
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

visited von Neumann in 1948, Tucker drove Dantzig back to the


train station in Princeton. It was during this short ride that
Dantzig exposed linear programming to Tucker. In the context
of game theory, von Neumann is generally cited as the inventor of
the duality theorem, and Tucker, Kuhn and Wales as the first to
propose a rigorous proof. In 1950, Tucker proposed the example
of the prisoner’s dilemma to illustrate the difficulty of non zero-sum games. John
Nash, Nobel Laureate in Economics 1994, was one of his students at Princeton. The
story goes that von Neumann did not agree with the approach of Nash’s game theory,
while Tucker encouraged him to develop his ideas and supervised his doctoral thesis.
Figure 6.3: Albert William Tucker

where L is the Lagrangian function (Definition 4.3). If f is twice differentiable,


then
yT ∇2xx L(x∗ , λ∗ )y ≥ 0 , ∀y ∈ Rn such that Ay = 0 . (6.12)

Proof. We employ the technique for the elimination of constraints described in Section
3.4 to convert the optimization problem with constraints into an optimization problem
without constraint (3.70). To simplify the proof, we can assume that the variables
are arranged in a way that the m variables to eliminate are the m first ones. Then,
P = I in (3.70) and the minimization problem is
 −1  
B b − NxN
min g(xN ) = f . (6.13)
xN ∈Rn−m xN

If x∗ = (x∗B , x∗N ) is a local minimum of the problem with constraints, then x∗N is a
local minimum of the problem without constraints and the necessary condition (5.1)
applies to (6.13). By using chain rule differentiation (see (C.6) of Theorem C.3) we
obtain:
∇g(x∗N ) = −NT B−T ∇B f(x∗ ) + ∇N f(x∗ ) = 0 , (6.14)
where ∇B f(x∗ ) and ∇N f(x∗ ) represent the gradient of f with regard to the variables
xB and xN , respectively. If we define

λ∗ = −B−T ∇B f(x∗ ) , (6.15)

which can also be written as

∇B f(x∗ ) + BT λ∗ = 0 , (6.16)
Constrained optimization 135

the condition (6.14) is expressed as

∇N f(x∗ ) + NT λ∗ = 0 . (6.17)

Equations (6.16) and (6.17) form (6.11), which proves the first-order result.
To demonstrate the second-order result, we first derive (6.11) to obtain
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

∇2xx L(x, λ) = ∇2 f(x) . (6.18)

We consider a vector y ∈ Rn such that Ay = 0 and then ByB + NyN = 0 or


yB = −B−1 NyN . Then, if d ∈ Rm−n is an arbitrary vector, the vector
   
yB −B−1 Nd
y= = (6.19)
yN d

is such that Ay = 0. According to the necessary optimality conditions (5.2) in the


unconstrained case and Definition B.8 of a positive semidefinite matrix, we have that

dT ∇2 g(x∗N )d ≥ 0 , ∀d ∈ Rm−n . (6.20)

However, when deriving (6.14), we obtain

∇2 g(x∗N ) = NT B−T ∇2BB f(x∗ )B−1 N


− NT B−T ∇2BN f(x∗ )
(6.21)
− ∇2NB f(x∗ )B−1 N
+ ∇2NN f(x∗ ) ,

where ∇2 f(x∗ ) is decomposed into


!
2 ∗
∇2BB f(x∗ ) ∇2BN f(x∗ )
∇ f(x ) = . (6.22)
∇2NB f(x∗ ) ∇2NN f(x∗ )

Then,

dT ∇2 g(x∗N )d =dT NT B−T ∇2BB f(x∗ )B−1 Nd


− dT NT B−T ∇2BN f(x∗ )d
− dT ∇2NB f(x∗ )B−1 Nd
+ dT ∇2NN f(x∗ )d from (6.21)
=yTB ∇2BB f(x∗ )yB
+ yTB ∇2BN f(x∗ )yN
+ yTN ∇2NB f(x∗ )yB
+ yTN ∇2NN f(x∗ )yN from (6.19)
T 2 ∗
=y ∇ f(x )y from (6.22)
=y T
∇2xx L(x∗ )y from (6.18)

and (6.12) is equivalent to (6.20).


136 Lagrange multipliers: necessary conditions

Example 6.9 (Karush-Kuhn-Tucker: linear case). We consider the following opti-


mization problem:

min f(x1 , x2 , x3 , x4 ) = x21 + x22 + x23 + x24

subject to
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

x1 + x2 + x3 =1
x1 − x2 + x4 = 1 .
The solution to this problem is
   
2/3 4/3
 0   0 
x∗ = 
 1/3 
 and ∇f(x∗ ) =  
 2/3  .
1/3 2/3
By decomposing  
 1 1 1 0
A= B N = ,
1 −1 0 1
we obtain
    
∗ −T ∗ 1/2 1/2 4/3 −2/3
λ = −B ∇B f(x ) = − = ,
1/2 −1/2 0 −2/3
and (6.11) is expressed as
        
4/3 1 1 4/3 −4/3 0
 
 0   1 −1  −2/3  0   0  0 
 +   + = .
 2/3   1 0  −2/3 =  2/3   −2/3   0 
2/3 0 1 2/3 −2/3 0

Any vector of the form  


1 1
− y3 − y4
 2 2 
 
 1 1 
 
y =  − y3 + y4 
 2 2 
 
 y3 
y4
is such that Ay = 0 and (6.12) is written as
1  1 T  
− y3 − y4 1 1
 2 2     − 2 y3 − 2
y4

  2 0 0 0  
 1 1  0 2  
T 2 ∗    0 0  1 1 
y ∇ f(x )y =  − y3 + y4  0 0  − y + y4 
 2 2  2 0  2 3 2 
   
 y3  0 0 0 2  y3 
y4 y4

= 3y23 + 3y24 ≥ 0 .
Constrained optimization 137

6.2.2 Equality constraints


We consider here the problem with equality constraints (1.71)–(1.72). In this case,
the Karush-Kuhn-Tucker conditions are formulated in the following way. Thanks to
the Lagrangian function (Definition 4.3), their expression presents similarities with
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

conditions without constraint.

Theorem 6.10 (Karush-Kuhn-Tucker: equality constraints). Let x∗ be a local min-


imum of the problem minx∈Rn f(x) subject to h(x) = 0, with f : Rn → R and
h : Rn → Rm continuously differentiable. If the constraints are linearly indepen-
dent in x∗ (in the sense of Definition 3.8), there exists a unique vector λ∗ ∈ Rm
such that
∇L(x∗ , λ∗ ) = 0 , (6.23)
where L is the Lagrangian function (Definition 4.3). If f and h are twice differ-
entiable, then
yT ∇2xx L(x∗ , λ∗ )y ≥ 0 , ∀y ∈ D(x∗ ) , (6.24)
where D(x∗ ) is the linearized cone1 in x∗ (Definition 3.23). Moreover,
 T −1 T
λ∗ = − ∇h x∗ ∇h(x∗ ) ∇h x∗ ∇f(x∗ ) . (6.25)

Proof. We generate a sequence of optimization problems without constraint approach-


ing the original problem. The idea is to penalize the violation of constraints by in-
troducing a penalty term. The objective function of the problem without constraints
is defined by
k 2 α 2
Fk (x) = f(x) + h(x) + x − x∗ , (6.26)
2 2
where k ∈ N, x∗ is a local minimum to the problem minx∈Rn f(x) subject to h(x) = 0
and α > 0 is arbitrary. Since x∗ is a local minimum (Definition 1.5), there exists ε
such that
f(x∗ ) ≤ f(x) , ∀x such that h(x) = 0 and x ∈ Sε , (6.27)

where Sε is the sphere defined by Sε = x|kx−x∗k ≤ ε . According to the Weierstrass
theorem (Theorem 1.14), the problem

min Fk (x) (6.28)

subject to
x ∈ Sε (6.29)
has a solution in Sε , denoted by xk . One should keep in mind that the problem (6.28)–
(6.29) is subject to constraints. Nevertheless, we demonstrate that, for a sufficiently
1 Since the constraints are linearly independent, the constraint qualification is satisfied and the
linearized cone corresponds to the tangent cone.
138 Lagrange multipliers: necessary conditions

large k, the solution lies strictly inside Sε and is therefore a solution to the problem
(6.28) without constraint (according to Theorem 3.5). The role of Sε is to ensure that
no local minima other than x∗ are found.
We have
Fk (xk ) ≤ f(x∗ ) . (6.30)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Indeed, Fk (xk ) ≤ Fk (x∗ ) because xk is the solution to (6.28)-(6.29) and x∗ ∈ Sε . Also,


according to (6.26),

k 2 α ∗ 2
Fk (x∗ ) = f(x∗ ) + h(x∗ ) + x − x∗ = f(x∗ ),
2 2
because h(x∗ ) = 0. Then, when k → ∞, the value of Fk (xk ) remains bounded. We
show by contradiction that this implies that

lim h(xk ) = 0 . (6.31)


k→∞

2
Indeed, if (6.31) is not satisfied, then the term k2 h(xk ) tends towards +∞. Since
F(xk ) remains bounded, this signifies that either f(xk ), or kxk − x∗ k2 tends towards
−∞. However f(xk ) is bounded from below by f(x∗ ) over Sε (according to (6.27))
and kxk − x∗ k2 is positive, which leads to a contradiction and proves (6.31).
Let bx be a limit point of the sequence (xk )k (Definition B.20). According to (6.31),
we have h(b x) = 0 and b x is feasible for the original problem and thus f(x∗ ) ≤ f(bx).
Moreover, according to (6.30)
2
x − x∗
x) + α b
lim Fk (xk ) = f(b ≤ f(x∗ ) . (6.32)
k→∞

Then,
2
f(b x − x∗
x) + α b ≤ f(b
x) . (6.33)
∗ 2

As a result, α b
x−x = 0 and bx = x∗ . The sequence xk k
converges to x∗ . According
to Definition B.19, there exists b
k such that

kxk − x∗ k ≤ 0.9 ε < ε , b,


∀k ≥ k (6.34)

where ε is the radius of the sphere Sε involved in the definition of the local minimum
(6.27). The point xk is inside Sε when k is sufficiently large.
According to Theorem 1.16, xk is a local minimum of the unconstrained prob-
lem (6.28). We can apply the necessary optimality conditions of an unconstrained
problem, given by Theorem 5.1:

∇Fk (xk ) = ∇f(xk ) + k∇h(xk )h(xk ) + α(xk − x∗ ) = 0 (6.35)

and ∇2 Fk (xk ) is positive semidefinite, with


m
X
∇2 Fk (xk ) = ∇2 f(xk ) + k hi (xk )∇2 hi (xk ) + k∇h(xk )∇h(xk )T + αI . (6.36)
i=1
Constrained optimization 139

x0
b

b
0.9ε

b
xk
x∗
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Figure 6.4: Illustration of the proof of Theorem 6.10

By multiplying (6.35) with ∇h(xk )T , we get

∇h(xk )T ∇f(xk ) + k∇h(xk )T ∇h(xk )h(xk ) + α∇h(xk )T (xk − x∗ ) = 0 . (6.37)

Since the constraints are linearly independent by hypothesis, the matrix ∇h(x∗ )T
∇h(x∗ ) is of full rank and invertible. By continuity of ∇h (indeed, h is continuously
differentiable), there exists a k sufficiently large such that ∇h(xk )T ∇h(xk ) is also
−1
invertible. By multiplying (6.37) by ∇h(xk )T ∇h(xk ) , we obtain
−1 
kh(xk ) = − ∇h(xk )T ∇h(xk ) ∇h(xk )T ∇f(xk ) + α(xk − x∗ ) . (6.38)

When k → ∞, we define
λ∗ = lim kh(xk ) (6.39)
k→∞

to obtain (6.25). By letting k → ∞ in (6.35), we get

∇f(x∗ ) + ∇h(x∗ )λ∗ = 0 . (6.40)

According to Definition 4.3, (6.40) is equivalent to ∇x L(x∗ , λ∗ ) = 0. Since 0 = h(x∗ ) =


∇λ L(x∗ , λ∗ ), we get (6.23).
To demonstrate the second-order condition (6.24), let us consider y in the lin-
T
earized cone (and thus ∇h x∗ y = 0) and, for a sufficiently large k, let us consider
its projection yk on the kernel of ∇h(xk )T . According to the theorem of projection
on the kernel of a matrix (Theorem C.7), we have
−1
yk = y − ∇h(xk ) ∇h(xk )T ∇h(xk ) ∇h(xk )T y . (6.41)

We also have
 T −1 T
lim yk = y − ∇h(x∗ ) ∇h x∗ ∇h(x∗ ) ∇h x∗ y = y , (6.42)
k→∞
T
because ∇h x∗ y = 0. Then, from (6.36),
m
!
X
yTk ∇2 F(xk )yk = yTk 2
∇ f(xk ) + k 2
hi (xk )∇ hi (xk ) yk
i=1

+ kyTk ∇h(xk )∇h(xk )T yk + αyTk yk .


140 Lagrange multipliers: necessary conditions

As ∇h(xk )T yk = 0,
m
!
X
yTk ∇2 F(xk )yk = yTk 2
∇ f(xk ) + k hi (xk )∇ hi (xk ) yk + αyTk yk .
2

i=1

As ∇2 Fk (xk ) is positive semidefinite, this last quantity is non negative. By letting


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

k → ∞ and using λ∗ = limk→∞ kh(xk ), we get


m
!
X
T 2 ∗ ∗ 2 ∗
y ∇ f(x ) + λi ∇ hi (x ) y + αyT y ≥ 0 . (6.43)
i=1

According to Definition 4.3 of the Lagrangian function, (6.43) is equivalent to

yT ∇2xx L(x∗ , λ∗ )y + αyT y ≥ 0 . (6.44)

If (6.24) is not satisfied, (6.44) is not valid for all α > 0. Indeed, if yT ∇2xx L(x∗ , λ∗ )y <
0, (6.44) is not satisfied for the values of α such that
yT ∇2xx L(x∗ , λ∗ )y
α<− .
yT y
Since α can be arbitrarily chosen, this concludes the proof.
Note that for linear constraints, h(x) = Ax − b, ∇h(x) = AT and (6.23) is written
as  
∇f(x∗ ) + AT λ∗
= 0,
Ax − b
which is equivalent to (6.11) of Theorem 6.8.
Example 6.11 (Karush-Kuhn-Tucker: equality constraints). Consider the optimiza-
tion problem
min x1 + x2 (6.45)
x∈R2

subject to  
x21 + (x2 − 1)2 − 1
h(x) = = 0. (6.46)
−x21 + x2
The set of constraints is illustrated in Figure 6.5. We have
   
1 2x1 −2x1
∇f(x) = and ∇h(x) = . (6.47)
1 2x2 − 2 1
The Lagrangian function of the problem is

L(x, λ) = x1 + x2 + λ1 x21 + (x2 − 1)2 − 1 + λ2 (−x21 + x2 ) (6.48)

and  
1 + 2λ1 x1 − 2λ2 x1
 
 1 + 2λ1 (x2 − 1) + λ2 
∇L(x, λ) = 
 x2 + (x − 1)2 − 1
.
 (6.49)
 1 2 
−x21 + x2
Constrained optimization 141

2
h2 (x) = x2 − x21
1.5

1 • a
x
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

0.5
x2

0 • b
x
-0.5 h1 (x) = x21 + (x2 − 1)2 − 1

-1
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x1
Figure 6.5: Illustration of the KKT conditions

T
The point xa = 1 1 is a local minimum of the problem. The constraints are
a
linearly independent in x because the matrix
 
a 2 −2
∇h(x ) = (6.50)
0 1

is of full rank. Thanks to the relations (6.25) and (6.47), we get


 
∗ −3/2
λ = . (6.51)
−1

The condition ∇L(xa , λ∗ ) = 0 is satisfied. Note that the linearized cone is empty
in xa and that the second-order
T condition is trivially satisfied.
The point xb = 0 0 is also a local minimum (in fact, we are dealing with a
global minimum of the problem). We have
 
1
 1 − 2λ1 + λ2 
∇L(xb , λ) = 

,
 (6.52)
0
0

which cannot be zero for any λ. The necessary condition is not satisfied in this case.
Indeed, the constraints are not linearly independent in xb because the matrix
 
0 0
∇h(xb ) = (6.53)
−2 1

is not of full rank.


142 Lagrange multipliers: necessary conditions

We now present the result of John (1948) for equality constraints, which consti-
tutes a generalization of Theorem 6.10.

Theorem 6.12 (Fritz John: equality constraints). Let x∗ be a local minimum of


the problem minx∈Rn f(x) subject to h(x) = 0, where f : Rn → R and h : Rn → Rm
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

are continuously differentiable. Then, there exists µ∗0 ∈ R and a vector λ∗ ∈ Rm


such that
µ∗0 ∇f(x∗ ) + ∇h(x∗ )λ∗ = 0 (6.54)
and µ∗0 , λ∗1 , . . . , λm , are not all zero.

Proof. In the case where the constraints are linearly independent, Theorem 6.10 ap-
plies and (6.54) is trivially obtained with µ∗0 = 1. In the case where the constraints
are linearly dependent, then there exist λ∗1 , . . . , λ∗m , not all zero, such that
m
X
λ∗i ∇hi (x∗ ) = 0 ,
i=1

and (6.54) is obtained when µ0 = 0.

6.2.3 Equality and inequality constraints


We now present the necessary Karush-Kuhn-Tucker optimality conditions for a gen-
eral case including equality and inequality constraints. As is often the case, the
approach consists in returning to an already studied case, in this case the problem
with only equality constraints.

Theorem 6.13 (Karush-Kuhn-Tucker). Let x∗ be a local minimum of the problem


minx∈Rn f(x) subject to h(x) = 0, g(x) ≤ 0, where f : Rn → R, h : Rn → Rm
and g : Rn → Rp are continuously differentiable. If the constraints are linearly
independent in x∗ (in the sense of Definition 3.8), there exists a unique vector
λ∗ ∈ Rm and a unique vector µ∗ ∈ Rp such that
∇x L(x∗ , λ∗ , µ∗ ) = 0 , (6.55)

µ∗j ≥ 0 , j = 1, . . . , p , (6.56)
and
µ∗j gj (x∗ ) = 0 , j = 1, . . . , p , (6.57)
where L is the Lagrangian function (Definition 4.3). If f, h and g are twice
differentiable, then
yT ∇2xx L(x∗ , λ∗ , µ∗ )y ≥ 0 , ∀y 6= 0 such that
T ∗
y ∇hi (x ) = 0 , i = 1, . . . , m (6.58)
T ∗ ∗
y ∇gi (x ) = 0 , i = 1, . . . , p such that gi (x ) = 0 .
Constrained optimization 143

Proof. We consider the active inequality constraints at the solution as equality con-
straints and let us ignore the other constraints in order to obtain the problem
minx∈Rn f(x) subject to h(x) = 0, gi (x) = 0, for any i ∈ A(x∗ ), where A(x∗ ) is
the set
 of active constraints in x∗ (Definition 3.4). According to Theorem 3.5 where
Y = x | h(x) = 0 , x∗ is a local minimum of the optimization problem with equality
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

constraints. According to Theorem 6.10, there exist Lagrange multipliers λ∗ ∈ Rm


and µ∗i , with i ∈ A(x∗ ) such that
X
∇f(x∗ ) + ∇h(x∗ )λ∗ + µ∗i ∇gi (x∗ ) = 0 . (6.59)
i∈A(x∗ )

We associate a zero multiplier to each non active inequality constraint in x∗ to obtain


p
X
∇f(x∗ ) + ∇h(x∗ )λ∗ + µ∗i ∇gi (x∗ ) = 0 (6.60)
i=1

with µ∗i = 0 if i 6∈ A(x∗ ). We thus get (6.55). Similarly, the second-order condition
of Theorem 6.10 implies that

yT ∇2xx L(x∗ , λ∗ , µ∗ )y ≥ 0 , ∀y such that


T ∗
y ∇hi (x ) = 0 , i = 1, . . . , m (6.61)
T ∗ ∗
y ∇gi (x ) = 0 , i = 1, . . . , p such that gi (x ) = 0

and (6.58) is satisfied. We note that (6.57) is trivially satisfied. Indeed, if the con-
straint gj (x∗ ) ≤ 0 is active, we have gj (x∗ ) = 0. If on the other hand it is not, we
have µ∗j = 0. We now need only demonstrate (6.56).
We take the same proof as Theorem 6.10, by defining the penalty function for
inequality constraints with

g+
i (x) = max 0, gi (x) , i = 1, . . . , p . (6.62)

In this case, the function (6.26) becomes


p
k 2 kX + 2 α
Fk (x) = f(x) + h(x) + gj (x) + kx − x∗ k2 . (6.63)
2 2 2
i=1

Since g+ 2
j (x) is differentiable and

∇g+ 2 +
j (x) = 2gj (x)∇gj (x) , (6.64)

we can invoke the same development as in the proof of Theorem 6.10. Since we have
obtained (6.39), we have

µ∗i = lim kg+


i (xk ) , i = 1, . . . , p . (6.65)
k→∞

And since g+
i (x) ≥ 0, we get (6.56).
144 Lagrange multipliers: necessary conditions

Example 6.14 (Karush-Kuhn-Tucker: inequality constraints – I). Consider the prob-


lem
min x1 + x2 (6.66)
x∈R2
subject to
(x1 − 3)2 + x22 ≤ 9
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

x21 + (x2 − 3)2 ≤ 9 (6.67)


x21 ≤ 1 + x2 ,
illustrated in Figure 6.6. Re-arranging the equations of the constraints, we obtain
g1 (x) = x21 − 6x1 + x22
g2 (x) = x21 − 6x2 + x22 (6.68)
g3 (x) = x21 − x2 − 1 ,
and the Lagrangian function is written as
L(x, µ) = x1 + x2 + µ1 (x21 − 6x1 + x22 )
(6.69)
+ µ2 (x21 − 6x2 + x22 ) + µ3 (x21 − x2 − 1) .

6
x21 ≤ 1 + x2
4 x21 + (x2 − 3)2 ≤ 9

2
x2

0 •
(x1 − 3)2 + x22 ≤ 9
-2

-4
-4 -2 0 2 4 6
x1
Figure 6.6: Karush-Kuhn-Tucker optimality conditions
T
The point x∗ = 0 0 is a local minimum of this problem. The constraints
g1 (x) ≤ 0 and g2 (x) ≤ 0 are active in x∗ , whereas the constraint g3 (x) ≤ 0 is not.
The point x∗ is also a local minimum of the problem where the two active inequality
constraints are replaced by equality constraints, and the active constraint is ignored,
that is
min x1 + x2 (6.70)
x∈R2
subject to
(x1 − 3)2 + x22 = 9
(6.71)
x21 + (x2 − 3)2 = 9 ,
Constrained optimization 145

or, equivalently
h1 (x) = x21 − 6x1 + x22 = 0
(6.72)
h2 (x) = x21 − 6x2 + x22 = 0 .
The gradient of the constraints is written as
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

 
2x1 − 6 2x1
∇h(x) = . (6.73)
2x2 2x2 − 6

According to the KKT conditions (Theorem 6.10), λ∗ is given by (6.25) and we have
 T  !−1  T    
∗ −6 0 −6 0 −6 0 1 1/6
λ =− = .
0 −6 0 −6 0 −6 1 1/6

Then, the necessary first-order KKT condition for the initial problem is written as
!
∗ ∗
1 − 6µ∗1 + 2(µ∗1 + µ∗2 + µ∗3 )x∗1
∇Lx (x , µ ) = = 0,
1 − 6µ∗2 + 2(µ∗1 + µ∗2 )x∗2 − µ∗3
T
where L is defined by (6.69), x∗ = 0 0 and
   
λ∗1 1/6
µ∗ =  λ∗2  =  1/6  .
0 0

Since the linearized cone is empty in x∗ , the necessary second-order KKT condition
is trivially satisfied.

Example 6.15 (Karush-Kuhn-Tucker: inequality constraints – II). Consider the


optimization problem
1
min (x21 − x22 ) (6.74)
x∈R 2 2
subject to
x2 ≤ 1 , (6.75)
for which the objective function is illustrated in Figure 6.7. We have

1 2 1 2
L(x, µ) = x − x + µ(x2 − 1) .
2 1 2 2

T
The point x∗ = 0 1 is a local minimum of the problem. The constraint is
active in this point. The first-order condition (6.55) is expressed as
   
∗ ∗ x∗1 0
∇x L(x , µ ) = = =0
−x∗2 + µ∗ −1 + µ∗
146 Lagrange multipliers: necessary conditions

60 10
40
20
0 5
-20
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

-40 0 x2
-60
-5

-10 -10
-5
x2 0 -10 -10 -5 0 5 10
5 5 0x -5
1010 1 x1
(a) Surface (b) Level curves

Figure 6.7: Objective function of Examples 6.15 and 6.21

and is satisfied with µ∗ = 1, which is positive. The second-order condition (6.58) is


written as   
1 0 y1
( y1 y2 ) = y21 − y22 ≥ 0 (6.76)
0 −1 y2
for any y such that
   
T y1 y1
∇g x∗ =( 0 1 ) = y2 = 0
y2 y2

and is satisfied.
T Note that if we choose a feasible direction y, for instance y =
0 −1 , the condition (6.76) is not satisfied. It is only valid for y orthogonal to
active constraints.
Example 6.16 (Physical interpretation of KKT conditions). We consider the opti-
mization problem
min x1
x∈R2

subject to
h1 (x) = x1 − sin x2 = 0 .
T
The equality constraint is represented in Figure 6.8(a). The point xa = −1 3π/2
is a local minimum of the problem. We have
   
1 1
∇f(x) = , ∇h(x) =
0 − cos x2

and the necessary optimality condition is written as

−∇f(xa ) − λ∗1 ∇h(xa ) = 0 .


Constrained optimization 147

6 λ∗ = −1
1

5 −∇f(xa ) −λ∗1 ∇h(xa )


xa•
4
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

−∇f(xb ) −λ∗1 ∇h(xb )


x2

3 •
xb
2

1
h(x) = x1 − sin x2 = 0
0
-2 -1 0 1 2
x1
(a) Equality constraint

5 −∇f(xa )
xa•
4
x2

1
g(x) = x1 − sin x2 ≤ 0
0
-2 -1 0 1 2
x1
(b) Inequality constraint

Figure 6.8: Interpretation of the KKT constraints

Using the fact that −∇f(xa ) is the direction with the steepest descent (Theorem
2.13), we can interpret this condition as an equilibrium between the “force” −∇f(xa ),
which drives the solution to lower values of the objective function, and the force
−λ1 ∇h(xa ), which maintains the solution on the constraint. If xa is optimal, this
signifies that the forces are balanced and that their sum is zero. In our example, since
the “force” −∇f(xa ) acts in the same direction as −∇h(xa ), the multiplier λ∗1 should
be negative so that the two “forces” can compensate each other.
T
If we now take the point xb = sin(3) 3 , the “forces” −∇f(xb ) and −λ∇h(xb )
are not balanced. This is not only the case when λ = λ∗1 , as shown in Figure 6.8(a),
but for all λ.
148 Lagrange multipliers: necessary conditions

We now consider the problem


min x1
x∈R2

subject to
g1 (x) = x1 − sin x2 ≤ 0 .
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

The inequality constraint is represented in Figure 6.8(b). The point xa = (−1, 3π/2)
is not a local minimum of the problem. In fact, the direction −∇f(xa ) is feasible and
the associated “force” drives the solution towards feasible values.
Note that the constraint is active in xa and that the equation

−∇f(xa ) − µ1 ∇g1 (xa ) = 0

is satisfied for µ1 = −1. The interpretation of the forces signifies that the “force”
−µ1 ∇g1 (xa ) drives the solution to the right and prevents it from going inside the
feasible domain, which is incompatible with the definition of the inequality constraint.
The condition (6.56) in Theorem 6.13, µ∗ ≥ 0 signifies that, for an inequality con-
straint, the “force” can only act in a single direction, so as to prevent the points from
leaving the feasible domain, but not from going inside. For the equality constraints,
the “force” can act in two directions and there is no condition for the sign of the
multipliers.

Example 6.17 (Slack variables). Consider problem (P1)

min f(x) (6.77)


x∈Rn

subject to
gi (x) ≤ 0, i = 1, . . . , m. (6.78)

The Lagrangian of (P1) is


m
X
L(x, µ) = f(x) + µi gi (x). (6.79)
i=1

The first derivative is


m
X ∂gi
∂L ∂f
(x, µ) = (x) + µi (x), (6.80)
∂xj ∂xj ∂xj
i=1

for j = 1, . . . , n, and the second derivative is

X m
∂2 L ∂2 f ∂gi
(x, µ) = (x) + µi (x), (6.81)
∂xj ∂xk ∂xj ∂xk ∂xj ∂xk
i=1

for j, k = 1, . . . , n.
Constrained optimization 149

Let x∗ be a local optimum of problem P1. Therefore, the first order necessary
optimality (KKT) conditions say that, under appropriate assumptions, there is a
unique µ∗ ∈ Rm , µ∗ ≥ 0, such that
X ∂gim
∂L ∗ ∗ ∂f ∗
(x , µ ) = (x ) + µ∗i (x∗ ) = 0,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

∂xj ∂xj ∂xj


i=1 (6.82)

µ∗i gi (x∗ ) = 0 i = 1, . . . , m.

Assume that the first p constraints are active at x∗ , and the others not, that is

gi (x∗ ) = 0, i = 1, . . . , p, (6.83)

and
gi (x∗ ) < 0, i = p + 1, . . . , m. (6.84)
Therefore, we obtain
X ∂gi p
∂L ∗ ∗ ∂f ∗
(x , µ ) = (x ) + µ∗i (x∗ ) = 0, (6.85)
∂xj ∂xj ∂xj
i=1

and
µ∗i ≥ 0, i = 1, . . . , p,
(6.86)
µ∗i = 0, i = p + 1, . . . , m.
Moreover, for each d ∈ Rm such that, for each i = 1, . . . , p
n
X ∂gi ∗
dk (x ) = 0, (6.87)
∂xk
k=1

we have !
n X
n p
X ∂2 f X ∂g i
(x∗ ) + µ∗i (x∗ ) dj dk ≥ 0. (6.88)
∂xj ∂xk ∂xj ∂xk
j=1 k=1 i=1

Consider now problem (P2), obtained from problem (P1) by transforming the
inequality constraints into equality constraints using slack variables, as suggested in
Section 1.2.2:
min
n m
f(x) (6.89)
x∈R ,y∈R

subject to
hi (x, y) = gi (x) + y2i = 0, i = 1, . . . , m. (6.90)
For each i = 1, . . . , m, the first derivatives of the constraint are
∂hi ∂gi
= , k = 1, . . . , n, (6.91)
∂xk ∂xk

∂hi
= 2yi , (6.92)
∂yi
150 Lagrange multipliers: necessary conditions

and
∂hi
= 0, ℓ = 1, . . . , m, ℓ 6= i. (6.93)
∂yℓ
The Lagrangian of (P2) is
m
X
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

L(x, y, λ) = f(x) + λi (gi (x) + y2i ). (6.94)


i=1

The first derivatives are


X ∂gi m
∂L ∂f
(x, y, λ) = (x) + λi (x), (6.95)
∂xj ∂xj ∂xj
i=1

for j = 1, . . . , n, and
∂L
(x, y, λ) = 2λi yi , (6.96)
∂yi
for i = 1, . . . , m. The second derivatives are
m
∂2 L ∂2 f X ∂gi
(x, y, λ) = (x) + λi (x), (6.97)
∂xj ∂xk ∂xj ∂xk ∂xj ∂xk
i=1

for j, k = 1, . . . , n,
∂2 L
(x, y, λ) = 2λi , (6.98)
∂y2i
for i = 1, . . . , m,
∂2 L
(x, y, λ) = 0, (6.99)
∂yi ∂yℓ
for i, ℓ = 1, . . . , m, i 6= ℓ, and
∂2 L
(x, y, λ) = 0, (6.100)
∂xj ∂yi
for j = 1, . . . , n, i = 1, . . . , m.
Let x∗ and y∗ be local optima of problem P2. Therefore, the first order necessary
optimality (KKT) conditions say that there is a unique λ∗ ∈ Rm such that
X ∂gi m
∂L ∗ ∗ ∗ ∂f ∗
(x , y , λ ) = (x ) + λ∗i (x∗ ) = 0, (6.101)
∂xj ∂xj ∂xj
i=1

and
2λi y∗i = 0 i = 1, . . . , m. (6.102)
Moreover, for each  
dx
d= ∈ Rn+m
dy
such that, for each i = 1, . . . , m
n
X ∂gi ∗
(dx )k (x ) + 2(dy )i y∗i = 0, (6.103)
∂xk
k=1
Constrained optimization 151

we have
n X
n m
! m
X ∂2 f X X
∗ ∗ ∂gi ∗
(x ) + λi (x ) (dx )j (dx )k + 2 λ∗i (dy )2i ≥ 0. (6.104)
∂xj ∂xk ∂xj ∂xk
j=1 k=1 i=1 i=1

Now, assume that the constraints are numbered so that y∗i = 0 for i = 1, . . . , p,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

and y∗i 6= 0 for i = p + 1, . . . , m. As x∗ verifies the constraints (6.90), we have also


gi (x∗ ) = 0 for i = 1, . . . , p, and gi (x∗ ) < 0 for i = p + 1, . . . , m. Then, from (6.102),
we have λ∗i = 0, i = p + 1, . . . , m. The first order condition (6.101) becomes
X ∂gi p
∂L ∗ ∗ ∗ ∂f ∗
(x , y , λ ) = (x ) + λ∗i (x∗ ) = 0. (6.105)
∂xj ∂xj ∂xj
i=1

For the second order conditions, for each


 
dx
d= ∈ Rn+m
dy
such that, for each i = 1, . . . , p
n
X ∂gi ∗
(dx )k (x ) = 0, (6.106)
∂xk
k=1

and for each i = p + 1, . . . , m


n
X ∂gi ∗
(dx )k (x ) + 2(dy )i y∗i = 0, (6.107)
∂xk
k=1

we have
n X
n p
! p
X ∂2 f ∗
X
∗ ∂gi ∗
X
(x ) + λi (x ) (dx )j (dx )k + 2 λ∗i (dy )2i ≥ 0. (6.108)
∂xj ∂xk ∂xj ∂xk
j=1 k=1 i=1 i=1

In particular, consider d such that dx = 0 and (dy )i = 0, i = p + 1, . . . , m. It clearly


verifies conditions (6.106) and (6.107). We have for any value of (dy )i , i = 1, . . . , p,
p
X
2 λ∗i (dy )2i ≥ 0. (6.109)
i=1

In particular, select k between 1 and p, and set (dy )i = 0 for each i 6= k, and
(dy )k = 1. Therefore, (6.109) implies λ∗k ≥ 0, for any k = 1, . . . , p.
Based on these results, we can prove that (x∗ , µ∗ ) verifies the KKT conditions
of problem P1, if and onlyp if (x∗ , y∗ , λ∗ ) verifies the KKT conditions of problem P2,
where λ∗ = µ∗ and y∗i = −gi (x∗ ), i = 1, . . . , m.
P1 =⇒ P2 Consider (x∗ , µ∗ ) that verifies the KKT conditions of problem P1, such
that gi (x∗ ) = 0 for i = 1, . . . , p and gi (x∗ ) < 0 for i = p + 1, . . . , m. Define y∗
such that
y∗i = 0p i = 1, . . . , p,
(6.110)
y∗i = −gi (x∗ ) i = p + 1, . . . , m,
and define λ∗ = µ∗ . Then (x∗ , y∗ , λ∗ ) verifies the KKT conditions of problem P2.
152 Lagrange multipliers: sufficient conditions

• Constraints (6.90) are trivially verified from the definition of y∗ .


• The first order conditions (6.105) are exactly the same as (6.85).
• The second order KKT conditions are also trivially verified. Consider a direc-
tion d that verifies (6.106) and (6.107). As (6.106) is equivalent to (6.87), we
deduce from (6.88) that
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

n X n p
!
X ∂2 f X
∗ ∗ ∂gi ∗
(x ) + λi (x ) (dx )j (dx )k ≥ 0. (6.111)
∂xj ∂xk ∂xj ∂xk
j=1 k=1 i=1

Now, (6.108) results from (6.111) and (6.86), which says that λ∗i ≥ 0, for each
i.
P2 =⇒ P1 Consider (x∗ , y∗ , λ∗ ) that verifies the KKT conditions of problem P2, such
that y∗ = 0 for i = 1, . . . , p and y∗i 6= 0 for i = p + 1, . . . , m. Then (x∗ , µ∗ ), where
µ∗ = λ∗ , verifies the KKT conditions of problem P1.
• The constraints (6.78) are direct consequences of (6.90).
• The first order conditions (6.105) are exactly the same as (6.85).
• The conditions (6.86) on the Lagrange multipliers are verified, as for P2, λ∗i ≥ 0,
for i = 1, . . . , p and λ∗i = 0, i = p + 1, . . . , m (see the discussion above).
• Consider d that verifies (6.87). Define dx = d, and define dy such that (dy )i =
0, i = 1, . . . , p, and
n
1 X ∂gi ∗
(dy )i = − ∗ (dx )k (x ) (6.112)
2yi ∂xk
k=1

for i = p + 1, . . . , m. By definition, dx and dy verify (6.106) and (6.107).


Therefore, (6.108) holds. As (dy )i = 0, i = 1, . . . , p, we obtain (6.88).

6.3 Lagrange multipliers: sufficient conditions


Similarly to the approach presented in Section 6.2, we start with problems with
equality constraints. We then generalize the result for general problems.
The demonstration of the sufficient optimality condition utilizes what is called an
augmented Lagrangian, which is also used for algorithms.

Definition 6.18 (Augmented Lagrangian). Consider the optimization problem with


equality constraints (1.71)–(1.72) minx∈Rn f(x) subject to h(x) = 0 and let us take a
parameter c ∈ R, c > 0, called penalty parameter. The Lagrangian function of the
problem
c 2
minn f(x) + h(x) subject to h(x) = 0 (6.113)
x∈R 2
Constrained optimization 153

is called the augmented Lagrangian function of the problem (1.71)–(1.72) and is


expressed as
c 2
Lc (x, λ) = L(x, λ) + h(x)
2 (6.114)
c 2
= f(x) + λT h(x) +
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

h(x) .
2

The idea of the objective function (6.113) is to penalize points x that violate the
constraints, hence the name penalty parameter for c .

6.3.1 Equality constraints


Thanks to the Lagrangian function, the sufficient conditions are similar to those for
the unconstrained case. However, we should note the role of the linearized cone.

Theorem 6.19 (Sufficient optimality conditions: equality constraints). Let f : Rn →


R and h : Rn → Rm be twice differentiable functions. Consider x∗ ∈ Rn and
λ∗ ∈ Rm such that
∇L(x∗ , λ∗ ) = 0 (6.115)
and
yT ∇2xx L(x∗ , λ∗ )y > 0 , ∀y ∈ D(x∗ ) , y 6= 0 , (6.116)
where L is the Lagrangian function of the optimization problem minx∈Rn f(x)
subject to h(x) = 0 and D(x∗ ) is the linearized cone in x∗ . Then, x∗ is a strict
local minimum of the optimization problem.

Proof. We first note that any solution to the augmented problem (6.113) is also a
solution to the original problem. We go back to a problem of unconstrained opti-
mization thanks to the augmented Lagrangian function, by showing that x∗ is a strict
local minimum of the problem
minn Lc (x, λ∗ ) (6.117)
x∈R

for sufficiently large c. Indeed,



∇x Lc (x∗ , λ∗ ) = ∇f(x∗ ) + ∇h(x∗ ) λ∗ + ch(x∗ ) by derivation of (6.114)
∗ ∗ ∗
= ∇f(x ) + ∇h(x )λ because x∗ is feasible
= ∇x L(x∗ , λ∗ ) according to Definition 4.3
=0 from (6.115) .

Similarly, we obtain
T
∇2xx Lc (x∗ , λ∗ ) = ∇2xx L(x∗ , λ∗ ) + c∇h(x∗ )∇h x∗ . (6.118)

By applying the theorem for the formation of a positive definite matrix (Theorem
C.18), there exists b
c such that (6.118) is positive definite for all c > b
c. According to
154 Lagrange multipliers: sufficient conditions

Theorem 5.7, x∗ is a strict local minimum of the unconstrained problem (6.117) for
sufficiently large c.
According to Definition 1.6, there exists ε > 0 such that

Lc (x∗ , λ∗ ) < Lc (x, λ∗ ) , ∀x ∈ Rn , x 6= x∗ such that kx − x∗ k < ε . (6.119)


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

According to Definition 6.18 of Lc , we get

f(x∗ ) < f(x) , ∀x ∈ Rn , x 6= x∗ such that kx − x∗ k < ε and h(x) = 0 . (6.120)

According to Definition 1.6, x∗ is a strict local minimum of the problem.

Notes
• Theorem 6.10 can be demonstrated by using the same constraint elimination tech-
nique as for Theorem 6.8. The logic behind the demonstration is the same, but
the proof is more technical (see Bertsekas, 1999).
• No constraint qualifications appear in the sufficient optimality conditions, neither
linear independence nor any other.

6.3.2 Inequality constraints

Theorem 6.20 (Sufficient optimality conditions). Let f : Rn → R, h : Rn → Rm


and g : Rn → Rp be twice differentiable functions. Consider x∗ ∈ Rn , λ∗ ∈ Rm
and µ∗ ∈ Rp such that

∇x L(x∗ , λ∗ , µ∗ ) = 0 (6.121)

h(x ) = 0 (6.122)

g(x ) ≤ 0 (6.123)

µ ≥0 (6.124)
µ∗j gj (x∗ ) = 0, j = 1, . . . , p (6.125)
µ∗j > 0, ∀j ∈ A(x ) ∗
(6.126)

yT ∇2xx L(x∗ , λ∗ , µ∗ )y > 0 , ∀y 6= 0 such that


T ∗
y ∇hi (x ) = 0 , i = 1, . . . , m (6.127)
T ∗ ∗
y ∇gi (x ) = 0 , i = 1, . . . , p such that gi (x ) = 0 ,
where L is the Lagrangian function of the optimization problem minx∈Rn f(x)
subject to h(x) = 0 and g(x) ≤ 0. Then, x∗ is a strict local minimum of the
optimization problem.
Constrained optimization 155

Proof. We use slack variables (Definition 1.4) to obtain the following optimization
problem with equality constraints

min f(x) (6.128)


x∈Rn , z∈Rp
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

subject to
hi (x) = 0 , i = 1, . . . , m
(6.129)
gi (x) + z2i = 0, i = 1, . . . , p ,
and let us define p
z∗i = −gi (x∗ ) (6.130)
2
such that gi (x∗ ) + z∗i = 0 is trivially satisfied.
The Lagrangian function of this problem is
m p
X X 
b z, λ, µ) = f(x) +
L(x, λi hi (x) + µi gi (x) + z2i (6.131)
i=1 i=1

and
m
X p
X
∇x b
L(x, z, λ, µ) = ∇f(x) + λi ∇hi (x) + µi ∇gi (x) = ∇x L(x, λ, µ) (6.132)
i=1 i=1

and
∂b
L(x, z, λ, µ)
= 2µi zi , i = 1, . . . , p . (6.133)
∂zi
 
x
Moreover, by expressing b
x= , we have
z
 
∇2xx L(x, λ, µ) 0
 2µ1 0 ··· 0 
 
 
∇2bxbx b
L(x, z, λ, µ) =  0 2µ2 ··· 0 . (6.134)
 .. .. .. 
 .. 
 0 . . . . 
0 0 · · · 2µp

From the hypothesis (6.125) and as per (6.130), we have µ∗i z∗i = 0. Moreover,
from the hypothesis (6.121), we have

∇b
Lbx (x∗ , z∗ , λ∗ , µ∗ ) = 0 . (6.135)

Consider a non zero vector  


y
∈ Rm+p ,
w
in the linearized cone at  
x∗
z∗
156 Lagrange multipliers: sufficient conditions

for the problem (6.128)–(6.129), i.e., such that

yT ∇hi (x∗ ) = 0 , i = 1, . . . , m , (6.136)

and
yT ∇gi (x∗ ) + 2z∗i wi = 0 ,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

i = 1, . . . , p . (6.137)
Note that if i ∈ A(x∗ ), then zi = 0 and (6.137) is written as yT ∇gi (x∗ ) = 0. The
vector y always corresponds to the conditions of the hypothesis (6.127). We have
 
 y
yT wT b ∗ , z∗ , λ∗ , µ∗ )
∇bxbx L(x
w
p
X
= yT ∇L2xx (x∗ , λ∗ , µ∗ )y + 2 µ∗i w2i from (6.134)
i=1
X
= yT ∇2xx L(x∗ , λ∗ , µ∗ )y + 2 µ∗i w2i from (6.125) .
i∈A(x∗ )

From the hypothesis (6.127), the first term is positive if y 6= 0. From the hypothesis
(6.124), each term of the sum of the second term is non negative. If y = 0, there is
necessarily an i such that wi 6= 0. In order for (6.137) to be satisfied, we have that
z∗i = 0 and then i ∈ A(x∗ ). From (6.126), the corresponding term µ∗i w2i is positive.
The sufficient optimality conditions of Theorem 6.19 are satisfied for x∗ , z∗ , λ∗
and µ∗ and  ∗ 
x
z∗
is a strict local minimum of (6.128)–(6.129). Consequently, x∗ is a strict local mini-
mum of the initial problem.
The condition (6.126) is called the strict complementarity condition. The fol-
lowing example illustrates its importance.
Example 6.21 (Importance of the strict complementarity condition). Consider the
problem
1
min (x21 − x22 ) (6.138)
x∈R 2
2

subject to
x2 ≤ 0 (6.139)
for which the objective function is illustrated in Figure 6.7. We demonstrate that
T all
sufficient optimality conditions except (6.126) are satisfied for x∗ = 0 0 and

µ = 0. We have
1
L(x, µ) = (x21 − x22 ) + µx2 . (6.140)
2
Then,  
x1
∇Lx (x, µ) =
−x2 + µ
Constrained optimization 157

T
and (6.121) is satisfied when x∗ = 0 0 and µ∗ = 0. The other conditions are
trivially satisfied and (6.126) is not satisfied because the constraint is active in x∗ and
µ∗ = 0.
We now consider the point (0, −α), with α > 0. It is feasible and the objective
function is − 21 α2 , which is strictly smaller than the value in x∗ , which is therefore
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

not a local minimum.


To understand why it is not a local optimum, let us take the proof of Theorem
6.20 and transform the problem into a problem with equality constraints:
1 2
min (x − x22 ) (6.141)
x∈R2 2 1
subject to
x2 + z2 = 0 . (6.142)
We have p
z∗ = −g(x∗ ) = 0 . (6.143)
The Lagrangian function of this problem is

b 1
L(x, z, µ) = (x21 − x22 ) + µ(x2 + z2 ) . (6.144)
2
The Hessian ∇2bxxb b
L(x∗ , z∗ , µ∗ ) used in the proof is
 
1 0 0
2 b 
∇bxbx L(x, z, µ) = 0 −1 0 
0 0 2µ
and  
1 0 0
∇2bxbx b
L(x∗ , z∗ , µ∗ ) =  0 −1 0 ,
0 0 0
and it is singular. As in the proof, let us take a vector belonging to the linearized
cone at  ∗ 
x
z∗
of the problem with equality constraints, i.e.,
 
  0
y
=  0 .
w
γ
For all γ, we have
  
1 0 0 0
( 0 0 γ )  0 −1 0   0  = 0
0 0 0 γ
and the sufficient condition for the problem with equality constraints is not satisfied,
which prevents us from proving Theorem 6.20.
158 Lagrange multipliers: sufficient conditions

We conclude this section with two examples of the use of optimality conditions
to identify critical points. They both lead to the resolution of a system of equations,
the topic of the next section of this book.
Example 6.22 (Identification of critical points – I). Consider the problem
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

min 3x21 + x22


x∈R2

subject to
2x1 + x2 = 1

illustrated in Figure 6.9. The necessary optimality condition (6.23) is written as

6x1 + 2λ = 0
2x2 + λ = 0
2x1 + x2 − 1 = 0 .

This is a system with three linear equations, with three unknowns, for which the
solution is
2
x∗1 =
7
3
x∗2 =
7
6
λ∗ = − .
7
We now need only ensure that this point satisfies the sufficient second-order conditions
in order to determine that it is indeed a solution.

• x∗ = ( 27 , 73 )
x2

x1

Figure 6.9: Problem for Example 6.22


Constrained optimization 159

Example 6.23 (Identification of critical points – II). Consider the problem

min 3x21 + x22


x∈R2

subject to
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

x21 + 4x1 − x2 + 3 = 0
illustrated in Figure 6.10. The necessary optimality condition (6.23) is expressed as

6x1 + 2λx1 + 4λ = 0
2x2 − λ = 0
x21 + 4x1 + 3 = 0 .

This is a system of three non linear equations, with three unknowns. One solution is
x1 = −1, x2 = 1.5, and λ = 3. It is not necessarily straightforward to calculate it.

3
2

• x∗ 1
0 x2
-1
-2
-3
-3 -2 -1 0 1 2 3
x1
Figure 6.10: Problem of Example 6.23

6.4 Sensitivity analysis


When the data of an optimization problem is slightly disturbed, the solution to the
perturbed problem generally does not differ fundamentally from that of the unper-
turbed problem. We first analyze this relation for problems with equality constraints.

Theorem 6.24 (Sensitivity analysis: equality constraints). Let f : Rn → R and


h : Rn → Rm be twice continuously differentiable functions. Consider the opti-
mization problem (1.71)–(1.72)

min f(x)
x∈Rn
160 Sensitivity analysis

subject to
h(x) = 0 .
Moreover, let x∗ be a local minimum and let λ∗ satisfy the sufficient optimality
conditions (6.115) and (6.116), such that the constraints are linearly indepen-
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

dent in x∗ , according to Definition 3.8. Consider a perturbation of the data


characterized by δ ∈ Rm , and the perturbed optimization problem

min f(x)
x∈Rn

subject to
h(x) = δ.
There thus exists a sphere S ⊂ Rm centered in 0 such that if δ ∈ S, there exist x(δ)
and λ(δ) satisfying the sufficient optimality conditions of the perturbed problem.
The functions x : Rm → Rn : δ x(δ) and λ : Rm → Rm : δ λ(δ) are
∗ ∗
continuously differentiable in S, with x(0) = x and λ(0) = λ . Moreover, for all
δ ∈ S, we have
∇p(δ) = −λ(δ) , (6.145)
where 
p(δ) = f x(δ) . (6.146)

Proof. We note that  


x
γ= ∈ Rn+m
λ
and consider the function F : Rm+n+m → Rn+m defined by
   
∇f(x) + ∇h(x)λ ∇x L(x, λ)
F(δ, γ) = = . (6.147)
h(x) − δ h(x) − δ

We first demonstrate that the gradient matrix


 2 
∇xx L(x∗ , λ∗ ) ∇h(x∗ )
∇γ F(δ, γ∗ ) =
∇h(x∗ )T 0

is non singular. We assume by contradiction that this is not the case. There then
exist y ∈ Rn and z ∈ Rm , non zero, such that
 
∗ y
∇γ F(δ, γ ) = 0,
z

i.e.,

∇2xx L(x∗ , λ∗ )y + ∇h(x∗ )z = 0 (6.148)


∗ T
∇h(x ) y = 0 . (6.149)
Constrained optimization 161

We have

yT ∇2xx L(x∗ , λ∗ )y = −yT ∇h(x∗ )z from (6.148)


=0 from (6.149).
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Since the sufficient optimality condition (6.116) is satisfied, ∇2xx L(x∗ , λ∗ ) is positive
definite and y = 0. Then, according to (6.148), ∇h(x∗ )z = 0. By assumption, the
constraints are linearly independent at x∗ and the matrix ∇h(x∗ ) is of full rank.
Then, z = 0, which contradicts the fact that y and z are non zero. The matrix
∇γ F(δ, γ∗ ) is indeed non singular and we can apply the theorem of implicit functions
(Theorem C.6): there exist neighborhoods V0 ⊆ Rm around δ+ = 0 and Vγ∗ ⊆ Rn+m
around γ+ = γ∗ , as well as a continuous function
 
x(δ)
φ : V0 → Vγ ∗ : δ γ(δ) =
λ(δ)

such that 
F δ, γ(δ) = 0 , ∀δ ∈ V0 ,
i.e., !
 
∇f x(δ) + ∇h x(δ) λ(δ)
 = 0. (6.150)
h x(δ) − δ
We now demonstrate that, for δ sufficiently close to 0, the sufficient optimality
conditions of the perturbed problem are satisfied.  Assuming (by contradiction) that
this is not the case, there exists a sequence δk k , such that limk→∞ δk = 0 and a
T
sequence (yk )k , with yk ∈ Rm , kyk k = 1 and ∇h x(δk ) yk = 0, for all k, such that

yTk ∇2xx L x(δk ), λ(δk ) yk ≤ 0 , ∀k .

Consider a subsequence of yk converging toward ȳ 6= 0. When we take the limit,


we obtain by continuity of ∇2xx L (as a result of the continuity of ∇2xx f and ∇2xx hi ,
i = 1, . . . , m) that
ȳT ∇2xx L(x∗ , λ∗ )ȳ ≤ 0 ,
which contradicts the sufficient optimality condition in x∗ and λ∗ (6.116).
By differentiating the second row of (6.150), we obtain
 
∇δ h x(δ) = ∇x(δ)∇h x(δ) = I . (6.151)

When multiplying the first row of (6.150) by ∇x(δ), we get


  
0 = ∇x(δ)∇f x(δ) + ∇x(δ)∇h x(δ) λ(δ) = ∇x(δ)∇f x(δ) + λ(δ) ,

where the second equality comes from (6.151). Therefore,


 
∇p(δ) = ∇δ f x(δ) = ∇x(δ)∇f x(δ) = −λ(δ) ,

which demonstrates (6.145).


162 Sensitivity analysis

Example 6.25 (Sensitivity). Consider the problem minx∈R x2 subject to x = 1, for


which the solution is x∗ = 1, λ∗ = −2. We consider the perturbed problem minx∈R x2
subject to x = 1 + δ, for which the solution is x(δ) = 1 + δ and λ(δ) = −2δ − 2. We
have f x(δ) = δ2 + 2δ + 1 and

Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

df
x(δ) = 2δ + 2 = −λ(δ) .


The quantity ∇f x(δ) represents the marginal modification of the objective func-
tion for a perturbation δ of the constraints. When δ is small, we use Taylor’s theorem
(Theorem C.1) to obtain

p(δ) = p(0) + δT ∇p(0) + o kdk .

Neglecting the last term, we obtain



f x(δ) ≈ f(x∗ ) − δT λ∗ .

Note that if p(δ) is linear, we have exactly



f x(δ) = f(x∗ ) − δT λ∗ . (6.152)

This result has significant practical consequences. Indeed, it becomes possible to


measure the impact of a perturbation of the constraint on the objective function,
without re-optimizing.
Example 6.26 (Sensitivity analysis). We consider a company manufacturing two
products. Each unit of the first product brings in e 6,000, while each unit of the
second product brings in e 5,000. A total of 10 machine-hours and 15 tons of raw
material are available daily. Each unit of the first product requires 2 machine-hours
and 1 ton of raw material. Each unit of the second product requires 1 machine-hour
and 3 tons of raw material. In thousands of euros, the optimal production that the
company should consider is obtained by solving the optimization problem

max 6x1 + 5x2


x1 ,x2

subject to
2x1 + x2 ≤ 10
x1 + 3x2 ≤ 15
x1 , x2 ≥ 0 .
We omit the non negativity constraints to maintain a simple formulation (these con-
straints are inactive at the solution). We first express the problem in the form (1.71)–
(1.72) by changing the maximization problem into a minimization problem, and by
including slack variables (see Section 1.2):

min f(x) = −6x1 − 5x2


x1 ,x2 ,x3 ,x4
Constrained optimization 163

subject to
h1 (x) = 2x1 + x2 + x23 − 10 = 0
h2 (x) = x1 + 3x2 + x24 − 15 = 0 ,

where x3 and x4 are the slack variables.
T The solution to the problem is x =
T ∗
and λ = 13/5 4/5 , enabling the company to bring in e 38,000
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

3 4 0 0
per day.

10 10 •x(δ) = (0, 10)


8 8
6 •x(δ) = (2, 6) 6
x2

x2
∗ ∗
4 •x = (3, 4) 4 •x = (3, 4)
2 2
0 0

0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
x1 x1
(a) δ = 5 (b) δ = 16

Figure 6.11: Graphical analysis of Example 6.26

In order to increase its production, the company wishes to invest by purchasing


an additional quantity δ of raw material per day. In this case, the constraints would
become
h1 (x) = 2x1 + x2 + x23 − 10 = 0
h2 (x) = x1 + 3x2 + x24 − 15 = δ .
To determine what this investment would bring in, we use (6.152) with
 
0
δ=
δ

to obtain
 4
f x(δ) = f(x∗ ) − δT λ∗ = −38 − δ . (6.153)
5
Therefore, the purchase of 5 tons of raw material per day would enable the company
to bring in e 4,000. If this purchase costs less than e 4,000, it is worth going through
with the investment. Otherwise, the investment is of no use.
Note that this result is only valid for small values of δ. If δ = 16, such that
31 tons of raw material are available each day, the company no longer has enough
machine-hours to use up all of the raw material. Therefore, the second constraint
is no longer active (here, x4 is positive). The company should thus produce only
the second product, which consumes half as many machine-hours. In this case, the
164 Sensitivity analysis

purchase of additional raw material would not enable the company to earn more, and
the company should thus rather invest in new machines.

The inspiration for this example came from de Werra et al. (2003).
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Corollary 6.27 (Sensitivity analysis). Let f : Rn → R, g : Rn → Rp and h :


Rn → Rm be twice differentiable functions. Consider the optimization problem
(1.71)–(1.73)
minn f(x)
x∈R

subject to
h(x) = 0
g(x) ≤ 0 .
Moreover, let x be a local minimum and let λ∗ , µ∗ satisfy the sufficient optimality

conditions (6.121)–(6.126), such that the constraints are linearly independent


in x∗ , according to Definition 3.8. Consider perturbations δh ∈ Rm and δg ∈ Rp ,
and the perturbed optimization problem

min f(x)
x∈Rn

subject to
h(x) = δh
g(x) ≤ δg .
T
Then, there exists a sphere S ⊂ Rm+p centered in 0 such that if δ = δTh δTg ∈
S, there are x(δ), λ(δ) and µ(δ) are satisfying the sufficient optimality conditions
of the perturbed problem. The functions x : Rm+p → Rn , λ : Rm+p → Rm and
µ : Rm+p → Rp are continuously differentiable in S, with x(0, 0) = x∗ , λ(0, 0) = λ∗
and µ(0, 0) = µ∗ . Moreover, for all δ ∈ S, we have

∇δh p(δ) = −λ(δ)


(6.154)
∇δg p(δ) = −µ(δ) ,

with 
p(δ) = f x(δ) . (6.155)

Proof. From Theorem 3.5, x∗ , λ∗ and µ∗ satisfy the optimality conditions of the
problem
min f(x)
x∈Rn

subject to
h(x) = 0
gi (x) = 0 , ∀i ∈ A(x∗ ) .
Constrained optimization 165

From Theorem 6.24, there exist x(δ), λ(δ) and µ(δ) satisfying the sufficient optimality
conditions of the problem
minn f(x) (6.156)
x∈R

subject to
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

h(x) = δh (6.157)
 ∗
gi (x) = δg i
, ∀i ∈ A(x ) . (6.158)

The functions x : Rm+p → Rn , λ : Rm+p → Rm and µi : Rm+p → R are continuously


differentiable in S, with x(0, 0) = x∗ , λ(0, 0) = λ∗ and µi (0, 0) = µ∗i , i ∈ A(x∗ ).
Moreover, for all δ ∈ S, we have

∇δh p(δ) = −λ(δ)



∇δg p(δ) i = −µi (δ) , i ∈ Ax∗ .

We now need only verify the result for inequality constraints that are inactive in
x∗ , i.e.,
gi (x∗ ) < 0 .

In this case, if δi is sufficiently close to 0, gi x(δ) < δi , and the constraint i is also in-
active on the solution to the perturbed problem (by continuity of gi ). Then, according
to Theorem 3.5, the perturbed problem is equivalent to the problem (6.156)–(6.158).
If we take µi (δ) = 0 for all i 6∈ A(x∗ ), x(δ), λ(δ) and µ(δ) satisfying the sufficient
optimality conditions. Moreover, regardless of the value of δi (small enough for the
constraint of the problem to remain inactive), the value of x(λ) remains constant,
since it is determined by the problem (6.156)–(6.158), which does not depend on δi ,
if i 6∈ A(x∗ ). Therefore,
n
X ∂f ∂xj
∂p
= = 0 = −µi (δ) ,
∂δi ∂xj ∂δi
j=1

which concludes the proof.


We emphasize the importance of the condition requiring that the inactive con-
straints in the initial problem remain so in the perturbed problem. This is illustrated
for Example 6.26 in Figure 6.11. When δ = 5, the solution x(δ) is such that the
constraints x1 ≥ 0 and x2 ≥ 0, inactive in x∗ , remain inactive in x(δ). However, when
δ = 16, the constraint x1 ≥ 0 becomes active in x(δ) and we leave the domain of
application of the theorem of sensitivity.

6.5 Linear optimization


We now analyze in greater detail the optimality conditions for the linear optimization
problem
minn cT x (6.159)
x∈R
166 Linear optimization

subject to
Ax = b
(6.160)
x ≥ 0,
where A ∈ Rm×n , b ∈ Rm , c ∈ Rn , for which the Lagrangian function is
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

L(x, λ, µ) = cT x + λT (b − Ax) − µT x , (6.161)

with λ ∈ Rm and µ ∈ Rn . By directly applying Theorem 6.13, the necessary first-


order optimality condition is expressed as

∇L(x, λ, µ) = c − AT λ − µ = 0
(6.162)
µ ≥ 0.

These conditions represent exactly the constraints (4.23) of the dual problem de-
scribed in Section 4.2. The second-order conditions are trivial, because ∇2xx L(x, λ, µ) =
0, for all (x, λ, µ). The complementarity condition (6.57) is simply written as

µi xi = 0 , i = 1, . . . , n, (6.163)

or, equivalently,
m
X
(ci − aji λj )xi = 0 , i = 1, . . . , n. (6.164)
j=1

We show below that this condition happens to also be sufficient for optimality.
We can also utilize the necessary conditions of Theorem 5.1. In particular, if we
consider the jth basic direction dj (Definition 3.42), the directional derivative of the
objective function in the direction dj is given by

∇f(x∗ )T dj = cT dj = cTB (dj )B + cTN (dj )N = −cTB B−1 Aj + cj . (6.165)

In the context of linear optimization, this quantity is often called reduced cost.

Definition 6.28 (Reduced costs). Consider the linear optimization problem (6.159)–
(6.160) and let x be a feasible basic solution of the constraint polyhedron. The reduced
cost of xj is
c̄j = cj − cTB B−1 Aj , j = 1, . . . , n . (6.166)
In matrix form, we have
c̄ = c − AT B−T cB . (6.167)

The reduced costs can be decomposed into their basic and non basic components,
as follows:
c̄B = cB − BT B−T cB = 0, (6.168)
and
c̄N = cN − NT B−T cB . (6.169)
Constrained optimization 167

Therefore, for any basis B, the basic components of the reduced costs is always
0. This, together with the geometric interpretation of the non basic components, is
formalized in the next theorem.

Theorem 6.29 (Reduced costs). Consider the linear problem (6.159)–(6.160)


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

and let x be a basic solution of the constraint polyhedron. When j is the index
of a non basic variable, the jth reduced cost is the directional derivative of the
objective function in the jth basic direction. When j is the index of a basic
variable, the jth reduced cost is zero.

Proof. In the case where j is non basic, the proof can be found above (see (6.165)).
For basic j, we see that B−1 B = I and B−1 Aj is the jth column of the identity matrix.
Then, cTB B−1 Aj = cj and the reduced cost is zero.
The concept of reduced costs now enables us to state the optimality conditions
for linear optimization.

Theorem 6.30 (Necessary optimality conditions, linear optimization). Consider the


linear problem (6.159)–(6.160) and let x∗ be a non degenerate basic solution of
the constraint polyhedron. If x∗ is the solution to (6.159)–(6.160), then c̄ ≥ 0.

Proof. Consider the basic direction dk . According to Theorem 3.44, the non degen-
eracy of x∗ ensures that dk is feasible. Therefore, given the convexity of the set of
constraints, the necessary condition of Theorem 6.3 applies and
T
∇f x∗ dk = c̄ ≥ 0 ,
by using (6.166) and Theorem 6.29.

Theorem 6.31 (Sufficient conditions, linear optimization). Consider the linear


problem (6.159)–(6.160) and let x∗ be a feasible basic solution of the constraint
polyhedron. If c̄ ≥ 0, then x∗ is optimal.

Proof. Let y be an arbitrary feasible point and w = y − x∗ . Since the feasible set
is convex, w is a feasible direction (Theorem 3.11). If dj is the jth basic direction2
(Definition 3.42), we have
X
cT w = (w)j cT dj from Theorem 3.45
j∈N
X
= (w)j (−cTB B−1 Aj + cj ) from Definition 3.42
j∈N
X
= (w)j c̄j from Definition 6.28.
j∈N
2 In this proof, dj is a vector of Rn , while (w)j is a scalar, representing the jth entry of the vector
w.
168 Linear optimization

Since x∗ is a basic solution, j ∈ N implies that x∗j = 0 according to Definition 3.38.


Therefore, (w)j = yj − x∗j = yj ≥ 0 by feasibility of y. Then, as the reduced costs are
non negative, we obtain

X
cT y − cT x∗ = cT w = (w)j c̄j ≥ 0 ,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

j∈N

which proves the optimality of x∗ .

Note that the sufficient conditions do not assume that x∗ is non degenerate. To
understand why the necessary conditions may not hold when x∗ is degenerate, we go
back to the example illustrated in Figure 3.16. In this example, the vertex number
2 corresponds to a degenerate solution, and the basic direction d b3 is not feasible.
Therefore, if this vertex happens to be the optimal solution of an optimization prob-
b3 to correspond to an ascent direction.
lem, it is not necessary for the basic direction d
It does not matter if it is a descent direction, as the direction is not feasible anyway.
If it happens to be a descent direction, the reduced cost is negative, and the necessary
condition is not verified although we are at the optimal solution.
We now characterize the optimal solution of the dual problem given the optimal
solution of the primal. We obtain an important result for linear optimization, called
strong duality, that the optimal value of the primal coincides with the optimal value
of the dual. Moreover, this result provides a direct link between the reduced costs
and the dual variables.

Corollary 6.32 (Optimality of the dual). Consider the primal linear problem
(6.159)–(6.160) and let B be a basis such that B−1 b ≥ 0 and c̄ ≥ 0. Consider
also the dual problem
maxm
λT b (6.170)
λ∈R

subject to
AT λ ≤ c. (6.171)
Then, the primal vector x∗ with basic variables

x∗B = B−1 b (6.172)

and non basic variables x∗N = 0, is optimal for the primal problem, the dual
vector
λ∗ = B−T cB (6.173)
is optimal for the dual problem, and the objective functions are equal, that is

(λ∗ )T b = cT x∗ . (6.174)
Constrained optimization 169

Proof. The optimality of x∗ is guaranteed by Theorem 6.31. We have also

(λ∗ )T b = cTB B−1 b from (6.173)


= cTB x∗B from (6.172)
T ∗
=c x as x∗N = 0,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

proving (6.174).
The vector λ∗ is feasible for the dual. Indeed, from (6.167), we have

AT λ∗ = AT B−T cB = c − c̄.

As c̄ ≥ 0, we obtain (6.171).
Consider now any dual feasible λ. By the weak duality theorem (Theorem 4.9),

λT b ≤ cT x∗ = (λ∗ )T b,

which proves the optimality of λ∗ for the dual problem.


The above result leads to an important result called strong duality. Consider x∗
an optimal solution of the primal problem. If x∗ is non degenerate, the condition
c̄ ≥ 0 is a sufficient and necessary condition for its optimality. If it is degenerate,
it can be shown that there exists a basis B such that x∗B = B−1 b ≥ 0 and c̄ =
c − AT B−T cB ≥ 0. The idea is that the simplex algorithm described in Chapter 16,
combined with appropriate rules attributed to Bland (1977) terminates in a finite
number of iterations with an optimal basis and non negative reduced costs.

Theorem 6.33 (Strong duality). Consider the primal linear problem

min cT x
x∈Rn

subject to
Ax = b
x ≥ 0,
where A ∈ Rm×n , b ∈ Rn , c ∈ Rn , and the dual problem

max λT b
λ∈Rm

subject to
AT λ ≤ c.
If either the primal or the dual problem has an optimal solution, so does the
other, and the optimal objective values are equal.

Proof. From the discussion above, if x∗ is a solution to the primal problem, there
exists a basis B such that x∗B = B−1 b ≥ 0 and c̄ = c − AT B−T cB ≥ 0. Therefore,
Corollary 6.32 applies, the optimal solution of the dual is B−T cB , and the objective
functions are equal.
170 Linear optimization

If λ∗ is a solution to the dual problem, the fact that the primal problem is the
dual of the dual (Theorem 4.15) is used to prove the result.
Note that another proof of this result, exploiting Farkas’ lemma, is presented in
Theorem 4.17. Finally, we show that the complementarity condition (6.164) is a
sufficient and necessary optimality condition.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Theorem 6.34 (Complementarity slackness). Consider the primal linear problem

min cT x
x∈Rn

subject to
Ax = b
x ≥ 0,
where A ∈ Rm×n , b ∈ Rn , c ∈ Rn , and the dual problem

max λT b
λ∈Rm

subject to
AT λ ≤ c.
Consider x∗ primal feasible and λ∗ dual feasible. x∗ is optimal for the primal
and λ∗ is optimal for the dual if and only if
m
X
(ci − aji λj )xi = 0 , i = 1, . . . , n. (6.175)
j=1

Proof. Conditions (6.175) are KKT necessary optimality conditions (see Theorem 6.13
and the discussion at the beginning of the section). To show that they are sufficient,
consider the equation
n
X m
X
(c − AT λ∗ )T x∗ = (ci − aji λj )xi = 0.
i=1 j=1

Therefore,
cT x∗ = (λ∗ )T Ax∗ .
As x∗ is primal feasible, we have Ax∗ = b and

cT x∗ = (λ∗ )T b.

Consequently, the objective function of the primal at x∗ equals the objective function
of the dual at λ∗ . We apply Theorem 4.11 to prove the optimality of x∗ and λ∗ .
Conditions (6.175) are called complementarity slackness conditions because the
activity of the constraints must be complementary. At the optimal solution, if a
Constrained optimization 171

primal variable is positive, that is if xi > 0, the corresponding dual constraint must
Pm
be active, that is ci − j=1 aji λj . Symmetrically, if a dual constraint is inactive, that
Pm
is if ci > j=1 aji λj , the corresponding primal variable must be equal to 0.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

6.6 Quadratic optimization


We now consider a case of quadratic optimization with equality constraints:

1 T
min x Qx + cT x (6.176)
x∈Rn 2
subject to
Ax = b , (6.177)
where Q ∈ Rn×n , c ∈ Rn , A ∈ Rm×n and b ∈ Rm . The Lagrangian function is

1 T
L(x, λ) = x Qx + cT x + λT (b − Ax) , (6.178)
2
with λ ∈ Rm . By directly applying Theorem 6.13, the necessary first-order optimality
condition is written as

∇x L(x, λ) = Qx + c − AT λ = 0 . (6.179)

By combining (6.177) and (6.179), we obtain the linear system


    
Q −AT x −c
= . (6.180)
A 0 λ b

We demonstrate the case where this system has a unique solution.

Lemma 6.35. Consider the quadratic problem (6.176)–(6.177), with A of full


rank. Let Z ∈ Rn×(n−m) be a matrix for which the columns form a basis of
the null space of A, i.e., AZ = 0, and Z is of full rank. If the reduced Hessian
matrix ZT QZ is positive definite, then the system (6.180) is non singular and
has a unique solution (x∗ , λ∗ ).

Proof. Consider x and λ such that


    
Q −AT x 0
= ,
A 0 λ 0

i.e., Qx = AT λ and Ax = 0. We demonstrate that x and λ are zero in order to prove


that the matrix is non singular. Since Ax = 0, we have
  
 Q −AT x
0= x T
λ T = xT Qx .
A 0 λ
172 Quadratic optimization

Since Z is of full rank, there exists y such that x = Zy. Therefore,

yT ZT QZy = 0 .

Since ZT QZ is positive definite, then y = 0. As a result, x = Zy = 0 and the first


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

equation is written as
Qx − AT λ = −AT λ = 0 .
Since A is of full rank, then λ = 0.
We calculate the analytical solution to this problem.

Lemma 6.36. Consider the quadratic problem (6.176)–(6.177) with Q = I and


b = 0, i.e.,
1
min xT x + cT x
2
subject to
Ax = 0
where A is of full rank. The solution to this problem is
−1
x∗ = AT AAT Ac − c (6.181)
−1
λ∗ = AAT Ac . (6.182)

Proof. The system (6.180) is written as


    
I −AT x∗ −c
= .
A 0 λ∗ 0

By multiplying the first equation

x∗ − AT λ∗ = −c , (6.183)

by A, we obtain
Ax∗ − AAT λ∗ = −Ac .
Since Ax∗ = 0 and A is of full rank, we obtain (6.182). We now need only introduce
(6.182) in (6.183) to obtain (6.181).

Lemma 6.37. Consider the quadratic problem (6.176)–(6.177) with Q = I, i.e.,

1 T
min x x + cT x
x 2
subject to
Ax = b
Constrained optimization 173

where A is of full rank. The solution to this problem is


−1
x∗ = AT AAT (Ac + b) − c (6.184)


T −1
λ = AA (Ac + b) (6.185)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Proof. Consider x0 such that Ax0 = b and let y = x − x0 , i.e., x = y + x0 . The


problem becomes
1 T 
min y y + xT0 x0 + 2yT x0 + cT y + xT x0
y 2
subject to
Ay + Ax0 = b .
By removing the constant terms of the objective function and using Ax0 = b, we
obtain
1 T
min yT y + c + x0 y
y 2

subject to
Ay = 0 .
According to Theorem 6.36, the solution to this problem is
−1
y∗ = AT AAT A(c + x0 ) − (c + x0 )
−1
λ∗ = AAT A(c + x0 ) .
We now need only use Ax0 = b and y∗ = x∗ − x0 to obtain the result.

Theorem 6.38 (Analytical solution of a quadratic problem). Consider the


quadratic problem (6.176)–(6.177) minx∈Rn 21 xT Qx+cT x subject to Ax = b, where
A is of full rank. If the matrix Q is positive definite, then the system (6.180) is
non singular and has a unique solution (x∗ , λ∗ ) given by

x∗ = Q−1 (AT λ∗ − c) (6.186)

and −1
λ∗ = AQ−1 AT (AQ−1 c + b) . (6.187)

Proof. Let Z ∈ Rn×(n−m) be a matrix where the columns form a basis of the null
space of A, i.e., such that AZ = 0. Since Q is positive definite, then so is ZT QZ,
and Theorem 6.35 applies to demonstrate the non singularity of the system and the
unicity of the solution. Let L be an lower triangular matrix such that Q = LLT and
let us take y = LT x. The problem (6.176)–(6.177) is thus written as
1 T −1 T −T 1
min y L LL L y + cT L−T y = yT y + cT L−T y
y 2 2
174 Exercises

subject to
AL−T y = b .
The solution to this problem is given by Theorem 6.37, by replacing c with L−1 c and
A with AL−T . Then,
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

−1
λ∗ = AL−T L−1 AT (AL−T L−1 c + b) = (AQ−1 AT )−1 (AQ−1 c + b)

and
y∗ = L−1 AT λ∗ − L−1 c .
We now need only take y∗ = LT x∗ to obtain the result.
The presentation of the proof of Theorems 6.10, 6.13, 6.19, 6.20, and 6.24 was
inspired by Bertsekas (1999). That of the proof of Theorem 6.31 was inspired by
Bertsimas and Tsitsiklis (1997).

6.7 Exercises
Exercise 6.1.
Identify the local optima of the following optimization problems, and verify the op-
timality conditions.
n
X
1. minn kxk22 , subject to xi = 1.
x∈R
i=1
n
X
2. minn xi , subject to kxk22 = 1.
x∈R
i=1
3. min −x21 − x22 , subject to (x1 /2)2 + (x2 /2)2 ≤ 1 (Hint: plot the level curves and
x∈R2
the constraints).
4. min −x21 − x22 , subject to −x21 + x22 ≤ 1, and −5 ≤ x1 ≤ 5 (Hint: plot the level
x∈R2
curves and the constraints).
5. The Indiana Jones problem (Section 1.1.6): min x21 + x22 , subject to x1 x2 − hx1 −
x∈R2
ℓx2 = 0, x1 ≥ ℓ, x2 ≥ h.
Exercise 6.2.
An electricity company must supply a town that consumes 100 MWh daily. Three
plants are used to generate the energy: a gas plant, producing at the cost of e 800/MWh,
a coal plant, producing at the cost of e 1,500/MWh, and a hydroelectric plant produc-
ing at the cost of e 300/MWh. The amount of available water limits the production
of the latter plant to 40 MWh per day. Moreover, due to ecological concerns, the two
other plants are limited to produce no more than 80 MWh per day each.
1. Formulate a linear optimization problem that would optimize the costs of the
company.
2. Formulate the dual problem.
Constrained optimization 175

3. Prove, using the optimality conditions, that the optimal solution is to produce 60
MWh per day with the gas plant, 40 MWh per day with the hydroelectric plant,
and not to use the coal plant.
4. Deduce the optimal values of the dual variables.
5. Use sensitivity analysis to propose profitable investments to the company.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Exercise 6.3. Consider the optimization problem minx∈Rn f(x) subject to


n
X
xi = 1 and x ≥ 0.
i=1

Let x∗ be a local minimum of f.


1. Prove that, if x∗i > 0, then

∂f(x∗ ) ∂f(x∗ )
≤ ∀j. (6.188)
∂xi ∂xj

(Hint: refer to Example 6.4).


2. Show that, if f is convex, condition (6.188) is sufficient. (Hint: Define ∆ =
mini ∂f(x∗ )/∂xi ).
Exercise 6.4 (Slack variables). Consider problem (P1)

min f(x) (6.189)


x∈Rn

subject to
gi (x) ≤ 0, i = 1, . . . , m,
and problem (P2)
min f(x)
x∈Rn ,y∈Rm

subject to
gi (x) + yi = 0, i = 1, . . . , m,
yi ≥ 0, i = 1, . . . , m.
1. Write the necessary optimality conditions (KKT) for problem (P1), both first and
second order.
2. Write the necessary optimality conditions (KKT) for problem (P2), both first and
second order.
3. Prove that (x∗ , µ∗ ) verifies the KKT conditions of problem P1, if and only if
(x∗ , y∗ , λ∗ ) verifies the KKT conditions of problem P2, where λ∗ = µ∗ and y∗i =
−gi (x∗ ), i = 1, . . . , m (Hint: refer to Example 6.16).
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Part III

Solving equations
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Equations are more important to


me, because politics is for the
present, but an equation is
something for eternity.

Albert Einstein

We have seen that the necessary optimality conditions enable us to identify the critical
points (Definition 5.6) that are candidates for the solution to an optimization problem.
In the case of optimization without constraint, we use condition (5.1). For constrained
optimization, we use conditions (6.11), (6.23), and (6.55)–(6.57). One way to address
the problem is to solve the system of equations defined by these conditions. This
is how the problem in Example 5.8 was solved, as well as Example 6.22. In these
two cases, the system of equations is easily solved. This is not always the case, as
illustrated in Example 6.23.
We now consider numerical methods enabling us to solve such systems of non
linear equations. Even though these are not directly utilized for optimization, they
are the basis for the main algorithms.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 7

Newton’s method

Contents

7.1 Equation with one unknown . . . . . . . . . . . . . . . . . 181


7.2 Systems of equations with multiple unknowns . . . . . . 192
7.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

Newton’s method plays a crucial role in the context of solving non linear equations
and, by extension, in that of non linear optimization. Isaac Newton was inspired
by a method from Vieta, and the method was later on improved by Raphson (and
sometimes called “Newton-Raphson.”) We refer the reader to Deuflhard (2012) for
a historical perspective of the method. We introduce it for the simple problem of
solving one equation of one unknown, i.e., by deriving numerical methods to find
x ∈ R such that F(x) = 0.

7.1 Equation with one unknown

Let F : R → R be a real differentiable function of one variable. In order to solve


the equation F(x) = 0, the main idea of Newton’s method consists in simplifying the
problem. Since a non linear equation is complicated to solve, it is replaced by a linear
equation. The concept of replacing a difficult problem with a simpler one is used
throughout this book. We use the term model when referring to a function that is a
simplification of another.
To obtain this simplified equation, we invoke Taylor’s Theorem C.1 which ensures
that a differentiable function can be approximated at a point by a straight line and
that the magnitude of the error decreases with the distance to this point.
182 Equation with one unknown

Isaac Newton was born prematurely and fatherless on December


25, 1642, in Woolsthorpe, England. (As 11 days were dropped in
September 1752 to adjust the calendar, the date of his birth in
the “new style” calendar, that is, January 4, 1643, is sometimes
reported.) He is considered as the father of modern analysis, es-
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

pecially thanks to his study on differentiable functions, and in-


finitesimal calculus (that he called “fluxions”). His most famous
work, published in Philosophiae naturalis principia mathe-
matica, concerns the theory of gravitation and associated principles (inertia, action-
reaction, tides, etc.) He is considered as the founder of celestial mechanics. Newton
claimed that the fall of an apple inspired in him the concept of gravitation. Some
dispute him being the father of these findings, rather attributing the fundamental
ideas to Robert Hooke. Newton accused Leibniz (apparently wrongfully) of having
plagiarized his work. He was the first British scientist to be knighted, on April 16,
1705, by Queen Anne. He died on March 20, 1727, in London. One of his most famous
quotes is “If I have seen further than others, it is by standing upon the shoulders of
giants.” He is buried in Westminster Abbey, with the following inscription on his
grave: “Hic depositum est, quod mortale fuit Isaaci Newtoni”. (Here lies that which
was mortal of Isaac Newton).
Figure 7.1: Sir Isaac Newton

Example 7.1 (Linear model). Take the function

F(x) = x2 − 2

and the point b


x = 2. According to Taylor’s theorem, for any d ∈ R, we have

F(b x) + dF ′ (b
x + d) = F(b x) + o |d|

=bx2 − 2 + 2bxd + o |d|

= 2 + 4d + o |d| .

The linear model is obtained by ignoring the error o |d| :

x + d) = 2 + 4d .
m(b

Defining x = b
x + d, we get

m(x) = 2 + 4(x − 2) = 4x − 6 .

The function and the model are presented in Figure 7.2(a). The zoom in Figure 7.2(b)
illustrates the good agreement between the model and the function around b x = 2.
Newton’s method 183

15 2.5
f(x) 2.4 f(x)
10 m(x) 2.3 m(x)
2.2
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

5 2.1
2
0 1.9
1.8
-5 1.7
1.6
-10 1.5
0 0.5 1 1.5 2 2.5 3 3.5 4 1.9 1.95 2 2.05 2.1
x x
(a) b
x=2 (b) Zoom

Figure 7.2: Linear model of x2 − 2

We can now provide a general definition of the linear model of a non linear func-
tion.

Definition 7.2 (Linear model of a function with one variable). Let F : R → R be a


differentiable function. The linear model of F in b
x is a function mbx : R → R defined
by
mbx (x) = F(b x)F ′ (b
x) + (x − b x) . (7.1)

From a first approximation b x, the main idea of Newton’s method in order to find
the root of the function f consists in
1. calculating the linear model in b
x,
2. calculating the root x+ of this linear model,
3. if x+ is not the root of f, considering x+ as a new approximation and starting
over.

According to Definition 7.2, the root of the linear model is the solution to

F(b x)F ′ (b
x) + (x − b x) = 0 , (7.2)

i.e., if F ′ (b
x) 6= 0,
F(b x)
x+ = b
x− , (7.3)
F ′ (b
x)
which summarizes the first two steps presented above.
184 Equation with one unknown

We also need to specify the third step. How do we conclude that x+ is a root
of the function, i.e., F(x+ ) = 0, and that we can stop the iterations? Seemingly
innocuous, this question is far from simple to answer. Indeed, computers operating
in finite arithmetic are not capable of representing all real numbers (that represent an
uncountable infinity). Therefore, it is possible, and even common, that the method
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

never generates a point x+ such that F(x+ ) = 0 exactly. Then, we must often settle for
a solution x+ such that F(x+ ) is “sufficiently close” to 0. In practice, the user provides
a measure of this desired proximity, denoted by ε and the algorithm is interrupted
when

F(x+ ) ≤ ε . (7.4)


A typical value for ε is εM , where εM is the machine epsilon, that is, an upper
bound on the relative error due to rounding in floating point arithmetic. A simple
way to compute εM is Algorithm 7.1. The loop stops when εM is so small that, when
added to 1, the result is also 1.

Algorithm 7.1: Machine epsilon


1 Objective
2 Find the machine epsilon εM .
3 Initialization
4 εM := 1.
5 while 1 + εM 6= 1 do
6 εM := εM /2.

Typical values are



• εM = 5.9605 10−8 for single precision floating point (so that ε = εM =
2.4414 10−4 ), and

• εM = 1.1102 10−16 for double precision floating point (so that ε = εM =
1.0537 10−8 ).

We now have all the elements in order to write Newton’s algorithm to solve an
equation with one unknown (Algorithm 7.2).
Newton’s method 185

Abu Ja‘far Muhammad ibn Musa Al-Khwarizmi was a Persian


mathematician born before AD 800. Only a few details about
his life can be gleaned from Islamic literature. His name appears to indicate that
he was from the State of Khwarazm or Khorezm (currently Khiva in Uzbekhistan).
However, other sources suggest that he was born between the Tigris and Euphrates
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

in the Baghdad area. Al Khwarizmi was an astronomer in the House of Wisdom (Dar
al-Hikma) of caliph Abd Allah al Mahmoun. He is primarily known for his treatise al
Kitab almukhtasar fi hisab al-jabr w’al muqabala (that can be translated as “The
Compendious Book on Calculation by Completion and Balancing”), which provides
the origin of the word algebra (al-Jabr, used in the sense of transposition, became
algebra). He explained in Arabic the system of Indian decimal digits applied to
arithmetic operations. The Latin translation of this work, entitled Algoritmi de
numero Indorum gave rise to the word algorithm. Al Khwarizmi died after AD 847.
Figure 7.3: Al Khwarizmi

Algorithm 7.2: Newton’s method: one variable


1 Objective
2 Find (an approximation of) a solution to the equation F(x) = 0.
3 Input
4 The function F : R → R.
5 The derivative of the function F ′ : R → R.
6 A first approximation of the solution x0 ∈ R.
7 The required precision ε ∈ R, ε > 0.
8 Output
9 An approximation x∗ ∈ R to the solution.
10 Initialization
11 k := 0.
12 Repeat
13 xk+1 := xk − F(xk )/F ′ (xk ),
14 k := k + 1.
15 Until F(xk ) ≤ ε
16 x∗ = xk

Example 7.3 (Newton’s method: one variable – I). Take the equation

F(x) = x2 − 2 = 0 .

We have F ′ (x) = 2x. We apply Newton’s method (Algorithm 7.2) with x0 = 2,


and ε = 10−15 . The iterations are listed in Table 7.1. The first two iterations are
portrayed in Figure 7.4. Figure 7.4(a) represents the first iteration, where x0 = 2.
186 Equation with one unknown

The linear model at x0 is represented by a dotted line. It intersects the x-axis at


x1 = 1.5. Figure 7.4(b) represents the second iteration, where x1 = 1.5. The linear
model at x1 is represented by a dotted line. It intersects the x-axis at x2 = 1.4666.

Table 7.1: Iterations with Newton’s method for Example 7.3


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

k xk F(xk ) F ′ (xk )
0 +2.00000000E+00 +2.00000000E+00 +4.00000000E+00
1 +1.50000000E+00 +2.50000000E-01 +3.00000000E+00
2 +1.41666667E+00 +6.94444444E-03 +2.83333333E+00
3 +1.41421569E+00 +6.00730488E-06 +2.82843137E+00
4 +1.41421356E+00 +4.51061410E-12 +2.82842712E+00
5 +1.41421356E+00 +4.44089210E-16 +2.82842712E+00

2 • 0.3
0.25 •
1.5
0.2
1 0.15
0.5 0.1
• 0.05
0
0 •
-0.5 -0.05
1.41.51.61.71.81.9 2 2.1 1.42 1.44 1.46 1.48 1.5
x x
(a) First iteration (b) Second iteration

Figure 7.4: Newton’s method for Example 7.3

According to Example 7.3, Newton’s method seems quite fast, as only 5 iterations
were necessary to converge. We characterize this speed below. Before that, however,
we illustrate by other examples that the method does not always work that well.
Example 7.4 (Newton’s method: one variable – II). Take the equation
F(x) = x − sin x = 0 .
We have F ′ (x) = 1 − cos x. We apply Newton’s method (Algorithm 7.2) with x0 = 1
and ε = 10−15 . The iterations are listed in Table 7.2. The number of iterations is
much larger than for the previous example. Note how the derivative F ′ (xk ) is getting
closer and closer to 0 as the iterations proceed. Actually, the root of this equation is
x∗ = 0, and the value of the derivative at the root is 0. As Newton’s method divides
by F ′ (xk ) at each iteration, the fact that F ′ (x∗ ) = 0 is the source of the slow behavior
of the method. The first two iterations are portrayed in Figure 7.5(a). The linear
model at the starting point x0 = 1 is represented by a dotted line and intersects the
x-axis at 0.65, which is the first iterate.
Newton’s method 187

Table 7.2: Iterations with Newton’s method for Example 7.4


k xk F(xk ) F ′ (xk )
0 +1.00000000E+00 +1.58529015E-01 +4.59697694E-01
1 +6.55145072E-01 +4.58707860E-02 +2.07040452E-01
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2 +4.33590368E-01 +1.34587380E-02 +9.25368255E-02


3 +2.88148401E-01 +3.97094846E-03 +4.12282985E-02
4 +1.91832312E-01 +1.17439692E-03 +1.83434616E-02
..
.
25 +3.84171966E-05 +9.44986548E-15 +7.37940486E-10
26 +2.56114682E-05 +2.79996227E-15 +3.27973648E-10
27 +1.70743119E-05 +8.29617950E-16 +1.45766066E-10

0.25
0.2
0.15 •
0.1
0.05 •
0 • •
-0.05
-0.1
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
x
(a) Two iterations

0.06

0.04
0.02
0 • •
-0.02
-0.04
0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7
x
(b) Zoom

Figure 7.5: Newton’s method for Example 7.4

The linear model at that point is also represented by a dotted line, and intersects
the x-axis at 0.43, which is the second iterate. Figure 7.5(b) is a zoom on the same
figure.
188 Equation with one unknown

Even though Newton’s method has managed to provide the desired precision in
5 iterations for Example 7.3, more than 5 times as many iterations are necessary for
Example 7.4. In the following Example, we see that the method may sometimes not
work at all.
Example 7.5 (Newton’s method: one variable – III). Take equation
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

F(x) = arctan x = 0 .

We have F ′ (x) = 1/(1 + x2 ). We apply Newton’s method (Algorithm 7.2) with


x0 = 1.5 and ε = 10−15 . The first 10 iterations are listed in Table 7.3. We note that
the absolute value of xk increases with each iteration, that the value of F(xk ) seems
to oscillate, and that the value of F ′ (xk ) closes in on 0. Therefore, not only does the
algorithm not approach the solution, but when the derivative approaches 0, the main
iteration cannot be performed due to the division by 0. The first three iterations are
portrayed in Figure 7.6.

Table 7.3: The ten first iterations with Newton’s method for Example 7.5
k xk F(xk ) F ′ (xk )
0 +1.50000000E+00 +9.82793723E-01 +3.07692308E-01
1 -1.69407960E+00 -1.03754636E+00 +2.58404230E-01
2 +2.32112696E+00 +1.16400204E+00 +1.56552578E-01
3 -5.11408784E+00 -1.37769453E+00 +3.68271300E-02
4 +3.22956839E+01 +1.53984233E+00 +9.57844131E-04
5 -1.57531695E+03 -1.57016153E+00 +4.02961851E-07
6 +3.89497601E+06 +1.57079607E+00 +6.59159364E-14
7 -2.38302890E+13 -1.57079633E+00 +1.76092712E-27
8 +8.92028016E+26 +1.57079633E+00 +1.25673298E-54
9 -1.24990460E+54 -1.57079633E+00 +6.40097701E-109
10 +2.45399464E+108 +1.57079633E+00 +1.66055315E-217

We now analyze in detail the aspects that influence the efficiency of the method.
The main result can be stated as follows:
• if the function is not too non linear,
• if the derivative of F at the solution is not too close to 0,
• if x0 is not too far from the root,
• then Newton’s method converges quickly toward the solution.
The central idea of the analysis is to measure the error that is committed when
the non linear function is replaced by the linear model. Intuitively, if the function
is almost linear, the error is small. While if the function is highly non linear, the
error is more significant. We use here the Lipschitz continuity of the derivative of
F to characterize the non linearity, as discussed in Section 2.4 (Definition 2.27).
Theorem 7.6 considers a linear model at b x, and provides an upper bound on the error
Newton’s method 189

1.5

0.5
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

0 x x1 x0 x2
3

-0.5

-1

-1.5
-4 -2 0 2 4
x
Figure 7.6: Newton’s method for Example 7.5

at a point x+ . This bound depends on the distance between b x and x+ , and on the
Lipschitz constant that characterizes the nonlinearity of the function.

Theorem 7.6 (Error of the linear model: one variable). Consider an open interval
X ⊆ R and a function F for which the derivative is Lipschitz continuous over X,
x, x+ ∈ X,
where M is the Lipschitz constant. So, for all b

(x+ − b
x)2
F(x+ ) − mbx (x+ ) ≤ M . (7.5)
2

Proof. We have

Z x+ Z x+ Z x+

F ′ (z) − F ′ (b
x) dz = F ′ (z) dz − F ′ (b
x) dz linearity of the integral
b
x b
x b
x

x) − F ′ (b
= F(x+ ) − F(b x)(x+ − b
x)

= F(x+ ) − mbx (x+ ) from (7.1) .

x + t(x+ − b
We take z = b x) and dz = (x+ − b
x) dt to obtain

Z1
 
F(x+ ) − mbx (x+ ) = F′ b x) − F ′ (b
x + t(x+ − b x) (x+ − b
x) dt .
0
190 Equation with one unknown

Therefore,

F(x+ ) − mbx (x+ )


Z1
 
= F′ b x) − F ′ (b
x + t(x+ − b x) (x+ − b
x) dt
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

0
Z1
 
≤ F′ b x) − F ′ (b
x + t(x+ − b x) |x+ − b
x| dt from Theorem C.12
0
Z1
 
= |x+ − b
x| F′ b x) − F ′ (b
x + t(x+ − b x) dt
0
Z1
+
≤ |x − b
x| M t(x+ − b
x) dt from Definition 2.27
0
Z1
+ 2
= M x −b
x t dt
0
M + 2
= x −b
x .
2

We now use this bound on the error to demonstrate the convergence of Newton’s
method.

Theorem 7.7 (Convergence of Newton’s method: one variable). Consider an open


interval X ⊆ R and a continuously differentiable function F such that its deriva-
tive is Lipschitz continuous over X, and where the Lipschitz constant is M.
Assume that there exists ρ > 0 such that

F ′ (x) ≥ ρ , ∀x ∈ X . (7.6)

Assume that there exists x∗ ∈ X such that F(x∗ ) = 0. There then exists η > 0
such that, if
|x0 − x∗ | < η (7.7)

with x0 ∈ X, the sequence xk k defined by

F(xk )
xk+1 = xk − , k = 0, 1, . . . , (7.8)
F ′ (xk )

is well defined and converges toward x∗ . Moreover,


M 2
|xk+1 − x∗ | ≤ xk − x∗ . (7.9)

Newton’s method 191

Proof. We provide a proof by induction. For k = 0, x1 is well defined as F ′ (x0 ) 6= 0


by assumption (7.6), as x0 ∈ X. We have

F(x0 )
x1 − x∗ = x0 − − x∗ from (7.8)
F ′ (x0 )
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

F(x0 ) − F(x∗ )
= x0 − x∗ − because F(x∗ ) = 0
F ′ (x0 )
1 
= ′ F(x∗ ) − mx0 (x∗ ) from (7.1) .
F (x0 )

Then
1
x1 − x∗ ≤ F(x∗ ) − mx0 (x∗ )
F ′ (x0 )
M 2
≤ ′
x0 − x∗ from (7.5)
2 F (x0 )
M
≤ |x0 − x∗ |2 from (7.6) ,

which proves the result for k = 0.


We now need technical constants. Take τ such that 0 < τ < 1 and let r be the
radius of the largest interval contained in X and centered in x∗ . We then create
 

η = min r, τ . (7.10)
M

Therefore, based on the hypothesis (7.7), we have


|x0 − x∗ | ≤ η ≤ τ (7.11)
M
and
M 2 M 2ρ
|x1 − x∗ | ≤ x0 − x∗ ≤ τ |x0 − x∗ | = τ|x0 − x∗ | < η ,
2ρ 2ρ M
where the last inequality is the result of the fact that τ < 1 and |x0 − x∗ | < η. Since
|x1 − x∗ | < η, we also have that |x1 − x∗ | < r (according to (7.10)) and x1 ∈ X. x1
thus satisfies the same assumptions as x0 . We can now apply the recurrence using
the same arguments for x2 , x3 , and so forth.
We now comment a summarized version of the result of Theorem 7.6:
If the function is not too non linear This assumption is related to the Lipschitz
continuity. The closer M is to 0, the less non linear is the function.
If the derivative of F is not too close to 0 This is hypothesis (7.6). If this as-
sumption is not satisfied, the method may not be well defined (division by zero),
may not converge, or may converge slowly, as illustrated by Example 7.4.
If x0 is not too far from the root This is hypothesis (7.7). If x0 is too far from
the root, the method may not converge, as shown in Example 7.5. It is interesting
192 Systems of equations with multiple unknowns

to take a close look at Definition (7.10) of η, assuming that r (a technical param-


eter) is sufficiently large such that η = 2ρτ/M. If the function is close to being
linear, then M is small and η is large. It means that the set of starting points
such that the method converges is large, and we can afford to start from a point
x0 farther away from x∗ . In practice, as x∗ is not known, it increases the chance
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

of finding a valid starting point.


Newton’s method converges quickly toward the solution The speed is char-
acterized by (7.9). At each iteration, the new distance to the solution is of the
order of the square of the former. For instance, if the initial error is of the order
of 10−1 , it only takes three iterations for it to become of the order of 10−8 . This
is illustrated in Example 7.3, for which the iterations are described in Table 7.1.
The method is said to converge q-quadratically.


Definition 7.8 (q-quadratic convergence). Take a sequence xk k in Rn that con-
verges toward x∗ . The sequence is said to converge q-quadratically toward x∗ if there
exists c ≥ 0 and b
k ∈ N such that
2
xk+1 − x∗ ≤ c xk − x∗ , ∀k ≥ b
k. (7.12)

In Definition 7.8, the prefix q signifies quotient. In practice, the other types of
convergence are rarely used, and the prefix could be omitted. More details can be
found in Ortega and Rheinboldt (1970).

7.2 Systems of equations with multiple unknowns


We now generalize Newton’s method for systems of non linear equations with multiple
unknowns. The concepts are exactly the same. We start with the definition of the
linear model. Although in our context, the function F maps Rn into Rn , we provide
the most general definition for a function from Rn to Rm .

Definition 7.9 (Linear model of a function with n variables). Let F : Rn → Rm


be a continuously differentiable function. The linear model of F in b x is a function
mbx : Rn → Rm defined by
T
x) + ∇F b
mbx (x) = F(b x (x − bx) = F(b x)(x − b
x) + J(b x) , (7.13)

x) ∈ Rn×m is the gradient matrix of F in b


where ∇F(b x (Definition 2.17) and J(b
x) =
T
b
∇F x is the Jacobian matrix, of dimensions m × n (Definition 2.18).
Newton’s method 193

As in the case with one variable, we determine a bound for the error committed
when replacing the function F by the linear model and finding a result similar to that
of Theorem 7.6. The proof is essentially the same.

Theorem 7.10 (Error of the linear model: n variables). Let F : Rn → Rm be


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

a continuously differentiable function over an open convex set X ⊂ Rn . The


Jacobian matrix of F is Lipschitz continuous over X (Definition 2.27, where M
is the Lipschitz constant, and the matrix norm is induced by the vector norm;
x, x+ ∈ X,
Definition B.27). So, for all b
2
x+ − b
x
F(x+ ) − mbx (x+ ) ≤ M . (7.14)
2

Proof. The structure of the proof is identical to that of Theorem 7.6. We have

F(x+ ) − mbx (x+ )


= F(x+ ) − F(b x)(x+ − b
x) − J(b x) from (7.13)
Z1

= J b x + t(x+ − bx) (x+ − b x)(x+ − b
x) dt − J(b x) Theorem C.11
0
Z1
 
= x + t(x+ − b
J b x) (x+ − b
x) − J(b x) dt .
0

Then,

F(x+ ) − mbx (x+ )


Z1

≤ x + t(x+ − b
J b x) kx+ − b
x) − J(b xk dt Theorem C.12
0
Z1
≤ M t(x+ − b
x) kx+ − b
xk dt Definition 2.27
0
Z1
2
= M x+ − b
x t dt
0
2
x+ − b
x
=M .
2

Newton’s method for systems of equations is also essentially the same as for a
single variable. It is described by Algorithm 7.3. System (7.16) solved at step 13 of
the algorithm is often called the Newton equations. Note that we have intentionally
not written this step as
dk+1 = −J(xk )−1 F(xk ).
Indeed, from a numerical point of view, the calculation of dk+1 must be performed
by solving the system of linear equations, not by inverting the Jacobian matrix.
194 Systems of equations with multiple unknowns

Algorithm 7.3: Newton’s method: n variables


1 Objective
2 To find (an approximation of) a solution to the system of equations

F(x) = 0 . (7.15)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

3 Input
4 The function F : Rn → Rn .
5 The Jacobian matrix of the function J : Rn → Rn×n .
6 A first approximation of the solution x0 ∈ Rn .
7 The required precision ε ∈ R, ε > 0.
8 Output
9 An approximation x∗ ∈ Rn of the solution.
10 Initialization
11 k = 0.
12 Repeat
13 Calculate dk+1 solution of

J(xk )dk+1 = −F(xk ) . (7.16)

xk+1 := xk + dk+1 .
14 k := k + 1.
15 Until F(xk ) ≤ ε
16 x∗ = xk

Example 7.11 (Newton’s method: n variables). Consider the system of equations


2
x1 + 1 + x22 = 2
(7.17)
ex1 + x32 = 2 .

We apply Newton’s method with


! !
(x1 + 1)2 + x22 − 2 2(x1 + 1) 2x2
F(x) = and J(x) = .
ex1 + x32 − 2 ex1 3x22
T
If x0 = 1 1 , we have
   
3 3
F(x0 ) = ≈
e−1 1.7182

and  
4 2
J(x0 ) = .
e 3
Newton’s method 195

The iterations of Newton’s method are described in Table 7.4, with ε = 10−15 , where
the first column reports the iteration number, the second column the current iterate,
the third the value of the function at the current iterate, and the last its norm. The
quadratic convergence of the method is well illustrated in this example. Indeed, the
value of xk converges rapidly to the solution (0, 1)T , and the values of kF(xk )k decrease
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

rapidly toward zero.

Table 7.4: Iterations of Newton’s method for Example 7.11


k xk F(xk ) F(xk )
0 1.00000000e+00 3.00000000e+00 3.45723768e+00
1.00000000e+00 1.71828182e+00
1 1.52359213e-01 7.56629795e-01 1.15470870e+00
1.19528157e+00 8.72274931e-01
2 -1.08376809e-02 5.19684443e-02 1.14042557e-01
1.03611116e+00 1.01513475e-01
3 -8.89664601e-04 1.29445248e-03 3.94232975e-03
1.00153531e+00 3.72375572e-03
4 -1.37008875e-06 3.13724882e-06 8.07998556e-06
1.00000293e+00 7.44606181e-06
5 -5.53838974e-12 1.05133679e-11 2.88316980e-11
1.00000000e+00 2.68465250e-11
6 -1.53209346e-16 -2.22044604e-16 2.22044604e-16
1.00000000e+00 0.00000000e+00

We now analyze the impact of the starting point on the solution of Newton’s
method.
Example 7.12 (Newton fractal). Consider the system of equations
!
x31 − 3x1 x22 − 1
F(x) = = 0.
x32 − 3x21 x2

It has three roots:


     
1 −1/2 −1/2
x∗ (b) = , x∗ (g) = √ , x∗ (w) = √ .
0 3/2 − 3/2
We apply Newton’s method to this problem, starting from different points. To visu-
alize the process, we take on the following convention:
• if Newton’s method, when starting from the point x0 , converges toward the solu-
tion x∗ (b), the point x0 is colored in black;
• if Newton’s method, when starting from the point x0 , converges toward the solu-
tion x∗ (g), the point x0 is colored in gray;
• if Newton’s method, when starting from the point x0 converges toward the solution
x∗ (w), the point x0 is colored in white.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48 196 Systems of equations with multiple unknowns

(a) −2 ≤ x1 ≤ 2, −2 ≤ x2 ≤ 2 (b) −0.001 ≤ x1 ≤ 0.001, −0.001 ≤ x2 ≤ 0.001

Figure 7.7: Newton’s method: relation between the starting point and the solution

The result is presented in Figure 7.7(a), where the three roots are represented
by a + sign. We see that there is no direct relationship between the position of
the starting point and the root identified by the method. For example, look at the
gray areas at the bottom right of Figure 7.7(a). Although these starting points are
closer to the roots x∗ (b) and x∗ (w), Newton’s method converges towards x∗ (g) when
started from these areas. But the most noticeable feature of this figure is the shape
of the borders between each region. This type of configuration is called a fractal
(see Mandelbrot, 1982). The zoom presented in Figure 7.7(b) shows that two points
that are very close may be colored differently. This is an illustration of a chaotic
system, which exhibits a significantly different outcome when the starting conditions
are perturbed just a little bit.

We now generalize Theorem 7.7 for the case of n equations and n variables.

Theorem 7.13 (Convergence of Newton’s method: n variables). Consider an open


convex set X ⊆ Rn and a function F : X → Rn . We assume that there exists
x∗ ∈ X, a sphere B(x∗ , r) centered in x∗ with radius r and a constant ρ > 0 such
that F(x∗ ) = 0, B(x∗ , r) ⊂ X, J(x∗ ) is invertible,
1
J(x∗ )−1 ≤ (7.18)
ρ
and J is Lipschitz continuous over B(x∗ , r), where M is the Lipschitz constant.
There thus exists η > 0 such that if

x0 ∈ B(x∗ , η) , (7.19)
Newton’s method 197

the sequence (xk )k defined by

xk+1 = xk − J(xk )−1 F(xk ) , k = 0, 1, . . . , (7.20)

is well defined and converges toward x∗ . Moreover,


Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

M 2
xk+1 − x∗ ≤ xk − x∗ . (7.21)
ρ

Proof. In order for the sequence to be well defined, the matrix J(xk ) always has to
be invertible. By assumption, it is the case at x∗ . We choose η such that J(x) is
invertible for all x in a sphere B(x∗ , η) of radius η around x∗ . We take
 ρ 
η = min r, . (7.22)
2M
We first demonstrate that J(x0 ) is invertible, by using the theorem about the inverse
of a perturbed matrix (Theorem C.16), with A = J(x∗ ) and B = J(x0 ). The hypothesis
(C.28) on which the theorem is based is satisfied. Indeed,
−1  −1
J x∗ J(x0 ) − J(x∗ ) ≤ J x∗ J(x0 ) − J(x∗ )
1
≤ J(x0 ) − J(x∗ ) from (7.18)
ρ
M
≤ kx0 − x∗ k Lipschitz
ρ
M
≤ η from (7.19)
ρ
1
≤ from (7.22) .
2
Therefore, J(x0 ) is invertible, and x1 given by (7.20) is well defined. By using this
result, Theorem C.16 and by noting that if y ≤ 1/2, then 1/(1 − y) ≤ 2, we obtain
−1
−1 J x∗
J x0 ≤ −1 
1 − J x∗ J(x0 ) − J(x∗ ) (7.23)

∗ −1 2
≤2 J x ≤ .
ρ
We have

x1 − x∗ = x0 − J(x0 )−1 F(x0 ) − x∗ according to (7.20)



= x0 − J(x0 ) −1
F(x0 ) − F(x ) − x∗

because F(x∗ ) = 0

= J(x0 )−1 F(x∗ ) − F(x0 ) − J(x0 )(x∗ − x0 )

= J(x0 )−1 F(x∗ ) − mx0 (x∗ ) from (7.13) .
198 Project

Consequently,

kx1 − x∗ k ≤ J(x0 )−1 F(x∗ ) − mx0 (x∗ )


2
≤ F(x∗ ) − mx0 (x∗ ) from (7.23)
ρ
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2
2 x0 − x∗
≤ M from (7.14) ,
ρ 2

which proves (7.21) for k = 0. Since

kx0 − x∗ k ≤ η from (7.19)


ρ
≤ from (7.22) ,
2M
we have
2 ρ kx0 − x∗ k 1
kx1 − x∗ k ≤ M = kx0 − x∗ k
ρ 2M 2 2
and x1 ∈ B(x∗ , η). The same reasoning can be applied recursively to prove the result
for k = 1, 2, 3, . . .
Newton’s method constitutes an effective tool that is central in optimization prob-
lems. However, it has two undesirable features:
1. it must be started close to the solution (which is not known in practice) and,
therefore, does not work from any starting point (assumption (7.19));
2. it requires calculating the matrix of the derivatives at each iteration, which can
involve a great deal of calculations in solving real problems.
Techniques that permit us to address the first issue are called globalization tech-
niques. A global algorithm exhibits convergence when started from any point. We
study such methods directly in the context of optimization in later chapters, and refer
the interested reader to Dennis and Schnabel (1996) for a comprehensive description
of these techniques in the context of solving systems of equations. In Chapter 8, we
address the second issue by presenting methods based on the same idea as Newton’s
method, but without using the derivatives. Such methods are called quasi-Newton
methods.
The presentation of the proof of Theorems 7.6, 7.7, and 7.13 is inspired by Dennis
and Schnabel (1996).

7.3 Project
The general organization of the projects is described in Appendix D.

Objective
To analyze the impact of the starting point on the convergence of Newton’s method,
with inspiration taken from Example 7.12.
Newton’s method 199

Approach
To create drawings similar to those of Figure 7.7, we use the following convention:
• Associate a specific color for each solution. For instance, when we have three
solutions, the RGB codes (255, 0, 0), (0, 255, 0) and (0, 0, 255) could be utilized.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

• Define a maximum number of iterations K.


• Apply Newton’s method from a starting point x0 .
• If the method converges in k iterations toward the first solution, associate the color
255 − (255 k/K), 0, 0 to the point x0 . Similarly, associate the color 0, 255 −
(255 k/K), 0 and 0, 0, 255 − (255 k/K if the algorithm converges toward the
second or third solution, respectively.
• If the method does not converge, associate the color black (0, 0, 0) to the point x0 .

Algorithm
Algorithm 7.3.

Problems
Exercise 7.1. The system
x2 = x21
x21 + (x2 − 2)2 = 4
T √ T √ T
has three roots: 0 0 , − 3 3 , 3 3 . Note that there are three
intersections between a circle and a parabola (draw the sketch). Note also that the
Jacobian is singular when x1 = 0.
Exercise 7.2. The system
3x21 + 2x22 = 35
4x21 − 3x22 = 24
T T T T
has four solutions: −3 −2 , −3 2 , 3 −2 , 3 2 . Note that
there are three intersections between an ellipse and a hyperbole (draw the sketch).
Exercise 7.3. The system

x21 − x1 x2 + x22 = 21
x21 + 2x1 x2 − 8x22 = 0
√ √ T √ √ T T T
has four solutions: −2 7 − 7 , 2 7 7 , −4 1 , 4 −1 .
Warning: when implementing these systems, one must not confuse the Jacobian
and the gradient matrix. Each row of the Jacobian corresponds to an equation and
each column to a variable.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Chapter 8

Quasi-Newton methods

“You cannot have your cake and eat it too,” says a popular proverb. In this chapter,
however, we try! The method developed here has an effectiveness close to that of
Newton’s method, without requiring the calculation of derivatives.

Contents
8.1 Equation with one unknown . . . . . . . . . . . . . . . . . 201
8.2 Systems of equations with multiple unknowns . . . . . . 208
8.3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

When conditions so permit, Newton’s method proves to be fast. However, it re-


quires that the Jacobian matrix be explicitly calculated at each iteration. There are
a number of cases where the function F is not specified by formulas, but rather by ex-
periments or determined by software. In these cases, the analytical expression of the
derivative is unavailable. Even if the problem happens to have an analytical formu-
lation, the calculation of the derivatives can be prohibitive or even impossible when
the analytical calculation and implementation of the derivatives require excessively
long work, into which errors can easily slip.
In this chapter, we see that it is possible to use the ideas from Newton’s method,
without using the derivatives. This is of course done at the expense of performance.
However, this expense is often small compared with what we gain by not having to cal-
culate Jacobian matrices. We introduce the main ideas regarding the simple problem
of one equation with one unknown, before generalizing for systems of equations.

8.1 Equation with one unknown


The main idea is based on the definition of the derivative:
F(x + s) − F(x)
F ′ (x) = lim . (8.1)
s→0 s
202 Equation with one unknown

Sister Caasi Newton, or Quasi Newton, is the twin sister of Sir Isaac
Newton. Caasi Newton tried to follow in the footsteps of her illustri-
ous brother, but was never able to understand the complex concept of
derivatives. Her striking resemblance to her brother and the complete
absence of any writings cast doubt on her existence.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Figure 8.1: Sister Caasi Newton.

To obtain a good approximation of the value of the derivative, we simply choose a


value of s that is close enough to zero and obtain

F(x + s) − F(x)
as (x) = . (8.2)
s

Geometrically, the derivative at x is the slope of the tangent to the function at x. The
above approximation replaces the tangent by a secant intersecting the function at x
and x + s, as illustrated in Figure 8.2. The model obtained from this approximation
is therefore called the secant linear model of the function.

F(x)
mbx;s

s
b
x b
x+s

Figure 8.2: Secant linear model

Definition 8.1 (Secant linear model of a function with one variable). Let F : R → R
be a differentiable function. The secant linear model of F in b
x is a function mbx;s :
R → R defined by

x + s) − F(b
F(b x)
mbx;s (x) = F(b
x) + (x − b
x) , (8.3)
s
where s 6= 0.
Quasi-Newton methods 203

We can now utilize the same principle as in Newton’s method, replacing the deriva-
tive in (7.3) by its secant approximation and obtain

F(bx)
x+ = b
x− . (8.4)
as (b
x)
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

To obtain an algorithm, we now need only define the value of s. As said above, a
natural idea is to choose s small so as to obtain a good approximation of the derivative.
For example, s can be defined as

τb x| ≥ 1
x if |b
s= (8.5)
τ otherwise ,

where τ is small, for instance equal to 10−7 . For a more sophisticated calculation of τ,
taking into account the epsilon machine and the precision obtained when calculating F,
we refer the reader to Dennis and Schnabel (1996, Algorithm A5.6.3). The algorithm
based on this definition of s is called the finite difference Newton’s method and is
presented as Algorithm 8.1.

Algorithm 8.1: Finite difference Newton’s method: one variable


1 Objective
2 To find (an approximation of) a solution to the equation

F(x) = 0 .

3 Input
4 The function F : R → R.
5 A first approximation of the solution x0 ∈ R.
6 A parameter τ > 0.
7 The required precision ε ∈ R, ε > 0.
8 Output
9 An approximation of the solution x∗ ∈ R.
10 Initialization
11 k := 0.
12 Repeat
13 if |xk | ≥ 1 then
14 s := τxk
15 else
16 s := τ
sF(xk )
17 xk+1 := xk − .
F(xk + s) − F(xk )
18 k := k + 1.
19 Until F(xk ) ≤ ε
20 x∗ = xk .
204 Equation with one unknown

The iterations of this algorithm applied to Example 7.3, with τ = 10−7 , are
described in Table 8.1. The difference with the iterations of Newton’s method (Ta-
ble 7.1) are almost imperceptible. What is even more interesting is that a higher value
of τ may still enable the algorithm to converge, even if this convergence is slow. Ta-
ble 8.2 contains the iterations of the algorithm applied to Example 7.3, with τ = 0.1.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

The first two iterations of this algorithm are illustrated in Figure 8.3. Intuitively, we
expect it to work well, with a relatively large s, when the function is not too linear.


2.5
2 •
1.5 x1 x1 + s x0 x0 + s
1
×
0.5
×
0
1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3
x
Figure 8.3: Finite difference Newton’s method for Example 7.3

Table 8.1: Iterations for finite difference Newton’s method (τ = 10−7 ) for Example
7.3
k xk F(xk )
0 +2.00000000E+00 +2.00000000E+00
1 +1.50000003E+00 +2.50000076E-01
2 +1.41666667E+00 +6.94446047E-03
3 +1.41421569E+00 +6.00768206E-06
4 +1.41421356E+00 +4.81081841E-12
5 +1.41421356E+00 +4.44089210E-16

In practice, there is no reason to take τ = 0.1 because, even if this choice provides
results, it slows down the convergence. The only motivation to take a larger s would
be to save on function evaluations. This is the idea of the secant method, that uses
a step based on the last two iterates, that is
s = xk−1 − xk ,
in such a way that (8.2) is written as
F(xk−1 ) − F(xk )
as (xk ) = .
xk−1 − xk
Therefore, no additional evaluation of the function is required, because F(xk−1 ) has
already been calculated during the previous iteration. The secant method is described
Quasi-Newton methods 205

Table 8.2: Iterations for finite difference Newton’s method (τ = 0.1) for Example 7.3
k xk F(xk )
0 +2.00000000E+00 +2.00000000E+00
1 +1.52380952E+00 +3.21995465E-01
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2 +1.42318594E+00 +2.54582228E-02
3 +1.41466775E+00 +1.28485582E-03
4 +1.41423526E+00 +6.13706622E-05
5 +1.41421460E+00 +2.92283950E-06
6 +1.41421361E+00 +1.39183802E-07
7 +1.41421356E+00 +6.62780186E-09
8 +1.41421356E+00 +3.15609761E-10
9 +1.41421356E+00 +1.50284230E-11
10 +1.41421356E+00 +7.15427717E-13
11 +1.41421356E+00 +3.41948692E-14
12 +1.41421356E+00 +1.33226763E-15
13 +1.41421356E+00 +4.44089210E-16

as Algorithm 8.2. Note that this technique does not work at the first iteration (k = 0),
as xk−1 is not defined. For the first iteration, an arbitrary value for a0 is therefore
selected.

Table 8.3 shows the iterations of the secant method, with a0 = 1. Figure 8.4
illustrates the first two iterations of the method for this example. At the iterate
x0 = 2, a first arbitrary linear model, with slope 1, is first considered. It intersects
the x-axis at 0, which becomes the next iterate x1 . Then the secant method can start.
The secant intersecting the function at x0 and x1 is considered. It intersects the x-
axis at 1, which becomes iterate x2 . The next iteration is illustrated in Figure 8.5.
The secant intersecting the function at x1 and x2 , crosses the x-axis at x3 = 2.
Interestingly, by coincidence, it happens to be the same value as x0 . But it does
not mean that iteration 3 is the same as iteration 0. Indeed, between the two, the
algorithm has collected information about the function, and accumulated it into ak .
If a0 = 1 is an arbitrary value, not containing information about F, the value a3 = 3
used for the next secant model has been calculated using explicit measures of the
function F. As a consequence, the secant intersecting the function at x2 and x3
crosses the x-axis at x4 , that happens not to be too far from the zero of the function.
Therefore, the convergence of the method from iteration 4 is pretty fast, as can be
seen in Table 8.3. Indeed, during the last iterations, xk and xk−1 are closer and closer
and s = xk−1 − xk is smaller and smaller. Geometrically, it means that the secant is
closer and closer to the actual tangent, and the method becomes similar to the finite
difference Newton’s method. The rate of convergence is fast, and is characterized as
superlinear.
206 Equation with one unknown

Algorithm 8.2: Secant method: one variable


1 Objective
2 To find (an approximation of) a solution to the equation

F(x) = 0 .
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

3 Input
4 The function F : R → R.
5 A first approximation of the solution x0 ∈ R.
6 A first approximation of the derivative a0 (by default: a0 = 1).
7 The required precision ε ∈ R, ε > 0.
8 Output
9 An approximation of the solution x∗ ∈ R.
10 Initialization
11 k := 0.
12 Repeat
13 Update the current iterate

F(xk )
xk+1 := xk − .
ak

14 Update the approximation of the derivative

F(xk ) − F(xk+1 )
ak+1 := .
xk − xk+1

15 k := k + 1.
16 Until F(xk ) ≤ ε
17 x∗ = xk


Definition 8.2 (Superlinear convergence). Consider a sequence xk k in Rn that
converges toward x∗ . The sequence is said to converge superlinearly toward x∗ if

xk+1 − x∗
lim = 0. (8.6)
k→∞ xk − x∗
Quasi-Newton methods 207

Table 8.3: Iterations for the secant method (a0 = 1) for Example 7.3
k xk F(xk ) as (xk )
0 +2.00000000E+00 +2.00000000E+00 +1.00000000E+00
1 +0.00000000E+00 -2.00000000E+00 +2.00000000E+00
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

2 +1.00000000E+00 -1.00000000E+00 +1.00000000E+00


3 +2.00000000E+00 +2.00000000E+00 +3.00000000E+00
4 +1.33333333E+00 -2.22222222E-01 +3.33333333E+00
5 +1.40000000E+00 -4.00000000E-02 +2.73333333E+00
6 +1.41463415E+00 +1.18976800E-03 +2.81463415E+00
7 +1.41421144E+00 -6.00728684E-06 +2.82884558E+00
8 +1.41421356E+00 -8.93145558E-10 +2.82842500E+00
9 +1.41421356E+00 +8.88178420E-16 +2.82842706E+00

-1

-2
x1 x2 x0
-3
-0.5 0 0.5 1 1.5 2
x
Figure 8.4: Iterations 0 and 1 of the secant method for Example 7.3

-1

-2
x1 x2 x4 x3
-3
-0.5 0 0.5 1 1.5 2
x
Figure 8.5: Iterations 2 and 3 of the secant method for Example 7.3
208 Systems of equations with multiple unknowns

8.2 Systems of equations with multiple unknowns


We now generalize the concepts of Section 8.1 for systems of n equations with n
unknowns. Again, the ideas are based on a linear model.
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

Algorithm 8.3: Finite difference Newton’s method: n variables


1 Objective
2 To find (an approximation of) a solution to the system of equations

F(x) = 0 . (8.7)

3 Input
4 The function F : Rn → Rn .
5 A first approximation of the solution x0 ∈ Rn .
6 A parameter τ > 0.
7 The required precision ε ∈ R, ε > 0.
8 Output
9 An approximation of the solution x∗ ∈ Rn .
10 Initialization
11 k := 0.
12 Repeat
13 for j = 1, . . . , n do
14 if |(xk )j | ≥ 1 then
15 sj := τ(xk )j
16 else if 0 ≤ (xk )j ≤ 1
17 then
18 sj := τ
19 else
20 sj := −τ

21 Form the matrix Ak with columns

 F(xk + sj ej ) − F(xk )
Ak j
:= , j = 1, . . . , n ,
sj

where Ak j
is the jth column of Ak , and ej ∈ Rn is the jth canonical
vector, composed of 0, except at the jth place containing 1 instead.
22 Calculate dk+1 solution of Ak dk+1 = −F(xk ).
23 xk+1 := xk + dk+1 .
24 k := k + 1.
25 Until F(xk ) ≤ ε
26 x∗ = xk .
Quasi-Newton methods 209

Definition 8.3 (Linear secant model for a function with n variables). Let F : Rn →
Rm be a Lipschitz continuous function and A ∈ Rm×n a matrix. The linear secant
x is a function mbx;A : Rn → Rm defined by
model of F in b
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

x) + A(x − b
mbx;A (x) = F(b x) . (8.8)

When m = n, Definition 8.3 is similar to Definition 7.9, where J(b x) is replaced


by A. As we did for problems with one variable, we now consider two methods to
determine A: the approximation of J(b x) by finite difference and the secant method
based on previous iterates.
Algorithm 8.3 describes the method based on finite difference approximation. The
comments related to the problems with one variable remain valid. When τ is small, the
differences with the original Newton method are small (compare Table 7.4 and Table
8.4). When τ is large, the method still works, but with a much slower convergence
speed. Table 8.5 describes the iterations for τ = 0.1. We note that the choice of
τ = 0.1 is given only as an illustration. In practice, if the finite difference method is
adopted, a small value of τ should be used (see Dennis and Schnabel, 1996, for more
details).

Table 8.4: Iterations for the finite difference Newton’s method for Example 7.11
(τ = 10−7 )
k xk F(xk ) F(xk )
0 +1.00000000e+00 +3.00000000e+00 +3.45723769e+00
+1.00000000e+00 +1.71828183e+00
1 +1.52359228e-01 +7.56629845e-01 +1.15470878e+00
+1.19528158e+00 +8.72274977e-01
2 -1.08376852e-02 +5.19684806e-02 +1.14042632e-01
+1.03611119e+00 +1.01513541e-01
3 -8.89667761e-04 +1.29445824e-03 +3.94234579e-03
+1.00153532e+00 +3.72377069e-03
4 -1.37016733e-06 +3.13751967e-06 +8.08060994e-06
+1.00000294e+00 +7.44662523e-06
5 -5.68344146e-12 +1.09472431e-11 +2.98662028e-11
+1.00000000e+00 +2.77875500e-11
6 -9.93522913e-17 +0.00000000e+00 +4.44089210e-16
+1.00000000e+00 +4.44089210e-16

The main disadvantage of this method is that it uses n + 1 evaluations of the


function per iteration. This turns out to be prohibitive when n is large. Therefore,
we use the same idea as in the case involving a single variable: force the linear model
in xk to interpolate the function F in xk and in xk−1 . We immediately observe that
mxk ;Ak (xk ) = F(xk ) by Definition 8.3. We now need only impose
mxk ;Ak (xk−1 ) = F(xk ) + Ak (xk−1 − xk ) = F(xk−1 ) (8.9)
210 Systems of equations with multiple unknowns

Table 8.5: Iterations for the finite difference Newton’s method for Example 7.11
(τ = 0.1)
k xk F(xk ) F(xk )
0 +1.00000000e+00 +3.00000000e+00 +3.45723769e+00
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

+1.00000000e+00 +1.71828183e+00
1 +1.64629659e-01 +8.02103265e-01 +1.21852778e+00
+1.20238971e+00 +9.17300554e-01
2 -1.45741083e-02 +8.85985792e-02 +1.88972898e-01
+1.05713499e+00 +1.66916290e-01
3 -5.72356301e-03 +8.21459268e-03 +2.52536228e-02
+1.00976678e+00 +2.38802414e-02
4 -4.76896360e-04 +1.48824845e-03 +3.51842725e-03
+1.00122016e+00 +3.18817297e-03
..
.
14 -2.17152295e-13 +5.45341550e-13 +1.36591792e-12
+1.00000000e+00 +1.25233157e-12
15 -2.49137961e-14 +6.26165786e-14 +1.56919411e-13
+1.00000000e+00 +1.43884904e-13
16 -2.79620466e-15 +7.10542736e-15 +1.79018084e-14
+1.00000000e+00 +1.64313008e-14
17 -2.35536342e-16 +8.88178420e-16 +1.98602732e-15
+1.00000000e+00 +1.77635684e-15
18 -5.13007076e-17 +0.00000000e+00 +0.00000000e+00
+1.00000000e+00 +0.00000000e+00

or
Ak (xk − xk−1 ) = F(xk ) − F(xk−1 ) .

This equation is called the secant equation.

Definition 8.4 (Secant equation). A linear model satisfies the secant equation in xk
and xk−1 if the matrix A defining it is such that

A(xk − xk−1 ) = F(xk ) − F(xk−1 ) . (8.10)

By taking
dk−1 = xk − xk−1
(8.11)
yk−1 = F(xk ) − F(xk−1 ) ,
it is written as
Adk−1 = yk−1 . (8.12)

Given xk , xk−1 , F(xk ) and F(xk−1 ), the linear secant model is based on a ma-
trix A satisfying the system of equations (8.10) or (8.12). This system of n linear
Quasi-Newton methods 211

equations has n2 unknowns (the elements of A). Therefore, when n > 1, it is always
underdetermined and has an infinite number of solutions. From a geometrical point
of view, there are infinitely many hyperplanes passing through the two points.
The idea proposed by Broyden (1965) is to choose, among the infinite number of
linear models verifying the secant equation, the one that is the closest to the model
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

established during the previous iteration, thereby conserving to the largest possible
extent what has already been calculated. We now calculate the difference between two
successive models, that is mxk−1 ;Ak−1 (x), the model of the function in the previous
iterate xk−1 , and mxk ;Ak (x), the model of the function in the current iterate xk .

Lemma 8.5. Let mxk ;Ak (x) and mxk−1 ;Ak−1 (x) be linear secant models of a func-
tion F : Rn → Rn in xk and xk−1 , respectively. If these models satisfy the secant
equation, we can characterize the difference between the two models by

mxk ;Ak (x) − mxk−1 ;Ak−1 (x) = (Ak − Ak−1 )(x − xk−1 ) . (8.13)

Proof. The proof exploits the definition (8.8) of the secant model, and the secant
equation (8.10).
mxk ;Ak (x) − mxk−1 ;Ak−1 (x) = F(xk ) + Ak (x − xk )
− F(xk−1 ) − Ak−1 (x − xk−1 ) from (8.8)
= F(xk ) + Ak (x − xk )
− F(xk−1 ) − Ak−1 (x − xk−1 )
+ Ak xk−1 − Ak xk−1
= F(xk ) − F(xk−1 ) − Ak (xk − xk−1 )
+ (Ak − Ak−1 )(x − xk−1 )
= (Ak − Ak−1 )(x − xk−1 ) from (8.10) .

We now need only establish which matrix Ak minimizes this difference.

Theorem 8.6 (Broyden update). Let mxk−1 ;Ak−1 (x) be the linear secant model of
a function F : Rn → Rn in xk−1 and let us take xk ∈ Rn , xk 6= xk−1 . The linear
secant model of F in xk that satisfies the secant equation (8.10) and is as close
as possible to mxk−1 ;Ak−1 (x) is

mxk ;Ak (x) = F(xk ) + Ak (x − xk ) , (8.14)

with
(yk−1 − Ak−1 dk−1 )dTk−1
Ak = Ak−1 + , (8.15)
dTk−1 dk−1
where dk−1 = xk − xk−1 and yk−1 = F(xk ) − F(xk−1 ).
212 Systems of equations with multiple unknowns

Proof. According to Lemma 8.5, the difference between the two linear models is
(Ak − Ak−1 )(x − xk−1 ) . (8.16)
The secant equation imposes the behavior of the linear model solely in the direction
dk−1 . Therefore, the degrees of freedom should be explored in the directions that are
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

orthogonal to dk−1 . For all x, we can decompose


x − xk−1 = αdk−1 + s , (8.17)
where s ∈ Rn is such that dTk−1 s = 0. Therefore, (8.16) is written as
α(Ak − Ak−1 )dk−1 + (Ak − Ak−1 )s . (8.18)
The secant equation imposes that the first term should be
α(Ak − Ak−1 )dk−1 = α(yk−1 − Ak−1 dk−1 ) .
It does not depend on Ak and no degrees of freedom are available here. However,
dk−1 is not involved in the second term, and the secant equation is irrelevant for this
term. We choose Ak such that this second term disappears and that the gap between
the two models is minimal. This is the case if Ak − Ak−1 is defined by
Ak − Ak−1 = udTk−1 , (8.19)
because dTk−1 s = 0. In this way, the choice of Ak depends on the choice of u. Once
again, it is the secant equation that enables its definition. We have
udTk−1 dk−1 = (Ak − Ak−1 )dk−1 = yk−1 − Ak−1 dk−1 .
Therefore,
yk−1 − Ak−1 dk−1
u= (8.20)
dTk−1 dk−1
and (8.15) is obtained directly from (8.19) and (8.20).
We now show that this update indeed generates the matrix satisfying the secant
equation that is the closest to Ak−1 .

Theorem 8.7 (Broydenoptimality). Consider Ak−1 ∈ Rn×n , dk−1 and yk−1 ∈


Rn , dk−1 6= 0. Let S = A | Adk−1 = yk−1 be the set of matrices satisfying the
secant equation. Then (8.15), i.e.,

(yk−1 − Ak−1 dk−1 )dTk−1


Ak = Ak−1 +
dTk−1 dk−1
is the solution to
min A − Ak−1 2
A∈S

and the unique solution to

min A − Ak−1 F
.
A∈S
Quasi-Newton methods 213

Proof. Let A be an arbitrary matrix in S. We have

(yk−1 − Ak−1 dk−1 )dTk−1


Ak − Ak−1 = from (8.15)
2 dTk−1 dk−1 2
(Adk−1 − Ak−1 dk−1 )dTk−1
Ce document est la propriété exclusive de Kavyaa Kannan ([email protected]) - jeudi 18 avril 2024 à 07h48

= because A ∈ S
dTk−1 dk−1 2
(A − Ak−1 )dk−1 dTk−1
=