0% found this document useful (0 votes)
15 views100 pages

2022 Moderncontrolsystems

Uploaded by

Abhay Thakur 01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views100 pages

2022 Moderncontrolsystems

Uploaded by

Abhay Thakur 01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

Modern Control Systems (Moderne

Regelungstechnik)

Lecture Notes

Jürgen Pannek

April 7, 2023
Jürgen Pannek
Institute for Intermodal Transport and Logistic Systems
Hermann-Blenck-Str. 42
38519 Braunschweig
FOREWORD

During winter term 2022/23 I give the lecture to the module Modern Control Systems (Moderne
Regelungstechnik) at the Technical University of Braunschweig. To structure the lecture and
support my students in their learning process, I prepared these lecture notes. As it is the first
edition, the notes are still incomplete and are updated in due course of the lecture itself. Moreover,
I will integrate remarks and corrections throughout the term.
The aim of the module is to provide participating students with knowledge of advanced control
methods, which extend the range of control engineering. After having successfully completed
the lecture Modern Control Systems, students are able to define control methods for embedded
and networked systems, transfer them to models and applications and apply them. The students
can specify and explain the aspects of consistency, stability and robustness as well as areas of
application of methods. In addition, they are able to implement the integration of methods in
toolchains and apply them to real systems such as vehicles. Students can also reproduce processes
of parameter application and automated testing and transfer them to case studies.
To this end, the module will tackle the subject areas

optimal and robust control as well as

predictive and AI based control

for linear as well as nonlinear systems. In particular, we discuss the methods

LQR – linear quadratic control,

H2 regulator – output feedback control,

H∞ regulator – robust control,

MPC – model predictive control, and


II

DCS – distributed control systems.

within the lecture and support understanding and application within the tutorial classes. The
module itself is accredited with 5 credits.
An electronic version of this script can be found at

https://www.tu-braunschweig.de/itl/lehre/skripte

During the preparation of the lecture, I utilized the books of E. Sontag [6] and D. Hinrichsen and
A. Pritchard [3] for terminology and linear systems. Regarding MPC, the lecture notes are based
on the book of L. Grüne and J. Pannek [2].
Contents

Contents iv

List of tables v

List of figures vii

List of definitions and theorems xi

1 Stability and observability 1


1.1 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Observability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

I Linear systems 19

2 Optimal stabilization 21
2.1 Linear quadratic regulator — LQR . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 H2 control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 H∞ control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 Optimal observation 33
3.1 Recursive estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Transformation of dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

II Nonlinear systems 43

4 Digitalization 45
4.1 Zero order hold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Practical stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
IV C ONTENTS

4.3 Existence of stabilizing feedback . . . . . . . . . . . . . . . . . . . . . . . . . . 50


4.4 Intersample behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Model predictive control 55


5.1 Introduction of constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 MPC approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 Recursive feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Stability conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6 Distributed control 69
6.1 Separation of systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.2 Sequential approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Hierarchical approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4 Parallel approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Bibliography 85
List of Tables
List of Figures

1.1 Term of a system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2


1.2 Sketch of a dynamic flow and a trajectory . . . . . . . . . . . . . . . . . . . . . 5
1.3 Flow of information for controllability and observability . . . . . . . . . . . . . 6
1.4 Connection of controllability and stability . . . . . . . . . . . . . . . . . . . . . 12
1.5 Connection of observability and detectability . . . . . . . . . . . . . . . . . . . 17

2.1 Connection of LQR results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.1 Zero order hold sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47


4.2 Zero order hold solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.1 Building blocks within the MPC Algorithm 5.9 . . . . . . . . . . . . . . . . . . 56


5.2 Sketch of a three stage supply network . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 Definition of the driving path via splines for given routing points . . . . . . . . . 58
5.4 Sketch of a viability set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1 Building blocks within the MPC Algorithm 5.9 . . . . . . . . . . . . . . . . . . 69


6.2 Communication structure for scheme of Richards and How . . . . . . . . . . . . 77
6.3 Sketch of exemplary communication (dashed) and dependency graph (line) . . . 79
6.4 Communication schedule for dual decomposition . . . . . . . . . . . . . . . . . 83
List of Definitions and Theorems

Definition 1.1 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2


Definition 1.2 Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Definition 1.3 State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Definition 1.5 State space – continuous time system . . . . . . . . . . . . . . . . . . . 3
Definition 1.7 State space – discrete time system . . . . . . . . . . . . . . . . . . . . . 4
Definition 1.9 Linear control system . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Theorem 1.10 Solution of linear control system . . . . . . . . . . . . . . . . . . . . . 6
Corollary 1.11 Superposition and time shift . . . . . . . . . . . . . . . . . . . . . . . 7
Definition 1.12 Operating point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Definition 1.13 Stability and Controllability . . . . . . . . . . . . . . . . . . . . . . . 8
Theorem 1.14 Eigenvalue criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Theorem 1.16 Linear feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Theorem 1.17 Kalman criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Theorem 1.19 Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Theorem 1.20 Hautus criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Theorem 1.21 Controllable canonical form . . . . . . . . . . . . . . . . . . . . . . . . 11
Theorem 1.22 Assignable polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Theorem 1.23 Stabilizing polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Corollary 1.24 Polynomial for Hautus criterion . . . . . . . . . . . . . . . . . . . . . 11
Definition 1.26 Distinguishability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Lemma 1.27 Necessary and sufficient condition for distinguishability . . . . . . . . . . 13
Theorem 1.29 Kalman criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Theorem 1.30 Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Definition 1.31 Dual system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Theorem 1.32 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Definition 1.34 Detectability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Theorem 1.35 Hautus criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Theorem 1.36 Observable canonical form . . . . . . . . . . . . . . . . . . . . . . . . 16
Theorem 1.37 Duality of detectability and stability . . . . . . . . . . . . . . . . . . . 16
Definition 2.1 Key performance criterion . . . . . . . . . . . . . . . . . . . . . . . . . 21
X L IST OF D EFINITIONS AND T HEOREMS

Definition 2.2 Cost function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22


Definition 2.3 Cost functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Definition 2.4 Optimal control problem . . . . . . . . . . . . . . . . . . . . . . . . . 22
Definition 2.5 Null controlling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Corollary 2.6 Null controlling stability . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Definition 2.7 Quadratic cost function . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Definition 2.8 LQ problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Theorem 2.9 Null controlling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Theorem 2.10 LQR feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Theorem 2.11 Algebraic Riccati equation . . . . . . . . . . . . . . . . . . . . . . . . 24
Definition 2.14 L2 norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Corollary 2.16 H2 norm equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Corollary 2.17 Laplace-transform impulse response . . . . . . . . . . . . . . . . . . . 28
Theorem 2.18 H2 norm equivalence for LTI . . . . . . . . . . . . . . . . . . . . . . . 28
Theorem 2.19 H2 stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Definition 2.20 H2 problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Theorem 2.22 H2 feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Definition 2.25 L∞ norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Definition 2.27 H∞ problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Theorem 2.28 H∞ feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Definition 3.4 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Definition 3.5 Estimator dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Definition 3.6 Error function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Definition 3.7 Cost functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Definition 3.8 Optimal estimation problem . . . . . . . . . . . . . . . . . . . . . . . . 37
Corollary 3.9 Null controlling observability . . . . . . . . . . . . . . . . . . . . . . . 37
Definition 3.10 Error dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Definition 3.12 Quadratic cost functional for observability . . . . . . . . . . . . . . . 39
Theorem 3.13 Time transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Definition 3.14 Kalman filter problem . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Theorem 3.15 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Definition 4.2 Zero order hold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Definition 4.4 Zero order hold solution . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Definition 4.7 Practical stability/controllability . . . . . . . . . . . . . . . . . . . . . . 49
Lemma 4.9 Existence of feed forward . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Corollary 4.10 Existence of practically stabilizing feed forward . . . . . . . . . . . . . 49
Definition 4.12 Practical Control-Lyapunov functions . . . . . . . . . . . . . . . . . . 51
L IST OF D EFINITIONS AND T HEOREMS XI

Theorem 4.14 Existence of feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . 51


Theorem 4.16 Existence of practical Control-Lyapunov function . . . . . . . . . . . . 52
Definition 4.17 Uniform boundedness . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Theorem 4.18 Asymptotic stability and uniform boundedness over T . . . . . . . . . . 53
Definition 5.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Definition 5.6 Constrained optimal control problem . . . . . . . . . . . . . . . . . . . 60
Definition 5.8 Digital constrained optimal control problem . . . . . . . . . . . . . . . 61
Definition 5.11 Admissibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Definition 5.13 Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Theorem 5.14 Recursive feasibility and admissibility . . . . . . . . . . . . . . . . . . 64
Definition 5.17 Terminal constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Definition 5.19 Feasibility set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Corollary 5.20 Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Theorem 5.21 Recursive feasibility using terminal constraints . . . . . . . . . . . . . . 67
Theorem 5.22 Asymptotical stability using terminal constraints . . . . . . . . . . . . . 67
Definition 5.23 Terminal costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Theorem 5.24 Asymptotical stability using terminal costs . . . . . . . . . . . . . . . . 67
Theorem 5.25 Asymptotical stability using suboptimality . . . . . . . . . . . . . . . . 68
Definition 6.3 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Definition 6.4 Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Definition 6.7 Neighboring index set . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Definition 6.8 Neighboring data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Corollary 6.11 Equivalent subsystem split . . . . . . . . . . . . . . . . . . . . . . . . 74
Definition 6.13 Projected digital constrained optimal control problem . . . . . . . . . . 74
Theorem 6.17 Recursive feasibility of distributed NMPC . . . . . . . . . . . . . . . . 76
Definition 6.20 Neighboring data extension . . . . . . . . . . . . . . . . . . . . . . . 77
Theorem 6.23 Stability of Richards and How Algorithm . . . . . . . . . . . . . . . . . 79
Corollary 6.24 Independence of systems . . . . . . . . . . . . . . . . . . . . . . . . . 80
Definition 6.25 List of parallel operational systems . . . . . . . . . . . . . . . . . . . 80
Definition 6.26 Priority and deordering rule . . . . . . . . . . . . . . . . . . . . . . . 80
Definition 6.31 Cost operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
CHAPTER 1

STABILITY AND OBSERVABILITY

In control engineering practice, the terms stability and observability are central properties of
systems. Abstractly speaking, stability of a system is given if for any bounded input the state and
output of the system remain bounded, and additionally that the impact of the input is decaying
over time. Observability, on the other hand, is given if for known input and output history of a
system the state of the system can be computed and is unique.
Within this lecture, we discuss methods to enforce and evaluate these properties. For our dis-
cussion, we distinguish between linear and nonlinear systems. The reason for considering these
cases separately is that for linear systems it is possible to analytically evaluate scenarios without
simulation, and therefore also to use respective formulas to prove properties of methods. In the
nonlinear case, the options to rigorously show such results are limited, and evaluation of systems
and methods require complex simulations.
This chapter serves as basis for the terminology used within the lecture. We first introduce the
required terms from system theory and control theory. Thereafter, we define the concepts of
stability and observability.

1.1. System
The term system as such is typically not defined clearly. In certain areas, a system stands for a
connected graph, a dynamically evolving entity or even a simulation or an optimization. While
the intention of the latter are quite distinct, they all can be boiled down to the following:

A system is the connection of different interacting components to realize given tasks.

The interdependence of systems with their environment is given by so called inputs and outputs.
More formally, we define the following:
2

Definition 1.1 (System).


Consider two sets U and Y . Then a map Σ : U → Y is called a system.

The set U and Y are called input and output sets. An element from the input set u ∈ U is called
an input, which act from the environment to the system and are not dependent on the system
itself or its properties. We distinguish between inputs, which are used to specifically manipulate
(or control) the system, and inputs, which are not manipulated on purpose. We call the first ones
control or manipulation inputs, and we refer to the second ones as disturbance inputs. An element
from the output set y ∈ Y is called an output. In contrast to an input, the output is generated by
the system and influences the environment. Here, we distinguish output variables depending on
whether we measure them or not. We call the measured ones measurement outputs.

u1 y1
u2 y2
.. System Σ ..
. .
u nu y ny

Figure 1.1.: Term of a system

In the literature, certain classes of systems are considered:

If the system is linear in inputs and outputs, then the system is called linear. Similarly, if it
is not linear in either the inputs or outputs, then the system is called nonlinear.

If all parameters are constants, then the system is called time invariant. It is termed time
varying if at least one parameter is time dependent.

If the outputs depend on the input at the same time instant, we call systems such as this one
static. If the output of the system depends not only on the input at the time instant but also
on the history of the latter, we call these systems dynamic.

If the outputs depend on the history of the inputs only, then the system is called causal. If
future values are included, the system is called uncausal.

If inputs are directly mapped to outputs, then the map is called input output system. If the
input triggers changes of an internal variable and the output depends on the latter, then the
map is called state space system.
1.1. S YSTEM 3

If time is measured continuously, then we call the system to be in continuous time. If time
is sampled, we refer to the system as discrete time system.

To assess systems, we require a formal notation of time:

Definition 1.2 (Time).


A time set T is a subgroup of (R, +).

Within the lecture, we focus on state space systems, which are time invariant, dynamic and causal.
To introduce such systems, we first need to define what we referred to as internal variable:

Definition 1.3 (State).


Consider a system Σ : U → Y . If the output y(t) uniquely depends on the history of inputs
u(τ ) for t0 ≤ τ ≤ t and some x(t0 ), then the variable x(t) is called state of the system and the
corresponding set X is called state set.

Within Definition 1.3, input, output and state refer to tuples

u = [ u1 u2 . . . u n u ] ⊤ (1.1a)
h i⊤
y = y1 y2 . . . y n y (1.1b)

x = [ x1 x2 . . . x n x ] ⊤ . (1.1c)

where u j is an element within the subset j of the input set U , y j is an element within the subset j
of the output set Y and x j is an element within the subset j of the state set X .

Remark 1.4
Here, we use this notation to allow for real valued and other entries such as gears, method
characteristics or switches. In the real valued setting, we have U ⊂ Rnu , Y ⊂ Rny and X ⊂
Rn x .

In the continuous time setting T = R, we can utilize the short form ẋ for d
dt x and obtain the
following compact notation:

Definition 1.5 (State space – continuous time system).


Consider a system Σ : U → Y in continuous time T = R satisfying the property from Defini-
4

tion 1.3. If X is a vector space, then we call it state space and refer to

ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0 (1.2a)


y ( t ) = h ( x ( t ), u ( t ), t ). (1.2b)

as continuous time system. Moreover, u, y and x are called input, output and state of the system.

The state of a system at time instant t can then be depicted as a point in the n x –dimensional state
space. The curve of points for variable time t in the state space is called trajectory and is denoted
by x(·).

Remark 1.6
Systems with infinite dimensional states are called distributed parametric systems and are de-
scribed, e.g., via partial differential equations. Examples of such systems are beams, boards,
membranes, electromagnetic fields, heat etc..

Similarly, in discrete time T = Z we define the following:

Definition 1.7 (State space – discrete time system).


Consider a system Σ : U → Y in discrete time T = Z satisfying the property from Defini-
tion 1.3. If X is a vector space, then we refer to

x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0 (1.3a)
y ( k ) = h ( x ( k ), u ( k ), k ). (1.3b)

as discrete time system. Again, u, y and x are called input, output and state of the system.

While we have t ∈ R in continuous time, for discrete time systems the matter of time refers to an
index k ∈ Z. Hence, trajectories are no longer curves but sequences of points in the respective
set. Discrete time systems are the typical result of digitalization as sampling continuous time
systems, e.g. via a A/D and D/A converter, directly reveals a discrete time system. The result of
such a digitalization is a time grid. The most simple case here is by applying equidistant sampling
with sampling time T which gives us

T := {tk | tk := t0 + k · T } ⊂ R. (1.4)

where t0 is some fixed initial time stamp. Apart from equidistant sampling, other types such as
event based or sequence based are possible.
1.1. S YSTEM 5

Remark 1.8
Note that the class of discrete time systems is larger and contains the class of continuous time
systems, i.e. for each continuous time system there exists a discrete time equivalent, but for some
discrete time systems no continuous time equivalent exists.

Note that in both discrete and continuous time, the map reveals a flow within the state space.
We obtain a trajectory if we specify an initial value and an input sequence. Figure 1.2 illustrates
the idea of flow and trajectory. In this case, the flow is colored to mark its intensity whereas the
arrows point into its direction. The trajectory is evaluated for a specific initial value and „follows“
the flow accordingly.

x2

x1

Figure 1.2.: Sketch of a dynamic flow and a trajectory

As indicated in the introduction, stability refers to the property of being able to control a system
to achieve a certain goal like boundedness or convergence. To this end, the input must be able
to show an impact on the states, may it be directly or indirectly. Observability on the other hand
refers to the ability of identifying the status of a system, that is to be directly or indirectly able
to measure states. Figure 1.3 illustrates this context. The figure also shows that typically not all
states can be manipulated, not even indirectly, and not all states can be observed. Yet, we will see
that even in this case methods can be applied to ensure stability and observability.
In order to discuss the terms stability and observability in details, we focus on the special class of
linear control systems:
6

x1

xj
u y
xk

xnx

Figure 1.3.: Flow of information for controllability and observability

Definition 1.9 (Linear control system).


For matrices A ∈ Rnx ×nx , B ∈ Rnx ×nu , C ∈ Rny ×nx , D ∈ Rny ×nu , we call the system

ẋ(t) = A · x(t) + B · u(t), x (0) = x0 (1.5a)


y(t) = C · x(t) + D · u(t) (1.5b)

linear time invariant control system in continuous time with initial value x0 ∈ Rnx . The time
discrete equivalent reads

x ( k + 1) = A · x ( k ) + B · u ( k ), x (0) = x0 (1.6a)
y ( k + 1) = C · x ( k ) + D · u ( k ). (1.6b)

This class is of particular interest as we can directly give its solution

Theorem 1.10 (Solution of linear control system).


Consider a linear control system (1.5). Then for any initial condition x(t0 ) = x0 and any piece-
wise continuous control function u ∈ U there exists a unique solution

Zt
A·(t−t0 )
x(t; t0 , x0 , u) = exp x0 + exp A·(t−s) · B · u(s)ds. (1.7)
t0

In the discrete time case (1.6) the solution reads

k −1
x ( k ) = A k · x0 + ∑ A k −1− j · B · u ( j ). (1.8)
j =0

From the solution, we directly obtain the so called superposition property and the time shifting
property:
1.2. S TABILITY 7

Corollary 1.11 (Superposition and time shift).


Consider a linear control system from Definition 1.9. Then the superposition principle

x(t; t0 , x0 , u) = x(t; t0 , x0 , 0) + x(t; t0 , 0, u) (1.9)

and the time shift property

x(t; t0 , x0 , u) = x(t; s, x(s; t0 , x0 , u), u) = x(t − s; t0 − s, x0 , u(s + ·)) (1.10)

hold.

The superposition principle allows us to separate the uncontrolled solution (u = 0) and the
unforced solution (x0 = 0).

1.2. Stability
Stability is an essential property for control systems and is bound to certain points in the state
space, the so called operating point. An operating point is characterized by the dynamic to be
zero at these points. In other terms, the input (as a control) should be chosen appropriately to
render the property to hold true.

Definition 1.12 (Operating point).


For continuous time systems (1.2) the pairs (x⋆ , u⋆ ) satisfying

f (x⋆ , u⋆ ) = 0 (1.11)

are called operating points of the system. For discrete time systems (1.3) we call (x⋆ , u⋆ ) oper-
ating point if

f (x⋆ , u⋆ ) = x⋆ (1.12)

If (1.11) or (1.12) hold true respectively for any u⋆ , then the operating point is called strong or
robust operating point.

Note that for autonomous systems, that is (1.2) or (1.3) being independent of time t or k, the
control u ∈ Rnu is required to be constant and fixed to u = u⋆ in order to compute the operating
points.
8

Based on this definition, the property of stability can be characterized by boundedness and con-
vergence of solutions:

Definition 1.13 (Stability and Controllability).


For a system (1.2) we call x⋆

strongly or robustly stable operating point if, for each ε > 0, there exists a real number
δ = δ(ε) > 0 such that for all u we have

∥x0 − x⋆ ∥ ≤ δ =⇒ ∥x(t) − x⋆ ∥ ≤ ε ∀t ≥ 0 (1.13)

strongly or robustly asymptotically stable operating point if it is stable and there exists a
positive real constant r such that for all u

lim ∥x(t) − x⋆ ∥ = 0 (1.14)


t→∞

holds for all x0 satisfying ∥x0 − x⋆ ∥ ≤ r. If additionally r can be chosen arbitrary large,
then x⋆ is called globally strongly or robustly asymptotically stable.

weakly stable or controllable operating point if, for each ε > 0, there exists a real number
δ = δ(ε) > 0 such that for each x0 there exists a control u guaranteeing

∥x0 − x⋆ ∥ ≤ δ =⇒ ∥x(t) − x⋆ ∥ ≤ ε ∀t ≥ 0. (1.15)

weakly asymptotically stable or asymptotically controllable operating point if there exists a


control u depending on x0 such that (1.15) holds and there exists a positive constant r such
that

lim ∥x(t) − x⋆ ∥ = 0 ∀∥x0 − x⋆ ∥ ≤ r. (1.16)


t→∞

If additionally r can be chosen arbitrary large, then x⋆ is called globally asymptotically


stable.

Stability is of particular interest as it allows the input to be considered as a disturbance while


still retaining the mentioned properties. Controllability on the other hand refers to inducing these
properties to the system by means of the input.
In the linear case, we can derive sufficient properties for the system to be stable. The main
ingredient is the so called Eigenvalue criterion:
1.2. S TABILITY 9

Theorem 1.14 (Eigenvalue criterion).


Consider a system (1.5) with u ≡ 0. Let λ1 , . . . , λ j ∈ C be the Eigenvalues of A.

Then the operating point x⋆ = 0 is stable iff all Eigenvalues have non-positive real part
and for all Eigenvalues with real part 0 the corresponding Jordan block is one-dimensional.

Then the operating point x⋆ = 0 is locally asymptotically stable iff all Eigenvalues have
negative real part.

Remark 1.15
If all Eigenvalues of a matrix A exhibit negative real part, then the matrix is called Hurwitz.

Given the Eigenvalue criterion, it is straightforward to derive an input, which induces the stability
property.

Theorem 1.16 (Linear feedback).


Consider a system (1.5) with u = F · x. Then the operating point x⋆ = 0 is locally asymptotically
stable iff all Eigenvalues of A + B · F for a feedback F have negative real part.

So technically, that would be it. Yet, we don’t know

1. whether or not it is actually possible that a feedback F can be constructed such that the
conditions of Theorem 1.16 hold, nor

2. how such a feedback can be constructed.

To answer the first question, we take a look at controllability of a system. Here, Kalman formu-
lated that idea to reach points by combinations of dynamics and input, that is A and B. Since the
dimension of the set reachable by the dynamics only cannot grow larger after n x − 1 iterations,
he introduced the so called Kalman criterion:

Theorem 1.17 (Kalman criterion).


The system (1.5) is controllable iff for the controllability matrix
 
n x −1
rk B | A · B | . . . | A · B = nx (1.17)

holds. Then the pair ( A, B) is called controllable.


10

Remark 1.18
The reachable set is typically defined as the set of point, which can be reached from x0 = 0 within
a certain time t ≥ 0 via

R(t) := {x(t, 0, u) | u ∈ U } .

Similarly, the controllable set refers to those points x0 , for which a control u can be found to
drive the solution to the origin, i.e.

C(t) := {x0 | ∃u ∈ U : x(t, x0 , u) = 0} .

Unfortunately, in his criterion Kalman made the assumption that the control needs to affect all
dimensions of the state space in order for the system to be controllable. But if a part of the
system is already controllable even without the control affecting it, then only controllability of
the remaining part needs to be ensured. To this end, Hautus introduced separability in the state
space:

Theorem 1.19 (Separability).


For any system (1.5), which is not controllable, there exists a linear transformation T such that
! !
A1 A2 B1
à := T −1 · A · T = , B̃ := T −1 · B = (1.18)
0 A3 0

where ( A1 , B1 ) is controllable.

Now, the idea is to simply apply the Kalman criterion to the separated part of the dynamics/state
space:

Theorem 1.20 (Hautus criterion).


Consider a system (1.5). Then ( A, B) is controllable iff

rk (λId − A | B) = n x (1.19)

holds

1. for all λ ∈ C or

2. for all eigenvalues λ ∈ C of A.


1.2. S TABILITY 11

Having answered the question whether or not a feedback can be constructed, we next focus on
how such a feedback can be computed. To this end, we apply basic linear algebra, which gives us
the so called controllable canonical form. Again, we start with the more simple Kalman case.

Theorem 1.21 (Controllable canonical form).


Consider a system (1.5). Then ( A, B) is controllable iff there exists a linear transformation T
with
   
0 1 ··· 0 0
. . . . .
 .. .. .. ..   .. 

à = T −1 · A · T =   B̃ = T −1
· B =   (1.20)
0 0 · · · 1 0
   
 
α1 α2 · · · α n x 1

with coefficients α j of the assigned polynomial Ξ A = znx − αnx znx −1 − · · · − α2 z − α1 .

Based on the latter, we directly obtain controllability if we can assign any polynomial.

Theorem 1.22 (Assignable polynomial).


Consider a system (1.5). Then the pair ( A, B) is controllable iff every polynomial of degree n x is
assignable.

To enforce the stability property, we require that the roots of an assignable polynom are in the
negative complex halfplain. Hence, if any polynomial is assignable, we choose one for which the
root criterion holds.

Theorem 1.23 (Stabilizing polynomial).


Consider a system (1.5). Then the operating point x⋆ = 0 is locally asymptotically stable iff there
exists an assignable polynomial, for which all roots in C have negative real part.

Coming back to Hautus’s case, we basically require that the uncontrollable part is already stable,
that is:

Corollary 1.24 (Polynomial for Hautus criterion).


For any system (1.5), the following is equivalent:

There exists an assignable polynomial, for which all roots in C have negative real part.

The pair ( A, B) is controllable or ( A, B) is not controllable but A3 has only eigenvalues


with negative real part.
12

Combining these lines of argumentation, Figure 1.4 provides an overview of the results.

Kalman Hautus

( A, B) is controllable
( A, B) is controllable or
( A, B) is not controllable but A3 has only eigenvalues with negative real part

Theorem 1.22

Existence assignable polynomial


Corollary 1.24

Existence assignable polynomial with negative real roots

Theorem 1.23

( A, B) is stabilizable

Figure 1.4.: Connection of controllability and stability

Remark 1.25
Theorem 1.23 and Corollary 1.24 are often called pole shifting theorem as the roots of the char-
acteristic polynomial are equivalent to the poles of the transfer matrix of the system.

1.3. Observability
Similarly to controllability, in many cases not all but only a reasonable subset of manipulable
inputs are controlled. Regarding observability, we also have the case that in most cases not all
measurable outputs are actually measured. For our linear time invariant system (1.5) or (1.6) this
means that the matrices C, D are not full rank matrices. In practice, one typically only finds that
states are measured while inputs stay unmeasured, i.e. D = 0.
The task for observability is to derive information on the system from the outputs y(·) ∈ Y by
utilizing the values themselves and the history of values.
1.3. O BSERVABILITY 13

Definition 1.26 (Distinguishability).


For a system (1.2) we call

two states x1 , x2 ∈ X distinguishable if there exists an input u ∈ U such that

h(x1 (t), u(t)) ̸= h(x2 (t), u(t)) (1.21)

for some time t ∈ T .

the system observable if any two states x1 , x2 ∈ X are distinguishable.

As in the previous Section 1.2, we now focus on the linear time invariant case. For such systems,
we have that equation (1.21) reads

C · x(t, x1 , u(t)) ̸= C · x(t, x2 , u(t)). (1.22)

By superposition, we can simplify the latter using linearity:

Lemma 1.27 (Necessary and sufficient condition for distinguishability).


Consider the system (1.5). Then two states x1 , x2 ∈ X are distinguishable iff condition

C · x(t, x1 − x2 , 0) ̸= 0 (1.23)

holds for some t ≥ 0.

Note that the lemma states that distinguishability and observability does not depend on the input
u in the linear case.

Remark 1.28
The set of non-observable states is defined as those states x0 such that the output for u = 0 is
always zero, i.e.

N (t) := {x0 | C · x(t, x0 , 0) = 0 ∀t ≥ 0} .

So again as in Section 1.2, we

1. need to identify conditions to ensure that a system is observable, and

2. have to construct an observer.


14

Based on Lemma 1.27, we can apply the Eigenvalue criterion from Theorem 1.14 to the pair
( A, C ).

Theorem 1.29 (Kalman criterion).


The system (1.5) is observable and the pair ( A, C ) is called observable iff for the observability
matrix
   n x −1 
⊤ ⊤ ⊤ ⊤ ⊤
rk C | A · C | . . . | A ·C = nx (1.24)

holds.

Following the approach of Hautus, an identical separation within the dynamics can be utilized to
widen the applicability of Kalman’s criterion.

Theorem 1.30 (Separability).


For any system (1.5), which is not observable, there exists a linear transformation T such that
! !
A1 A2 B1  
à := T −1 · A · T = , B̃ := T −1 · B = , C̃ := C · T = 0 C2 (1.25)
0 A3 0

where ( A3 , C2 ) is observable.

Now, however, we face the difficulty that we equivalent for stability in the context of observable
systems is missing. Yet, we have seen that there are remarkable similarities between controllabil-
ity and observability. These similarities also exists on a systemic level:

Definition 1.31 (Dual system).


Consider the system (1.5) defined by ( A, B, C ). Then we define the dual system as given by
( A ⊤ , C ⊤ , B ⊤ ).

Using this definition, we obtain

Theorem 1.32 (Duality).


Consider a system ( A, B, C ) and its dual ( A⊤ , C ⊤ , B⊤ ). Then we have

( A, B, C ) controllable ⇐⇒ ( A⊤ , C ⊤ , B⊤ ) observable (1.26)


( A, B, C ) observable ⇐⇒ ( A⊤ , C ⊤ , B⊤ ) controllable (1.27)
1.3. O BSERVABILITY 15

Remark 1.33
In particular, we have that the reachable set of the dual system is identical to the observable set
! !⊥
R(t) ⊤ =: R⊤ = N ⊥ :=
[ \
N (t) .
t ≥0 t ≥0

and vice versa.

Using duality, we define the property detectability, which resembles stability of the dual system.

Definition 1.34 (Detectability).


A system (1.5) is called detectable if

lim x(t, x0 , 0) = 0 (1.28)


t→∞

holds for all x0 ∈ X .

Detectability therefore means that information on the non-observable part (cf. Theorem 1.30) is
not required as respective solutions are asymptotically stable.
Hence, we now have the means to transfer the Hautus criterion to observability.

Theorem 1.35 (Hautus criterion).


Consider a system (1.5). Then ( A, C ) is observable iff
 
rk λId − A⊤ | C ⊤ = n x (1.29)

holds

1. for all λ ∈ C or

2. for all eigenvalues λ ∈ C of A.

Similar to the canonical form for controllability, for observable systems a respective transforma-
tion can be found.
16

Theorem 1.36 (Observable canonical form).


Consider a system (1.5). Then ( A, C ) is observable iff there exists a linear transformation T with
 
0 ··· · · · 0 α1
1 ... . . . ..
 
. α2   
à = T −1 · A · T =  . C̃ = C · T = 0 · · · 0 1 (1.30)
 
 .. 1 .. .. 
 . 0 .  
..
0 0 . 1 αnx

with coefficients α j of the assigned polynomial Ξ A = znx − αnx znx −1 − · · · − α2 z − α1 .

Using duality, we particularly have

Theorem 1.37 (Duality of detectability and stability).


A system ( A, C ) is detectable iff the system ( A⊤ , C ⊤ ) is stabilizable.

Combining these lines of argumentation together with the core of stability, Figure 1.5 provides
an overview of the results.
We like to point out that the properties controllability and observability are independent from one
another and only connected for the respective dual system. Consequently, there exist four classes
of systems

1. controllable and observable,

2. controllable and not observable,

3. not controllable and observable, and

4. not controllable and not observable.

These classes can also be seen in Figure 1.3, which served as starting point for these terms.
1.3. O BSERVABILITY 17

Kalman Hautus

( A, C ) is observable
( A, C ) is observable or
( A, C ) is not observable but A1 has only eigenvalues with negative real part

Theorem 1.32 Theorem 1.32

( A⊤ , C ⊤ ) is controllable
( A⊤ , C ⊤ ) is controllable or
( A⊤ , C ⊤ ) is not controllable but A1 has only eigenvalues with negative real part

Theorem 1.22

Existence assignable polynomial


Corollary 1.24

Existence assignable polynomial with negative real roots

Theorem 1.23

( A⊤ , C ⊤ ) is stabilizable

Theorem 1.37

( A, C ) is detectable

Figure 1.5.: Connection of observability and detectability


Part I.

Linear systems
CHAPTER 2

OPTIMAL STABILIZATION

Regarding stabilization, we found the Eigenvalue criterion to compute stabilizing feedbacks in


Chapter 1. While this is sufficient to guarantee stability, it only addresses a qualitative property
whereas quantitative aspects like performance or the dynamics itself are not considered. Particular
examples of quantitative aspects, which should be avoided, are large overshoots of trajectories and
large control values.
To deal with such quantitative issues, we discuss methods which include the latter directly within
their construction. To this end, we first clarify what is good and what needs to be avoided, and
then quantify these aspects. This is achieved by using so called key performance indicators within
a cost function, which is optimized subject to the state and the dynamics of the problem.
Throughout this chapter, we consider the case of stabilizing an operating point. More general
settings will be considered for more advanced methods. Additionally, we limit ourselves to linear
time invariant systems of the form (1.5) and assume the full state to be measureable.

2.1. Linear quadratic regulator — LQR


Starting point of the optimally designing a feedback is the quantification of a good performance.
To this end, inputs, outputs and functional dependencies of the system can be used to derive a
quantification. Regarding LQR, we consider the state space representation, yet for H2 and H∞
regulators, the frequency representation is used. To handle both concepts, we use so called key
performance criteria.

Definition 2.1 (Key performance criterion).


A key performance criterion is a function, which measures defined information retrieved from the
system against a standard.
22

Focusing on the state space, we typically speak of cost functions. These combined information
on state and input of the system to quantify performance of the control.

Definition 2.2 (Cost function).


We call a key performance criterion given by a function ℓ : X × U → R0+ a cost function.

The value of a key performance criterion reveals a snapshot only, i.e. the evaluation at one time
instant t ∈ T . To obtain the performance, we need to evaluate it over the operating period of the
system. Since by doing so we define a function of a function, this is referred to as a functional.

Definition 2.3 (Cost functional).


Consider a key performance criterion ℓ : X × U → R0+ . Then we call

Z∞
J ( x0 , u ) : = ℓ(x(t, x0 , u), u(t))dt (2.1)
0

cost functional.

Now we can combine the criteria to evaluate and optimize the dynamics over an operating period.
This allows us to quantify not only operating points, but also the transients from the current state
of the system to such an operating point.

Definition 2.4 (Optimal control problem).


Consider a system (1.2) and a cost functional (2.1). Then we call

Z∞
min J (x0 , u) = ℓ(x(t, x0 , u), u(t))dt over all u ∈ U (2.2)
0
subject to ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0

an optimal control problem. The function

V (x0 ) := inf J (x0 , u) (2.3)


u∈U

is called optimal value function.

The idea of the optimal control problem is to enforce the stability property of a system and to
compute a feedback, which is optimal in the sense of the key performance indicator. A simple
way to check whether a feedback stabilizes a system, the following condition can be used.
2.1. L INEAR QUADRATIC REGULATOR — LQR 23

Definition 2.5 (Null controlling).


Consider a system (1.2) and a cost function J : X × U → R0+ . If the condition

J ( x0 , u ) < ∞ =⇒ x(t, x0 , u) → 0 for t → ∞ (2.4)

holds, then we call the optimal control problem null controlling.

The connection betweeen condition (2.4) and stability is rather simple: If we design the key
performance criterion such that it is zero at the desired operating point, then once the operating
point is reached no additional costs will occur over the operating period. Hence, the state of the
system will remain at the operating point. Note that by Definition 1.12 for each operating point
there exists an input such that the state remains unchanged.

Corollary 2.6 (Null controlling stability).


If a optimal control problem is null controlling, then the system is stabilizable.

Now, we focus on the LTI case (1.5). For this particular case, it is sufficient to consider a norm
like criterion, that is a way to measure the distance from current state to operating point. The first
distance which we consider is the Euclidean distance.

Definition 2.7 (Quadratic cost function).


We call a key performance criterion ℓ : X × U → R0+ a quadratic cost function if it is given by
! " #
h i Q N x
ℓ(x, u) = x⊤ u⊤ · · (2.5)
N⊤ R u

where Q ∈ Rnx ×nx , N ∈ Rnx ×nu and R ∈ Rnu ×nu form a symmetric and positive definite
matrix in (2.5).

Combining linear dynamics with quadratic costs gives us the so called LQ problem.

Definition 2.8 (LQ problem).


Consider the optimal control problem given by the LTI system (1.5) and the quadratic cost func-
tion (2.5). Then we refer to this setting as linear quadratic problem or LQ problem.

The nice property of the LQ problem is that its solution is null controlling and therefore the
solution also stabilizes the system.
24

Theorem 2.9 (Null controlling).


The LQ problem is null controlling.

The central question now is to compute the solution of the LQ problem. In particular, we are
not simply interested in a solution but in a solution which can be evaluated based on the state of
system, i.e. a feedback. To this end, we utilize the idea of the value function and suppose it can
be chosen in the ansatz

V (x) = x⊤ · P · x (2.6)

for P ∈ Rnx ×nx . If this ansatz is right, we obtain the following:

Theorem 2.10 (LQR feedback).


If the LQ problem exhibits a value function of the form (2.6), then the solution to the LQ problem

u⋆ (t) = F · x(t, x⋆ , F ) (2.7)

is asymptotically stable with feedback matrix F ∈ Rnu ×nx given by


 
F = − R −1 · B ⊤ · P + N (2.8)

and x(t, x⋆ , F ) represents the solution of the closed loop

ẋ(t) = ( A + B · F ) · x(t), x(0, x⋆ , F ) = x⋆ .

To evaluate the feedback, we require the matrix P of the value function ansatz. This matrix can
be computed using the so called algebraic Riccati equation. The idea of this equation is that the
solution reaches the operating point and calculate the minimum of the ansatz (2.6), i.e. take the
derivative and set it to zero. Since the ansatz is quadratic, the necessary condition is also sufficient
for optimality.

Theorem 2.11 (Algebraic Riccati equation).


The optimal value function of the LQ problem is given by (2.6) iff the matrix P ∈ Rnx ×nx is semi
positive definite and solves the algebraic Riccati equation
 
⊤ −1 ⊤ ⊤
P · A + A · P + Q − (P · B + N) · R B ·P+N = 0. (2.9)
2.1. L INEAR QUADRATIC REGULATOR — LQR 25

During computation of a solution P of (2.9), we have to be careful regarding the requirements of


the solution for the following reason: While the algebraic Riccati equation may exhibit more than
one solution, there exists at most one semi positive definite P. Combining the latter results, we
obtain the following procedure to compute the linear quadratic regulator (LQR):

Algorithm 2.12 (Computation of LQR)


Consider an LQ problem

Z∞ h
! " #
i Q N x(t)
min J (x0 , u) = x(t)⊤ u(t)⊤ · · dt over all u ∈ U (2.10)
N⊤ R u(t)
0
subject to ẋ(t) = A · x(t) + B · u(t), x ( t0 ) = x0

to be given. Then we obtain the LQR feedback F via

1. Compute a semipositive definite solution P of the algebraic Riccati equation (2.9)


 
P · A + A⊤ · P + Q − ( P · B + N ) · R−1 B⊤ · P + N ⊤ = 0.

2. Compute the optimal linear feedback F via (2.8)


 
F = − R −1 · B ⊤ · P + N .

The connections between the latter results are visualized in Figure 2.1.

Theorem 2.9
Value function V (x) is of form (2.6) ( A, B) is stabilizable

Theorem 2.11 Corollary 2.9


Theorem 2.10

Exists unique P for (2.9) Exists optimal stabilizing linear feedback F


Theorem 2.10

Figure 2.1.: Connection of LQR results


26

Remark 2.13
The state based setting described within this section can be extended to the output based setting.
For this case, we utilize the quadratic cost function
! " #
h i Qe N
e y
ℓ(y, u) = y⊤ u⊤ · · (2.11)
e⊤ R
N u

with Q = C ⊤ · Q e · C and N = C ⊤ · N. Given output values y, we obtain that the respective LQ


problem is null controlling if the pair ( A, C ) is observable. In that case, the relations drawn in
Figure 2.1 hold.

2.2. H2 control
In contrast to LQR, which focuses on properties measured within the state space, the H2 formal-
ism considers a frequency domain idea. To get to this idea, we first introduce the 2-norm for
systems.

Definition 2.14 (L2 norm).


Consider a function v : R → Rny . Then we call
1 1
Z∞ ny Z∞
 
2 2

∥ v ∥2 =  ∑ v j (t)2 dt = v(t)⊤ · v(t)dt (2.12)


0 j =1 0

the L2 norm of the function. If

Z∞
V (s) := v̂(s) = L( f (t)) = exp(−st) · f (t)dt, s = α + iω
0

denotes the Laplace transform of v, then we call


1 1
Z∞ ny Z∞
 
2 2
1 1
∥V ∥2 = 
2π ∑ |Vj (iω )| dω  =  2π
2 ⊤
V (iω ) · V (iω )dω  (2.13)
−∞ j =1 −∞

the L2 norm of the transform.


2.2. H2 CONTROL 27

Remark 2.15
In the literature, the term L2 space is typically found to be the correct one. Yet, talking about
function which are bounded and analytic in the right half plane and exhibit finite L p norms on the
imaginary axis – which are fundamental for stable function – are called Hardy spaces, the term
H2 norm has become dominant.

By Parseval’s theorem we directly have

Corollary 2.16 (H2 norm equivalence).


Consider a function v : R → Rny and its Laplace transform V := v̂. Then

∥ v ∥2 = ∥V ∥2 (2.14)

holds for the H2 norms.

In order to apply this result, we reconsider our dynamics. For multivariable systems, we know
from control theory that a reformulation via the Laplace transform reveals a transfer matrix con-
necting inputs to outputs. In particular, for our LTI case (1.5)

ẋ(t) = A · x(t) + B · u(t)


y(t) = C · x(t) + D · u(t)

the frequency domain equivalent is given by

G (s) = C · (sId − A)−1 · B + D.

Computing the solution of the LTI system reveals

Zt
A·t
y(t) = C · exp · x0 + H (t − τ ) · u(τ )dτ (2.15)
0

where H (t − τ ) is the impulse response



C · exp A·t · B + D, if t ≥ 0
H (t) :=
0, if t < 0.

Combined, we obtain the Laplace transform of the impulse response:


28

Corollary 2.17 (Laplace-transform impulse response).


Consider an LTI system (1.5). Then

Z∞
G (s) = H (t) · exp−st dt (2.16)
0

represents the transfer matrix of the system.

Now we can apply Corollary 2.16 to our dynamics and see the following:

Theorem 2.18 (H2 norm equivalence for LTI).


Consider an LTI system (1.5) and let G (s) be its Laplace transform. Then we have

∥ G ∥2 = ∥ H ∥2 (2.17)

where by (2.12) we have


1 1
Z∞ ny ny Z∞
 
2 2
 
∥ H ∥2 =  ∑ ∑ | Hjk (t)| 2
dt =  ⊤
tr H (t) · H (t) dt . (2.18)
− ∞ j =1 k =1 −∞

Equation (2.18) allows us to evaluate the H2 norm in frequency domain by means known in the
state domain. To this end, we only require the solution and the respective output, which we get
from (2.15). In particular, for the LTI case we have
 ⊤   12
Z∞ Z∞
∥ G ∥2 = ∥ H ∥2 = tr  C · exp A·t · B + D  ·  C · exp A·t · B + D  . (2.19)
 
0 0

Here, we get the first result for a respective controller:

Theorem 2.19 (H2 stability).


Consider an LTI system (1.5). Then the system is stable iff its H2 norm is finite.

Having defined the connections between the norms, the aim of the H2 controller we want to
compute now is to minimize the H2 norm of the closed loop. Note that the term associated to the
initial value x0 in (2.15) is a constant and therefore can be omitted in an optimization.
2.2. H2 CONTROL 29

Definition 2.20 (H2 problem).


Consider the optimal control problem given by the LTI system (1.5) and the cost functional
from (2.18)

nu Z∞
J ( x0 , u ) : = ∥ H ∥22 = ∑ y(t)⊤ · y(t)dt (2.20)
j =1 0

to be minimized over all u(t) = e j · δ(t) where δ(·) is the Dirac delta function. Then we refer to
this setting as H2 problem.

Within this setting, the input is modeled as noise, which is realized on the j-th input using the
Dirac delta and may occur at any time instant t.

Remark 2.21
If the covariance of the inputs is a unitary matrix, then the input can be interpreted as white noise.
Moreover, the result of the H2 converges in the expected value as all frequencies are accounted
for in an equal manner. Therefore, the H2 control shows a stochastic characterization.

Having defined the H2 problem, we can solve it using an identical idea as in the LQR case, that
is to impose an algebraic Riccati equation. In particular, we obtain the following:

Theorem 2.22 (H2 feedback).


Consider the H2 problem and suppose ∥ H2 ∥ to be finite. Then the solution to the H2 problem

u⋆ (t) = F · x(t, x⋆ , F ) (2.21)

is asymptotically stable with feedback matrix F ∈ Rnu ×nx given by


  −1
F = − D⊤ · D · B⊤ · P (2.22)

where P is the unique symmetric positive semidefinite solution of the algebraic Riccati equation
  −1
A⊤ · P + P · A − P · B · D ⊤ · D · B⊤ · P + C ⊤ · C = 0. (2.23)

Similar to the LQR case, we again have to be careful to use the positive definite solution of the
algebraic Riccati equation. The approach to evalute the H2 feedback is almost identical to the
LQR case:
30

Algorithm 2.23 (Computation of H2 controller)


Consider an H2 problem

nu Z∞
min J (x0 , u) = ∥ H ∥22 = ∑ y(t)⊤ · y(t)dt over all u(t) = e j · δ(t) (2.24)
j =1 0

subject to ẋ(t) = A · x(t) + B · u(t), x ( t0 ) = x0


y(t) = C · x(t) + D · u(t) (2.25)

to be given. Then we obtain the H2 feedback F via

1. Compute a semipositive definite solution P of the algebraic Riccati equation (2.23)


  −1
⊤ ⊤
A ·P+P·A−P·B· D ·D · B⊤ · P + C ⊤ · C = 0.

2. Compute the optimal linear feedback F via (2.22)


  −1
F = − D⊤ · D · B⊤ · P.

Remark 2.24
Note that in the LTI case we have that

nu Z∞
J ( x0 , u ) = ∑ y(t)⊤ · y(t)dt
j =1 0

n Z∞
u
= ∑ (C · x(t) + D · u(t))⊤ · (C · x(t) + D · u(t)) dt.
j =1 0

1 1
If we choose C = Q 2 and D = R 2 , then we obtain that H2 is a special case of LQR.

2.3. H∞ control
The idea of the H∞ feedback is similar to the H2 feedback. Instead of the L2 norm, where the
aim is to minimize the deviation of the output along the trajectory, in the H∞ case the supremum
norm is used to minimize the highest deviation.
2.3. H∞ CONTROL 31

Definition 2.25 (L∞ norm).


Consider a function v : R → Rny . Then we call

∥v∥∞ = sup ∥v(t)∥ (2.26)


t

the L∞ norm of the function. If

Z∞
V (s) := v̂(s) = L( f (t)) = exp(−st) · f (t)dt, s = α + iω
0

denotes the Laplace transform of v, then we call

∥ G (iω ) · v∥
 
∥y∥∞ = sup | v ̸= 0, v ∈ Cny (2.27)
v ∥v∥

the L∞ norm of the transform.

Again, the terms L∞ and H∞ are used identically in the literature. In the case of H∞ , we will not
go into deep but only highlight connections to H2 . The first connection is about conservatism of
the controllers. Since we have
1 1
Z∞ Z∞
 
2 2

∥ G · v ∥2 =  ∥ G (iω ) · v(iω )∥2 dω  =  ∥ G (iω )∥2 · ∥v(iω )∥2 dω 


−∞ −∞
1
Z∞

2

≤ sup (σ ( G (iω ))) ·  ∥v(iω )∥2 dω  = ∥ G ∥∞ · ∥v∥2


ω
−∞

where σ (·) denotes the maximal singular value, we obtain

∥ G · v ∥2
∥ G ∥∞ ≥ ∀v ̸= 0.
∥ v ∥2

This can be interpreted as the concentrated impact of v close to the frequency range of ∥ G ∥∞ .
Hence, the H∞ norm gives the maximum factor by which the system magnifies the H2 norm of
any input. For this reason, ∥ G ∥∞ is also referred to as gain of the system.

Remark 2.26
As a consequence, the H∞ feedback is always more conservative than the H2 feedback as it aims
to hold down the maximal amplification.
32

Using the H∞ norm, we define the H∞ problem similar to the H2 problem:

Definition 2.27 (H∞ problem).


Consider the optimal control problem given by the LTI system (1.5) and the cost functional
from (2.18)
nu
J (x0 , u) := ∥ H ∥2∞ = sup ∑ ∥y(t)∥2 (2.28)
t j =1

to be minimized over all u(t) = e j · δ(t) where δ(·) is the Dirac delta function. Then we refer to
this setting as H∞ problem.

Regarding the solutions, again an algebraic Riccati equation is employed and we obtain:

Theorem 2.28 (H∞ feedback).


Consider the H∞ problem and suppose ∥ H∞ ∥ < γ to be finite. Then the feedbac

u⋆ (t) = F · x(t, x⋆ , F ) (2.29)

asymptotically stablizes the system with feedback matrix F ∈ Rnu ×nx given by
  −1

F =− D ·D · B⊤ · P (2.30)

where P is the unique symmetric positive semidefinite solution of the algebraic Riccati equation
  −1
⊤ ⊤
A ·P+P·A−P·B D ·D B⊤ P + γ−2 · P · B · B⊤ · P + C ⊤ · C = 0. (2.31)
CHAPTER 3

OPTIMAL OBSERVATION

In the previous chapters, we discussed stability as a system property and how we can manage
to ensure that a system is asymptotically stable by computing a feedback law. The feedback,
however, is based on the state of the system x. Since typically not all states are actually measured
but instead only a restricted output y is known, the feedback cannot be evaluated.
To complete this gap in this chapter, we shift our focus to the task of estimating the state x based
on the output y. Similar to the LQR approach from Section 2.1, the aim is to derive a method
that provides us with an optimal state estimation x̂(t) ≈ x(t) and can be applied in realtime. The
latter requirement rules out all aposteriori methods minimizing over given data sets, but instead
forces a recursive approach. Recursive means that estimates from previous time instances are
re-used and are updated using newly acquired output data. Such methods are typically referred to
as observers or filters.

3.1. Recursive estimation


A typical estimation problem is given by set of data, a model of a system and a set of parameters
which shall be estimated. To illustrate the impact of the realtime requirement, we consider the
following example.

Task 3.1 (Mean value computation)


Suppose outputs y( j), j = 1, . . . , N to be given. Calculate the mean of the outputs.
34

Solution to Task 3.1: The estimate of the mean ŷ based on N outputs is given by

N
1
ŷ( N ) =
N ∑ y ( j ).
j =1

The difficulty now arises if another output is available and the mean computation shall be updated.

Task 3.2 (Mean value update)


Consider the result from Task 3.1 to be given and a output y( N + 1) to be available. Compute
the mean of the outputs.

Solution to Task 3.2: Again, the mean is given by

N +1
1
ŷ( N + 1) =
N+1 ∑ y ( j ).
j =1

In this solution, the previous result from Solution 3.1 is not used. While such an approach is nu-
merically robust and requires no further insight, it may be computationally expensive depending
on the number of samples and the complexity of the computation process. Hence, reformulating
the problem such that only the newly required calculations are made, recuperating all the previous
results, may allow us to generate a more efficient solution method.

Task 3.3 (Real mean value update)


Consider the setting of Task 3.2. Reuse the results from Solution 3.1 to compute the mean
value.

Solution to Task 3.3: To recuperate the previous sum, we can equivalently evaluate

N
1 1
ŷ( N + 1) =
N+1 ∑ y ( j ) + N + 1 y ( N + 1)
j =1
N 1
= ŷ( N ) + y ( N + 1).
N+1 N+1
3.2. T RANSFORMATION OF DYNAMICS 35

Although this form already meets our requirements of reusing previous computations, it is
possible to rearrange it to a more suitable expression:

1
ŷ( N + 1) = ŷ( N ) + (y( N + 1) − ŷ( N ))
N+1

Although this expression is very simple, it is very informative because almost every recursive
algorithm can be reduced to a similar form. Based on the latter, the following observations can
be made:

The new estimate ŷ( N + 1) equals the old estimate ŷ( N ) plus a correction term, that is
1
N +1 ( y ( N + 1) − ŷ ( N )).

1
The correction term consists of two terms by itself: a gain factor N +1 and an error term.

The gain factor decreases towards zero as more outputs are already accumulated in the
previous estimate. This means that in the beginning of the experiment, less importance is
given to the old estimate ŷ( N ), and more attention is paid to the new incoming outputs.
When N starts to grow, the error term becomes small compared to the old estimate. The
algorithm relies more and more on the accumulated information in the old estimate ŷ( N )
and it does not vary it that much for accidental variations of the new outputs. The additional
bit of information in the new output becomes small compared with the information that is
accumulated in the old estimate.

The second term y( N + 1) − ŷ( N ) is an error term. It incorporates the difference between
the predicted value of the next output on the basis of the model and the output y( N + 1).

When properly initiated, i.e. ŷ(1) = y(1), this recursive result is exactly equal to the non
recursive implementation. However, from a numerical point of view, it is a very robust
procedure as calculation errors etc. are compensated in each step.

3.2. Transformation of dynamics


To derive the general optimal observation problem, we consider the nonlinear system

ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0


(1.2)
y ( t ) = h ( x ( t ), u ( t ), t ).
36

together with the known control u(t), t ≥ 0, given outputs y(t), t ≥ 0 and an estimate x̂0 of the
unknown initial state x0 .
Depending on the time instant of interest, we can classify the following problem classes:

Definition 3.4 (Filtering).


Consider x(·) to be a state trajectory of a system. Given a specific time instant t, we call the
problem of computing

x(τ ) with τ < t an interpolation problem,

x(τ ) with τ = t a filtering problem, and

x(τ ) with τ > t an prediction (or extrapolation) problem.

Within this chapter, we are interested in computing realtime estimates, i.e. τ = t and therefore
work in the area of filtering problems. To solve the latter we apply the ansatz using the so called
estimator dynamics:

Definition 3.5 (Estimator dynamics).


Given a system (1.2), we call

x̂˙ (t) = f (x̂(t), u(t), t) + d(t), x̂(0) = x̂0 (3.1)

estimator dynamics where d : R → Rnx .

Based on the latter, we can quantify the mismatch between estimator and true system:

Definition 3.6 (Error function).


Consider a system (1.2) and an estimator (3.1). Then we call e : R × X → Rnx with

e(t, x̂0 ) := x̂(t, x̂0 , u) − x(t, x0 , u) (3.2)

error function of the estimator.

Similar to the optimal control problem, we can now define the optimal estimation problem. Yet, in
contrast of finding an optimal input u(·), we aim to find an estimator x̂(·) such that the estimated
error (3.2) becomes as small as possible in the sense of a key performance indicator. Moreover,
at time t the estimator shall be computable based on outputs y(τ ), 0 ≤ τ ≤ t known at time t
only.
3.2. T RANSFORMATION OF DYNAMICS 37

Similar to the cost function for the control problem where the idea of the cost is to induce stability
via null-controlling, we formulate a cost function for the estimator using the error function. Here,
the idea is to use the null-controlling property to enforce stability of the error function and thereby
convergence of the estimator.

Definition 3.7 (Cost functional).


Consider a key performance criterion ℓ : X × U → R0+ . Then we call

Z∞
J ( x0 , u ) : = ℓ(e(t, x̂0 ), u(t))dt (3.3)
0

cost functional.

This gives us

Definition 3.8 (Optimal estimation problem).


Consider a system (1.2) and a cost functional (2.1). Then we call

Z∞
min J (x0 , u) = ℓ(e(t, x̂0 ), u(t))dt over all u ∈ U (3.4)
0
subject to e(t, x̂0 ) := x̂(t, x̂0 , u) − x(t, x0 , u)
ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0
x̂˙ (t) = f (x̂(t), u(t), t) + d(t), x̂(0) = x0

an optimal estimation problem.

Note that we can use this problem to directly transfer the null controlling property from Corol-
lary 2.6 for stability to observability. In this case, not the system but the error function of the
estimation is stabilized.

Corollary 3.9 (Null controlling observability).


If a optimal estimation problem is null controlling, then the system is observable.

The latter result suggests that the solution of the optimal estimation problem from Definition 3.8
could be identical to the optimal control problem from Definition 2.4. Unfortunately, there are
some slight differences:
38

1. In the optimal control problem, we consider the state to be stabilized, while in the optimal
estimation problem the error needs to be stabilized.

2. The solution computed by the optimal control problem is the control strategy, which in the
LTI case can be evaluated by a linear feedback law. For the optimal estimation problem,
we aim to compute the current state of the problem.

3. Last, the given data for the optimal estimation problem stems from past measurements,
which cannot be used in the formulation of the optimal estimation problem.

In the following, we will address the integration problem of measurements from the past by
converting the optimal estimation problem. Then, similar to LQR, our aim now is to derive a
problem, for which the null controlling property can be shown.

3.3. Kalman filter


We now focus on the LTI case, where not only the dynamics are much more simple, but we can
also derive an explicit dynamics for the error function of the estimator. More precisely, for the
LTI case

ẋ(t) = A · x(t) + B · u(t) (3.5a)


y(t) = C · x(t) + D · u(t) (3.5b)

with estimator

x̂˙ (t) = A · x̂(t) + B · u(t) + d(t), x̂(0) = x0 (3.6)

we obtain:

Definition 3.10 (Error dynamics).


Given an LTI system (1.5) with estimator dynamics (3.6) we call

ė(t) = A · e(t) + d(t) (3.7a)


ye (t) = C · e(t) (3.7b)

error dynamics.
3.3. K ALMAN FILTER 39

Remark 3.11
The error dynamics are the dual wrt. the LTI system, cf. Definition 1.31. Hence, stability of the
dual system gives us observability of the primal system.
As a consequence, all the following computations can only be executed if the system ( A, C ) is
observable. Otherwise, no solution can be computed.

Based on the error dynamics, we can integrate the measurements, which are available for past
time instances. Hence, the cost functional we design aims to drive the error to zero but operates
on a time frame, which leads up to the current time instant.

Definition 3.12 (Quadratic cost functional for observability).


We call


J ( x0 , d ) : = (C · e(t) − ye (t))⊤ · Q̂ · (C · e(t) − ye (t)) + d(t)⊤ · R · d(t)dt (3.8)
−∞

quadratic cost functional for observability where Q̂ ∈ Rny ×ny and R ∈ Rnu ×nu are
(semi)positive definite matrices.

In order to convert the cost functional (3.8) to be in the form (3.3), we apply the following:

Theorem 3.13 (Time transformation).


Consider an LTI system (3.7) with cost functional (3.8) to be given. Given the transformation

xτ (t, x0 , d) := x(τ − t, x0 , d) (3.9)


yτe (t) := ye (τ − t) (3.10)

the cost function (3.8) is equivalent to

Z∞
τ
J ( x0 , d ) : = (C · eτ (t) − yτe (t))⊤ · Q̂ · (C · eτ (t) − yτe (t)) + d(t)⊤ · R · d(t)dt (3.11)
0

and the respective error dynamics is equivalent to

ėτ (t) = − A · eτ (t) − d(τ − t) (3.12a)


yτe (t) = C · eτ (t). (3.12b)
40

Definition 3.14 (Kalman filter problem).


Consider an LTI system (3.7) and outputs y(t), t ∈ (−∞, τ ] to be given. Then we call

Z∞
τ
min J (x0 , d) := (C · eτ (t) − yτe (t))⊤ · Q̂ · (C · eτ (t) − yτe (t)) + d(t)⊤ · R · d(t)dt
0
over all x0 ∈ X (3.13)
subject to ėτ (t) = − A · eτ (t) − d(τ − t), e τ ( t0 ) = x0
yτe (t) = C · eτ (t)

Kalman filter problem.

Now, we can impose the identical ansatz

V (e) = eτ ⊤ · P · eτ (3.14)

for P ∈ Rnx ×nx . If this ansatz is right, we obtain the following:

Theorem 3.15 (Kalman filter).


Consider an LTI system (3.7) with cost functional (3.8) to be given. Then the solution of the
optimal estimation problem is given by

e˙τ (τ ) = A · eτ (τ ) + L · (C · eτ (τ ) − ye (τ )) (3.15)

where the gain matrix

L := −S · C ⊤ · Q̂ (3.16)

is solution of the dual Riccati equation

A · S + S · A⊤ − S · C ⊤ · Q̂ · C · S + D · R−1 · D ⊤ = 0 (3.17)

and the value function of the optimal estimation problem is given by (3.14) with P := S−1 .

Remark 3.16
In (3.15) we obtain the identical structure of the observer, which we designed in Task 3.3 for the
mean value update. For this reason, L is also called gain matrix.
3.3. K ALMAN FILTER 41

Note that again a solution P of the dual Riccati equation (3.17) is not unique, yet there exists at
most one semi positive definite S. Combining the latter results, we obtain the following procedure
to compute the Kalman filter:

Algorithm 3.17 (Computation of Kalman filter)


Consider an Kalman filter problem (3.13) to be given. Then we obtain the solution via

1. Compute a semipositive definite solution S of the dual Riccati equation (3.17)

A · S + S · A⊤ − S · C ⊤ · Q̂ · C · S + D · R−1 · D ⊤ = 0

2. Compute the gain matrix L via (3.16)

L := −S · C ⊤ · Q̂.

In practice, a Kalman filter is typically updated periodically, i.e. a dynamic for computing the
ansatz matrix S is applied to integrate newly obtained knowledge of outputs. S is also called
covariance matrix of the system. In the literature, the dynamic of this matrix is split into an
apriori and an aposteriori covariance update as well as an prediction and an correction step of the
error dynamics, cf., e.g., [4]. As we focus on continuous time dynamics, this separation is beyond
the scope of the lecture.
Part II.

Nonlinear systems
CHAPTER 4

DIGITALIZATION

To deal with nonlinear systems, we follow a so called direct approach, which is quite different
from the direct approach we considered in Control engineering 2. Instead of analytically or
structurally dealing with the system or its solution, we first transfer the problem into the sphere
of digital control problems and than apply optimization to compute a control strategy.
In the present chapter, we focus on the first step and digitize the control system. Here, we follow
the most simple approach and consider a so called zero order hold. At this point, we already like
to stress that by definition such a control is not Lipschitz continuous. Hence, the feedback will be
very different from the ones we considered in Control engineering 2 and in particular will not be
in the form of a function. Moreover, we don’t aim to compute a feedback which is stabilizing for
all possible digitizations. Instead, we suppose a sampling to be given and then derive a stabilizing
controller.
To conclude stability of the original system, in the present chapter we additionally discuss how
stability of the digital feedback can be guaranteed for the original system as well. Throughout the
nonlinear part of the lecture, we focus on systems of the form

ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0


(1.2)
y ( t ) = h ( x ( t ), u ( t ), t ).
In the upcoming chapters, we will then design methods to compute and evaluate control laws,
which provide us with a stabilizing feedback for the digitized system.
46

4.1. Zero order hold


The most simple case of a discontinuous feedback is given by the so called zero order hold. The
idea is to sample the input, i.e. to fix a time grid T := {tk } ⊂ R and define the input to be
constant in between two sampling instances tk and tk+1 . Here, we further simplify the setting by
introducing a sampling period T and define the sampling instances to be equidistant, which we
already discussed in
T := {tk | tk := t0 + k · T } ⊂ R. (1.4)

Remark 4.1
There are two more general cases: For one, the sampling times may be defined by a function of
time, or secondly, the sampling times can be defined by a function of states. The first one is com-
mon in prediction and prescription of systems where action is the far future are significantly less
important. Hence, one typically chooses between exactness of the prediction and computational
complexity. The latter case is referred to a event driven control.

We still like to stress that in applications, the choice of T is not fixed right from the beginning,
but depends on the obtainable solution and stability properties. Note that the result of sampling
the control is not a discrete time system (see Definition 1.7), but a continuous time system (see
Definition 1.5) where the input u is of zero order hold. More formally, we formulate zero order
hold input and solution as a parametrization of operators with respect to T.

Definition 4.2 (Zero order hold).


Consider a nonlinear control system (1.2) and a feedback u : X → U such that ∥u(x)∥ ≤ γ(x)
holds for all x ∈ X and some continuous function γ : X → R. Moreover suppose a sampling
period T > 0 to be given, which defines the sampling times tk = k · T. Then we call the
piecewise constant function

u T (t) ≡ u(x(tk )), t ∈ [ t k , t k +1 ) (4.1)

zero order hold.

Remark 4.3
We like to point out that higher order holds are possible as well. In practice, however, such
higher order holds are not defined using a polynomial approximation but via additional differen-
tial equations for the control itself.
4.1. Z ERO ORDER HOLD 47

As a consequence of the latter definition, the input u T is not continuous but instead exhibits jumps
at the sampling times tk , cf. Figure 4.1.

u
continuous
2 sampled

1.5

0.5

t
−3 −2 −1 1 2 3 4 5 6

−0.5

Figure 4.1.: Zero order hold sampling

Still, the function is integrable, which is a requirement for existence of a solution of (1.2) for such
an input. This insertion directly leads to the following:

Definition 4.4 (Zero order hold solution).


Given a nonlinear control system (1.2) and a zero order hold input u T : T → U . Then we call
the function x T : T → X satisfying

ẋ T (t) = f (x T (t), u T (t)) (4.2)

zero order hold solution.

In order to compute such a solution, we can simply concatenate solutions of subsequent sampling
intervals [tk , tk+1 ). Here, we can use the endpoint of the solution on one sampling interval to be
the initial point on the following one. Hence, the computation of x T is well defined, cf. Figure 4.2
for an illustration.

Remark 4.5
Since the system is Lipschitz continuous on each interval [tk , tk+1 ), the solution is also unique.
48

u/x Control u T
Solution x T
8

t
−3 −2 −1 1 2 3 4 5 6

Figure 4.2.: Zero order hold solution

Hence, identifying endpoint and initial point of subsequent sampling intervals is sufficient to
show that the zero order hold solution is unique. Yet, as a consequence of this concatenation, the
solution is not differentiable at the sampling points tk .

Remark 4.6
Note that despite u T to be piecewise constant, the zero order hold solution does not exhibit jumps
and shows nonlinear behavior.

4.2. Practical stability


We next introduce the concept of stability, which is equivalent to Definition 1.13. To this end,
we utilize the so called practical KL notation, which extends the standard KL concept using
comparison functions to not cases where convergence can only be guaranteed to a certain neigh-
borhood.
For the stability concept, we use the same simplification to shift the operating point (x⋆ , u⋆ ) to
the origin.
4.2. P RACTICAL STABILITY 49

Definition 4.7 (Practical stability/controllability).


Consider a nonlinear control system (1.2) with f (0, 0) = 0 and T > 0. Then we call a feedback
u T to semiglobally practically asymptotically stabilize the operating point (x⋆ , u⋆ ) = (0, 0) if
there exists a function β ∈ KL and constants R > ε > 0 such that

∥x T (t)∥ ≤ max{ β(∥x0 ∥, t), ε} (4.3)

holds for all t > 0 and all initial value satisfying ∥x0 ∥ ≤ R.

Again, main difference between our setting here and in Control engineering 2 is that we don’t aim
to compute a feedback which is stabilizing for all T ∈ (0, T ⋆ ]. Instead, we suppose a sampling
to be given and then derive a stabilizing controller.

Remark 4.8
The term „semiglobal“ refers to the constant R, which limits the range of the initial states for
which stability can be concluded. The term „practical“ refers to the constant ε, which is a
measure on how close the solution can be driven towards the operating point before oscillations
as in the case of the bang bang controller occur.

Different from the linear case where existence of a feedback and a feed forward control are
equivalent, in the nonlinear case we only have the following:

Lemma 4.9 (Existence of feed forward).


Consider a system (1.2) and let (x⋆ , u⋆ ) be an operating point. If a feedback u : X → U exists
such that the closed loop is asymptotically stable and additionally both the feedback and the
closed loop are Lipschitz, then there exists a feed forward u : T → U such that the system is
asymptotically controllable.

As a direct conclusion of Definition 4.7, we can apply Lemma 4.9 and obtain:

Corollary 4.10 (Existence of practically stabilizing feed forward).


Consider a nonlinear control system (1.2) with f (0, 0) = 0 and suppose a feedback u T ,
T > 0 to exist, which semiglobally practically asymptotically stabilizes the operating point
(x⋆ , u⋆ ) = (0, 0). Then there exists a feed forward u : T → U such that the system is practically
asymptotically controllable.
50

Definition 4.7 also shows the dilemma of digital control using fixed sampling periods: Both close
to the desired operating point and for initial values far away from it, the discontinuous evaluation
of the feedback u T leads to a degradation of performance. Close to the operating point, a slow
evaluation leads to overshoots despite the dynamics to be typically rather slow. Far away from
the operating point, the dynamics is too fast to be captured in between two sampling points which
leads to unstable behavior.
Still, it may even be possible to obtain asymptotic stability (not only practical asymptotic stability)
using fixed sampling periods T as shown in the following task:

Task 4.11
Consider the system
 
2 2
ẋ1 (t) = − x1 (t) + x2 (t) · u(t)
x2 (t) = (−2 · x1 (t) · x2 (t)) · u(t).

Design a zero order hold control such that the system is practically asymptotically stable.

Solution to Task 4.11: We set



1, x1 ≥ 0
u T (t) = .
−1, x1 < 0

For this choice, the system is globally asymptotically stable for all T > 0 and even inde-
pendent from T. The reason for the latter is that the solutions never cross the switching line
x1 = 0, i.e. the input to be applied is always constant, which leads to independence of the
feedback from T.

As described before, the behavior observed in Task 4.11 is the exception. In practice, the limita-
tions of semiglobality and practicality is typically the best we can expect in zero order hold input
of nonlinear system.

4.3. Existence of stabilizing feedback


In order to show that a stabilizing zero order hold input exists, we utilize the concept of Control-
Lyapunov functions, which extend the standard Lyapunov approach.
4.3. E XISTENCE OF STABILIZING FEEDBACK 51

Definition 4.12 (Practical Control-Lyapunov functions).


Consider a nonlinear control system (1.2) with operating point (x⋆ , u⋆ ) = (0, 0) such that
f (x⋆ , u⋆ ) = 0 and a neighborhood N (x⋆ ). Then the continuous function VT : Rnx → R0+
is called a semiglobal practical Control-Lyapunov function if there exist constants R̂ > ε̂ > 0 as
well as functions α1 , α2 , ∈ K∞ and a continuous function W : X → R+ \ {0} such that there
exists a control function u satisfying the inequalities

α1 (∥x∥) ≤ VT (x) ≤ α2 (∥x∥) (4.4)


inf VT (x T (tk+1 )) ≤ max {VT (x T (tk ) − T · W (x T (tk )), ε̂)} (4.5)
u∈U

for all x ∈ N \ {x⋆ } with VT (x) ≤ R̂.

The latter definition extends the concepts of a Control-Lyapunov function is various ways. For
one, as the zero order hold solution is not differentiable, we can no longer assume VT to be
differentiable. Hence, the formulation of decrease in energy in inequality (4.5) is given along
a solution instead of its vector field. Moreover, the ideas of semiglobality and practicality are
integrated.

Remark 4.13
Comparing Definition 4.12 to Definition 4.7, we can identify the similarity of semiglobality be-
tween the constants R and R̂ as well as ε and ε̂. The difference between these two pairs lies in
their interpretation: For KL function, we utilize the state space, whereas for Control-Lyapunov
functions the energy space is used. Hence, both values are a transformation of one another using
the Control-Lyapunov function VT .

Now, supposingly that a practical Control-Lyapunov function exists, we can directly derive the
existence of a zero order hold control.

Theorem 4.14 (Existence of feedback).


Consider a nonlinear control system (1.2) with operating point (x⋆ , u⋆ ) = (0, 0) such that
f (x⋆ , u⋆ ) = 0 and T > 0. Let VT to be a semiglobal practical Control-Lyapunov function.
Then the minimizer

u T (t) := argmin VT (x T (tk+1 )) (4.6)


u∈U

is a semiglobally practially asymptotically stabilizing feedback.


52

Note that in (4.6), the right hand side depends on u implicitly as x T (tk+1 ) is defined using the
initial value x T (tk ) and the zero order hold input u. Hence, the definition (4.6) is proper.

Remark 4.15
The transfer from infimum in (4.5) to minimum in (4.6) is only possible as the input is constant in
between two sampling instances tk and tk+1 and therefore the solution x T (·) is continuous with
respect to u.

Unfortunately, the pure existence of a feedback does not help us in computing it. Additionally,
we still require the existence of a practical Control-Lyapunov function to conclude existence of
such a feedback. Here, we first address existence of a Control-Lyapunov function, for which the
following is known from the literature:

Theorem 4.16 (Existence of practical Control-Lyapunov function).


Consider a nonlinear control system (1.2) with operating point (x⋆ , u⋆ ) = (0, 0) such that
f (x⋆ , u⋆ ) = 0. If the system is asymptotic controllable, then there exists a semiglobal practi-
cal Control-Lyapunov function.

The most important aspect of Theorem 4.16 is the requirement regarding the control system. The
result does only require the system to be asymptotically controllable, i.e. without digitalization.

4.4. Intersample behavior


Unfortunately, the results only hold true for the digitized system, i.e. only for time instances
tk ∈ T . The behavior of the system between these instances is called intersample behavior and
can be estimated using properties of the system dynamics. The main tool is the so called uniform
boundedness.

Definition 4.17 (Uniform boundedness).


Consider a nonlinear control system (1.2) with operating point (x⋆ , u⋆ ) = (0, 0) together with a
input u T : T → U . If there exists a function γ ∈ K and a constant η > 0 such that for all x ∈ X
with ∥x∥ ≤ η, the solutions exist on [0, T ] and the solutions satisfy

∥x T (t)∥ ≤ γ(∥x∥) (4.7)

for all t ∈ [0, T ] then the solutions are called uniformly bounded over T.
4.4. I NTERSAMPLE BEHAVIOR 53

Using boundedness, it can be shown that the system will stay bounded in between sampling
instances.

Theorem 4.18 (Asymptotic stability and uniform boundedness over T).


Consider nonlinear control system (1.2) with operating point (x⋆ , u⋆ ) = (0, 0) together with
a input u T : T → U . Then the system is semiglobally practically asymptotically stable iff
there exists a semiglobally practically asymptotically stabilizing feedback u T : T → U and the
solutions x T : T → X are uniformly bounded over T.

Concluding, if we can compute an semiglobally practically asymptotically stabilizing feedback


law for the discrete time system induced by the sampled data system, then the digitizes continuous
time closed loop is also semiglobally practically asymptotically stable provided its solutions are
uniformly bounded over T.

In practice, however, the two tasks of deriving feedback u T and Control-Lyapunov function VT
are often done in the inverse sequence. To this end, first a feedback u T is derived, and then the
inequality (4.5) is shown to hold for this feedback

VT (x T (tk+1 )) ≤ max {VT (x T (tk ) − T · W (x T (tk )), ε̂)} .

The reason for using such a procedure is that Theorem 4.14 only requires a Control-Lyapunov
function for fixed R̂, ε̂ to exists for some T0 > 0 in order to conclude existence also for all
smaller sampling periods. Hence, if we find a constructive way to derive a feedback, then a
practical Control-Lyapunov function can be derived and stability properties of this feedback can
be concluded.
In the following chapters, we now focus on constructing such a feedback. To simplify the respec-
tive notation, we utilize the discrete time notation

x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0
(1.3)
y ( k ) = h ( x ( k ), u ( k ), k ).

introduced in Definition 1.7. To this end, we assume that the differential equation is solved to
compute the state x(k + 1) based on the continuous time dynamics (1.2) and the zero order hold
control u(t) := u T (t) =: u(k ).
CHAPTER 5

MODEL PREDICTIVE CONTROL

Based on the previous Chapter 4 on digitalization, we now discuss one approach to compute a
zero order hold feedback for a nonlinear system. The approaches we considered so far are based
on the analytical solution of an optimal control problem using the Riccati approach for a quadratic
optimal value function ansatz V (x) = x · P · x. However, as soon as the cost is nonquadratic,
the dynamics nonlinear or is state and control constraints are introduced, the value function V
is no longer quadratic and the approach in general no longer possible. The same holds for the
optimal feedback law, which is typically a rather complicated function for which already the
storage poses problems and limits such approaches to low dimensions. Moreover, the approach is
only capable to compute a Lipschitz continuous feedback. Yet if no continuous feedback exists,
by controllability we know that some kind of control exists, for which stability can be shown, e.g.
a discontinuous one.
The model predictive control approach takes a step back from optimality over an infinite horizon
by approximating it via a series of finite horizon problems. The purpose of the present chapter is
twofold: For one, we discuss the construction of a basic MPC algorithm and the interplay of the
building blocks as outlined in Figure 5.1 Thereafter, we show how a feedback can be constructed
from such an approach and how stability of the closed loop can be guaranteed.

5.1. Introduction of constraints


In the previous chapters, we considered systems operating in sets such as the state set X , the
control set U and the output set Y . We then refined this general class of systems given in Defi-
nition 1.1 for continuous time systems (1.2) and discrete time systems (1.3) which led us to the
term state space, control space and output space.
For designing the LQR, H2 and H∞ controllers, we implicitly assumed that these spaces are
56

MPC

Digitalization

Simulation Optimization

Figure 5.1.: Building blocks within the MPC Algorithm 5.9

unbounded. In practical applications, however, we often face the problem that requirements need
to be met. To illustrate this point, we consider the following:

Task 5.1
Consider a supply chain as multi stage network driven by the dynamics

ṡ p (t) = f s ( a p (t), ℓ p (t)) (Stock)


p
ȯu (t) = f o (o p (t), a p (t)) (Unfulfilled order to stock)
ḃ p (t) = f b (d p (t), ℓ p (t)) (Backlog from stock)

where p ∈ S = {S, M, R} denotes the stages, cf. Figure 5.2. Typically, the stage set
contains supplier (S), manufacturer (M) and retailer (R). Moreover, a p , ℓ p , o p and d p
denote the arriving and leaving as well as the order and demand rates. Formulate the basic
constraints such a system needs to obey in order to be physically meaningful.

Solution to Task 5.1: For all times t ≥ 0 and stages p ∈ S , the system is subject to the
constraints

p p
0 ≤ o p (t) ≤ omax 0 ≤ s p (t) ≤ smax
p p p
0 ≤ ou (t) ≤ ou,max 0 ≤ b p (t) ≤ bmax

as well as unknown costumer orders oC and fixed delivery delays τij , where i, j ∈ S represent
consecutive stages. The stages need to be linked since arrival/leaving as well as demand/order
5.1. I NTRODUCTION OF CONSTRAINTS 57

aS aM aR aC
τSM τMR τRC

Manufacturer

Customer
Supplier ℓS ℓM ℓR

Retailer
τSS

oS oM oR oC
dS dM dR
Figure 5.2.: Sketch of a three stage supply network

information is required to evaluate the dynamics. Here, we use a j (t + τij ) = ℓi (t) and
d j (t) = oi (t) for consecutive nodes i, j ∈ S and ai (τii ) = oi (t) for the supplier to define
p
these connections. The state for each stage can be defined via x p := (s p , ou , b p )⊤ .

Hence, constraints arise naturally in practical problems as states need to be bounded, e.g. to
prevent the system from collapsing or hitting physical barriers, or the controls need to be bounded,
e.g., for energy reasons or actuator limitations, or outputs need to be bounded, e.g., due to sensor
limitations. To address these requirements formally, we define constraints for our system as
follows.

Definition 5.2 (Constraints).


Given the state, control and output sets X , U and Y , we call X ⊂ X state constraints, U ⊂ U
control constraints and Y ⊂ Y output constraints.

We like to stress that constraints are always causing trouble in numerical computations. For this
reason, in many applications constraints are not formulated „hard“, that is as constraints that must
be satisfied, but instead as „soft“ by adding them as KPI to the cost function by penalizing the
violation of constraints.

Remark 5.3
Note that by definition soft constraints may be violated. Hence, such an approach is not applicable
for safety critical constraints.

Alternatively, modelers can focus on circumventing the usage of constraints as outlined in the
following task:
58

Task 5.4
Model cars going from an initial point x0 ∈ R2 to a target point x⋆ ∈ R2 via routing points
x j ∈ R2 , j = 1, . . . , M as illustrated in Figure 5.3 using a one dimensional system only.

Figure 5.3.: Definition of the driving path via splines for given routing points

Solution to Task 5.4: Define the route of each vehicle via routing points via interpolation
by splines. The car is then controlled along the arc of the spline. Then, we create a one
dimensional dynamics via the velocity along the arc length as a control.
To formalize this approach, we call M ∈ N the number of routine points. Denoting the
entire arc length by L, the routing points are interpolated via the cubic spline
!
Sx (ℓ)
S(ℓ) = , 0 ≤ ℓ ≤ L,
Sy (ℓ)

which is parametrized by ℓ representing the position on the arc. The arc length is approxi-
mated by
q
ℓ0 := 0, ℓ j +1 : = ℓ j + ( x j +1 − x j )2 + ( y j +1 − y j )2 , L := ℓ M .
5.2. MPC APPROACH 59

Last, we re-obtain the parametrized driving route via


! !
x (ℓ) Sx (ℓ)
:= for 0 ≤ ℓ ≤ L.
y(ℓ) Sy (ℓ)

The spline gives us the route of each car, and its velocity is the time derivative of the current
position on the arc. Hence, driving along the route is equivalent to solving the initial value
problem

ℓ̇(t) = u(t), ℓ(0) = 0

where t denotes time and u(t) represents the velocity of the car at time instant t. By choosing
the velocity u ∈ U we can control the car along the route. The corresponding position at
time instant t is given by
! !
x (ℓ(t)) Sx (ℓ(t))
=
y(ℓ(t)) Sy (ℓ(t))

Remark 5.5 Note that deriving the routing points in Task 5.4 is a different and decoupled
problem, which may be solved by a traffic guidance system. For simplicity, the center of
the traffic lane can be chosen. Regarding a production process or a single machine, these
routing points can be regarded as a feedforward control.

Instead of the velocity along the route, we could also use the acceleration or jerk. These
choices result in a differential equation of higher order. Additionally, the bounds on the
velocity are then state constraints, which drastically increase the complexity of the problem.

As mentioned before, we could also impose more complex models for each car and the
respective dynamics. However, these model would lead to an increase in the computational
cost. Since the modeled arcs are locally controlled by sublayer controllers of the car, these
arcs represent reality close enough. Hence, such an approach is more efficient.

5.2. MPC approach


Having defined constraints, we can now generalize the setting from Chapter 2 to a nonlinear
constrained optimal control problem. Note that in Definition 2.4, we used the general nonlinear
60

form, which we later specified to LTI systems to discuss the LQR, H2 and H∞ controller.
Formally, we obtain

Definition 5.6 (Constrained optimal control problem).


Consider a system (1.2) and a cost functional (2.1). Then we call

Z∞
min J (x0 , u) = ℓ(x(t, x0 , u), u(t))dt over all u ∈ U∞ (5.1)
0
subject to ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0
x(t) ∈ X, t ∈ [0, ∞)

an constrained optimal control problem. The function

V (x0 ) := inf J (x0 , u) (5.2)


u∈U

is called optimal value function.

Since the continuous time formulation allows for infinitely many control changes, it is not only
computationally difficult or intractable to solve. Additionally, actuators work in a sampled man-
ner, hence such a control is practically also not usable. To address these issues, we apply the
following adaptations:

By applying digitalization, we can shift the problem to the discrete time formulation solv-
ing the sampling issue. Moreover, digitalization allows us to decouple optimization and
simulation.

Cutting the infinite horizon to a finite one allows us to address the computational issue. For
one, simulation techniques to digitalized or discrete time systems are very effective, and
secondly, optimization methods for finitely many inputs are well developed.

These are the ingredients linked in Figure 5.1, which allow us to divide the control problem (5.1)
accordingly. To formalize this procedure, we first introduce the following:

Notation 5.7 (Open/closed loop index)


In the context of MPC we denote the closed loop time index by n and the open loop time index
by k. Moreover, we denote the open loop horizon by N.

Now, the subproblems to solve take the following form:


5.2. MPC APPROACH 61

Definition 5.8 (Digital constrained optimal control problem).


Consider a constrained optimal control problem (5.1). Applying digitalization, zero order hold
and horizon cutting, we call

N −1
min J (x0 , u) = ∑ ℓ(x(k, x0 , u), u(k)) over all u ∈ U N (5.3)
k =0
subject to x(k + 1) = f (x(k ), u(k), k), x (0) = x0
x(k ) ∈ X, k ∈ [0, N ]

a digital finite constrained optimal control problem.

While the problem is solvable now, it does not give us a solution of the original problem. To still
be able to at least approximate such a solution, MPC can be used. The idea of MPC is split up
the problem over time and only consider time windows, for which the problem is to be solved.
This goes hand in hand with the digitalization idea and the time windows are constructed such
that each window starts at a sampling instant. To capture long term system behavior, the length of
the time windows is longer than one sampling period and measured in multiples of the sampling
period. As the time windows solution is longer than required, only a fraction of the solution is
applied.
Combined, MPC is a three step scheme:

Algorithm 5.9 (Basic MPC Algorithm)


For each closed loop time index n = 0, 1, 2 . . .:

(1) Obtain the state x(n) ∈ X of the system.

(2) Set x0 := x(n), solve the digital finite optimal control problem (5.3) and denote the ob-
tained optimal control sequence by u⋆ (·) ∈ U N .

(3) Define the MPC feedback µ N (x(n)) := u⋆ (0).

While easily accessible and adaptable, the method behind Algorithm 5.9 exhibits some severe
flaws that need to be considered before putting it into practice:

1. Cutting the horizon to N < ∞ may result in infeasibility of the problem at closed loop time
indexes n > 0. A simple example is a car driving towards a wall. If the prediction horizon
is too small, the car is unable to stop before hitting the wall. Mathematically speaking, no
solution can be found satisfying all constraints. We address this issue in Section 5.3 and
show how feasibility can be guaranteed recursively.
62

2. Cutting the horizon may also result in destabilizing the system. Again, we can use the
car/wall example and put the target point behind the wall, i.e. the car needs to go around
the wall. If the wall is long compared to the prediction horizon, the car will not be able to
„see“ a possibility of going around the wall and stop in front of it. Hence the system is not
asymptotically stable. In Section 5.4, we address this issue using three different strategies.

5.3. Recursive feasibility


From the discussing above on existence of a solution throughout the MPC iterations we obtain
that we require for each n

existence of a solution for problem (5.3) at closed loop index n and

guarantee that the subsequent problem (5.3) at closed loop index n + 1 exhibits a solution.

Remark 5.10
At this point, we want to stress the fact that loss of feasibility is due to the method of MPC,
i.e. the cutting of the horizon. This problem does not exist for the original constrained optimal
control problem (5.1). However, if the latter does not exhibit a solution, then it is not possible to
approximate such a non-existing solution using MPC.

The first property is referred to as feasibility, the second as recursive feasibility. To formalize
these properties, we first introduce the following:

Definition 5.11 (Admissibility).


Consider a discrete time control system (1.3) with state and input constraints X ⊂ X and U ⊂ U .

The states x ∈ X are called admissible states and the inputs u ∈ U(x) are called admissi-
ble inputs for x. The elements of the set {(x, u) | x ∈ X, u ∈ U(x)} are called admissible
pairs.

For N ∈ N and initial value x0 ∈ X we call an input sequence u ∈ U N and the corre-
sponding trajectory xu (k, x0 ) admissible for x0 up to time N if
the running time constraint

(xu (k, x0 ), u(k)) are admissible pairs ∀k = 0, . . . , N − 1


5.3. R ECURSIVE FEASIBILITY 63

and the terminal constraint

xu ( N, x0 ) ∈ X

hold. We denote the respective set of admissible sequences by UX


N ( x ).
0

An input sequence u ∈ U ∞ are the corresponding trajectory xu (k, x0 ) are called admissible
for x0 if they are admissible for x0 up to every time N ∈ N. We denote the set of admissible
input sequences for x0 by UX ∞ ( x ).
0

A feedback µ : X → U is called admissible if µ(x) ∈ U1X (x) holds for all x ∈ X.

We like to note the slight difference between U and U1 (x): By definition os admissibility for x
up to time 1, we have that

U1X (x) := {u ∈ U(x) | f (x0 , u) ∈ X} ⊂ U(x).

This is essential especially for our definition of an admissible feedback, which ensures exactly
that.

Remark 5.12
Note that even if U(x) = U is independent of the actual state x, the set U N (x) may still depend
on x for some or all N ∈ N.

The property of admissibility is defined on sequences of states and inputs, yet not on the problem.
We now use admissibility to formalize the problem property of feasibility:

Definition 5.13 (Feasibility).


Consider a digital finite constrained optimal control problem (5.3).

We call an initial condition x0 ∈ X feasible for (5.3) if U N (x0 ) ̸= ∅.

The MPC Algorithm 5.9 is called recursively feasible on a set A ⊂ X if each x ∈ A is


feasible for (5.3) and x ∈ A implies f (x, µ N (x)) ∈ A.

In order to guarantee that Algorithm 5.9 is recursively feasible, the so called viability assumption
can be used.
64

Theorem 5.14 (Recursive feasibility and admissibility).


Consider the MPC Algorithm 5.9. If the viability assumption

∀x ∈ A ⊂ X : ∃u ∈ U(x) such that f (x, u) ∈ A ⊂ X (5.4)

holds, then the MPC Algorithm 5.9 is recursively feasible on A and the pairs
(xµ N (n), µ N (xµ N (n))) as well as the feedback µ N are admissible for all n ∈ N.

We like to point out that the viability assumption (5.4) looks simple, yet in practice it is rather
difficult to identify the set A.

Task 5.15
Consider sampled data model of a car
!
x1 (k ) + x2 (k ) + u(k )/2
x ( k + 1) =
x2 ( k ) + u ( k )

on a one dimensional road with position x1 , speed x2 and piecewise constant acceleration u.
Assume all variables to be constrained to [−1, 1]. Compute the set A.

Solution to Task 5.15: Using the dynamics and the extreme values x1 = x2 = 1 we obtain

x1 (k + 1) = x1 (k) + x2 (k ) + u(k )/2 ≥ 3/2 > 1

for any u ∈ U = [−1, 1]. Hence, such a state is not recursively feasible. Via elementary
computations, we can define
n o
A := x ∈ R2 | x1 ∈ [−1, 1], x2 ∈ [−1, 1] ∩ [−3/2 − x1 , 3/2 − x1 ]

for which the choice





 1, x2 < −1/2

u := −2x2 , x2 ∈ [−1/2, 1/2]


−1,

x2 > 1/2

satisfies u ∈ [−1, 1] and f (x, u) ∈ A.


5.4. S TABILITY CONDITIONS 65

Figure 5.4 illustrates the viability condition (blue) in comparison to the state constraints (black),
where the difference occurs in the encircled regions (red).

0.5
x2

−0.5

−1
−1 −0.5 0 0.5 1
x1

Figure 5.4.: Sketch of a viability set

In practice, we are interested to compute a feedback which is not only admissible, but also asymp-
totically stabilizes our system.

5.4. Stability conditions


To guarantee stability of the closed loop using the MPC feedback computed via Algorithm 5.9,
there are three different ideas in the literature. Two of them include the usage of so called terminal
conditions, that is conditions imposed to the end point of the open loop prediction horizon used
within MPC, and one based on Lyapunov functions. Here, we will not go into details regarding
the specifics of these methods, but discuss them from an application point of view.
Terminal conditions are conditions, which are added to the problem (5.3) at open loop time instant
k = N.

Remark 5.16
Note that as terminal conditions alter the problem, the solutions of the problem are in general
different.

The first approach uses so called terminal constraints:


66

Definition 5.17 (Terminal constraints).


Consider a digital finite constrained optimal control problem (5.3). Then we call

xu ( N, x0 ) ∈ X0 (5.5)

terminal constraint and X0 ⊂ X terminal constraint set.

The idea of terminal constraints is straightforward: By imposing a terminal constraint set, the set
of admissible pairs is limited, i.e. the set of initial values and controls to be chosen are reduced.
Hence, it is no longer necessary to compute the set A from the viability conditions, but it is
implicitly imposed using the right terminal conditions.

Remark 5.18
The right choice for terminal conditions can be made using ideas such as linearization around the
operating point x⋆ . From Control engineering 2 we then now that there exists a linear feedback
such that the terminal constraint set is recursively feasible.

In fact, we obtain the following restriction:

Definition 5.19 (Feasibility set).


Consider a digital finite constrained optimal control problem (5.3) together with terminal con-
straint (5.5). Then we call
n o
X N := x0 ∈ X | ∃u ∈ U N (x0 ) : xu ( N, x0 ) ∈ X0 (5.6)

feasible set for horizon N and


n o
N N
UX N
( x0 ) := u ∈ U (x0 ) | xu ( N, x0 ) ∈ X0 (5.7)

set of admissible control sequences for horizon N.

Combining terminal constraints and the MPC algorithm, we obtain the following:

Corollary 5.20 (Feasibility).


Consider the MPC Algorithm 5.9. For each x0 ∈ X N we have

f (x, µ N (x)) ∈ X N −1 . (5.8)


5.4. S TABILITY CONDITIONS 67

Based on the latter, we directly obtain:

Theorem 5.21 (Recursive feasibility using terminal constraints).


Consider the MPC Algorithm 5.9 with terminal constraint (5.5). Then the MPC Algorithm is
recursively feasible.

If we additionally know that for the region defined by the terminal constraint (5.5) there exists an
asymptotically stabilizing feedback, then the following can be concluded:

Theorem 5.22 (Asymptotical stability using terminal constraints).


Consider the MPC Algorithm 5.9. Suppose a terminal constraint (5.5) to be imposed on prob-
lem (5.3) and furthermore an asymptotically stabilizing feedback to exist for all x ∈ X0 . Then
the MPC Algorithm is asymptotically stabilizing the system (1.3).

While being simple in usage, the limitations of terminal constraints are the reduction of admissible
controls, which can only be reduced by enlarging the prediction horizon N. Since the latter
induces high computing times, it would be much simpler to increase the size of the terminal
constraints, which stand at the center of the second approach.
Different from terminal constraints, the second approach appends a terminal cost to the cost
function in problem (5.3). The intention is to enlarge the terminal constraints by including costs
arising for the cutoff horizon [ N, ∞). These terminal costs are defined as follows:

Definition 5.23 (Terminal costs).


Consider a digital finite constrained optimal control problem (5.3). Then we call a function
L : x → R0+ terminal cost if it is added to the cost function of problem (5.3)

N −1
min J (x0 , u) = ∑ ℓ(x(k, x0 , u), u(k)) + L(xu ( N, x0 ). (5.9)
k =0

Again, we obtain asymptotic stability using the existence of an asymptotically stabilizing feed-
back in the terminal constraint set:

Theorem 5.24 (Asymptotical stability using terminal costs).


Consider the MPC Algorithm 5.9. Suppose a terminal constraint X0 and terminal costs L(·) to
be imposed on problem (5.3) and furthermore an asymptotically stabilizing feedback to exist for
all x ∈ X0 . Then the MPC Algorithm is recursively feasible and asymptotically stabilizes the
system (1.3).
68

The last idea to guarantee asymptotic stability of the MPC closed loop utilizes a control-Lyapunov
function based approach. Here, we can directly utilize the MPC formulation to check the require-
ments of Definition 4.12 for practical control-Lyapunov function:

Theorem 5.25 (Asymptotical stability using suboptimality).


Consider the MPC Algorithm 5.9 and suppose the viability condition 5.4 to hold. If there exists
a function V : X → R0+ such that there exist functions α1 , α2 ∈ K∞ and a constant α ∈ (0, 1]
such that

α1 (∥x − x⋆ ∥) ≤ V (x) ≤ α2 (∥x − x⋆ ∥) (5.10)


V (x) ≥ αℓ(x, µ N (x)) + V ( f (x, µ N (x))) (5.11)

holds, then the MPC Algorithm is recursively feasible and asymptotically stabilizes the sys-
tem (1.3).

The intention of the last approach is to avoid constructing terminal constraints or costs and to
also avoid alteration of the original control problem. While being technically simple to monitor,
conditions (5.10)–(5.11) are very hard to check analytically. For further details, we refer to [2].
From an energy point of view, the conditions of Theorem 5.25 state that energy is continuously
drawn from the system, hence any trajectory is driven towards the operating point x⋆ . Yet, it is
not equivalent to the standard notation of Lyapunov, which uses α = 1. The latter parameter can
be interpreted as a measure of suboptimality, i.e. the tradeoff in optimality we have to accept for
cutting the horizon and making the problem to be computationally tractable.
CHAPTER 6

DISTRIBUTED CONTROL

So far, we considered systems and processes, for which one control unit can be used. In practice,
however, this may in some cases not be possible. For one, we may face the problem that a system
is either too large/complex such that it needs to be split up into smaller but possibly connected
problems. Examples for such systems are chemical plants, supply chains or production lines.
These problems also exist on a pure software level, e.g. in robotic process automation. Secondly,
there also exist problems which are naturally split. Such problems arise, e.g., if two units need to
work in a joint area. Examples range from autonomous cars to robots and companies working on
a seller/buyer basis.
In the present chapter, we focus on approaches using MPC to address such problems. Here, we
particularly focus on three ideas which allow us to split up or respectively keep the splitting while
still addressing the overall control problem. In Figure 6.1, we highlight that the connection we
seek is situated on the MPC level.

MPC MPC

MPC

Figure 6.1.: Building blocks within the MPC Algorithm 5.9

To this end, we consider three basic approaches. The first approach follows a first come first serve
principle where one controller takes its ground and the rest have to use the remaining opportuni-
ties. In the second approach, we highlight which of the systems do not interfere with one another,
70

i.e. which systems can actually work in parallel, and which cannot. The last and most insight-
ful approach considers a full parallelization of all systems. As we will see, the communication
requirements and also the information to be exchanged varies between these approaches.

Remark 6.1
At this point we like to stress that we focus on system which are split in the space/control domain.
It is additionally possible to tackle complexity also via a timewise split of systems. To this end,
not the states are separated, but the prediction horizon of the system. From a system theory point
of view, the splits are rather similar, yet require a PDE perspective.

6.1. Separation of systems


Instead of considering only one system

x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0
(1.3)
y ( k ) = h ( x ( k ), u ( k ), k ).

in this chapter we omit time variability and output and consider a set systems

p
x p (k + 1) = f p (x p (k), u p (k ), i p (k )), x p (0) = x0 (6.1)

where p ∈ P := {1, . . . , P} denotes the index of the respective subsystem and states and controls
satisfy x p (k ) ∈ X p and u p (k ) ∈ U p . Within these subsystems, we introduce the variable
i p (k ) ∈ I p in (6.1). The latter will allow us to link the set of systems on all levels and is therefore
called neighboring data and neighboring data set respectively. Note that the set depends on the
chosen element p ∈ P and may also vary over time.
Within the lecture, we will solely focus on the case of splitting the dynamics of the system. In
general, however, an MPC problem additionally contains the elements of constraints and costs,
which can also be split. As the splits can be built up on the same idea we outline here, we refer
to [2] for details on the general split.
To illustrate the idea, we first consider the following example:

Task 6.2
Reconsider the Example from Task 5.15 with dynamics
! !
x1 ( k + 1) x1 (k ) + x2 (k ) + u(k )/2
=
x2 ( k + 1) x2 ( k ) + u ( k )
6.1. S EPARATION OF SYSTEMS 71

and split the system into two subsystems using x1 = x1 , x2 = x2 and u2 = u.

Solution to Task 6.2: Setting x1 = x1 , x2 = x2 and u2 = u and leaving u1 undefined, we


obtain
from subsystem 2
z }| {
1 1 2 2
x (k + 1) = x (k) + x (k ) + u (k ) /2
x2 ( k + 1) = x2 ( k ) + u2 ( k ).

For that choice, subsystem 2 is independent from subsystem 1. However, to evaluate subsys-
tem 1 the information i1 (k ) is required to evaluate x2 (k ) and u2 (k ) from subsystem 2. Note
that the connection depends on how the control input from the overall system is assigned to
the subsystems. Setting u1 = u and leaving u2 undefined, both subsystems depend on each
other.

The aim of a split is that by recombining the subsystems (6.1) we reobtain the overall system (1.3)

x(k + 1) = f (x(k), u(k)) (6.2)

with state x(k ) = (x1 (k)⊤ , . . . , x P (k )⊤ )⊤ ∈ X = X 1 × . . . × X P and control u(k ) =


(u1 (k)⊤ , . . . , u P (k)⊤ )⊤ ∈ U = U 1 × . . . × U P . Within this chapter, we call (6.2) the (overall)
system, (6.1) the set of subsystems, and refer to p as a subsystem.
As we have seen in Task 6.2, it may be necessary to split up both the state set X as well as the
control set U . To do that in a coordinated manner, we introduce the following:

Definition 6.3 (Projection).


Given a set S, let π : S → S be a linear map which is idempotent, that is π ◦ π = π. We call
π a projection of S onto Im(π ) (along Ker(π )) where Im(π ) and Ker(π ) denote the image and
kernel of π.

Using a set of projections we define a decomposition of a vector space:

Definition 6.4 (Decomposition).


Consider a set S, a set P = {1, . . . , P} where P ∈ N, and a set of projections (π p ) p∈P where
S p := Im(π p ) is a subset of S for all p ∈ P to be given. If we have that

⟨(S p ) p∈P ⟩ = S and Sq ∩ ⟨(S p ) p∈P ,p̸=q ⟩ = {0} for all q ∈ P


72

hold, then we call the set (S p ) p∈P a decomposition of S.

Now we can use the decompositon to rewrite our overall system into subsystems defined on
subspaces. In particular, we require two projection sets for all p ∈ P , that is

p p
πX : X → X to split the state set such that Im(πX ) = X p , and

p p
πU : U → U to split the control set such that Im(πU ) = U p .

Unfortunately, these projections will in general not simply separate the state and control set.
We already saw the reason for this deficiency in Task 6.2: Subsystem dynamics may depend on
variables which we project into other subsystems. Hence, the projection in general leave us with
three components each, that is:

p
For the state projection, we obtain [X p , Xe p , X ] where x p ∈ X p are our primary variables
of interest. In particular, we have that e x p ∈ Xe p are the states of neighbors necessary to
p
evaluate the projected dynamic πX ◦ f correctly.

p
For the control projection, we have [U p , Ue p , U ] where again u p ∈ U p is at the core of our
e p ∈ Ue p is the necessary control information of neighbors to evaluate the
interest. Again, u
p
projected dynamic πX ◦ f .

Remark 6.5
e p ∈ Ue p are computed by different controllers. Hence, to include them to
Note that the controls u
evaluate another system, we have to transmit the respective data.

p p p
Different from Xe p and Ue p we find that πX ◦ f is independent of x p ∈ X and u p ∈ U . For
this reason, we call the latter independent states and controls.

Remark 6.6
In programming, x p (k ) ∈ X p and u p (k ) ∈ U p are called local or private variables whereas
p p
x p (k) ∈ Xe p , x p (k ) ∈ X , u
e e (k) ∈ Ue p and u p (k ) ∈ U are termed interface or public variables.

Based on Xe and Ue we can identify which information is required, and in particular from which
subsystem this information is required. This reveals
6.1. S EPARATION OF SYSTEMS 73

Definition 6.7 (Neighboring index set).


Consider a decomposition of system (6.2). Then we call I p = { p1 , . . . , pm } ⊂ P \ { p} neigh-
boring index set if it satisfies

(X p1 × . . . × X pm ) × (U p1 × . . . × U pm ) ⊃ (Xe p × Ue p ). (6.3)

Here, we like to stress that the above definition allows us to simply define all systems as part of
the index set. However, regarding bandwidth constraints, it is typically a good idea to keep these
sets as small as possible. The respective data is called neighboring data:

Definition 6.8 (Neighboring data).


Consider a neighboring index set I p (k ) of subsystem p ∈ P . We call the set

i p (k ) = {(q, k q , xq (·), uq (·)) | q ∈ I p (k )} ∈ I p (6.4)

neighboring data. The neighboring data set is given by I p = 2Q with Q = (P \ { p}) × N0 ×


X N +1 × U N .

Task 6.9
Reconsider Task 6.2 and compute neighboring index set and neighboring data.

Solution to Task 6.9: For our choice of variables we have I 1 (k) = {2} and I 2 (k) = ∅.
As we have seen in the solution of Task 6.2, we require the information contained in the
neighboring data i1 (k ) = 2, k, x2 (k ), u2 (k) to evaluate the system.
 

We like to highlight that the immediate information as in Task 6.9 is not sufficient for running
an MPC. To compute a respective trajectory, we require the state and control trajectories of those
subsystems in the neighboring index set.

Remark 6.10
For simplicity, we assume that the prediction horizon length will always be identical. Hence, we
do not include respective information in the neighboring data. Generalizations to this assumption
are possible but require a neighboring data set of the form

i p (k) = {(q, kq , N q , xq (·), uq (·)) | q ∈ I p (k)} ∈ I p . (6.5)


74

While the modification of the data to be transmitted is simple, the adaptations in the algorithms
and in the stability concepts are quite involved.

Using the construction via neighboring data, we directly obtain the following:

Corollary 6.11 (Equivalent subsystem split).


p p
Suppose a system (6.2), a set P = {1, . . . , P} as well as projections πX p∈P , πU p∈P in-
ducing decompositions ⟨(X p ) p∈P ⟩ and ⟨(U p ) p∈P ⟩ to be given. Then the overall system (6.2) is
equivalent to a set of subsystems (6.1) given by the dynamics
p p p p p
f p (x p , u p , (e e p )) :=[Idnx ×nx ; 0(nx −nx )×nx ] ◦ πX
xp, u
◦ f (σX p −1 (x p , e
x p , 0), σU p −1 (u p , u
e p , 0)) (6.6)

p
x p , x p ) and σU p : U →
for permutations σX p : X → X p × Xe p × X with σX p (x) = (x p , e
p
e p , u p ) for all p ∈ P .
U p × Ue p × U with σU p (u) = (u p , u

Coming back to our definition of the neighboring index set, we see that the choice of the projec-
tions is not fixed, yet it is advisable to keep it as small as possible. Moreover, the subsystems do
p p
not depend on the subspaces X , U , which should therefore be maximized to reduce computa-
tional complexity.
As outlined at the beginning of this section, the projection approach can also be applied to the
components of costs and constraints of the MPC problem.

Remark 6.12
We like to note that in case of constraints the projection the sets of costate and independent states
as well as cocontrols and independent controls depend on the overall system state x ∈ X .

Using these projections, we obtain the following local problems

Definition 6.13 (Projected digital constrained optimal control problem).


Consider a digital constrained optimal control problem (5.3), a set P = {1, . . . , P} as well as
p p
projections πX p∈P , πU p∈P inducing a decomposition. Then we call

N −1

p p p p,N
min J ( x0 , u p ) = ℓ p (x p (k, x0 , u p ), u p (k)) over all u p ∈ UX p (6.7)
0
k =0
p
subject to x p (k + 1) = f p (x p (k), u p (k )), x p (0) = x0
6.1. S EPARATION OF SYSTEMS 75

x p (k) ∈ X p , k ∈ [0, N ]

a projected digital finite constrained optimal control problem.

Hence, a basic distributed MPC algorithm may look as follows:

Algorithm 6.14 (Basic MPC Algorithm)


For each closed loop time index n = 0, 1, 2 . . .:

(1) For each subsystem p ∈ P


Obtain the state x p (n) ∈ X p of the system.

(2) For each subsystem p ∈ P


a) Obtain neighboring index set I p (n) and collect data i p .
p
b) Set x0 := x p (n), solve the projected digital finite optimal control problem (6.7) and
p,N p
denote the obtained optimal control sequence by u p,⋆ (·) ∈ U p (x0 , i p ).
X0

c) Send data ( p, x p (·), u p (·)) to all subsystems q ∈ P \ { p}.


until u p (·) and i p has converged for all p ∈ P .

(3) For each subsystem p ∈ P


p
Define the MPC feedback µ N (x p (n)) := u p,⋆ (0).

This algorithm will be the basis for our discussion regarding how to coordinate the subsystems
and subsystem computations in the following section.

Remark 6.15
We like to note that there exist a variety of problems that fall under the scope of so called dis-
tributed problems. On the extreme ends, there are centralized and decentralized problems. The
first represents the case where only one system is considered (equivalent for combining all sys-
tems into one big system). The latter is the case where the systems are completely disconnected,
i.e. for system p the variables of all other systems q ∈ P are independent variables. In between,
we distinguish between so called cooperative and noncooperative settings, the first characterized
by having identical KPIs while for the second one KPIs may differ.

The basic assumption which we have to make for any of the following approaches is to impose
feasibility of Algorithm 6.14:
76

Assumption 6.16 (Feasibility)


Given Algorithm 6.14 suppose that for each x(n) = x1 (n), . . . , x p (n) obtained in Step 1 there

p,N p
exists i p (n) for all p ∈ P such that U p (x0 , i p ) ̸= ∅ and Step 2 terminates successfully.
X0

The latter assumption makes sure that we can always apply Algorithm 6.14. Hence, we have
recursive feasibility:

Theorem 6.17 (Recursive feasibility of distributed NMPC).


Consider Algorithm 6.14 and suppose Assumption 6.16 to hold. Then the closed loop is recur-
sively feasible.

Unfortunately, it is not clear in general when the assumption can be assumed to hold true. Here,
we focus on a few less general cases, for which the latter can be shown.

6.2. Sequential approach


The first approach we discuss is characterized by a time decoupling of the problem. The idea is
that the problem is split into subproblems, and the subproblems are solved in a sequential way.
While being simple in the implementation, such an approach has several shortcoming which we
will address in this section as well.
The method of sequentially solving the distributed control task is also called Richards and How [5]
algorithm. The idea of the method is to form a simple line between the subsystems, i.e. first we
compute a control for subproblem 1 and transmit it to all others, then for subproblem 2 and so
forth, cf. Figure 6.2.

Remark 6.18
The order of systems is a free choice within this setting.

As indicated by the red lines within the figure, the only difficulty arising is that information
transmitted from systems x p at time instant n can only be used by systems xq < x p in the
subsequent time instant n + 1.

Remark 6.19
For a typical MPC implementation as we discuss it here, the lack of data equals one time step at
the end of the neighboring data. This may differ in practice depending on the type of horizon shift
used within the MPC.
6.2. S EQUENTIAL APPROACH 77

Subsystem x1

Subsystem x2

···

Subsystem x p

n n+1

Figure 6.2.: Communication structure for scheme of Richards and How

To cope with this issue and refill the data lack, we define the following:

Definition 6.20 (Neighboring data extension).


Consider a neighboring index set I p (k ) of subsystem p ∈ P . We call the set

ei p (k ) = {(q, k q , xq (·), uq (·)) | q ∈ I p (k)} ∈ e


I p (i p ) (6.8)

neighboring data extension if

xq (·) and uq (·) is defined for k = 0, . . . , N − 1 and

for all undefined xq (·) and uq (·) an admissible solution is substituted.

In other words, an admissible extension regarding control and state must exist and added to the
neighboring data sequence in order to continue computing within the distributed setting.

Remark 6.21
Note that for the stability ideas of terminal conditions such an extension exists as the terminal
point is an operating point of the subsystem. Regarding terminal costs, an extension exists if the
neighborhood of the terminal point is designed such that all solutions within the neighborhood
will always represent independent variables for all other subsystems.
78

Combining the communication structure given by Figure 6.2 and the neighboring data extension
with our basic Algorithm 6.14, we obtain the following:

Algorithm 6.22 (Richards and How Algorithm for Distributed NMPC)


Initialization:

(1) For each subsystem p ∈ P


Obtain the state x p (0) ∈ X p of the system.

(2) For each subsystem p ∈ P


p,N p
a) Find control sequences u p,⋆ (·) ∈ U p ( x0 ) such that the overall system is feasible.
X0

b) Send data ( p, x p (·), u p (·)) to all subsystems q ∈ P \ { p}.

(3) For each subsystem p ∈ P


p
a) Define the MPC feedback µ N (x p (0)) := u p,⋆ (0).

Feedback loop: For each closed loop time index n = 1, 2 . . .:

(1) For each subsystem p ∈ P


Obtain the state x p (n) ∈ X p of the system.

(2) For each subsystem p ∈ P do sequentially


a) Collect neighboring data i p for all subsystems and extend neighboring data for all
subsystems j > p.
p
b) Set x0 := x p (n), solve the projected digital finite optimal control problem (6.7) and
p,N p
denote the obtained optimal control sequence by u p,⋆ (·) ∈ U p (x0 , i p ).
X0

c) Send data ( p, x p (·), u p (·)) to all subsystems q ∈ P \ { p}.


until u p (·) and i p has converged for all p ∈ P .

(3) For each subsystem p ∈ P


p
a) Define the MPC feedback µ N (x p (n)) := u p,⋆ (0).

Regarding recursive feasibility, we can convert out stability results for centralized MPC problems
from Chapter 5 to obtain the following:
6.3. H IERARCHICAL APPROACH 79

Theorem 6.23 (Stability of Richards and How Algorithm).


Consider Algorithm 6.22. If the initialization phase exhibits a solution, then we have

p,N p p
U p ( x0 , i ) ̸= ∅ ∀n ∈ N. (6.9)
X0

If additionally the stability conditions from either Theorem 5.22, Theorem 5.24 or Theorem 5.25
hold for each subsystem p ∈ P , then the closed loop of the overall system is asymptotically
stable.

While being simple to apply, the serial solution of optimal control problems leads to long waiting
times for other subsystems. This is particularly hurtful for systems which are independent from
one another, which we exploit in the following section.

6.3. Hierarchical approach


In the previous section we discussed how a sequential approach can operate. The idea of a hi-
erarchical approach takes the same idea but sorts systems in a dependency tree. Similar to the
sequential approach, the order of systems is a choice. The main difference lies in identifying
which systems may operate in parallel. To this end, the communication and the dependency
graph must be decoupled, cf. Figure 6.3 for an exemplary sketch.

x1 x2

x3 x4

Figure 6.3.: Sketch of exemplary communication (dashed) and dependency graph (line)

To make use of this decoupling, we must identify those systems, which are independent from one
another. Using the denomination from our projection, we directly obtain:
80

Corollary 6.24 (Independence of systems).


Consider a decomposition of system (6.2) using a set of projections (π p ) p∈P . Given a current
state of the overall system x ∈ X , then subsystems p and q are independent if

Xe p = ∅, Ue p = ∅ and Xe q = ∅, Ue q = ∅. (6.10)

Using this independence, we know that certain sets of systems may operate in parallel. More
formally

Definition 6.25 (List of parallel operational systems).


Consider a decomposition of system (6.2) using a set of projections (π p ) p∈P . Then we call the
set of sets L ∈ 2P satisfying

L := { p ∈ P | (6.10) holds} (6.11)

list of parallel operational systems.

Since we used the powerset in the above definition, we can see that there exists quite a large
number of possibilities for such lists. This corresponds to the chance that subsystems 1 and 2
may be independent from one another, but 2 may depend on 3. In that case, subsystems 1 and
3 or subsystems 1 and 2 may operate in parallel. To obtain a concise order, we introduce the
following two operators.

Definition 6.26 (Priority and deordering rule).


We call the operator Π : 2P → 2P priority rule and the operator ∆ : 2P → 2P deordering rule.

The priority rule can be used to sort subsystems within lists of parallel operational systems.

Task 6.27
Give an example of a priority rule.

Solution to Task 6.27: The lexicographical order <N is a priority rule sorting subsystems
by their index. It additionally chooses that list of parallel operational systems for which
subsystems are sorted to their lowest possible list element.
6.3. H IERARCHICAL APPROACH 81

The idea of the deordering rule is different: Depending on the overall system state, dependencies
of subsystems on one another may occur. In that case, the subsystems are sorted into list elements
of parallel operational systems. However, if the dependency no longer exists, also the sorting
should be revoked. Unfortunately, we cannot detect this using the solution of the subsystems.
The reason for the latter is simple: If dependencies via constraints occur, feasibility of a solution
will solve this dependency. Hence, no potential violation occurs. Yet, we cannot say whether
there is a potential for a violation or not, we can only detect it if it occurs. For this reaons, one
typically uses a simple forget rule.

Task 6.28
Give an example of a deordering rule.

Solution to Task 6.28: The operator ∆(L) = ∅ is a deordering rule. It basically removes
all dependencies, which will have to be rebuild before taking the next optimization step.

Remark 6.29
Instead of fully forgetting any structure, a more structure preserving idea is to delete one depen-
dency at random.

Combining the latter two operator with our Algorithm 6.14, we obtain the following:

Algorithm 6.30 (Hierarchical DMPC Algorithm)


For each closed loop time index n = 0, 1, 2 . . .:

(1) For each subsystem p ∈ P


Obtain the state x p (n) ∈ X p of the system.

(2a) Deordering
For each j from 2 to P
For k from 1 to #L j
i. Set I p k (n) := ∆(I p k (n))
ii. If I p k (n) = ∅ remove pk from L j and set L1 := (L1 , pk )
Else if m̃ = mink∈Lm ,pk ∈I p k (n) m < j, remove pk from L j and set Lm̃ :=
(Lm̃ , pk )
82

p
a) Set x0 := x p (n), solve the projected digital finite optimal control problem (6.7) and
p,N p
denote the obtained optimal control sequence by u p,⋆ (·) ∈ U p (x0 , i p ).
X0

b) Send data ( p, x p (·), u p (·)) to all subsystems q ∈ Lk with k > j.

(2b) Priority
For each j from 1 to P do
a) If #L j ∈ {0, 1} goto Step 3. Else sort index via L j := Π(L j ).
b) Collect neighboring data i p for all subsystems.
c) For k from 2 to #L j do
If pk exhibits costate/cocontrol of pk , k < k, set L j+1 := (L j+1 , k ) and L j :=
L j \ L j +1
d) Solve the projected digital finite optimal control problem (6.7) and denote the obtained
p,N p
optimal control sequence by u p,⋆ (·) ∈ U p (x0 , i p ).
X0

e) Send data ( p, x p (·), u p (·)) to all subsystems q ∈ Lk with k ≥ j.

(3) For each subsystem p ∈ P


p
Define the MPC feedback µ N (x p (n)) := u p,⋆ (0).

We like to stress that in Step 2b the sending of neighboring information addresses subsystems
on equal or higher hierarchy level while in Step 2a only higher levels are addressed. The reason
for the latter is that for establishing the dependency graph, we must be able to assess whether or
not a subsystem on the same level poses a costate/cocontrol for the present subsystem. If such a
variable exists, then by priority the subsystems will be sorted to higher levels.
While trying to address parallel computing, the hierarchical approach ultimately fails if subsys-
tems remain dependent on one another.

6.4. Parallel approach


A completely different idea of decoupling the subsystems is to consider dynamic and constraints
as costs, which renders the overall system to be unconstrained, and then decouple the uncon-
strained problem. Here, we focus on the so called dual decomposition method. In contrast to the
sequential and hierarchical approach, an additional server is required resulting in a communica-
tion structure displayed in Figure 6.4.
Apart from introducing a server, the communication also differs as to how often information is
transmitted. Here, several transmissions to the server and back to the subsystems is required per
6.4. PARALLEL APPROACH 83

Server

Subsystem xq

Subsystem x p

n Communication n+1

Figure 6.4.: Communication schedule for dual decomposition

closed loop step.


To formalize the setting, we first introduce the following:

Definition 6.31 (Cost operator).


Consider a control problem (5.3) with n constraints given by state constraints, control constraints
and dynamics. Then we call an operator Γ : X × U → Rnx +n a cost operator if it satisfies

Γ(x, u) = 0 (6.12)

iff the conditions

x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0 (6.13)
x(k ) ∈ X, k ∈ [0, N ] (6.14)

hold.

Using the Lagrangian idea, we obtain the combined cost

L(x0 , u, λ) := JN (x0 , u) + λ⊤ · Γ(x0 , u) (6.15)

for which g(λ) = argminu∈U L(x0 , u, λ) is the dual of the control problem (5.3).
84

Now, we can additively distribute the Lagrangian problem (6.15), which leads to the following
algorithm:

Algorithm 6.32 (Dual decomposition)


For each closed loop time index n = 0, 1, 2 . . .:
At subsystem:

(1) For each subsystem p ∈ P


Obtain the state x p (n) ∈ X p of the system and set λ0 = 0 and j = 0.

(2) For each subsystem p ∈ P do


(2a) Collect data (0, n, λ j ).
p
(2b) Compute a minimizer for the Lagrangian (6.15) and denote the solution by un (·).
p
(2c) Send data p, n, x0 , u p,j+1 (·) to central entity.


At central entity:
(2a) Collect neighboring data i p for all subsystems.
(2b) Update Lagrange multiplier

λ j +1 : = λ j + ρ j · Γ ( x 0 , u j , λ j )

(2c) Send Lagrange multiplier (0, n, λ j+1 ) to all subsystems p ∈ P . Set j := j + 1 and go
to (2) unless a termination criterion is satisfied.
At subsystem:

(3) For each subsystem p ∈ P


p
a) Define the MPC feedback µ N (x p (n)) := u p,⋆ (0).

The big advantage of Algorithm 6.32 is that it can be applied to basically any optimal control
problem. It allows us to split the problem into subproblems, where the split is not necessarily
according to constraints or dynamics, but can be chosen arbitrarily. On the downside, a central
entity is required, which coordinates the progress of the overall system. Here, the iterator j indi-
cates that a number of intermediate steps may (and typically is) necessary to reach a termination
criterion. Regarding the Lagrangian multiplier update, we included the factor ρ, which can be
adapted for the line search.
BIBLIOGRAPHY

[1] BARTO, A. ; S UTTON, R.S.: Reinforcement Learning: An Introduction. 2. MIT Press, 2018

[2] G RÜNE, L. ; PANNEK, J.: Nonlinear Model Predictive Control: Theory and Algorithms. 2.
Springer, 2017

[3] H INRICHSEN, D. ; P RITCHARD, A.J.: Mathematical system theory I: Modeling, state space
analysis and robustness. Springer, 2010

[4] L JUNG, L.: System Identification: Theory for the User. Pearson Education, 1998. – ISBN
0132440539

[5] R ICHARDS, A. ; H OW, J.: A Decentralized Algorithm for Robust Constrained Model Pre-
dictive Control, 2004, S. 4241–4266

[6] S ONTAG, E.D.: Mathematical Control Theory: Deterministic Finite Dimensional Systems.
Springer, 1998. – 531 S. – ISBN 0387984895
Jürgen Pannek
Institute for Intermodal Transport and Logistic Systems
Hermann-Blenck-Str. 42
38519 Braunschweig

During winter term 2022/23 I give the lecture to the module Modern Control Systems
(Moderne Regelungstechnik) at the Technical University of Braunschweig. To structure
the lecture and support my students in their learning process, I prepared these lecture
notes. The aim of the lecture notes is to provide participating students with knowledge of
advanced control methods, which extend the range of control engineering. The students
shall be enabled to list modern control methods and recall their properties. Moreover,
students shall be able to apply these methods in simulation experiments and assess
respective results.

You might also like