2022 Moderncontrolsystems
2022 Moderncontrolsystems
Regelungstechnik)
Lecture Notes
Jürgen Pannek
April 7, 2023
Jürgen Pannek
Institute for Intermodal Transport and Logistic Systems
Hermann-Blenck-Str. 42
38519 Braunschweig
FOREWORD
During winter term 2022/23 I give the lecture to the module Modern Control Systems (Moderne
Regelungstechnik) at the Technical University of Braunschweig. To structure the lecture and
support my students in their learning process, I prepared these lecture notes. As it is the first
edition, the notes are still incomplete and are updated in due course of the lecture itself. Moreover,
I will integrate remarks and corrections throughout the term.
The aim of the module is to provide participating students with knowledge of advanced control
methods, which extend the range of control engineering. After having successfully completed
the lecture Modern Control Systems, students are able to define control methods for embedded
and networked systems, transfer them to models and applications and apply them. The students
can specify and explain the aspects of consistency, stability and robustness as well as areas of
application of methods. In addition, they are able to implement the integration of methods in
toolchains and apply them to real systems such as vehicles. Students can also reproduce processes
of parameter application and automated testing and transfer them to case studies.
To this end, the module will tackle the subject areas
within the lecture and support understanding and application within the tutorial classes. The
module itself is accredited with 5 credits.
An electronic version of this script can be found at
https://www.tu-braunschweig.de/itl/lehre/skripte
During the preparation of the lecture, I utilized the books of E. Sontag [6] and D. Hinrichsen and
A. Pritchard [3] for terminology and linear systems. Regarding MPC, the lecture notes are based
on the book of L. Grüne and J. Pannek [2].
Contents
Contents iv
List of tables v
I Linear systems 19
2 Optimal stabilization 21
2.1 Linear quadratic regulator — LQR . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 H2 control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 H∞ control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 Optimal observation 33
3.1 Recursive estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Transformation of dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
II Nonlinear systems 43
4 Digitalization 45
4.1 Zero order hold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Practical stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
IV C ONTENTS
6 Distributed control 69
6.1 Separation of systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.2 Sequential approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Hierarchical approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4 Parallel approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Bibliography 85
List of Tables
List of Figures
In control engineering practice, the terms stability and observability are central properties of
systems. Abstractly speaking, stability of a system is given if for any bounded input the state and
output of the system remain bounded, and additionally that the impact of the input is decaying
over time. Observability, on the other hand, is given if for known input and output history of a
system the state of the system can be computed and is unique.
Within this lecture, we discuss methods to enforce and evaluate these properties. For our dis-
cussion, we distinguish between linear and nonlinear systems. The reason for considering these
cases separately is that for linear systems it is possible to analytically evaluate scenarios without
simulation, and therefore also to use respective formulas to prove properties of methods. In the
nonlinear case, the options to rigorously show such results are limited, and evaluation of systems
and methods require complex simulations.
This chapter serves as basis for the terminology used within the lecture. We first introduce the
required terms from system theory and control theory. Thereafter, we define the concepts of
stability and observability.
1.1. System
The term system as such is typically not defined clearly. In certain areas, a system stands for a
connected graph, a dynamically evolving entity or even a simulation or an optimization. While
the intention of the latter are quite distinct, they all can be boiled down to the following:
The interdependence of systems with their environment is given by so called inputs and outputs.
More formally, we define the following:
2
The set U and Y are called input and output sets. An element from the input set u ∈ U is called
an input, which act from the environment to the system and are not dependent on the system
itself or its properties. We distinguish between inputs, which are used to specifically manipulate
(or control) the system, and inputs, which are not manipulated on purpose. We call the first ones
control or manipulation inputs, and we refer to the second ones as disturbance inputs. An element
from the output set y ∈ Y is called an output. In contrast to an input, the output is generated by
the system and influences the environment. Here, we distinguish output variables depending on
whether we measure them or not. We call the measured ones measurement outputs.
u1 y1
u2 y2
.. System Σ ..
. .
u nu y ny
If the system is linear in inputs and outputs, then the system is called linear. Similarly, if it
is not linear in either the inputs or outputs, then the system is called nonlinear.
If all parameters are constants, then the system is called time invariant. It is termed time
varying if at least one parameter is time dependent.
If the outputs depend on the input at the same time instant, we call systems such as this one
static. If the output of the system depends not only on the input at the time instant but also
on the history of the latter, we call these systems dynamic.
If the outputs depend on the history of the inputs only, then the system is called causal. If
future values are included, the system is called uncausal.
If inputs are directly mapped to outputs, then the map is called input output system. If the
input triggers changes of an internal variable and the output depends on the latter, then the
map is called state space system.
1.1. S YSTEM 3
If time is measured continuously, then we call the system to be in continuous time. If time
is sampled, we refer to the system as discrete time system.
Within the lecture, we focus on state space systems, which are time invariant, dynamic and causal.
To introduce such systems, we first need to define what we referred to as internal variable:
u = [ u1 u2 . . . u n u ] ⊤ (1.1a)
h i⊤
y = y1 y2 . . . y n y (1.1b)
x = [ x1 x2 . . . x n x ] ⊤ . (1.1c)
where u j is an element within the subset j of the input set U , y j is an element within the subset j
of the output set Y and x j is an element within the subset j of the state set X .
Remark 1.4
Here, we use this notation to allow for real valued and other entries such as gears, method
characteristics or switches. In the real valued setting, we have U ⊂ Rnu , Y ⊂ Rny and X ⊂
Rn x .
In the continuous time setting T = R, we can utilize the short form ẋ for d
dt x and obtain the
following compact notation:
tion 1.3. If X is a vector space, then we call it state space and refer to
as continuous time system. Moreover, u, y and x are called input, output and state of the system.
The state of a system at time instant t can then be depicted as a point in the n x –dimensional state
space. The curve of points for variable time t in the state space is called trajectory and is denoted
by x(·).
Remark 1.6
Systems with infinite dimensional states are called distributed parametric systems and are de-
scribed, e.g., via partial differential equations. Examples of such systems are beams, boards,
membranes, electromagnetic fields, heat etc..
x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0 (1.3a)
y ( k ) = h ( x ( k ), u ( k ), k ). (1.3b)
as discrete time system. Again, u, y and x are called input, output and state of the system.
While we have t ∈ R in continuous time, for discrete time systems the matter of time refers to an
index k ∈ Z. Hence, trajectories are no longer curves but sequences of points in the respective
set. Discrete time systems are the typical result of digitalization as sampling continuous time
systems, e.g. via a A/D and D/A converter, directly reveals a discrete time system. The result of
such a digitalization is a time grid. The most simple case here is by applying equidistant sampling
with sampling time T which gives us
T := {tk | tk := t0 + k · T } ⊂ R. (1.4)
where t0 is some fixed initial time stamp. Apart from equidistant sampling, other types such as
event based or sequence based are possible.
1.1. S YSTEM 5
Remark 1.8
Note that the class of discrete time systems is larger and contains the class of continuous time
systems, i.e. for each continuous time system there exists a discrete time equivalent, but for some
discrete time systems no continuous time equivalent exists.
Note that in both discrete and continuous time, the map reveals a flow within the state space.
We obtain a trajectory if we specify an initial value and an input sequence. Figure 1.2 illustrates
the idea of flow and trajectory. In this case, the flow is colored to mark its intensity whereas the
arrows point into its direction. The trajectory is evaluated for a specific initial value and „follows“
the flow accordingly.
x2
x1
As indicated in the introduction, stability refers to the property of being able to control a system
to achieve a certain goal like boundedness or convergence. To this end, the input must be able
to show an impact on the states, may it be directly or indirectly. Observability on the other hand
refers to the ability of identifying the status of a system, that is to be directly or indirectly able
to measure states. Figure 1.3 illustrates this context. The figure also shows that typically not all
states can be manipulated, not even indirectly, and not all states can be observed. Yet, we will see
that even in this case methods can be applied to ensure stability and observability.
In order to discuss the terms stability and observability in details, we focus on the special class of
linear control systems:
6
x1
xj
u y
xk
xnx
linear time invariant control system in continuous time with initial value x0 ∈ Rnx . The time
discrete equivalent reads
x ( k + 1) = A · x ( k ) + B · u ( k ), x (0) = x0 (1.6a)
y ( k + 1) = C · x ( k ) + D · u ( k ). (1.6b)
Zt
A·(t−t0 )
x(t; t0 , x0 , u) = exp x0 + exp A·(t−s) · B · u(s)ds. (1.7)
t0
k −1
x ( k ) = A k · x0 + ∑ A k −1− j · B · u ( j ). (1.8)
j =0
From the solution, we directly obtain the so called superposition property and the time shifting
property:
1.2. S TABILITY 7
hold.
The superposition principle allows us to separate the uncontrolled solution (u = 0) and the
unforced solution (x0 = 0).
1.2. Stability
Stability is an essential property for control systems and is bound to certain points in the state
space, the so called operating point. An operating point is characterized by the dynamic to be
zero at these points. In other terms, the input (as a control) should be chosen appropriately to
render the property to hold true.
f (x⋆ , u⋆ ) = 0 (1.11)
are called operating points of the system. For discrete time systems (1.3) we call (x⋆ , u⋆ ) oper-
ating point if
f (x⋆ , u⋆ ) = x⋆ (1.12)
If (1.11) or (1.12) hold true respectively for any u⋆ , then the operating point is called strong or
robust operating point.
Note that for autonomous systems, that is (1.2) or (1.3) being independent of time t or k, the
control u ∈ Rnu is required to be constant and fixed to u = u⋆ in order to compute the operating
points.
8
Based on this definition, the property of stability can be characterized by boundedness and con-
vergence of solutions:
strongly or robustly stable operating point if, for each ε > 0, there exists a real number
δ = δ(ε) > 0 such that for all u we have
strongly or robustly asymptotically stable operating point if it is stable and there exists a
positive real constant r such that for all u
holds for all x0 satisfying ∥x0 − x⋆ ∥ ≤ r. If additionally r can be chosen arbitrary large,
then x⋆ is called globally strongly or robustly asymptotically stable.
weakly stable or controllable operating point if, for each ε > 0, there exists a real number
δ = δ(ε) > 0 such that for each x0 there exists a control u guaranteeing
Then the operating point x⋆ = 0 is stable iff all Eigenvalues have non-positive real part
and for all Eigenvalues with real part 0 the corresponding Jordan block is one-dimensional.
Then the operating point x⋆ = 0 is locally asymptotically stable iff all Eigenvalues have
negative real part.
Remark 1.15
If all Eigenvalues of a matrix A exhibit negative real part, then the matrix is called Hurwitz.
Given the Eigenvalue criterion, it is straightforward to derive an input, which induces the stability
property.
1. whether or not it is actually possible that a feedback F can be constructed such that the
conditions of Theorem 1.16 hold, nor
To answer the first question, we take a look at controllability of a system. Here, Kalman formu-
lated that idea to reach points by combinations of dynamics and input, that is A and B. Since the
dimension of the set reachable by the dynamics only cannot grow larger after n x − 1 iterations,
he introduced the so called Kalman criterion:
Remark 1.18
The reachable set is typically defined as the set of point, which can be reached from x0 = 0 within
a certain time t ≥ 0 via
R(t) := {x(t, 0, u) | u ∈ U } .
Similarly, the controllable set refers to those points x0 , for which a control u can be found to
drive the solution to the origin, i.e.
Unfortunately, in his criterion Kalman made the assumption that the control needs to affect all
dimensions of the state space in order for the system to be controllable. But if a part of the
system is already controllable even without the control affecting it, then only controllability of
the remaining part needs to be ensured. To this end, Hautus introduced separability in the state
space:
where ( A1 , B1 ) is controllable.
Now, the idea is to simply apply the Kalman criterion to the separated part of the dynamics/state
space:
rk (λId − A | B) = n x (1.19)
holds
1. for all λ ∈ C or
Having answered the question whether or not a feedback can be constructed, we next focus on
how such a feedback can be computed. To this end, we apply basic linear algebra, which gives us
the so called controllable canonical form. Again, we start with the more simple Kalman case.
Based on the latter, we directly obtain controllability if we can assign any polynomial.
To enforce the stability property, we require that the roots of an assignable polynom are in the
negative complex halfplain. Hence, if any polynomial is assignable, we choose one for which the
root criterion holds.
Coming back to Hautus’s case, we basically require that the uncontrollable part is already stable,
that is:
There exists an assignable polynomial, for which all roots in C have negative real part.
Combining these lines of argumentation, Figure 1.4 provides an overview of the results.
Kalman Hautus
( A, B) is controllable
( A, B) is controllable or
( A, B) is not controllable but A3 has only eigenvalues with negative real part
Theorem 1.22
Theorem 1.23
( A, B) is stabilizable
Remark 1.25
Theorem 1.23 and Corollary 1.24 are often called pole shifting theorem as the roots of the char-
acteristic polynomial are equivalent to the poles of the transfer matrix of the system.
1.3. Observability
Similarly to controllability, in many cases not all but only a reasonable subset of manipulable
inputs are controlled. Regarding observability, we also have the case that in most cases not all
measurable outputs are actually measured. For our linear time invariant system (1.5) or (1.6) this
means that the matrices C, D are not full rank matrices. In practice, one typically only finds that
states are measured while inputs stay unmeasured, i.e. D = 0.
The task for observability is to derive information on the system from the outputs y(·) ∈ Y by
utilizing the values themselves and the history of values.
1.3. O BSERVABILITY 13
As in the previous Section 1.2, we now focus on the linear time invariant case. For such systems,
we have that equation (1.21) reads
C · x(t, x1 − x2 , 0) ̸= 0 (1.23)
Note that the lemma states that distinguishability and observability does not depend on the input
u in the linear case.
Remark 1.28
The set of non-observable states is defined as those states x0 such that the output for u = 0 is
always zero, i.e.
Based on Lemma 1.27, we can apply the Eigenvalue criterion from Theorem 1.14 to the pair
( A, C ).
holds.
Following the approach of Hautus, an identical separation within the dynamics can be utilized to
widen the applicability of Kalman’s criterion.
where ( A3 , C2 ) is observable.
Now, however, we face the difficulty that we equivalent for stability in the context of observable
systems is missing. Yet, we have seen that there are remarkable similarities between controllabil-
ity and observability. These similarities also exists on a systemic level:
Remark 1.33
In particular, we have that the reachable set of the dual system is identical to the observable set
! !⊥
R(t) ⊤ =: R⊤ = N ⊥ :=
[ \
N (t) .
t ≥0 t ≥0
Using duality, we define the property detectability, which resembles stability of the dual system.
Detectability therefore means that information on the non-observable part (cf. Theorem 1.30) is
not required as respective solutions are asymptotically stable.
Hence, we now have the means to transfer the Hautus criterion to observability.
holds
1. for all λ ∈ C or
Similar to the canonical form for controllability, for observable systems a respective transforma-
tion can be found.
16
Combining these lines of argumentation together with the core of stability, Figure 1.5 provides
an overview of the results.
We like to point out that the properties controllability and observability are independent from one
another and only connected for the respective dual system. Consequently, there exist four classes
of systems
These classes can also be seen in Figure 1.3, which served as starting point for these terms.
1.3. O BSERVABILITY 17
Kalman Hautus
( A, C ) is observable
( A, C ) is observable or
( A, C ) is not observable but A1 has only eigenvalues with negative real part
( A⊤ , C ⊤ ) is controllable
( A⊤ , C ⊤ ) is controllable or
( A⊤ , C ⊤ ) is not controllable but A1 has only eigenvalues with negative real part
Theorem 1.22
Theorem 1.23
( A⊤ , C ⊤ ) is stabilizable
Theorem 1.37
( A, C ) is detectable
Linear systems
CHAPTER 2
OPTIMAL STABILIZATION
Focusing on the state space, we typically speak of cost functions. These combined information
on state and input of the system to quantify performance of the control.
The value of a key performance criterion reveals a snapshot only, i.e. the evaluation at one time
instant t ∈ T . To obtain the performance, we need to evaluate it over the operating period of the
system. Since by doing so we define a function of a function, this is referred to as a functional.
Z∞
J ( x0 , u ) : = ℓ(x(t, x0 , u), u(t))dt (2.1)
0
cost functional.
Now we can combine the criteria to evaluate and optimize the dynamics over an operating period.
This allows us to quantify not only operating points, but also the transients from the current state
of the system to such an operating point.
Z∞
min J (x0 , u) = ℓ(x(t, x0 , u), u(t))dt over all u ∈ U (2.2)
0
subject to ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0
The idea of the optimal control problem is to enforce the stability property of a system and to
compute a feedback, which is optimal in the sense of the key performance indicator. A simple
way to check whether a feedback stabilizes a system, the following condition can be used.
2.1. L INEAR QUADRATIC REGULATOR — LQR 23
The connection betweeen condition (2.4) and stability is rather simple: If we design the key
performance criterion such that it is zero at the desired operating point, then once the operating
point is reached no additional costs will occur over the operating period. Hence, the state of the
system will remain at the operating point. Note that by Definition 1.12 for each operating point
there exists an input such that the state remains unchanged.
Now, we focus on the LTI case (1.5). For this particular case, it is sufficient to consider a norm
like criterion, that is a way to measure the distance from current state to operating point. The first
distance which we consider is the Euclidean distance.
where Q ∈ Rnx ×nx , N ∈ Rnx ×nu and R ∈ Rnu ×nu form a symmetric and positive definite
matrix in (2.5).
Combining linear dynamics with quadratic costs gives us the so called LQ problem.
The nice property of the LQ problem is that its solution is null controlling and therefore the
solution also stabilizes the system.
24
The central question now is to compute the solution of the LQ problem. In particular, we are
not simply interested in a solution but in a solution which can be evaluated based on the state of
system, i.e. a feedback. To this end, we utilize the idea of the value function and suppose it can
be chosen in the ansatz
V (x) = x⊤ · P · x (2.6)
To evaluate the feedback, we require the matrix P of the value function ansatz. This matrix can
be computed using the so called algebraic Riccati equation. The idea of this equation is that the
solution reaches the operating point and calculate the minimum of the ansatz (2.6), i.e. take the
derivative and set it to zero. Since the ansatz is quadratic, the necessary condition is also sufficient
for optimality.
Z∞ h
! " #
i Q N x(t)
min J (x0 , u) = x(t)⊤ u(t)⊤ · · dt over all u ∈ U (2.10)
N⊤ R u(t)
0
subject to ẋ(t) = A · x(t) + B · u(t), x ( t0 ) = x0
The connections between the latter results are visualized in Figure 2.1.
Theorem 2.9
Value function V (x) is of form (2.6) ( A, B) is stabilizable
Remark 2.13
The state based setting described within this section can be extended to the output based setting.
For this case, we utilize the quadratic cost function
! " #
h i Qe N
e y
ℓ(y, u) = y⊤ u⊤ · · (2.11)
e⊤ R
N u
2.2. H2 control
In contrast to LQR, which focuses on properties measured within the state space, the H2 formal-
ism considers a frequency domain idea. To get to this idea, we first introduce the 2-norm for
systems.
Z∞
V (s) := v̂(s) = L( f (t)) = exp(−st) · f (t)dt, s = α + iω
0
Remark 2.15
In the literature, the term L2 space is typically found to be the correct one. Yet, talking about
function which are bounded and analytic in the right half plane and exhibit finite L p norms on the
imaginary axis – which are fundamental for stable function – are called Hardy spaces, the term
H2 norm has become dominant.
∥ v ∥2 = ∥V ∥2 (2.14)
In order to apply this result, we reconsider our dynamics. For multivariable systems, we know
from control theory that a reformulation via the Laplace transform reveals a transfer matrix con-
necting inputs to outputs. In particular, for our LTI case (1.5)
Zt
A·t
y(t) = C · exp · x0 + H (t − τ ) · u(τ )dτ (2.15)
0
Z∞
G (s) = H (t) · exp−st dt (2.16)
0
Now we can apply Corollary 2.16 to our dynamics and see the following:
∥ G ∥2 = ∥ H ∥2 (2.17)
Equation (2.18) allows us to evaluate the H2 norm in frequency domain by means known in the
state domain. To this end, we only require the solution and the respective output, which we get
from (2.15). In particular, for the LTI case we have
⊤ 12
Z∞ Z∞
∥ G ∥2 = ∥ H ∥2 = tr C · exp A·t · B + D · C · exp A·t · B + D . (2.19)
0 0
Having defined the connections between the norms, the aim of the H2 controller we want to
compute now is to minimize the H2 norm of the closed loop. Note that the term associated to the
initial value x0 in (2.15) is a constant and therefore can be omitted in an optimization.
2.2. H2 CONTROL 29
nu Z∞
J ( x0 , u ) : = ∥ H ∥22 = ∑ y(t)⊤ · y(t)dt (2.20)
j =1 0
to be minimized over all u(t) = e j · δ(t) where δ(·) is the Dirac delta function. Then we refer to
this setting as H2 problem.
Within this setting, the input is modeled as noise, which is realized on the j-th input using the
Dirac delta and may occur at any time instant t.
Remark 2.21
If the covariance of the inputs is a unitary matrix, then the input can be interpreted as white noise.
Moreover, the result of the H2 converges in the expected value as all frequencies are accounted
for in an equal manner. Therefore, the H2 control shows a stochastic characterization.
Having defined the H2 problem, we can solve it using an identical idea as in the LQR case, that
is to impose an algebraic Riccati equation. In particular, we obtain the following:
where P is the unique symmetric positive semidefinite solution of the algebraic Riccati equation
−1
A⊤ · P + P · A − P · B · D ⊤ · D · B⊤ · P + C ⊤ · C = 0. (2.23)
Similar to the LQR case, we again have to be careful to use the positive definite solution of the
algebraic Riccati equation. The approach to evalute the H2 feedback is almost identical to the
LQR case:
30
nu Z∞
min J (x0 , u) = ∥ H ∥22 = ∑ y(t)⊤ · y(t)dt over all u(t) = e j · δ(t) (2.24)
j =1 0
Remark 2.24
Note that in the LTI case we have that
nu Z∞
J ( x0 , u ) = ∑ y(t)⊤ · y(t)dt
j =1 0
n Z∞
u
= ∑ (C · x(t) + D · u(t))⊤ · (C · x(t) + D · u(t)) dt.
j =1 0
1 1
If we choose C = Q 2 and D = R 2 , then we obtain that H2 is a special case of LQR.
2.3. H∞ control
The idea of the H∞ feedback is similar to the H2 feedback. Instead of the L2 norm, where the
aim is to minimize the deviation of the output along the trajectory, in the H∞ case the supremum
norm is used to minimize the highest deviation.
2.3. H∞ CONTROL 31
Z∞
V (s) := v̂(s) = L( f (t)) = exp(−st) · f (t)dt, s = α + iω
0
∥ G (iω ) · v∥
∥y∥∞ = sup | v ̸= 0, v ∈ Cny (2.27)
v ∥v∥
Again, the terms L∞ and H∞ are used identically in the literature. In the case of H∞ , we will not
go into deep but only highlight connections to H2 . The first connection is about conservatism of
the controllers. Since we have
1 1
Z∞ Z∞
2 2
∥ G · v ∥2
∥ G ∥∞ ≥ ∀v ̸= 0.
∥ v ∥2
This can be interpreted as the concentrated impact of v close to the frequency range of ∥ G ∥∞ .
Hence, the H∞ norm gives the maximum factor by which the system magnifies the H2 norm of
any input. For this reason, ∥ G ∥∞ is also referred to as gain of the system.
Remark 2.26
As a consequence, the H∞ feedback is always more conservative than the H2 feedback as it aims
to hold down the maximal amplification.
32
to be minimized over all u(t) = e j · δ(t) where δ(·) is the Dirac delta function. Then we refer to
this setting as H∞ problem.
Regarding the solutions, again an algebraic Riccati equation is employed and we obtain:
asymptotically stablizes the system with feedback matrix F ∈ Rnu ×nx given by
−1
⊤
F =− D ·D · B⊤ · P (2.30)
where P is the unique symmetric positive semidefinite solution of the algebraic Riccati equation
−1
⊤ ⊤
A ·P+P·A−P·B D ·D B⊤ P + γ−2 · P · B · B⊤ · P + C ⊤ · C = 0. (2.31)
CHAPTER 3
OPTIMAL OBSERVATION
In the previous chapters, we discussed stability as a system property and how we can manage
to ensure that a system is asymptotically stable by computing a feedback law. The feedback,
however, is based on the state of the system x. Since typically not all states are actually measured
but instead only a restricted output y is known, the feedback cannot be evaluated.
To complete this gap in this chapter, we shift our focus to the task of estimating the state x based
on the output y. Similar to the LQR approach from Section 2.1, the aim is to derive a method
that provides us with an optimal state estimation x̂(t) ≈ x(t) and can be applied in realtime. The
latter requirement rules out all aposteriori methods minimizing over given data sets, but instead
forces a recursive approach. Recursive means that estimates from previous time instances are
re-used and are updated using newly acquired output data. Such methods are typically referred to
as observers or filters.
Solution to Task 3.1: The estimate of the mean ŷ based on N outputs is given by
N
1
ŷ( N ) =
N ∑ y ( j ).
j =1
The difficulty now arises if another output is available and the mean computation shall be updated.
N +1
1
ŷ( N + 1) =
N+1 ∑ y ( j ).
j =1
In this solution, the previous result from Solution 3.1 is not used. While such an approach is nu-
merically robust and requires no further insight, it may be computationally expensive depending
on the number of samples and the complexity of the computation process. Hence, reformulating
the problem such that only the newly required calculations are made, recuperating all the previous
results, may allow us to generate a more efficient solution method.
Solution to Task 3.3: To recuperate the previous sum, we can equivalently evaluate
N
1 1
ŷ( N + 1) =
N+1 ∑ y ( j ) + N + 1 y ( N + 1)
j =1
N 1
= ŷ( N ) + y ( N + 1).
N+1 N+1
3.2. T RANSFORMATION OF DYNAMICS 35
Although this form already meets our requirements of reusing previous computations, it is
possible to rearrange it to a more suitable expression:
1
ŷ( N + 1) = ŷ( N ) + (y( N + 1) − ŷ( N ))
N+1
Although this expression is very simple, it is very informative because almost every recursive
algorithm can be reduced to a similar form. Based on the latter, the following observations can
be made:
The new estimate ŷ( N + 1) equals the old estimate ŷ( N ) plus a correction term, that is
1
N +1 ( y ( N + 1) − ŷ ( N )).
1
The correction term consists of two terms by itself: a gain factor N +1 and an error term.
The gain factor decreases towards zero as more outputs are already accumulated in the
previous estimate. This means that in the beginning of the experiment, less importance is
given to the old estimate ŷ( N ), and more attention is paid to the new incoming outputs.
When N starts to grow, the error term becomes small compared to the old estimate. The
algorithm relies more and more on the accumulated information in the old estimate ŷ( N )
and it does not vary it that much for accidental variations of the new outputs. The additional
bit of information in the new output becomes small compared with the information that is
accumulated in the old estimate.
The second term y( N + 1) − ŷ( N ) is an error term. It incorporates the difference between
the predicted value of the next output on the basis of the model and the output y( N + 1).
When properly initiated, i.e. ŷ(1) = y(1), this recursive result is exactly equal to the non
recursive implementation. However, from a numerical point of view, it is a very robust
procedure as calculation errors etc. are compensated in each step.
together with the known control u(t), t ≥ 0, given outputs y(t), t ≥ 0 and an estimate x̂0 of the
unknown initial state x0 .
Depending on the time instant of interest, we can classify the following problem classes:
Within this chapter, we are interested in computing realtime estimates, i.e. τ = t and therefore
work in the area of filtering problems. To solve the latter we apply the ansatz using the so called
estimator dynamics:
Based on the latter, we can quantify the mismatch between estimator and true system:
Similar to the optimal control problem, we can now define the optimal estimation problem. Yet, in
contrast of finding an optimal input u(·), we aim to find an estimator x̂(·) such that the estimated
error (3.2) becomes as small as possible in the sense of a key performance indicator. Moreover,
at time t the estimator shall be computable based on outputs y(τ ), 0 ≤ τ ≤ t known at time t
only.
3.2. T RANSFORMATION OF DYNAMICS 37
Similar to the cost function for the control problem where the idea of the cost is to induce stability
via null-controlling, we formulate a cost function for the estimator using the error function. Here,
the idea is to use the null-controlling property to enforce stability of the error function and thereby
convergence of the estimator.
Z∞
J ( x0 , u ) : = ℓ(e(t, x̂0 ), u(t))dt (3.3)
0
cost functional.
This gives us
Z∞
min J (x0 , u) = ℓ(e(t, x̂0 ), u(t))dt over all u ∈ U (3.4)
0
subject to e(t, x̂0 ) := x̂(t, x̂0 , u) − x(t, x0 , u)
ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0
x̂˙ (t) = f (x̂(t), u(t), t) + d(t), x̂(0) = x0
Note that we can use this problem to directly transfer the null controlling property from Corol-
lary 2.6 for stability to observability. In this case, not the system but the error function of the
estimation is stabilized.
The latter result suggests that the solution of the optimal estimation problem from Definition 3.8
could be identical to the optimal control problem from Definition 2.4. Unfortunately, there are
some slight differences:
38
1. In the optimal control problem, we consider the state to be stabilized, while in the optimal
estimation problem the error needs to be stabilized.
2. The solution computed by the optimal control problem is the control strategy, which in the
LTI case can be evaluated by a linear feedback law. For the optimal estimation problem,
we aim to compute the current state of the problem.
3. Last, the given data for the optimal estimation problem stems from past measurements,
which cannot be used in the formulation of the optimal estimation problem.
In the following, we will address the integration problem of measurements from the past by
converting the optimal estimation problem. Then, similar to LQR, our aim now is to derive a
problem, for which the null controlling property can be shown.
with estimator
we obtain:
error dynamics.
3.3. K ALMAN FILTER 39
Remark 3.11
The error dynamics are the dual wrt. the LTI system, cf. Definition 1.31. Hence, stability of the
dual system gives us observability of the primal system.
As a consequence, all the following computations can only be executed if the system ( A, C ) is
observable. Otherwise, no solution can be computed.
Based on the error dynamics, we can integrate the measurements, which are available for past
time instances. Hence, the cost functional we design aims to drive the error to zero but operates
on a time frame, which leads up to the current time instant.
Zτ
J ( x0 , d ) : = (C · e(t) − ye (t))⊤ · Q̂ · (C · e(t) − ye (t)) + d(t)⊤ · R · d(t)dt (3.8)
−∞
quadratic cost functional for observability where Q̂ ∈ Rny ×ny and R ∈ Rnu ×nu are
(semi)positive definite matrices.
In order to convert the cost functional (3.8) to be in the form (3.3), we apply the following:
Z∞
τ
J ( x0 , d ) : = (C · eτ (t) − yτe (t))⊤ · Q̂ · (C · eτ (t) − yτe (t)) + d(t)⊤ · R · d(t)dt (3.11)
0
Z∞
τ
min J (x0 , d) := (C · eτ (t) − yτe (t))⊤ · Q̂ · (C · eτ (t) − yτe (t)) + d(t)⊤ · R · d(t)dt
0
over all x0 ∈ X (3.13)
subject to ėτ (t) = − A · eτ (t) − d(τ − t), e τ ( t0 ) = x0
yτe (t) = C · eτ (t)
V (e) = eτ ⊤ · P · eτ (3.14)
e˙τ (τ ) = A · eτ (τ ) + L · (C · eτ (τ ) − ye (τ )) (3.15)
L := −S · C ⊤ · Q̂ (3.16)
A · S + S · A⊤ − S · C ⊤ · Q̂ · C · S + D · R−1 · D ⊤ = 0 (3.17)
and the value function of the optimal estimation problem is given by (3.14) with P := S−1 .
Remark 3.16
In (3.15) we obtain the identical structure of the observer, which we designed in Task 3.3 for the
mean value update. For this reason, L is also called gain matrix.
3.3. K ALMAN FILTER 41
Note that again a solution P of the dual Riccati equation (3.17) is not unique, yet there exists at
most one semi positive definite S. Combining the latter results, we obtain the following procedure
to compute the Kalman filter:
A · S + S · A⊤ − S · C ⊤ · Q̂ · C · S + D · R−1 · D ⊤ = 0
L := −S · C ⊤ · Q̂.
In practice, a Kalman filter is typically updated periodically, i.e. a dynamic for computing the
ansatz matrix S is applied to integrate newly obtained knowledge of outputs. S is also called
covariance matrix of the system. In the literature, the dynamic of this matrix is split into an
apriori and an aposteriori covariance update as well as an prediction and an correction step of the
error dynamics, cf., e.g., [4]. As we focus on continuous time dynamics, this separation is beyond
the scope of the lecture.
Part II.
Nonlinear systems
CHAPTER 4
DIGITALIZATION
To deal with nonlinear systems, we follow a so called direct approach, which is quite different
from the direct approach we considered in Control engineering 2. Instead of analytically or
structurally dealing with the system or its solution, we first transfer the problem into the sphere
of digital control problems and than apply optimization to compute a control strategy.
In the present chapter, we focus on the first step and digitize the control system. Here, we follow
the most simple approach and consider a so called zero order hold. At this point, we already like
to stress that by definition such a control is not Lipschitz continuous. Hence, the feedback will be
very different from the ones we considered in Control engineering 2 and in particular will not be
in the form of a function. Moreover, we don’t aim to compute a feedback which is stabilizing for
all possible digitizations. Instead, we suppose a sampling to be given and then derive a stabilizing
controller.
To conclude stability of the original system, in the present chapter we additionally discuss how
stability of the digital feedback can be guaranteed for the original system as well. Throughout the
nonlinear part of the lecture, we focus on systems of the form
Remark 4.1
There are two more general cases: For one, the sampling times may be defined by a function of
time, or secondly, the sampling times can be defined by a function of states. The first one is com-
mon in prediction and prescription of systems where action is the far future are significantly less
important. Hence, one typically chooses between exactness of the prediction and computational
complexity. The latter case is referred to a event driven control.
We still like to stress that in applications, the choice of T is not fixed right from the beginning,
but depends on the obtainable solution and stability properties. Note that the result of sampling
the control is not a discrete time system (see Definition 1.7), but a continuous time system (see
Definition 1.5) where the input u is of zero order hold. More formally, we formulate zero order
hold input and solution as a parametrization of operators with respect to T.
Remark 4.3
We like to point out that higher order holds are possible as well. In practice, however, such
higher order holds are not defined using a polynomial approximation but via additional differen-
tial equations for the control itself.
4.1. Z ERO ORDER HOLD 47
As a consequence of the latter definition, the input u T is not continuous but instead exhibits jumps
at the sampling times tk , cf. Figure 4.1.
u
continuous
2 sampled
1.5
0.5
t
−3 −2 −1 1 2 3 4 5 6
−0.5
Still, the function is integrable, which is a requirement for existence of a solution of (1.2) for such
an input. This insertion directly leads to the following:
In order to compute such a solution, we can simply concatenate solutions of subsequent sampling
intervals [tk , tk+1 ). Here, we can use the endpoint of the solution on one sampling interval to be
the initial point on the following one. Hence, the computation of x T is well defined, cf. Figure 4.2
for an illustration.
Remark 4.5
Since the system is Lipschitz continuous on each interval [tk , tk+1 ), the solution is also unique.
48
u/x Control u T
Solution x T
8
t
−3 −2 −1 1 2 3 4 5 6
Hence, identifying endpoint and initial point of subsequent sampling intervals is sufficient to
show that the zero order hold solution is unique. Yet, as a consequence of this concatenation, the
solution is not differentiable at the sampling points tk .
Remark 4.6
Note that despite u T to be piecewise constant, the zero order hold solution does not exhibit jumps
and shows nonlinear behavior.
holds for all t > 0 and all initial value satisfying ∥x0 ∥ ≤ R.
Again, main difference between our setting here and in Control engineering 2 is that we don’t aim
to compute a feedback which is stabilizing for all T ∈ (0, T ⋆ ]. Instead, we suppose a sampling
to be given and then derive a stabilizing controller.
Remark 4.8
The term „semiglobal“ refers to the constant R, which limits the range of the initial states for
which stability can be concluded. The term „practical“ refers to the constant ε, which is a
measure on how close the solution can be driven towards the operating point before oscillations
as in the case of the bang bang controller occur.
Different from the linear case where existence of a feedback and a feed forward control are
equivalent, in the nonlinear case we only have the following:
As a direct conclusion of Definition 4.7, we can apply Lemma 4.9 and obtain:
Definition 4.7 also shows the dilemma of digital control using fixed sampling periods: Both close
to the desired operating point and for initial values far away from it, the discontinuous evaluation
of the feedback u T leads to a degradation of performance. Close to the operating point, a slow
evaluation leads to overshoots despite the dynamics to be typically rather slow. Far away from
the operating point, the dynamics is too fast to be captured in between two sampling points which
leads to unstable behavior.
Still, it may even be possible to obtain asymptotic stability (not only practical asymptotic stability)
using fixed sampling periods T as shown in the following task:
Task 4.11
Consider the system
2 2
ẋ1 (t) = − x1 (t) + x2 (t) · u(t)
x2 (t) = (−2 · x1 (t) · x2 (t)) · u(t).
Design a zero order hold control such that the system is practically asymptotically stable.
For this choice, the system is globally asymptotically stable for all T > 0 and even inde-
pendent from T. The reason for the latter is that the solutions never cross the switching line
x1 = 0, i.e. the input to be applied is always constant, which leads to independence of the
feedback from T.
As described before, the behavior observed in Task 4.11 is the exception. In practice, the limita-
tions of semiglobality and practicality is typically the best we can expect in zero order hold input
of nonlinear system.
The latter definition extends the concepts of a Control-Lyapunov function is various ways. For
one, as the zero order hold solution is not differentiable, we can no longer assume VT to be
differentiable. Hence, the formulation of decrease in energy in inequality (4.5) is given along
a solution instead of its vector field. Moreover, the ideas of semiglobality and practicality are
integrated.
Remark 4.13
Comparing Definition 4.12 to Definition 4.7, we can identify the similarity of semiglobality be-
tween the constants R and R̂ as well as ε and ε̂. The difference between these two pairs lies in
their interpretation: For KL function, we utilize the state space, whereas for Control-Lyapunov
functions the energy space is used. Hence, both values are a transformation of one another using
the Control-Lyapunov function VT .
Now, supposingly that a practical Control-Lyapunov function exists, we can directly derive the
existence of a zero order hold control.
Note that in (4.6), the right hand side depends on u implicitly as x T (tk+1 ) is defined using the
initial value x T (tk ) and the zero order hold input u. Hence, the definition (4.6) is proper.
Remark 4.15
The transfer from infimum in (4.5) to minimum in (4.6) is only possible as the input is constant in
between two sampling instances tk and tk+1 and therefore the solution x T (·) is continuous with
respect to u.
Unfortunately, the pure existence of a feedback does not help us in computing it. Additionally,
we still require the existence of a practical Control-Lyapunov function to conclude existence of
such a feedback. Here, we first address existence of a Control-Lyapunov function, for which the
following is known from the literature:
The most important aspect of Theorem 4.16 is the requirement regarding the control system. The
result does only require the system to be asymptotically controllable, i.e. without digitalization.
for all t ∈ [0, T ] then the solutions are called uniformly bounded over T.
4.4. I NTERSAMPLE BEHAVIOR 53
Using boundedness, it can be shown that the system will stay bounded in between sampling
instances.
In practice, however, the two tasks of deriving feedback u T and Control-Lyapunov function VT
are often done in the inverse sequence. To this end, first a feedback u T is derived, and then the
inequality (4.5) is shown to hold for this feedback
The reason for using such a procedure is that Theorem 4.14 only requires a Control-Lyapunov
function for fixed R̂, ε̂ to exists for some T0 > 0 in order to conclude existence also for all
smaller sampling periods. Hence, if we find a constructive way to derive a feedback, then a
practical Control-Lyapunov function can be derived and stability properties of this feedback can
be concluded.
In the following chapters, we now focus on constructing such a feedback. To simplify the respec-
tive notation, we utilize the discrete time notation
x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0
(1.3)
y ( k ) = h ( x ( k ), u ( k ), k ).
introduced in Definition 1.7. To this end, we assume that the differential equation is solved to
compute the state x(k + 1) based on the continuous time dynamics (1.2) and the zero order hold
control u(t) := u T (t) =: u(k ).
CHAPTER 5
Based on the previous Chapter 4 on digitalization, we now discuss one approach to compute a
zero order hold feedback for a nonlinear system. The approaches we considered so far are based
on the analytical solution of an optimal control problem using the Riccati approach for a quadratic
optimal value function ansatz V (x) = x · P · x. However, as soon as the cost is nonquadratic,
the dynamics nonlinear or is state and control constraints are introduced, the value function V
is no longer quadratic and the approach in general no longer possible. The same holds for the
optimal feedback law, which is typically a rather complicated function for which already the
storage poses problems and limits such approaches to low dimensions. Moreover, the approach is
only capable to compute a Lipschitz continuous feedback. Yet if no continuous feedback exists,
by controllability we know that some kind of control exists, for which stability can be shown, e.g.
a discontinuous one.
The model predictive control approach takes a step back from optimality over an infinite horizon
by approximating it via a series of finite horizon problems. The purpose of the present chapter is
twofold: For one, we discuss the construction of a basic MPC algorithm and the interplay of the
building blocks as outlined in Figure 5.1 Thereafter, we show how a feedback can be constructed
from such an approach and how stability of the closed loop can be guaranteed.
MPC
Digitalization
Simulation Optimization
unbounded. In practical applications, however, we often face the problem that requirements need
to be met. To illustrate this point, we consider the following:
Task 5.1
Consider a supply chain as multi stage network driven by the dynamics
where p ∈ S = {S, M, R} denotes the stages, cf. Figure 5.2. Typically, the stage set
contains supplier (S), manufacturer (M) and retailer (R). Moreover, a p , ℓ p , o p and d p
denote the arriving and leaving as well as the order and demand rates. Formulate the basic
constraints such a system needs to obey in order to be physically meaningful.
Solution to Task 5.1: For all times t ≥ 0 and stages p ∈ S , the system is subject to the
constraints
p p
0 ≤ o p (t) ≤ omax 0 ≤ s p (t) ≤ smax
p p p
0 ≤ ou (t) ≤ ou,max 0 ≤ b p (t) ≤ bmax
as well as unknown costumer orders oC and fixed delivery delays τij , where i, j ∈ S represent
consecutive stages. The stages need to be linked since arrival/leaving as well as demand/order
5.1. I NTRODUCTION OF CONSTRAINTS 57
aS aM aR aC
τSM τMR τRC
Manufacturer
Customer
Supplier ℓS ℓM ℓR
Retailer
τSS
oS oM oR oC
dS dM dR
Figure 5.2.: Sketch of a three stage supply network
information is required to evaluate the dynamics. Here, we use a j (t + τij ) = ℓi (t) and
d j (t) = oi (t) for consecutive nodes i, j ∈ S and ai (τii ) = oi (t) for the supplier to define
p
these connections. The state for each stage can be defined via x p := (s p , ou , b p )⊤ .
Hence, constraints arise naturally in practical problems as states need to be bounded, e.g. to
prevent the system from collapsing or hitting physical barriers, or the controls need to be bounded,
e.g., for energy reasons or actuator limitations, or outputs need to be bounded, e.g., due to sensor
limitations. To address these requirements formally, we define constraints for our system as
follows.
We like to stress that constraints are always causing trouble in numerical computations. For this
reason, in many applications constraints are not formulated „hard“, that is as constraints that must
be satisfied, but instead as „soft“ by adding them as KPI to the cost function by penalizing the
violation of constraints.
Remark 5.3
Note that by definition soft constraints may be violated. Hence, such an approach is not applicable
for safety critical constraints.
Alternatively, modelers can focus on circumventing the usage of constraints as outlined in the
following task:
58
Task 5.4
Model cars going from an initial point x0 ∈ R2 to a target point x⋆ ∈ R2 via routing points
x j ∈ R2 , j = 1, . . . , M as illustrated in Figure 5.3 using a one dimensional system only.
Figure 5.3.: Definition of the driving path via splines for given routing points
Solution to Task 5.4: Define the route of each vehicle via routing points via interpolation
by splines. The car is then controlled along the arc of the spline. Then, we create a one
dimensional dynamics via the velocity along the arc length as a control.
To formalize this approach, we call M ∈ N the number of routine points. Denoting the
entire arc length by L, the routing points are interpolated via the cubic spline
!
Sx (ℓ)
S(ℓ) = , 0 ≤ ℓ ≤ L,
Sy (ℓ)
which is parametrized by ℓ representing the position on the arc. The arc length is approxi-
mated by
q
ℓ0 := 0, ℓ j +1 : = ℓ j + ( x j +1 − x j )2 + ( y j +1 − y j )2 , L := ℓ M .
5.2. MPC APPROACH 59
The spline gives us the route of each car, and its velocity is the time derivative of the current
position on the arc. Hence, driving along the route is equivalent to solving the initial value
problem
where t denotes time and u(t) represents the velocity of the car at time instant t. By choosing
the velocity u ∈ U we can control the car along the route. The corresponding position at
time instant t is given by
! !
x (ℓ(t)) Sx (ℓ(t))
=
y(ℓ(t)) Sy (ℓ(t))
Remark 5.5 Note that deriving the routing points in Task 5.4 is a different and decoupled
problem, which may be solved by a traffic guidance system. For simplicity, the center of
the traffic lane can be chosen. Regarding a production process or a single machine, these
routing points can be regarded as a feedforward control.
Instead of the velocity along the route, we could also use the acceleration or jerk. These
choices result in a differential equation of higher order. Additionally, the bounds on the
velocity are then state constraints, which drastically increase the complexity of the problem.
As mentioned before, we could also impose more complex models for each car and the
respective dynamics. However, these model would lead to an increase in the computational
cost. Since the modeled arcs are locally controlled by sublayer controllers of the car, these
arcs represent reality close enough. Hence, such an approach is more efficient.
form, which we later specified to LTI systems to discuss the LQR, H2 and H∞ controller.
Formally, we obtain
Z∞
min J (x0 , u) = ℓ(x(t, x0 , u), u(t))dt over all u ∈ U∞ (5.1)
0
subject to ẋ(t) = f (x(t), u(t), t), x ( t0 ) = x0
x(t) ∈ X, t ∈ [0, ∞)
Since the continuous time formulation allows for infinitely many control changes, it is not only
computationally difficult or intractable to solve. Additionally, actuators work in a sampled man-
ner, hence such a control is practically also not usable. To address these issues, we apply the
following adaptations:
By applying digitalization, we can shift the problem to the discrete time formulation solv-
ing the sampling issue. Moreover, digitalization allows us to decouple optimization and
simulation.
Cutting the infinite horizon to a finite one allows us to address the computational issue. For
one, simulation techniques to digitalized or discrete time systems are very effective, and
secondly, optimization methods for finitely many inputs are well developed.
These are the ingredients linked in Figure 5.1, which allow us to divide the control problem (5.1)
accordingly. To formalize this procedure, we first introduce the following:
N −1
min J (x0 , u) = ∑ ℓ(x(k, x0 , u), u(k)) over all u ∈ U N (5.3)
k =0
subject to x(k + 1) = f (x(k ), u(k), k), x (0) = x0
x(k ) ∈ X, k ∈ [0, N ]
While the problem is solvable now, it does not give us a solution of the original problem. To still
be able to at least approximate such a solution, MPC can be used. The idea of MPC is split up
the problem over time and only consider time windows, for which the problem is to be solved.
This goes hand in hand with the digitalization idea and the time windows are constructed such
that each window starts at a sampling instant. To capture long term system behavior, the length of
the time windows is longer than one sampling period and measured in multiples of the sampling
period. As the time windows solution is longer than required, only a fraction of the solution is
applied.
Combined, MPC is a three step scheme:
(2) Set x0 := x(n), solve the digital finite optimal control problem (5.3) and denote the ob-
tained optimal control sequence by u⋆ (·) ∈ U N .
While easily accessible and adaptable, the method behind Algorithm 5.9 exhibits some severe
flaws that need to be considered before putting it into practice:
1. Cutting the horizon to N < ∞ may result in infeasibility of the problem at closed loop time
indexes n > 0. A simple example is a car driving towards a wall. If the prediction horizon
is too small, the car is unable to stop before hitting the wall. Mathematically speaking, no
solution can be found satisfying all constraints. We address this issue in Section 5.3 and
show how feasibility can be guaranteed recursively.
62
2. Cutting the horizon may also result in destabilizing the system. Again, we can use the
car/wall example and put the target point behind the wall, i.e. the car needs to go around
the wall. If the wall is long compared to the prediction horizon, the car will not be able to
„see“ a possibility of going around the wall and stop in front of it. Hence the system is not
asymptotically stable. In Section 5.4, we address this issue using three different strategies.
guarantee that the subsequent problem (5.3) at closed loop index n + 1 exhibits a solution.
Remark 5.10
At this point, we want to stress the fact that loss of feasibility is due to the method of MPC,
i.e. the cutting of the horizon. This problem does not exist for the original constrained optimal
control problem (5.1). However, if the latter does not exhibit a solution, then it is not possible to
approximate such a non-existing solution using MPC.
The first property is referred to as feasibility, the second as recursive feasibility. To formalize
these properties, we first introduce the following:
The states x ∈ X are called admissible states and the inputs u ∈ U(x) are called admissi-
ble inputs for x. The elements of the set {(x, u) | x ∈ X, u ∈ U(x)} are called admissible
pairs.
For N ∈ N and initial value x0 ∈ X we call an input sequence u ∈ U N and the corre-
sponding trajectory xu (k, x0 ) admissible for x0 up to time N if
the running time constraint
xu ( N, x0 ) ∈ X
An input sequence u ∈ U ∞ are the corresponding trajectory xu (k, x0 ) are called admissible
for x0 if they are admissible for x0 up to every time N ∈ N. We denote the set of admissible
input sequences for x0 by UX ∞ ( x ).
0
We like to note the slight difference between U and U1 (x): By definition os admissibility for x
up to time 1, we have that
This is essential especially for our definition of an admissible feedback, which ensures exactly
that.
Remark 5.12
Note that even if U(x) = U is independent of the actual state x, the set U N (x) may still depend
on x for some or all N ∈ N.
The property of admissibility is defined on sequences of states and inputs, yet not on the problem.
We now use admissibility to formalize the problem property of feasibility:
In order to guarantee that Algorithm 5.9 is recursively feasible, the so called viability assumption
can be used.
64
holds, then the MPC Algorithm 5.9 is recursively feasible on A and the pairs
(xµ N (n), µ N (xµ N (n))) as well as the feedback µ N are admissible for all n ∈ N.
We like to point out that the viability assumption (5.4) looks simple, yet in practice it is rather
difficult to identify the set A.
Task 5.15
Consider sampled data model of a car
!
x1 (k ) + x2 (k ) + u(k )/2
x ( k + 1) =
x2 ( k ) + u ( k )
on a one dimensional road with position x1 , speed x2 and piecewise constant acceleration u.
Assume all variables to be constrained to [−1, 1]. Compute the set A.
Solution to Task 5.15: Using the dynamics and the extreme values x1 = x2 = 1 we obtain
for any u ∈ U = [−1, 1]. Hence, such a state is not recursively feasible. Via elementary
computations, we can define
n o
A := x ∈ R2 | x1 ∈ [−1, 1], x2 ∈ [−1, 1] ∩ [−3/2 − x1 , 3/2 − x1 ]
Figure 5.4 illustrates the viability condition (blue) in comparison to the state constraints (black),
where the difference occurs in the encircled regions (red).
0.5
x2
−0.5
−1
−1 −0.5 0 0.5 1
x1
In practice, we are interested to compute a feedback which is not only admissible, but also asymp-
totically stabilizes our system.
Remark 5.16
Note that as terminal conditions alter the problem, the solutions of the problem are in general
different.
xu ( N, x0 ) ∈ X0 (5.5)
The idea of terminal constraints is straightforward: By imposing a terminal constraint set, the set
of admissible pairs is limited, i.e. the set of initial values and controls to be chosen are reduced.
Hence, it is no longer necessary to compute the set A from the viability conditions, but it is
implicitly imposed using the right terminal conditions.
Remark 5.18
The right choice for terminal conditions can be made using ideas such as linearization around the
operating point x⋆ . From Control engineering 2 we then now that there exists a linear feedback
such that the terminal constraint set is recursively feasible.
Combining terminal constraints and the MPC algorithm, we obtain the following:
If we additionally know that for the region defined by the terminal constraint (5.5) there exists an
asymptotically stabilizing feedback, then the following can be concluded:
While being simple in usage, the limitations of terminal constraints are the reduction of admissible
controls, which can only be reduced by enlarging the prediction horizon N. Since the latter
induces high computing times, it would be much simpler to increase the size of the terminal
constraints, which stand at the center of the second approach.
Different from terminal constraints, the second approach appends a terminal cost to the cost
function in problem (5.3). The intention is to enlarge the terminal constraints by including costs
arising for the cutoff horizon [ N, ∞). These terminal costs are defined as follows:
N −1
min J (x0 , u) = ∑ ℓ(x(k, x0 , u), u(k)) + L(xu ( N, x0 ). (5.9)
k =0
Again, we obtain asymptotic stability using the existence of an asymptotically stabilizing feed-
back in the terminal constraint set:
The last idea to guarantee asymptotic stability of the MPC closed loop utilizes a control-Lyapunov
function based approach. Here, we can directly utilize the MPC formulation to check the require-
ments of Definition 4.12 for practical control-Lyapunov function:
holds, then the MPC Algorithm is recursively feasible and asymptotically stabilizes the sys-
tem (1.3).
The intention of the last approach is to avoid constructing terminal constraints or costs and to
also avoid alteration of the original control problem. While being technically simple to monitor,
conditions (5.10)–(5.11) are very hard to check analytically. For further details, we refer to [2].
From an energy point of view, the conditions of Theorem 5.25 state that energy is continuously
drawn from the system, hence any trajectory is driven towards the operating point x⋆ . Yet, it is
not equivalent to the standard notation of Lyapunov, which uses α = 1. The latter parameter can
be interpreted as a measure of suboptimality, i.e. the tradeoff in optimality we have to accept for
cutting the horizon and making the problem to be computationally tractable.
CHAPTER 6
DISTRIBUTED CONTROL
So far, we considered systems and processes, for which one control unit can be used. In practice,
however, this may in some cases not be possible. For one, we may face the problem that a system
is either too large/complex such that it needs to be split up into smaller but possibly connected
problems. Examples for such systems are chemical plants, supply chains or production lines.
These problems also exist on a pure software level, e.g. in robotic process automation. Secondly,
there also exist problems which are naturally split. Such problems arise, e.g., if two units need to
work in a joint area. Examples range from autonomous cars to robots and companies working on
a seller/buyer basis.
In the present chapter, we focus on approaches using MPC to address such problems. Here, we
particularly focus on three ideas which allow us to split up or respectively keep the splitting while
still addressing the overall control problem. In Figure 6.1, we highlight that the connection we
seek is situated on the MPC level.
MPC MPC
MPC
To this end, we consider three basic approaches. The first approach follows a first come first serve
principle where one controller takes its ground and the rest have to use the remaining opportuni-
ties. In the second approach, we highlight which of the systems do not interfere with one another,
70
i.e. which systems can actually work in parallel, and which cannot. The last and most insight-
ful approach considers a full parallelization of all systems. As we will see, the communication
requirements and also the information to be exchanged varies between these approaches.
Remark 6.1
At this point we like to stress that we focus on system which are split in the space/control domain.
It is additionally possible to tackle complexity also via a timewise split of systems. To this end,
not the states are separated, but the prediction horizon of the system. From a system theory point
of view, the splits are rather similar, yet require a PDE perspective.
x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0
(1.3)
y ( k ) = h ( x ( k ), u ( k ), k ).
in this chapter we omit time variability and output and consider a set systems
p
x p (k + 1) = f p (x p (k), u p (k ), i p (k )), x p (0) = x0 (6.1)
where p ∈ P := {1, . . . , P} denotes the index of the respective subsystem and states and controls
satisfy x p (k ) ∈ X p and u p (k ) ∈ U p . Within these subsystems, we introduce the variable
i p (k ) ∈ I p in (6.1). The latter will allow us to link the set of systems on all levels and is therefore
called neighboring data and neighboring data set respectively. Note that the set depends on the
chosen element p ∈ P and may also vary over time.
Within the lecture, we will solely focus on the case of splitting the dynamics of the system. In
general, however, an MPC problem additionally contains the elements of constraints and costs,
which can also be split. As the splits can be built up on the same idea we outline here, we refer
to [2] for details on the general split.
To illustrate the idea, we first consider the following example:
Task 6.2
Reconsider the Example from Task 5.15 with dynamics
! !
x1 ( k + 1) x1 (k ) + x2 (k ) + u(k )/2
=
x2 ( k + 1) x2 ( k ) + u ( k )
6.1. S EPARATION OF SYSTEMS 71
For that choice, subsystem 2 is independent from subsystem 1. However, to evaluate subsys-
tem 1 the information i1 (k ) is required to evaluate x2 (k ) and u2 (k ) from subsystem 2. Note
that the connection depends on how the control input from the overall system is assigned to
the subsystems. Setting u1 = u and leaving u2 undefined, both subsystems depend on each
other.
The aim of a split is that by recombining the subsystems (6.1) we reobtain the overall system (1.3)
Now we can use the decompositon to rewrite our overall system into subsystems defined on
subspaces. In particular, we require two projection sets for all p ∈ P , that is
p p
πX : X → X to split the state set such that Im(πX ) = X p , and
p p
πU : U → U to split the control set such that Im(πU ) = U p .
Unfortunately, these projections will in general not simply separate the state and control set.
We already saw the reason for this deficiency in Task 6.2: Subsystem dynamics may depend on
variables which we project into other subsystems. Hence, the projection in general leave us with
three components each, that is:
p
For the state projection, we obtain [X p , Xe p , X ] where x p ∈ X p are our primary variables
of interest. In particular, we have that e x p ∈ Xe p are the states of neighbors necessary to
p
evaluate the projected dynamic πX ◦ f correctly.
p
For the control projection, we have [U p , Ue p , U ] where again u p ∈ U p is at the core of our
e p ∈ Ue p is the necessary control information of neighbors to evaluate the
interest. Again, u
p
projected dynamic πX ◦ f .
Remark 6.5
e p ∈ Ue p are computed by different controllers. Hence, to include them to
Note that the controls u
evaluate another system, we have to transmit the respective data.
p p p
Different from Xe p and Ue p we find that πX ◦ f is independent of x p ∈ X and u p ∈ U . For
this reason, we call the latter independent states and controls.
Remark 6.6
In programming, x p (k ) ∈ X p and u p (k ) ∈ U p are called local or private variables whereas
p p
x p (k) ∈ Xe p , x p (k ) ∈ X , u
e e (k) ∈ Ue p and u p (k ) ∈ U are termed interface or public variables.
Based on Xe and Ue we can identify which information is required, and in particular from which
subsystem this information is required. This reveals
6.1. S EPARATION OF SYSTEMS 73
(X p1 × . . . × X pm ) × (U p1 × . . . × U pm ) ⊃ (Xe p × Ue p ). (6.3)
Here, we like to stress that the above definition allows us to simply define all systems as part of
the index set. However, regarding bandwidth constraints, it is typically a good idea to keep these
sets as small as possible. The respective data is called neighboring data:
Task 6.9
Reconsider Task 6.2 and compute neighboring index set and neighboring data.
Solution to Task 6.9: For our choice of variables we have I 1 (k) = {2} and I 2 (k) = ∅.
As we have seen in the solution of Task 6.2, we require the information contained in the
neighboring data i1 (k ) = 2, k, x2 (k ), u2 (k) to evaluate the system.
We like to highlight that the immediate information as in Task 6.9 is not sufficient for running
an MPC. To compute a respective trajectory, we require the state and control trajectories of those
subsystems in the neighboring index set.
Remark 6.10
For simplicity, we assume that the prediction horizon length will always be identical. Hence, we
do not include respective information in the neighboring data. Generalizations to this assumption
are possible but require a neighboring data set of the form
While the modification of the data to be transmitted is simple, the adaptations in the algorithms
and in the stability concepts are quite involved.
Using the construction via neighboring data, we directly obtain the following:
p
x p , x p ) and σU p : U →
for permutations σX p : X → X p × Xe p × X with σX p (x) = (x p , e
p
e p , u p ) for all p ∈ P .
U p × Ue p × U with σU p (u) = (u p , u
Coming back to our definition of the neighboring index set, we see that the choice of the projec-
tions is not fixed, yet it is advisable to keep it as small as possible. Moreover, the subsystems do
p p
not depend on the subspaces X , U , which should therefore be maximized to reduce computa-
tional complexity.
As outlined at the beginning of this section, the projection approach can also be applied to the
components of costs and constraints of the MPC problem.
Remark 6.12
We like to note that in case of constraints the projection the sets of costate and independent states
as well as cocontrols and independent controls depend on the overall system state x ∈ X .
N −1
∑
p p p p,N
min J ( x0 , u p ) = ℓ p (x p (k, x0 , u p ), u p (k)) over all u p ∈ UX p (6.7)
0
k =0
p
subject to x p (k + 1) = f p (x p (k), u p (k )), x p (0) = x0
6.1. S EPARATION OF SYSTEMS 75
x p (k) ∈ X p , k ∈ [0, N ]
This algorithm will be the basis for our discussion regarding how to coordinate the subsystems
and subsystem computations in the following section.
Remark 6.15
We like to note that there exist a variety of problems that fall under the scope of so called dis-
tributed problems. On the extreme ends, there are centralized and decentralized problems. The
first represents the case where only one system is considered (equivalent for combining all sys-
tems into one big system). The latter is the case where the systems are completely disconnected,
i.e. for system p the variables of all other systems q ∈ P are independent variables. In between,
we distinguish between so called cooperative and noncooperative settings, the first characterized
by having identical KPIs while for the second one KPIs may differ.
The basic assumption which we have to make for any of the following approaches is to impose
feasibility of Algorithm 6.14:
76
The latter assumption makes sure that we can always apply Algorithm 6.14. Hence, we have
recursive feasibility:
Unfortunately, it is not clear in general when the assumption can be assumed to hold true. Here,
we focus on a few less general cases, for which the latter can be shown.
Remark 6.18
The order of systems is a free choice within this setting.
As indicated by the red lines within the figure, the only difficulty arising is that information
transmitted from systems x p at time instant n can only be used by systems xq < x p in the
subsequent time instant n + 1.
Remark 6.19
For a typical MPC implementation as we discuss it here, the lack of data equals one time step at
the end of the neighboring data. This may differ in practice depending on the type of horizon shift
used within the MPC.
6.2. S EQUENTIAL APPROACH 77
Subsystem x1
Subsystem x2
···
Subsystem x p
n n+1
To cope with this issue and refill the data lack, we define the following:
In other words, an admissible extension regarding control and state must exist and added to the
neighboring data sequence in order to continue computing within the distributed setting.
Remark 6.21
Note that for the stability ideas of terminal conditions such an extension exists as the terminal
point is an operating point of the subsystem. Regarding terminal costs, an extension exists if the
neighborhood of the terminal point is designed such that all solutions within the neighborhood
will always represent independent variables for all other subsystems.
78
Combining the communication structure given by Figure 6.2 and the neighboring data extension
with our basic Algorithm 6.14, we obtain the following:
Regarding recursive feasibility, we can convert out stability results for centralized MPC problems
from Chapter 5 to obtain the following:
6.3. H IERARCHICAL APPROACH 79
p,N p p
U p ( x0 , i ) ̸= ∅ ∀n ∈ N. (6.9)
X0
If additionally the stability conditions from either Theorem 5.22, Theorem 5.24 or Theorem 5.25
hold for each subsystem p ∈ P , then the closed loop of the overall system is asymptotically
stable.
While being simple to apply, the serial solution of optimal control problems leads to long waiting
times for other subsystems. This is particularly hurtful for systems which are independent from
one another, which we exploit in the following section.
x1 x2
x3 x4
Figure 6.3.: Sketch of exemplary communication (dashed) and dependency graph (line)
To make use of this decoupling, we must identify those systems, which are independent from one
another. Using the denomination from our projection, we directly obtain:
80
Xe p = ∅, Ue p = ∅ and Xe q = ∅, Ue q = ∅. (6.10)
Using this independence, we know that certain sets of systems may operate in parallel. More
formally
Since we used the powerset in the above definition, we can see that there exists quite a large
number of possibilities for such lists. This corresponds to the chance that subsystems 1 and 2
may be independent from one another, but 2 may depend on 3. In that case, subsystems 1 and
3 or subsystems 1 and 2 may operate in parallel. To obtain a concise order, we introduce the
following two operators.
The priority rule can be used to sort subsystems within lists of parallel operational systems.
Task 6.27
Give an example of a priority rule.
Solution to Task 6.27: The lexicographical order <N is a priority rule sorting subsystems
by their index. It additionally chooses that list of parallel operational systems for which
subsystems are sorted to their lowest possible list element.
6.3. H IERARCHICAL APPROACH 81
The idea of the deordering rule is different: Depending on the overall system state, dependencies
of subsystems on one another may occur. In that case, the subsystems are sorted into list elements
of parallel operational systems. However, if the dependency no longer exists, also the sorting
should be revoked. Unfortunately, we cannot detect this using the solution of the subsystems.
The reason for the latter is simple: If dependencies via constraints occur, feasibility of a solution
will solve this dependency. Hence, no potential violation occurs. Yet, we cannot say whether
there is a potential for a violation or not, we can only detect it if it occurs. For this reaons, one
typically uses a simple forget rule.
Task 6.28
Give an example of a deordering rule.
Solution to Task 6.28: The operator ∆(L) = ∅ is a deordering rule. It basically removes
all dependencies, which will have to be rebuild before taking the next optimization step.
Remark 6.29
Instead of fully forgetting any structure, a more structure preserving idea is to delete one depen-
dency at random.
Combining the latter two operator with our Algorithm 6.14, we obtain the following:
(2a) Deordering
For each j from 2 to P
For k from 1 to #L j
i. Set I p k (n) := ∆(I p k (n))
ii. If I p k (n) = ∅ remove pk from L j and set L1 := (L1 , pk )
Else if m̃ = mink∈Lm ,pk ∈I p k (n) m < j, remove pk from L j and set Lm̃ :=
(Lm̃ , pk )
82
p
a) Set x0 := x p (n), solve the projected digital finite optimal control problem (6.7) and
p,N p
denote the obtained optimal control sequence by u p,⋆ (·) ∈ U p (x0 , i p ).
X0
(2b) Priority
For each j from 1 to P do
a) If #L j ∈ {0, 1} goto Step 3. Else sort index via L j := Π(L j ).
b) Collect neighboring data i p for all subsystems.
c) For k from 2 to #L j do
If pk exhibits costate/cocontrol of pk , k < k, set L j+1 := (L j+1 , k ) and L j :=
L j \ L j +1
d) Solve the projected digital finite optimal control problem (6.7) and denote the obtained
p,N p
optimal control sequence by u p,⋆ (·) ∈ U p (x0 , i p ).
X0
We like to stress that in Step 2b the sending of neighboring information addresses subsystems
on equal or higher hierarchy level while in Step 2a only higher levels are addressed. The reason
for the latter is that for establishing the dependency graph, we must be able to assess whether or
not a subsystem on the same level poses a costate/cocontrol for the present subsystem. If such a
variable exists, then by priority the subsystems will be sorted to higher levels.
While trying to address parallel computing, the hierarchical approach ultimately fails if subsys-
tems remain dependent on one another.
Server
Subsystem xq
Subsystem x p
n Communication n+1
Γ(x, u) = 0 (6.12)
x ( k + 1) = f ( x ( k ), u ( k ), k ), x (0) = x0 (6.13)
x(k ) ∈ X, k ∈ [0, N ] (6.14)
hold.
for which g(λ) = argminu∈U L(x0 , u, λ) is the dual of the control problem (5.3).
84
Now, we can additively distribute the Lagrangian problem (6.15), which leads to the following
algorithm:
At central entity:
(2a) Collect neighboring data i p for all subsystems.
(2b) Update Lagrange multiplier
λ j +1 : = λ j + ρ j · Γ ( x 0 , u j , λ j )
(2c) Send Lagrange multiplier (0, n, λ j+1 ) to all subsystems p ∈ P . Set j := j + 1 and go
to (2) unless a termination criterion is satisfied.
At subsystem:
The big advantage of Algorithm 6.32 is that it can be applied to basically any optimal control
problem. It allows us to split the problem into subproblems, where the split is not necessarily
according to constraints or dynamics, but can be chosen arbitrarily. On the downside, a central
entity is required, which coordinates the progress of the overall system. Here, the iterator j indi-
cates that a number of intermediate steps may (and typically is) necessary to reach a termination
criterion. Regarding the Lagrangian multiplier update, we included the factor ρ, which can be
adapted for the line search.
BIBLIOGRAPHY
[1] BARTO, A. ; S UTTON, R.S.: Reinforcement Learning: An Introduction. 2. MIT Press, 2018
[2] G RÜNE, L. ; PANNEK, J.: Nonlinear Model Predictive Control: Theory and Algorithms. 2.
Springer, 2017
[3] H INRICHSEN, D. ; P RITCHARD, A.J.: Mathematical system theory I: Modeling, state space
analysis and robustness. Springer, 2010
[4] L JUNG, L.: System Identification: Theory for the User. Pearson Education, 1998. – ISBN
0132440539
[5] R ICHARDS, A. ; H OW, J.: A Decentralized Algorithm for Robust Constrained Model Pre-
dictive Control, 2004, S. 4241–4266
[6] S ONTAG, E.D.: Mathematical Control Theory: Deterministic Finite Dimensional Systems.
Springer, 1998. – 531 S. – ISBN 0387984895
Jürgen Pannek
Institute for Intermodal Transport and Logistic Systems
Hermann-Blenck-Str. 42
38519 Braunschweig
During winter term 2022/23 I give the lecture to the module Modern Control Systems
(Moderne Regelungstechnik) at the Technical University of Braunschweig. To structure
the lecture and support my students in their learning process, I prepared these lecture
notes. The aim of the lecture notes is to provide participating students with knowledge of
advanced control methods, which extend the range of control engineering. The students
shall be enabled to list modern control methods and recall their properties. Moreover,
students shall be able to apply these methods in simulation experiments and assess
respective results.