Optimal control
Overview of optimal control
Linear quadratic regulator
Linear quadratic tracking control
Examples
Optimal Control p. 1/52
Overview
What is optimal control?
Optimal control is the process of finding control and
state histories for a system to minimize a performance
index
The optimal control problem is to find a control u that
forces the system x = Ax + Bu to follow an optimal
trajectory x that minimizes the performance criterion,
R t1
or cost function J = 0 h(x, u) dt
For discrete time systems, its
xk+1 = Axk + Buk , J = k01 h(x, u)
t1 , k1 are called the optimization horizon
Optimal Control p. 2/52
Example: Autopilot of yatch
Autopilot designed for course-keeping, i.e. minimize
e = d in the presence of disturbances (wind,
waves)
What are the objectives?
Keep on track as much as possible, i.e. minimize e
(save time)
Use as little fuel f as possible (save cost)
Minimize rudder activity (save cost)
Optimal Control p. 3/52
Example: Autopilot of a yatch
Define a quadratic performance index:
Z t1
J = q2e + r1 f 2 + r2 2 dt, q, r1 , r2 > 0
t0
Z t1
r1 0 f
= e qe + f dt
t0 0 r2
Z t1
= xT Qx + uT Ru dt
t0
Q, R are state and control weighting matrices, always
square and symmetric
Optimal control seeks to find u that minimizes J
Optimal Control p. 4/52
Linear Quadratic Regulator (LQR)
Consider a state-space system
x = Ax + Bu (1)
y = Cx (2)
where dim(x) = nx , dim(y) = ny , dim(u) = nu
Our objective is to make x 0 (regulate the system) as
fast as possible, but using as little control effort (u) as
possible
Question: How do we design the control u so that the
states converge as fast as possible with as little control
effort as possible?
Optimal Control p. 5/52
LQR
Define a function equation
Z t1
min
f= u xT Qx + uT Ru dt > 0 (3)
t0
where Q, R are symmetric positive definite weighting
matrices
A positive definite matrix Q will satisfy xT Qx > 0 for all
values of x
Mathematical representation: if Q > 0 then
xT Qx > 0, x
Optimal Control p. 6/52
LQR
Then differentiate with respect to t to get
T !
f min T T f x
= u x Qx + u Ru + (4)
t x t
Define f = xT P x where P = P T > 0, so we get
T
f f
= 2P x = 2xT P (5)
x x
f T
=x Px (6)
t t
Optimal Control p. 7/52
LQR
f f x
Substitute for , ,
t x t
from (6), (5), (1) into (4) to get
P
T min T T T
x x = u x Qx + u Ru + 2x P (Ax + Bu) (7)
t
To minimize the RHS of (7) w.r.t. u, take the partial
derivative
x Qx + u Ru + 2x P (Ax + Bu) = 2uT R + 2xT P B
T T T
u
(8)
Equate it to zero to get
u = R1 B T P x (9)
Optimal Control p. 8/52
LQR
Substitute (9) into (7) to get
xT P x = xT (Q + 2P A P BR1 B T P )x (10)
Since 2xT P Ax = xT (P A + AT P )x then (10) becomes
xT P x = xT (Q + P A + AT P P BR1 B T P )x
and
P = (Q + P A + AT P P BR1 B T P ) (11)
Optimal Control p. 9/52
LQR
It can be shown that the solutions of P will converge and
hence P 0
If the final time t1 is very far away from t0 (infinite horizon),
then (11) will reduce to become the Ricatti equation
P A + AT P + Q P BR1 B T P = 0 (12)
Optimal Control p. 10/52
LQR
Summary
Given a state-space system (1), to design the optimal
controller
Select the weighting matrices Q, R as in (3)
The size of the weights correspond to how much you want to penalize x and u;
to make x converge faster, make Q bigger; to use less input, make R bigger
Solve the Ricatti equation (12) to get P
You can solve this in Matlab using the command care or lqr
Set u as in (9)
Optimal Control p. 11/52
LQR
One good thing about this design method is that
stability is guaranteed if (A, B) is stabilizable
Just need to choose Q, R and the system will be stable
The Riccati equation (12) can be re-written as
P (A BR1 B T P ) + (A BR1 B T P )T P
+ Q + P BR1 B T P = 0
Recall that K = R1 B T P , so re-arrange to get
P (A BK) + (A BK)T P = (Q + P BR1 B T P ) (13)
| {z }
negative definite
Optimal Control p. 12/52
LQR
Quoting Lyapunov theory: if there exists a matrix
P = P T > 0 such that
P A + AT P < 0
then the matrix A is stable
Apply the same argument to (13), therefore A BK is
stable
Optimal Control p. 13/52
LQR
The weights Q and R can be used to tune the size of K
If Q is chosen to be larger, then K will also be larger
If R is chosen to be larger, then K will be smaller
Optimal Control p. 14/52
Example 1
Consider a state-space system where
0 1 0
A= , B=
3 2 1
It is desired to minimize the cost function
Z
J = xT Qx + uT Ru dt
where
2 0
Q= , R=1
0 3
Optimal Control p. 15/52
Example 1
The Matlab commands are
>> A=[0 1;-3 -2]; B=[0;1];
>> Q=[2 0;0 3]; R=1;
>> P=care(A,B,Q,R);
and we get
3.1633 0.3166
P =
0.3166 0.7628
1 T
K = Kopt = R B P = 0.3166 0.7628
(A BK) = 1.3814 j1.1867
Optimal Control p. 16/52
Example 1
Lets try another design to make (A BK) deeper in the
LHP; the following choice of K will give
(A BK) = 4, 5 (pole placement)
Solve using det(A-BK) = 0 to find K
K = K1 = 17 7
Lets now simulate the system; set an initial condition of
2
x0 =
3
Optimal Control p. 17/52
Example 1
x , K=K
1 opt
x , K=K
2 2 opt
x , K=K
1 1
x , K=K
2 1
4
0 1 2 3 4 5 6 7 8 9 10
The states x
Optimal Control p. 18/52
Example 1
6
K=K
opt
8 K=K
1
10
12
14
0 1 2 3 4 5 6 7 8 9 10
The input u
Optimal Control p. 19/52
Example 1
30
25
K=K
opt
20 K=K1
15
10
0
0 1 2 3 4 5 6 7 8 9 10
The cost function J
Optimal Control p. 20/52
Example 1
So what can we observe/conclude?
The non-optimal controller K1 causes x to converge
faster because (A BK1 ) are more negative
The non-optimal controller K1 causes a larger control
effort u
Ultimately the cost function with K1 is higher because it
is not optimal
Optimal Control p. 21/52
Example 1
The Ricatti equation can actually be solved by hand
p1 p2
Let P =
p2 p3
Substitute A, B, Q, R into the Ricatti equation:
p1 p2 0 1 0 3 p1 p2
+
p2 p3 3 2 1 2 p2 p3
2 0 p p2 0 h i p p2
+ 1 0 1 1 =0
0 3 p2 p3 1 p2 p3
Optimal Control p. 22/52
Example 1
We get the following equations:
6p2 + 2 p22 = 0 (14)
p1 2p2 3p3 p2 p3 = 0 (15)
2p2 4p3 + 3 p23 = 0 (16)
Solve (14) to get p2 = 0.3166, 6.3166
Substitute p2 = 0.3166 into (16) and solve
p3 = 0.7628, 4.7628 (take only positive)
Substitute p2 = 6.3166 into (16) and solve
p3 = 2 j2.3634 (reject this set of p2 , p3 )
Finally solve (15) to get p1 = 3.1631
Optimal Control p. 23/52
LQ tracking
What was demonstrated earlier was a regulation problem
(to make x 0)
In reality, it is desired that x follows a desired state
trajectory r
So now the quadratic performance index is
Z t1
T T
J= (r x) Q(r x) + u Ru dt (17)
t0
Q, R have the same function as before
Optimal Control p. 24/52
LQ tracking
The optimal control u is given by
u = R1 B T P x R1 B T s (18)
s = (A BR1 B T P )T s Qr (19)
P = P A AT P Q + P BR1 B T P (20)
Optimal Control p. 25/52
Example 2
Consider a DC motor modelled by:
2 0 10
= 1 0
+
0
V
Let x = and define r to be the reference for x. Find
the optimal control V (for infinite horizon) such that the
following cost function is minimized:
Z
T 2 1 0
J= (r x) Q(r x) + V dt, Q =
0 0 5
Optimal Control p. 26/52
Example 2
Based on A, B, Q, R = 1, solve the Ricatti equation:
0.1020 0.2236
P =
0.2236 2.7269
Implement the optimal controller in (18) - (19) and get the
following results:
Optimal Control p. 27/52
Example 2
Motor position (solid) and its reference (dashed)
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
0 5 10 15 20 25 30 35 40 45 50
Optimal Control p. 28/52
Example 2
Input voltage
0.6
0.4
0.2
0.2
0.4
0.6
0 5 10 15 20 25 30 35 40 45 50
Optimal Control p. 29/52
Example 2
Cost function J
160
140
120
100
80
60
40
20
0
0 5 10 15 20 25 30 35 40 45 50
Optimal Control p. 30/52
Example 2
The response of the position is rather sluggish, but it
saves the control input V
To make respond faster, change weight to
1 0
Q=
0 10
0.1367 0.7071
Solve Ricatti equation to get P =
0.7071 11.0775
converges faster but at a higher cost of V
Optimal Control p. 31/52
Example 2
Motor position (solid) and its reference (dashed)
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
0 5 10 15 20 25 30 35 40 45 50
Optimal Control p. 32/52
Example 2
Input voltage
2.5
1.5
0.5
0.5
1.5
2.5
0 5 10 15 20 25 30 35 40 45 50
Optimal Control p. 33/52
LQ tracking
The block diagram is .........
Optimal Control p. 34/52
Example 2
Notice that converges much faster now, but at a
higher cost of V
Note that there is no steady-state error because this is
a type 1 system; for optimal tracking, there may still be
steady-state error in order to save the input cost
Optimal Control p. 35/52