Model-Based Output-Difference Feedback
Optimal Control
1 Introduction
This document investigates a model-based method to design the optimal Output-
Difference Feedback Controller (ODFC). We begin by assuming the presence of
an observer that provides an unbiased estimate of the state, represented math-
ematically as:
x̂k = xk + ϵk , ϵk ∼ N (0, Σϵ )
2 Theorem 3.1: Optimal Control Problem
Consider the optimal control problem defined by equations (2)-(5). The optimal
state feedback controller gain K ∗ is given by:
K ∗ = R + B T P ∗ B −1 B T P ∗ A + N T
where P ∗ > 0 is the solution to the Algebraic Riccati Equation (ARE):
AT P ∗ A − P ∗ − AT P ∗ B + N R + B T P ∗ B −1 B T P ∗ A + N T + Q = 0
T T
Here, Q = A Qx A, R = B T Qx B + R, N = A Qx B, and A = A − I.
3 Average Cost
The average cost associated with K ∗ is given by:
λK ∗ = Tr(ATeff Qx Aeff Σϵ )+Tr(Qx Ww )+2Tr(Qy Wv )+Tr(K ∗T B T P ∗ K ∗ Wv )+Tr(P ∗ (Ww +Σϵ ))−Tr((A−BK ∗ )T P
3.1 Deriving Each Term
1. **State Cost**: - Tr(ATeff Qx Aeff Σϵ ): Captures the cost associated with the
state estimation error.
1
2. **Control Cost**: - Tr(Qx Ww ): Reflects the cost related to the process
noise affecting the state.
3. **Output Cost**: - 2Tr(Qy Wv ): Represents the cost linked to the output
noise.
4. **Feedback Gain Cost**: - Tr(K ∗T B T P ∗ K ∗ Wv ): Captures the cost
incurred due to the control action based on the feedback gain K ∗ .
5. **Covariance Cost**: - Tr(P ∗ (Ww + Σϵ )): Accounts for the combined
effect of the process noise covariance and the estimation error covariance.
6. **Adjustment for Feedback**: - −Tr((A − BK ∗ )T P ∗ (A − BK ∗ )Σϵ ):
Adjusts for the effect of the feedback control on the state dynamics.
4 Proof Overview
The proof resembles results for linear stochastic systems with state-dependent
quadratic costs, following similar procedures to those found in [?]. The optimal
feedback gain K ∗ is derived from minimizing the Bellman equation, leading to
the satisfaction of equations (9) and (10).
5 Theorem 3.2: Iterative Algorithm
Let K0 be any stabilizing state feedback controller gain and Pi > 0 be the
solution of the Lyapunov equation:
ATi Pi Ai − Pi + Q + KiT RKi − KiT N − N Ki = 0
where i = 0, 1, 2, . . . and Ai = A − BKi . For Ki+1 calculated as:
Ki+1 = R + B T Pi B −1 B T Pi A + N T
The following holds:
• A − BKi+1 is Schur.
• P ∗ ≤ Pi+1 ≤ Pi
• limi→∞ Pi = P ∗ , limi→∞ Ki = K ∗
6 Proof Overview
The proof follows arguments similar to those in [?] (Theorem 3.1) and is there-
fore omitted here.
2
7 Theorem 3.3: Parameterized Observer
A parameterized observer is introduced to estimate the system state xk from
the output difference measurement. The observer can be combined with (8) to
provide a solution for the optimal control problem.
The state parametrization is given as:
x̄k = Γu αk + Γy βk
This converges exponentially in mean to the state xk as k → ∞ for an
observable system. The estimation error is given by:
x̃k ≡ xk − x̄k ∼ N (0, Σϵ )
where Σϵ is a bounded error covariance matrix.
7.1 Matrices and Updates
The matrices Γu and Γy contain system-dependent transfer function coefficients.
The updates for αk and βk are defined as follows:
i
αk+1 = Aαki + Buk , ∀i = 1, 2, . . . , m
βki = Cσki + D(yk − yk−1 ), ∀i = 1, 2, . . . , p
where ui and yi are the i-th input and output, respectively.
7.2 Existence of the Observer
The existence of the parametrization is equivalent to the difference-feedback
state observer:
x̄k+1 = (A − LCA + LC)x̄k + (B − LCB)uk + L(yk+1 − yk )
where L is the observer gain. The mean and covariance of the estimation
error can be determined using this formulation.
8 Derivation of Discrete-Time ARE
The Algebraic Riccati Equation (ARE) is a fundamental equation in optimal
control theory, particularly for discrete-time linear systems. Below, we derive
the discrete-time ARE from the principles of optimal control.
3
8.1 Discrete-Time Linear System
Consider a discrete-time linear system described by:
xk+1 = Axk + Buk
where:
• xk is the state vector at time k,
• uk is the control input,
• A is the state transition matrix,
• B is the input matrix.
8.2 Cost Function
We want to minimize a quadratic cost function of the form:
∞
X
xTk Qxk + uTk Ruk + 2xTk N uk
J=
k=0
where:
• Q is a positive semi-definite matrix,
• R is a positive definite matrix,
• N is a matrix that captures the coupling between the state and control
inputs.
8.3 Bellman Equation
The optimal control problem can be formulated using the Bellman equation.
The value function V (x) represents the minimum cost to go from state x:
V (x) = min xT Qx + uT Ru + 2xT N u + V (Ax + Bu)
u
Assuming a quadratic form for the value function:
V (x) = xT P x
where P is a positive semi-definite matrix, we can write:
V (Ax + Bu) = (Ax + Bu)T P (Ax + Bu)
4
8.4 Substituting into the Bellman Equation
Substituting back into the Bellman equation, we have:
V (x) = min xT Qx + uT Ru + 2xT N u + xT AT P Ax + xT AT P Bu + uT B T P Ax + uT B T P Bu
u
Grouping terms, we get:
V (x) = xT Q + AT P A x + uT R + B T P B u + 2xT AT P B + N u
8.5 Minimizing the Cost Function
To minimize this quadratic expression with respect to u, we take the derivative
and set it to zero:
∂V
= 2 R + B T P B u + 2 AT P B + N x = 0
∂u
Solving for u gives:
−1
u∗ = − R + B T P B BT P A + N T x
8.6 Substituting Back into the Cost Function
Substituting u∗ back into the cost function:
−1 T
J ∗ = x T Q + AT P A x − x T AT P B + N R + B T P B B PA + NT x
This leads to the equation:
−1 T
J ∗ = x T Q + AT P A − AT P B + N R + B T P B B PA + NT x
For the minimum cost to be zero, the term in parentheses must equal zero:
−1
AT P A − P − AT P B + N R + BT P B BT P A + N T + Q = 0
8.7 Algebraic Riccati Equation
Rearranging gives us the discrete-time Algebraic Riccati Equation (ARE):
−1
AT P A − P − AT P B R + B T P B BT P A + Q = 0
5
9 Conclusion
The discrete-time ARE is a key result in optimal control, allowing us to compute
the optimal feedback gain matrix K ∗ using:
−1
K ∗ = R + BT P B BT P A + N T
The solution P can be found using various numerical methods, such as iter-
ative algorithms or matrix factorizations.
References