ARMAX
ARMAX
This is called the prediction error method. (Block diagram for prediction error, Figure 7.1 [3]) The identifi-
cation data is D = {(u(t), y(t))|t = 1, . . . , N }.
Recall that ARX model is given by
y(t)+a1 y(t−1)+· · ·+an y(t−n) = b1 u(t−1)+· · ·+bm u(t−m)+e(t)+c1 e(t−1)+· · ·+cl e(t−l). (2)
Let
θ = (a1 , . . . , an , b1 , . . . , bm , c1 , . . . , cl )T
B(q −1 ) C(q −1 )
G(q, θ) = , H(q, θ) = .
A(q −1 ) A(q −1 )
This model consists of the moving average (MA) part C(q −1 )e(t), the auto regressive (AR) part A(q −1 )y(t)
and the exogeneous input (X) part B(q −1 )u(t). The ARMAX model cannot be written as a linear regression.
Notice that
y(t) = G(q, θ)u(t) + H(q, θ)e(t). (3)
Let
v(t) = H(q)e(t) (4)
where H(q) is deterministic and e(t) is a zero mean uncorrelated white noise i.e., E[e(t)] = 0, E[e(t)e(s)] =
λδ(t − s). Note that
v(t) = H(q)e(t) = y(t) − G(q)u(t). (5)
P∞
Assume that H(q) is monic i.e., H(q) = 1 + k=1 h(k)q −k . Therefore,
∞
X
v(t) = e(t) + h(k)e(t − k) = e(t) + m(t − 1) (6)
k=1
where
∞
X
m(t − 1) := h(k)e(t − k) = (H(q) − 1)e(t). (7)
k=1
1
Since H(q) is monic, H(q)−1 is also monic. Notice that v(t) − v̂(t|t − 1) = e(t) ⊥ v̂(t|t − 1) indicating
that v̂(t|t − 1) is the best MSE estimator of v(t).
The output can be estimated as
Notice that the one step prediction error is white noise which is unpredictable. Therefore, we have used all
predictable information in ŷ(t|t − 1). The one step predictor in (10) is the optimal. Observe that
B(q −1 ) A(q −1 )
ŷ(t|θ) = u(t) + (1 − )y(t) (12)
C(q −1 ) C(q −1 )
⇒ C(q −1 )ŷ(t|θ) = B(q −1 )u(t) + (C(q −1 ) − A(q −1 ))y(t)
⇒ ŷ(t|θ) = (1 − C(q −1 ))ŷ(t|θ) + B(q −1 )u(t) + (C(q −1 ) − A(q −1 ))y(t). (13)
Since the parameters are not linear in ŷ(t|θ), ŷ(t|θ) 6= φT (t)θ. However, it can be written in a pseudo linear
regression form as
Define
(t|θ) := y(t) − ŷ(t|θ). (15)
Then,
θ := (a1 , . . . , an , b1 , . . . , bm , c1 , . . . , cl )T (17)
T
φ(t|θ) := (−y(t − 1), . . . , −y(t − n), u(t − 1), . . . , u(t − m), (t − 1|θ), . . . , (t − l|θ)) . (18)
Thus, we obtain the prediction in pseudo linear regression form ŷ(t|θ) = φT (t|θ)θ. This is a nonlinear
optimization problem
1 X
θ̂(N ) = argminθ (y(t) − ŷ(t|θ))2 (19)
N
where ŷ(t|θ) = φT (t|θ)θ.
1. Assume C(q −1 ) = 1 and solve the ARX problem with the same order of A(q −1 ) and B(q −1 ) to get
an initial estimate θ̄ = (a1 , . . . , an , b1 , . . . , bm ).
2. Set θ(0) := (θ̄, 0, . . . , 0)T and compute (t − 1|θ(0) ), . . . , (t − l|θ(0) ).
2
3. For i = 1 to M ,
Construct φ(t|θ(i−1) ), t = 1, . . . , N
−1
PN PN
Compute LSE θ̂ =(i)
t=t∗ φ(t|θ̂
(i−1) T
)φ (t|θ̂ (i−1) ) t=t∗ φ(t|θ̂
(i−1) )y(t) where t∗ =max(m, n, l).
Compute (t − 1|θ(i) ), . . . , (t − l|θ(i) ).
End.
This is not a convex optimization problem and we may find a local solution.
Exercise: Take a first order model and implement the algorithm.
References
[1] H. Asada, Identification, Estimation and Learning, MIT, Lecture notes and video lectures, 2021.
[2] L. Ljung, System Identification, Theory for the user, PHI, 2nd Edition, 1999.
[5] M. Diehl, Lecture notes on Modeling and System Identification, Lecture notes and video lectures, 2020.