Neutral Network Model Predictive Controler
Neutral Network Model Predictive Controler
www.elsevier.com/locate/jprocont
Received 1 November 2005; received in revised form 9 June 2006; accepted 11 June 2006
Abstract
A neural network controller is applied to the optimal model predictive control of constrained nonlinear systems. The control law is
represented by a neural network function approximator, which is trained to minimize a control-relevant cost function. The proposed
procedure can be applied to construct controllers with arbitrary structures, such as optimal reduced-order controllers and decentralized
controllers.
2006 Elsevier Ltd. All rights reserved.
0959-1524/$ - see front matter 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jprocont.2006.06.001
938 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946
and Zoppoli [20] and Zoppoli et al. [24] study stochastic X1 h
kþN
optimal control problems, where the controller is parame- J N ðU N ðkÞ; kÞ ¼ ð^y ði þ 1jkÞ y r ði þ 1ÞÞT Qð^y ði þ 1jkÞ
i¼k
terized as a function of past inputs and outputs using i
neural network approximators. Seong and Widrow [23] y r ði þ 1ÞÞ þ DuðiÞT RDuðiÞ þ qN ð^xðk þ N jkÞÞ
consider an optimal control problem where the initial state
ð2Þ
has a random distribution, and the controller is a state
feedback described by a neural network. In both studies subject to the constraints
a stochastic approximation type algorithm is used to train
gx ð^xði þ 1jkÞÞ 6 0
the network. Similar methods have been presented by
Ahmed and Al-Dajani [2] and Nayeri et al. [15] using gu ðuðiÞÞ 6 0 ð3Þ
steepest descent methods to train the neural network gD ðDuðiÞÞ 6 0; i ¼ k; k þ 1; . . . ; k þ N 1
controllers.
where ^xð j kÞ and ^y ðjkÞ denote the predicted state and out-
In many applications it is relevant to design controllers
put as functions of future inputs and the state at time in-
which have a specified structure. For complex systems it
stant k. Here, UN(k) denotes the future inputs,
may for example be of interest to use a reduced-order con-
troller, or to apply decentralized control for systems with U N ðkÞ ¼ fuðkÞ; uðk þ 1Þ; . . . ; uðk þ N 1Þg ð4Þ
many inputs and outputs. In model predictive control the
complete system model is used to predict the future system and Du denotes the incremental input,
trajectory, and the optimal control signal is therefore DuðkÞ ¼ uðkÞ uðk 1Þ ð5Þ
implicitly a function of the whole state of the system
model. Hence, model predictive control is not applicable In Eq. (2), Q and R are symmetric positive (semi)defi-
to fixed-structure control problems. In contrast, controllers nite output and input weight matrices, and qN(Æ) is a
based on neural network function approximators can non-negative terminal cost.
be applied to optimal fixed-structure control by imposing From the optimality principle of dynamic programming
the appropriate structure on the neural network [6] it follows that minimization of the cost (2) gives the
approximator. solution of the infinite-horizon optimal control problem
In this paper, the approach studied in [20,2,24,23,15] is obtained in the limiting case N ! 1, if the terminal cost
formulated for constrained MPC type nonlinear optimal qN ð^xðk þ N j kÞÞ is taken as the minimum cost from time
control problems with structural constraints. The control instant k + N to infinity. The finite-horizon cost (2) is
law is represented by a neural network approximator, therefore well motivated. For nonlinear systems the
which is trained off line to minimize a control-relevant cost optimal control problem has, however, in general no
function. The design of optimal low-order controllers is closed-form solution. Therefore, various brute-force and
accomplished by specifying the input to the neural network suboptimal methods have been studied.
controller appropriately. Decentralized and other fixed- In model predictive control (MPC) the control signal is
structure neural network controllers are constructed by determined by minimizing the cost (2) numerically with
introducing appropriate constraints on the network struc- respect to the input sequence UN(k) at each sampling
ture. The performance of the neural network controller is instant. Only the first element u(k) of the optimal input
evaluated on numerical examples and compared to optimal sequence is applied to the system. In the next sampling per-
model predictive controllers. iod, a new optimization problem is solved to give the opti-
mal control signal at sampling instant k + 1, etc. In this
approach the terminal cost qN(Æ) can be taken as an approx-
2. Problem formulation
imation of the minimum cost from time instant k + N to
infinity, but is in practice more often used as a means to
We consider the control of a discrete-time nonlinear sys-
ensure closed-loop stability, or is omitted altogether if the
tem described by
optimization horizon is sufficiently long. The model predic-
tive control approach has the drawback that a nonlinear,
xðk þ 1Þ ¼ f ðxðkÞ; uðkÞÞ constrained optimization problem should be solved at each
ð1Þ
yðkÞ ¼ hðxðkÞÞ sampling period, which may be computationally too
demanding for on-line implementation.
where y(k) 2 Rp is the controlled output, u(k) 2 Rr is the In order to reduce the computational burden in model
manipulated input, and x(k) 2 Rn is a state vector. The predictive control, explicit MPC has been proposed. In this
control objective is to keep the output close to a specified approach, part of the computations are performed off line.
reference trajectory yr(i) in such a way that large control For nonlinear systems, the optimal MPC strategy may be
signal variations are avoided and possible hard constraints mapped by off-line computations, and described by a func-
on the states and inputs are satisfied. A commonly used, tion approximator. More precisely, the control strategy
quantitative formulation of the control objective is to min- which minimizes the cost (2) defines the control signal, or
imize the cost equivalently, its increment Du(k), as a function,
B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946 939
Duopt ðkÞ ¼ gðI MPC ðkÞÞ ð6Þ required that the cost is minimized for a set of training
data,
where I MPC ðkÞ ¼ f^xðk j kÞ; y r ðk þ 1Þ; y r ðk þ 2Þ; . . . ; y r ðk þ N Þg
is the information used to compute the cost (2) as a func-
V ðmÞ ðkÞ ¼ xðmÞ ðkÞ; uðmÞ ðk 1Þ; y ðmÞ ðmÞ
r ðk þ 1Þ; . . . ; y r ðk þ N Þ ;
tion of UN(k). The functional relationship (6) can be
evaluated for any IMPC(k) by minimizing the cost (2). m ¼ 1; 2; . . . ; M ð9Þ
The optimal strategy can therefore be approximated by a Using the control strategy (7), the system evolution for the
function approximator which can be trained off line. initial state x(m)(k) is given by
Although this approach has been found very useful, it
has some limitations. As the MPC strategy is based on xðmÞ ði þ 1Þ ¼ f ðxðmÞ ðiÞ; uðmÞ ðiÞÞ
the information IMPC(k) used to calculate the predicted DuðmÞ ðiÞ ¼ gNN ðIðiÞ; wÞ ð10Þ
outputs, this approach is not well suited for representing ðmÞ ðmÞ
reduced-order or fixed-structure controllers. In addition, y ðiÞ ¼ hðx ðiÞÞ; i ¼ k; k þ 1; . . .
the computational effort required to generate training data Define the associated cost associated with the training data
may be extensive, as each training data point requires the (9),
solution of an MPC optimization problem.
ðmÞ
X1 h
kþN
T
3. A neural network optimal controller J N ðwÞ ¼ ðy ðmÞ ði þ 1Þ y ðmÞ
r ði þ 1ÞÞ Qðy
ðmÞ
ði þ 1Þ
i¼k
i
T
In this section a procedure for constructing a neural net- y ðmÞ
r ði þ 1ÞÞ þ Du
ðmÞ
ðiÞ RDuðmÞ ðiÞ
work model predictive controller for the control problem
þ qN ðxðmÞ ðk þ N ÞÞ ð11Þ
described in Section 2 is presented. Here we adopt a proce-
dure in which the controller is trained directly to minimize The training of the approximator (7) now consists of solv-
the cost (2) for a training data set, without having ing the nonlinear least-squares optimization problem
to compute the optimal MPC control signals by off-line
optimizations. X
M
ðmÞ
The controller is represented as min J N ðwÞ ð12Þ
w
m¼1
DuðkÞ ¼ gNN ðIðkÞ; wÞ ð7Þ
where gNN(I(k); w) is a (neural network) function approxi- subject to the constraints
mator, I(k) denotes the information which is available to
the controller at time instant k, and w denotes a vector of gx ðxðmÞ ði þ 1ÞÞ 6 0
approximator parameters (neural network weights). gu ðuðmÞ ðiÞÞ 6 0 ð13Þ
If complete state information is assumed, i.e., I(k) = ðmÞ
gD ðDu ðiÞÞ 6 0; i ¼ k; k þ 1; . . . ; k þ N 1
IMPC(k), the controller (7) can be considered as a functional
approximation of the optimal MPC strategy (6). The
The training problem can be solved by a gradient-based
approach studied here is, however, not restricted to con-
nonlinear least-squares optimization procedure, such as
trollers with full state information, and typically the set
the Levenberg–Marquardt algorithm. From Eq. (11), the
I(k) is taken to consist of a number of past inputs
cost function gradients are given by
u(k i) and outputs y(k i) as well as information about
the setpoint or reference trajectory yr(k + i). In this way ðmÞ
dJ N ðwÞ X1
kþN
T oy
ðmÞ
ði þ 1Þ
it is possible to construct low-complexity controllers for ðmÞ ðmÞ
T
¼ 2 ðy ði þ 1Þ y r ði þ 1ÞÞ Q
high-order systems. Various ways to select the set I(k) are dw i¼k
owT
illustrated in the examples presented in Section 4. T oDu
ðmÞ
ðiÞ
þ DuðmÞ ðiÞ R ð14Þ
Remark 1. Besides allowing for controllers of reduced owT
complexity the controller structure may be fixed as well by
imposing a structure on the mapping gNN(Æ ; Æ). For exam- where
ple, assuming that the information has the decomposi-
tion I(k) = [I1(k), I2(k), . . . , Ir(k)], a decentralized controller oy ðmÞ ði þ 1Þ ohðxðmÞ ði þ 1ÞÞ oxðmÞ ði þ 1Þ
¼
Dui(k) = gNN,i(Ii(k); wi), i = 1, . . . , r is obtained by requiring owT oxðmÞ ði þ 1ÞT owT
ð15Þ
that the controller has the structure oDuðmÞ ðiÞ ogNN ðIðiÞ; wÞ ogNN ðIðiÞ; wÞ oxðmÞ ðiÞ
h ¼ þ
gNN ðIðkÞ; wÞ ¼ gTNN;1 ðI 1 ðkÞ; w1 Þ; gTNN;2 ðI 2 ðkÞ; w2 Þ; . . . ; owT owT oxðmÞ ðiÞT owT
iT
gTNN;r ðI m ðkÞ; wr Þ ð8Þ where ogNN(I(i); w)/owT is the partial derivative of the neu-
ral network output with respect to the network parameters,
In order to determine the controller parameters w in such a which depends on the network structure, and ox(m)(i)/owT
way that the control law (7) minimizes the cost (2) it is is obtained recursively according to
940 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946
oxðmÞ ði þ 1Þ of ðxðmÞ ðiÞ; uðmÞ ðiÞÞ oxðmÞ ðiÞ white noise disturbances v(k) and n(k) with variances Rv
¼
owT oxðmÞ ðiÞ
T
owT and Rn. Using the sampling time 0.2 min, the time delay
is L = 5, and the system matrices can be represented as
of ðxðmÞ ðiÞ; uðmÞ ðiÞÞ ouðmÞ ðiÞ
þ T
functions of the output according to
ouðmÞ ðiÞ owT ð16Þ 2 3
p1 ðyÞ þ 0:96 p2 ðyÞ p1 ðyÞ þ 0:96 0
ouðmÞ ðiÞ ouðmÞ ði 1Þ oDuðmÞ ðiÞ 6
¼ þ 6 p1 ðyÞ p2 ðyÞ þ 0:96 p1 ðyÞ 077
owT owT owT F ðyÞ ¼ 6 7;
ðmÞ 4 0 0 0 05
ou ðk 1Þ
¼ 0; i ¼ k; k þ 1; . . . ; k þ N 1 p3 ðyÞ p4 ðyÞ p3 ðyÞ 1
owT 2 3
p1 ðyÞ þ 0:96
6 p1 ðyÞ 7
Remark 2. The cost function (11) used to train the neural 6 7
G0 ðyÞ ¼ 0:26 7;
network controller is similar to the costs used in model 4 1 5
predictive control. The proposed controller can therefore p3 ðyÞ
be regarded as an explicit model predictive controller. 2 3 2 3
p2 ðyÞ 1
Notice that for a given controller complexity, the compu- 6 p ðyÞ þ 0:96 7 617
tational effort required to optimize the controller param- 6 2 7 6 7
G1 ðyÞ ¼ 0:26 7; Gv ðyÞ ¼ 6 7;
eters w does not depend critically on the length N of the 4 0 5 405
control horizon. It may therefore be feasible to use longer p4 ðyÞ 0
control horizons than in model predictive control, where
the number of parameters to be optimized increases in
proportion to length of the control horizon. H ¼ ½0 0 0 1
7 7 7
6 6 6
pH
5 5
pH
5
4 4
4
3 3
3
0 20 40 60 80 100 0 20 40 60 80 100
0 20 40 60 80 100 120 140 160 180 200
10 10
10
u (mmol/l)
8 8
u (mmol/l)
8
6 6
6
4 4
0 20 40 60 80 100 0 20 40 60 80 100 4
k k
0 20 40 60 80 100 120 140 160 180 200
Fig. 1. Responses obtained in Example 1 with the neural network k
controller (solid lines) and model predictive control (dashed lines) for
setpoint changes. Fig. 3. Responses obtained in Example 1 with the neural network
controller (solid lines) and model predictive control (dashed lines) for a
sequence of setpoint changes. The total cost obtained when using the
neural network controller is 2.60 and 1.60 when using the optimal MPC
7 7 strategy.
6 6
pH
5 5 7
4 4
6
3 3
pH
5
0 20 40 60 80 100 0 20 40 60 80 100
4
10 10 3
u (mmol/l)
6 6
10
4 4
8
u (mmol/l)
0 20 40 60 80 100 0 20 40 60 80 100
k k
6
Fig. 2. Responses obtained in Example 1 with the neural network
controller (solid lines) and model predictive control (dashed lines) for 4
setpoint changes when there is measurement noise. 0 50 100 150 200 250 300 350 400 450 500
k
using step disturbances. Elimination of steady-state offsets Fig. 4. Responses obtained in Example 1 with the neural network
was guaranteed by imposing the condition Du(k) = controller (solid lines) and model predictive control (dashed lines) for step
disturbances when there is measurement noise. The total cost is 22.6 when
gNN(I(k); w) = 0 at the steady states, cf. [1].
using the neural network controller and 21.8 when using the optimal MPC
The simulation results show that the neural network strategy.
controller achieves near-optimal control performance for
various disturbance types. It should be noted that the neu-
ral network controller requires only 238 flops at each sam- dcA V_
pling interval, which is less than 0.1% of the average ¼ ðcA0 cA Þ k 1 ð#ÞcA k 3 ð#Þc2A
dt VR
number of operations required by the model predictive dcB V_
controller. This is in accordance with the results in [1]. ¼ cB þ k 1 ð#ÞcA k 2 ð#ÞcB
dt VR
Example 2. In this example both centralized and decen- d# 1
¼ k 1 ð#ÞcA DH RAB þ k 2 ð#ÞcB DH RBC þ k 3 ð#Þc2A DH RAD
tralized neural network model predictive controllers are dt qC p
designed for a simulated multivariable non-isothermal V_ k w AR
continuous stirred tank reactor with a van de Vusse þ ð#0 #Þ þ ð#K #Þ
VR qC p V R
reaction scheme [8]. The process involves the reactant A,
d#K 1 _
the desired product B, as well as two by-products C and D ¼ QK þ k w AR ð# #K Þ
dt mK C p K
which are also produced in the reaction. The reactor is
modelled by a system of four coupled differential equations, ð22Þ
B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946 943
where cA and cB are the concentrations of components A range 9000 kJ=h 6 Q_ K 6 0 kJ=h.
and B, respectively, # is the reactor temperature and #K A discrete-time system representation of the form (1)
is the coolant temperature. The concentration of A in the was constructed by using Euler’s approximation to discret-
feed stream is cA0 and the temperature of the feed stream ize the model (22) with the sampling time 20 s. The con-
is #0. The reactor volume is VR, V_ is the feed flow rate trolled outputs were taken as y1 = cB [mol/l] and y2 =
and Q_ K is the rate of heat addition or removal. The reac- # [C] and the inputs were defined as u1 ¼ V_ [m3/h] and
tion coefficients k1, k2 and k3 are given by the Arrhenius u2 ¼ Q_ K [kJ/h].
equation, The procedure in Section 3 was used to design both cen-
tralized and decentralized neural network controllers for
EAi the reactor system. Training data were generated in analogy
k i ð#Þ ¼ k 0i exp ; i ¼ 1; 2; 3 ð23Þ
# þ 273:15 C with Example 1, with the outputs and setpoints in the region
0.8 mol/l 6 cB 6 1.0 mol/l, 125 C 6 # 6 135 C. For tra-
Numerical values of the model parameters are given in jectory following, the initial states were taken to correspond
Table 2. to steady states associated with the outputs y m1 ðkÞ; y m2 ðkÞ and
setpoints y msp;1 , y msp;2 . Four equally spaced values in the con-
The control objective is to control the concentration cB
sidered regions were selected for each output. For each ini-
and the reactor temperature # by manipulating the feed
tial steady state, eight combinations of setpoint values
flow rate V_ and the rate of heat addition or removal Q_ K .
ðy msp;1 ; y msp;2 Þ ¼ ðy m1 ðkÞ 0:1; y m2 ðkÞ 5Þ; ðy m1 ðkÞ; y m2 ðkÞ 5Þ;
The concentration cB and the reactor temperature # are
ðy m1 ðkÞ 0:1; y m2 ðkÞÞ were selected. Excluding values falling
available from measurements. The feed concentration cA0
outside the selected variable range, this results in 84 initial
and the feed temperature #0 are treated as disturbances,
states. The length of the prediction horizon was set to
with the nominal values 5.1 mol/l and 130 C, respectively.
N = 60 steps. For disturbance rejection, training data were
The feed flow V_ is constrained to the interval 0:05 m3 =h 6
constructed by taking constant reference signals y mr1 ðiÞ ¼
V_ 6 0:35 m3 =h and the rate of heat removal Q_ K lies in the
y msp;1 ; y mr2 ðiÞ ¼ y msp;2 ; i ¼ k; k þ 1; . . . and initial states corre-
sponding to steady states with y m1 ðkÞ ¼ y msp;1 0:02 and
Table 2 y m2 ðkÞ ¼ y msp;2 1. Using four equally spaced values for
Parameters of the van de Vusse reactor
each setpoint, 64 initial states are obtained. The length of
k01 = 1.287 · 1012 h1 DH RAB ¼ 4:2 kJ=mol the prediction horizon was N = 25. Thus, the total number
k02 = 1.287 · 1012 h1 DH RBC ¼ 11:0 kJ=mol
k03 = 9.043 · 109 l/(mol h) DH RAD ¼ 41:85 kJ=mol
of initial states in the training set was M = 148, and the
EA1 = 9758.3 K q = 0.9342 kg/l number of training data points was 6640. The test data set
EA2 = 9758.3 K Cp = 3.01 kJ/(kg K) consisted of four setpoint changes between the values
EA3 = 8560 K kw = 4032 kJ/(h m2 K) cB = 0.85 and 0.95 mol/l and the values # = 128 and
AR = 0.215 m2 VR = 0.01 m3 133 C (cf. Figs. 5–7), and was used as in Example 1 to select
mK = 5.0 kg C pK ¼ 2:0 kJ=ðkg KÞ
the network sizes.
1
c (mol/l)
0.9
B
0.8
0 10 20 0 10 20 0 10 20 0 10 20
135
ϑ (°C)
130
125
0 10 20 0 10 20 0 10 20 0 10 20
0.3
V (m3/h)
0.2
0.1
0 10 20 0 10 20 0 10 20 0 10 20
0
Q (MJ/h)
–5
K
–10
0 10 20 0 10 20 0 10 20 0 10 20
k k k k
Fig. 5. Responses obtained in Example 2 to four setpoint changes when using a neural network controller (solid lines) and an optimal model predictive
controller (dashed lines) designed for the weights in Eq. (25).
944 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946
c (mol/l)
0.9
B
0.8
0 10 20 0 10 20 0 10 20 0 10 20
135
ϑ (°C) 130
125
0 10 20 0 10 20 0 10 20 0 10 20
0.3
V (m3/h)
0.2
0.1
0 10 20 0 10 20 0 10 20 0 10 20
0
Q (MJ/h)
–5
K
–10
0 10 20 0 10 20 0 10 20 0 10 20
k k k k
Fig. 6. Responses obtained in Example 2 to four setpoint changes used as test data when using a centralized neural network controller (solid lines) and
an optimal model predictive controller (dashed lines) designed for the weights in Eq. (26).
1
c (mol/l)
0.9
B
0.8
0 10 20 0 10 20 0 10 20 0 10 20
135
ϑ (°C)
130
125
0 10 20 0 10 20 0 10 20 0 10 20
0.3
V (m3/h)
0.2
0.1
0 10 20 0 10 20 0 10 20 0 10 20
0
Q (MJ/h)
–5
K
–10
0 10 20 0 10 20 0 10 20 0 10 20
k k k k
Fig. 7. Responses obtained in Example 2 to four setpoint changes used as test data when using a decentralized neural network controller (solid lines)
and an optimal model predictive controller (dashed lines).
The neural network controllers were taken to be func- where ny = 3 and nu = 3. No significant improvement could
tions of past outputs y(k i) and input increments be obtained by increasing the values ny and nu.
Du(k i), the state xr(k) of the reference model (cf. Eq. For the multivariable example system, the training of
(20)) and the current setpoint. Both outputs had identical the neural network controllers turned out to be more
reference models, which were taken as first-order systems demanding than for the single-input single-output system
with unit stationary gains and poles at 0.8. In the central- studied in Example 1. In particular, the optimization of
ized controller case the controller in Eq. (7) was taken as the neural network weights could be stuck in local minima.
a function of These were avoided by starting the optimization from a
number of random initial points.
IðkÞ ¼ yðkÞ; . . . ; yðk ny þ 1Þ; Duðk 1Þ; . . . ; Duðk nu Þ;
It was observed that in the multivariable case, the choice
xr ðkÞ; y sp ðkÞ ð24Þ of the relative magnitudes of the weights on the outputs in
B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946 945
the cost function (11) has a significant effect on the network I i ðkÞ ¼ y i ðkÞ; . . . ; y i ðk ny;i þ 1Þ; Dui ðk 1Þ; . . . ;
training performance. If the contribution to the cost func-
Dui ðk nu;i Þ; xr;i ðkÞ; y sp;i ðkÞ ; i ¼ 1; 2 ð28Þ
tion from one output dominates over the contributions
from the other outputs, the network will only learn to con- where ny,i = nu,i = 3, i = 1, 2. No significant improvement
trol the dominating output, whereas satisfactory control of could be obtained by increasing the values ny,i and nu,i.
the other outputs is not achieved. This happens despite the The network sizes were determined as above. A neural net-
fact that the optimal model predictive controller based on work controller with four hidden layer neurons was deter-
the same weights achieves good control performance for mined to control the concentration, while a network with
all outputs. For the inputs a similar situation holds. The three hidden layer neurons was used to control the temper-
set of cost function weights for which a neural network ature. The total number of weights is 72. The network gives
controller can be efficiently trained by the procedure in Sec- the cost 3145 on the training data and 139.2 on the test
tion 3 is therefore limited. This behaviour is illustrated in data. Fig. 7 gives the closed-loop responses for the test
Fig. 5, which shows the closed-loop responses obtained data, showing that the performance of the decentralized
with the optimal MPC strategy and the best neural network controller is remarkably good for setpoint changes, despite
approximator which could be found when the weight the facts that there are considerable interactions between
matrices in the cost function (11) were taken as the loops (cf. Eq. (22)).
It should be observed that although the decentralized
400 0 10 0
Q¼ ; R¼ ð25Þ controllers are represented by separate networks, they can-
0 10 0 107 not be trained independently. This is because due to the
interactions between the loops one controller affects the
The contribution of the first output cB to the cost is much
output controlled by the other and vice versa. The simulta-
smaller than the contribution of the second output #. Con-
neous training of the networks can be performed in a
sequently, the second output dominates and poor control
straightforward way by defining the global parameter vec-
of the first output is obtained. Therefore, the weight on T
tor w ¼ ½wT1 ; wT2 and by applying a standard gradient-
the first output should be increased. A similar phenome-
based nonlinear least-squares approach of the form
non, although less prominent, can be seen for the inputs.
described in Section 3. The training is further complicated
The controllers considered below will the based on the
by the fact that the information Ii(k) used to determine the
weight matrices
control variable ui(k) does not define the state of the system
uniquely. For these reasons the training of the decentral-
2500 0 90 0
Q¼ ; R¼ ð26Þ ized neural network controllers proved to be more
0 1 0 107
demanding both computationally and with respect to the
For these weights, the contributions to the cost from the quality of training data required. In this example, the
individual outputs (inputs) will have the same orders of whole training data set was essential in order to train
magnitude when using the optimal strategy. the decentralized controller properly, whereas the central-
The best performance on the test data when using a con- ized controller could be trained satisfactorily even though
troller with the input in Eq. (24) was achieved with a net- the number of data points were reduced.
work having four hidden layer neurons, and 78 weights.
The cost on the training data was 2336 and on the test data 5. Conclusion
the cost was 128.3. For comparison, with an optimal model
predictive controller the training data cost was 1719 and Optimal neural network control of constrained nonlin-
the test data cost was 125.8. The responses of the neural ear systems has been studied. The neural network control-
network controller and model predictive control on the test ler is designed by minimizing an MPC type cost function
data sequences are shown in Fig. 6. Notice that the system off-line for a set of training data. In this way the procedure
has inverse response characteristics, and both controllers is closely related to MPC, and it can be considered as a
give inverse responses. form of explicit model predictive control.
An optimal decentralized neural network controller was The neural network controller has a number of distinct
designed as follows. For the chemical reactor it is natural advantages over standard nonlinear model predictive con-
to use a decentralized controller where the feed flow u1 is trol. In analogy with other explicit MPC methods, the neu-
a function of the product concentration y1 and the rate ral network controller has substantially reduced on-line
of heat exchange u2 is a function of the reactor temperature computational requirements. In addition, the computa-
y2. Hence a decentralized neural network controller con- tional effort involved in the network training depends
sisting of the individual controllers mainly on the network complexity, and not on the length
Dui ðkÞ ¼ gNN;i ðI i ðkÞ; wi Þ; i ¼ 1; 2 ð27Þ of the control horizon. This makes it feasible to design
controllers with a longer control horizon than might be
was used, where the information available to the individual possible in MPC. Moreover, the structure of the neural
controllers consisted of local variables only, network controller can be fixed, so that controllers with a
946 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946
specified structure, such as decentralized controllers, can be [9] J. Gómez Ortega, E.F. Camacho, Mobile robot navigation in a
designed. The main limitation of the neural network con- partially structured static environment, using neural predictive
control, Control Engineering Practice 4 (12) (1996) 1669–1679.
troller is that substantial off-line computations may be [10] T.K. Gustafsson, B.O. Skrifvars, K.V. Sandström, K.V. Waller,
needed in order to train it properly, and for some choices Modeling of pH for control, Industrial & Engineering Chemistry
of cost functions it may not even be feasible to achieve sat- Research 34 (3) (1995) 820–827.
isfactory accuracies. [11] M.A. Henson, Non-linear model predictive control: Current status
Numerical examples show that the neural network and future directions, Computers & Chemical Engineering 23 (2)
(1998) 187–202.
model predictive controller can be trained to achieve [12] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward
near-optimal control performance (when compared to the networks are universal approximators, Neural Networks 2 (5) (1989)
optimal MPC strategy) using both centralized and decen- 359–366.
tralized controller structures. [13] K.J. Hunt, D. Sbarbaro, R. Zbikowski, P.J. Gawthrop, Neural
networks for control—a survey, Automatica 28 (6) (1992) 1083–1112.
[14] M.A. Hussain, Review of the applications of neural networks in
Acknowledgement chemical process control – simulation and online implementation,
Artificial Intelligence in Engineering 13 (1) (1999) 55–68.
This work was supported by the Academy of Finland [15] M.R.D. Nayeri, A. Alasty, K. Daneshjou, Neural optimal control of
(Grant 206750). Bernt M. Åkesson was supported by flexible spacecraft slew maneuver, Acta Astronautica 55 (10) (2004)
the Finnish Graduate School in Chemical Engineering 817–827.
[16] M. Nørgaard, O. Ravn, N.K. Poulsen, L.K. Hansen, Neural
(GSCE). Networks for Modelling and Control of Dynamic Systems,
Springer-Verlag, London, 2000.
References [17] R.H. Nyström, B.M. Åkesson, H.T. Toivonen, Gain-scheduling
controllers based on velocity-form linear parameter-varying models
[1] B.M. Åkesson, H.T. Toivonen, J.B. Waller, R.H. Nyström, Neural applied to an example process, Industrial & Engineering Chemistry
network approximation of a nonlinear model predictive controller Research 41 (2) (2002) 220–229.
applied to a pH neutralization process, Computers & Chemical [18] R.H. Nyström, K.V. Sandström, T.K. Gustafsson, H.T. Toivonen,
Engineering 29 (2) (2005) 323–335. Multimodel robust control of nonlinear plants: a case study, Journal
[2] M.S. Ahmed, M.A. Al-Dajani, Neural regulator design, Neural of Process Control 9 (2) (1999) 135–150.
Networks 11 (9) (1998) 1695–1709. [19] T. Parisini, R. Zoppoli, A receding-horizon regulator for nonlinear
[3] F. Allgöwer, T.A. Badgwell, S.J. Qin, J.B. Rawlings, S.J. Wright, systems and neural approximation, Automatica 31 (10) (1995) 1443–
Nonlinear predictive control and moving horizon estimation – an 1451.
introductory overview, in: P.M. Frank (Ed.), Advances in Control: [20] T. Parisini, R. Zoppoli, Neural approximations for multistage
Highlights of ECC’99, Springer-Verlag, Berlin, 1999, pp. 391–449 optimal control of nonlinear stochastic systems, IEEE Transactions
(Chapter 12). on Automatic Control 41 (6) (1996) 889–895.
[4] S.N. Balakrishnan, R.D. Weil, Neurocontrol: a literature survey, [21] W.H. Press, B.P. Flannery, S.A. Teukolsky, W.T. Vetterling,
Mathematical and Computer Modelling 23 (1–2) (1996) 101–117. Numerical Recipes in C: The Art of Scientific Computing, Cambridge
[5] A. Bemporad, M. Morari, V. Dua, E.N. Pistikopoulos, The explicit University Press, 1992.
linear quadratic regulator for constrained systems, Automatica 38 (1) [22] S.J. Qin, T.A. Badgwell, An overview of nonlinear model predictive
(2002) 3–20. control applications, in: F. Allgöwer, A. Zheng (Eds.), Nonlinear
[6] D.P. Bertsekas, J.N. Tsitsiklis, Neuro-Dynamic Programming, Model Predictive Control, Birkhäuser Verlag, Basel, 2000.
Athena Scientific, Belmont, Mass, 1996. [23] C.-Y. Seong, B. Widrow, Neural dynamic optimization for control
[7] L. Cavagnari, L. Magni, R. Scattolini, Neural network implementa- systems—Part II: Theory, IEEE Transactions on Systems, Man, and
tion of nonlinear receding-horizon control, Neural Computing & Cybernetics 31 (4) (2001) 490–501.
Applications 8 (1) (1999) 86–92. [24] R. Zoppoli, M. Sanguineti, T. Parisini, Can we cope with the curse of
[8] S. Engell, K.-U. Klatt, Nonlinear control of a non-minimum phase dimensionality in optimal control by using neural approximators? in:
CSTR, in: Proceedings of the American Control Conference, San Proceedings of the 40th IEEE Conference on Decision and Control,
Francisco, CA, USA, 1993, pp. 2941–2945. Orlando, FL, USA, 2001, pp. 3540–3545.