0% found this document useful (0 votes)
54 views

Neutral Network Model Predictive Controler

1) The document proposes a neural network model predictive controller to optimally control constrained nonlinear systems. 2) The neural network is trained offline to minimize a control cost function, representing the optimal control law without needing to solve complex online optimizations. 3) This approach can construct controllers with various structures, like reduced order or decentralized controllers, by specifying the neural network architecture accordingly.

Uploaded by

Nimai Kowlessur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Neutral Network Model Predictive Controler

1) The document proposes a neural network model predictive controller to optimally control constrained nonlinear systems. 2) The neural network is trained offline to minimize a control cost function, representing the optimal control law without needing to solve complex online optimizations. 3) This approach can construct controllers with various structures, like reduced order or decentralized controllers, by specifying the neural network architecture accordingly.

Uploaded by

Nimai Kowlessur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Journal of Process Control 16 (2006) 937–946

www.elsevier.com/locate/jprocont

A neural network model predictive controller


Bernt M. Åkesson, Hannu T. Toivonen *

Department of Chemical Engineering, Åbo Akademi University, FIN-20500 Åbo, Finland

Received 1 November 2005; received in revised form 9 June 2006; accepted 11 June 2006

Abstract

A neural network controller is applied to the optimal model predictive control of constrained nonlinear systems. The control law is
represented by a neural network function approximator, which is trained to minimize a control-relevant cost function. The proposed
procedure can be applied to construct controllers with arbitrary structures, such as optimal reduced-order controllers and decentralized
controllers.
 2006 Elsevier Ltd. All rights reserved.

Keywords: Model predictive control; Neural networks; Nonlinear control

1. Introduction The powerful function approximator properties of neu-


ral networks makes them useful for representing nonlinear
Due to the complexity of nonlinear control problems it models or controllers, cf. for example [4,13,14]. Methods
is in general necessary to apply various computational or based on model-following control have been particularly
approximative procedures for their solution. In this context popular in neural network control. A limitation of this
a widely used approach is model predictive control (MPC), approach is, however, that it is not well suited for systems
which relies on solving a numerical optimization problem with unstable inverses. It is therefore well motivated to
on line. Another approach is to apply function approxima- study methods based on optimal control techniques.
tors, such as artificial neural networks, which can be A number of neural network-based methods have been
trained off line to represent the optimal control law. suggested for optimal control problems, where the control
In model predictive control the control signal is deter- objective is to minimize a control-relevant cost function.
mined by minimizing a future cost on-line at each time One approach is to apply a neural network to approximate
instant. It has found widespread industrial use for control the solution of the dynamic programming equation associ-
of constrained multivariable systems and nonlinear ated with the optimal control problem [6]. A more direct
processes [3,11,22]. A potential drawback of the MPC approach is to mimic the MPC methodology and train a
methodology is that the optimization problem may be neural network controller in such a way that the future
computationally quite demanding, especially for nonlinear cost over a prediction horizon is minimized. One way to
systems. In order to reduce the on-line computational achieve this follows the explicit MPC technique, using a
requirements explicit model predictive control has been neural network to approximate a model predictive control
introduced for linear MPC problems [5], where part of strategy, which is mapped by off-line calculations
the computations are performed off line. [1,7,9,19]. Instead of training a neural network to approx-
imate an optimal model predictive control strategy, an
alternative and more direct approach is to train the neural
*
Corresponding author. Tel.: +358 2 2154451; fax: +358 2 2154479.
network controller to minimize the cost directly, without
E-mail addresses: bakesson@abo.fi (B.M. Åkesson), htoivone@abo.fi, the need to calculate a model predictive controller. Various
Hannu.Toivonen@abo.fi (H.T. Toivonen). versions of this procedure have been presented. Parisini

0959-1524/$ - see front matter  2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jprocont.2006.06.001
938 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946

and Zoppoli [20] and Zoppoli et al. [24] study stochastic X1 h
kþN

optimal control problems, where the controller is parame- J N ðU N ðkÞ; kÞ ¼ ð^y ði þ 1jkÞ  y r ði þ 1ÞÞT Qð^y ði þ 1jkÞ
i¼k
terized as a function of past inputs and outputs using i
neural network approximators. Seong and Widrow [23] y r ði þ 1ÞÞ þ DuðiÞT RDuðiÞ þ qN ð^xðk þ N jkÞÞ
consider an optimal control problem where the initial state
ð2Þ
has a random distribution, and the controller is a state
feedback described by a neural network. In both studies subject to the constraints
a stochastic approximation type algorithm is used to train
gx ð^xði þ 1jkÞÞ 6 0
the network. Similar methods have been presented by
Ahmed and Al-Dajani [2] and Nayeri et al. [15] using gu ðuðiÞÞ 6 0 ð3Þ
steepest descent methods to train the neural network gD ðDuðiÞÞ 6 0; i ¼ k; k þ 1; . . . ; k þ N  1
controllers.
where ^xð j kÞ and ^y ðjkÞ denote the predicted state and out-
In many applications it is relevant to design controllers
put as functions of future inputs and the state at time in-
which have a specified structure. For complex systems it
stant k. Here, UN(k) denotes the future inputs,
may for example be of interest to use a reduced-order con-
troller, or to apply decentralized control for systems with U N ðkÞ ¼ fuðkÞ; uðk þ 1Þ; . . . ; uðk þ N  1Þg ð4Þ
many inputs and outputs. In model predictive control the
complete system model is used to predict the future system and Du denotes the incremental input,
trajectory, and the optimal control signal is therefore DuðkÞ ¼ uðkÞ  uðk  1Þ ð5Þ
implicitly a function of the whole state of the system
model. Hence, model predictive control is not applicable In Eq. (2), Q and R are symmetric positive (semi)defi-
to fixed-structure control problems. In contrast, controllers nite output and input weight matrices, and qN(Æ) is a
based on neural network function approximators can non-negative terminal cost.
be applied to optimal fixed-structure control by imposing From the optimality principle of dynamic programming
the appropriate structure on the neural network [6] it follows that minimization of the cost (2) gives the
approximator. solution of the infinite-horizon optimal control problem
In this paper, the approach studied in [20,2,24,23,15] is obtained in the limiting case N ! 1, if the terminal cost
formulated for constrained MPC type nonlinear optimal qN ð^xðk þ N j kÞÞ is taken as the minimum cost from time
control problems with structural constraints. The control instant k + N to infinity. The finite-horizon cost (2) is
law is represented by a neural network approximator, therefore well motivated. For nonlinear systems the
which is trained off line to minimize a control-relevant cost optimal control problem has, however, in general no
function. The design of optimal low-order controllers is closed-form solution. Therefore, various brute-force and
accomplished by specifying the input to the neural network suboptimal methods have been studied.
controller appropriately. Decentralized and other fixed- In model predictive control (MPC) the control signal is
structure neural network controllers are constructed by determined by minimizing the cost (2) numerically with
introducing appropriate constraints on the network struc- respect to the input sequence UN(k) at each sampling
ture. The performance of the neural network controller is instant. Only the first element u(k) of the optimal input
evaluated on numerical examples and compared to optimal sequence is applied to the system. In the next sampling per-
model predictive controllers. iod, a new optimization problem is solved to give the opti-
mal control signal at sampling instant k + 1, etc. In this
approach the terminal cost qN(Æ) can be taken as an approx-
2. Problem formulation
imation of the minimum cost from time instant k + N to
infinity, but is in practice more often used as a means to
We consider the control of a discrete-time nonlinear sys-
ensure closed-loop stability, or is omitted altogether if the
tem described by
optimization horizon is sufficiently long. The model predic-
tive control approach has the drawback that a nonlinear,
xðk þ 1Þ ¼ f ðxðkÞ; uðkÞÞ constrained optimization problem should be solved at each
ð1Þ
yðkÞ ¼ hðxðkÞÞ sampling period, which may be computationally too
demanding for on-line implementation.
where y(k) 2 Rp is the controlled output, u(k) 2 Rr is the In order to reduce the computational burden in model
manipulated input, and x(k) 2 Rn is a state vector. The predictive control, explicit MPC has been proposed. In this
control objective is to keep the output close to a specified approach, part of the computations are performed off line.
reference trajectory yr(i) in such a way that large control For nonlinear systems, the optimal MPC strategy may be
signal variations are avoided and possible hard constraints mapped by off-line computations, and described by a func-
on the states and inputs are satisfied. A commonly used, tion approximator. More precisely, the control strategy
quantitative formulation of the control objective is to min- which minimizes the cost (2) defines the control signal, or
imize the cost equivalently, its increment Du(k), as a function,
B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946 939

Duopt ðkÞ ¼ gðI MPC ðkÞÞ ð6Þ required that the cost is minimized for a set of training
data,
where I MPC ðkÞ ¼ f^xðk j kÞ; y r ðk þ 1Þ; y r ðk þ 2Þ; . . . ; y r ðk þ N Þg
is the information used to compute the cost (2) as a func-  
V ðmÞ ðkÞ ¼ xðmÞ ðkÞ; uðmÞ ðk  1Þ; y ðmÞ ðmÞ
r ðk þ 1Þ; . . . ; y r ðk þ N Þ ;
tion of UN(k). The functional relationship (6) can be
evaluated for any IMPC(k) by minimizing the cost (2). m ¼ 1; 2; . . . ; M ð9Þ
The optimal strategy can therefore be approximated by a Using the control strategy (7), the system evolution for the
function approximator which can be trained off line. initial state x(m)(k) is given by
Although this approach has been found very useful, it
has some limitations. As the MPC strategy is based on xðmÞ ði þ 1Þ ¼ f ðxðmÞ ðiÞ; uðmÞ ðiÞÞ
the information IMPC(k) used to calculate the predicted DuðmÞ ðiÞ ¼ gNN ðIðiÞ; wÞ ð10Þ
outputs, this approach is not well suited for representing ðmÞ ðmÞ
reduced-order or fixed-structure controllers. In addition, y ðiÞ ¼ hðx ðiÞÞ; i ¼ k; k þ 1; . . .
the computational effort required to generate training data Define the associated cost associated with the training data
may be extensive, as each training data point requires the (9),
solution of an MPC optimization problem.
ðmÞ
X1 h
kþN
T
3. A neural network optimal controller J N ðwÞ ¼ ðy ðmÞ ði þ 1Þ  y ðmÞ
r ði þ 1ÞÞ Qðy
ðmÞ
ði þ 1Þ
i¼k
i
T
In this section a procedure for constructing a neural net- y ðmÞ
r ði þ 1ÞÞ þ Du
ðmÞ
ðiÞ RDuðmÞ ðiÞ
work model predictive controller for the control problem
þ qN ðxðmÞ ðk þ N ÞÞ ð11Þ
described in Section 2 is presented. Here we adopt a proce-
dure in which the controller is trained directly to minimize The training of the approximator (7) now consists of solv-
the cost (2) for a training data set, without having ing the nonlinear least-squares optimization problem
to compute the optimal MPC control signals by off-line
optimizations. X
M
ðmÞ
The controller is represented as min J N ðwÞ ð12Þ
w
m¼1
DuðkÞ ¼ gNN ðIðkÞ; wÞ ð7Þ
where gNN(I(k); w) is a (neural network) function approxi- subject to the constraints
mator, I(k) denotes the information which is available to
the controller at time instant k, and w denotes a vector of gx ðxðmÞ ði þ 1ÞÞ 6 0
approximator parameters (neural network weights). gu ðuðmÞ ðiÞÞ 6 0 ð13Þ
If complete state information is assumed, i.e., I(k) = ðmÞ
gD ðDu ðiÞÞ 6 0; i ¼ k; k þ 1; . . . ; k þ N  1
IMPC(k), the controller (7) can be considered as a functional
approximation of the optimal MPC strategy (6). The
The training problem can be solved by a gradient-based
approach studied here is, however, not restricted to con-
nonlinear least-squares optimization procedure, such as
trollers with full state information, and typically the set
the Levenberg–Marquardt algorithm. From Eq. (11), the
I(k) is taken to consist of a number of past inputs
cost function gradients are given by
u(k  i) and outputs y(k  i) as well as information about
the setpoint or reference trajectory yr(k + i). In this way ðmÞ
dJ N ðwÞ X1 
kþN
T oy
ðmÞ
ði þ 1Þ
it is possible to construct low-complexity controllers for ðmÞ ðmÞ
T
¼ 2 ðy ði þ 1Þ  y r ði þ 1ÞÞ Q
high-order systems. Various ways to select the set I(k) are dw i¼k
owT

illustrated in the examples presented in Section 4. T oDu
ðmÞ
ðiÞ
þ DuðmÞ ðiÞ R ð14Þ
Remark 1. Besides allowing for controllers of reduced owT
complexity the controller structure may be fixed as well by
imposing a structure on the mapping gNN(Æ ; Æ). For exam- where
ple, assuming that the information has the decomposi-
tion I(k) = [I1(k), I2(k), . . . , Ir(k)], a decentralized controller oy ðmÞ ði þ 1Þ ohðxðmÞ ði þ 1ÞÞ oxðmÞ ði þ 1Þ
¼
Dui(k) = gNN,i(Ii(k); wi), i = 1, . . . , r is obtained by requiring owT oxðmÞ ði þ 1ÞT owT
ð15Þ
that the controller has the structure oDuðmÞ ðiÞ ogNN ðIðiÞ; wÞ ogNN ðIðiÞ; wÞ oxðmÞ ðiÞ
h ¼ þ
gNN ðIðkÞ; wÞ ¼ gTNN;1 ðI 1 ðkÞ; w1 Þ; gTNN;2 ðI 2 ðkÞ; w2 Þ; . . . ; owT owT oxðmÞ ðiÞT owT
iT
gTNN;r ðI m ðkÞ; wr Þ ð8Þ where ogNN(I(i); w)/owT is the partial derivative of the neu-
ral network output with respect to the network parameters,
In order to determine the controller parameters w in such a which depends on the network structure, and ox(m)(i)/owT
way that the control law (7) minimizes the cost (2) it is is obtained recursively according to
940 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946

oxðmÞ ði þ 1Þ of ðxðmÞ ðiÞ; uðmÞ ðiÞÞ oxðmÞ ðiÞ white noise disturbances v(k) and n(k) with variances Rv
¼
owT oxðmÞ ðiÞ
T
owT and Rn. Using the sampling time 0.2 min, the time delay
is L = 5, and the system matrices can be represented as
of ðxðmÞ ðiÞ; uðmÞ ðiÞÞ ouðmÞ ðiÞ
þ T
functions of the output according to
ouðmÞ ðiÞ owT ð16Þ 2 3
p1 ðyÞ þ 0:96 p2 ðyÞ p1 ðyÞ þ 0:96 0
ouðmÞ ðiÞ ouðmÞ ði  1Þ oDuðmÞ ðiÞ 6
¼ þ 6 p1 ðyÞ p2 ðyÞ þ 0:96 p1 ðyÞ 077
owT owT owT F ðyÞ ¼ 6 7;
ðmÞ 4 0 0 0 05
ou ðk  1Þ
¼ 0; i ¼ k; k þ 1; . . . ; k þ N  1 p3 ðyÞ p4 ðyÞ p3 ðyÞ 1
owT 2 3
p1 ðyÞ þ 0:96
6 p1 ðyÞ 7
Remark 2. The cost function (11) used to train the neural 6 7
G0 ðyÞ ¼ 0:26 7;
network controller is similar to the costs used in model 4 1 5
predictive control. The proposed controller can therefore p3 ðyÞ
be regarded as an explicit model predictive controller. 2 3 2 3
p2 ðyÞ 1
Notice that for a given controller complexity, the compu- 6 p ðyÞ þ 0:96 7 617
tational effort required to optimize the controller param- 6 2 7 6 7
G1 ðyÞ ¼ 0:26 7; Gv ðyÞ ¼ 6 7;
eters w does not depend critically on the length N of the 4 0 5 405
control horizon. It may therefore be feasible to use longer p4 ðyÞ 0
control horizons than in model predictive control, where
the number of parameters to be optimized increases in
proportion to length of the control horizon. H ¼ ½0 0 0 1

The state-dependent parameters are given by the


4. Numerical examples expressions
0:23
In this section the neural network model predictive con- p1 ðyÞ ¼
troller presented in Section 3 is illustrated on numerical 1 þ e4:69yþ24:59
examples. In all examples, the control law (7) is represented 0:59
p2 ðyÞ ¼
using a feedforward neural network with one hidden layer 1 þ e3:59yþ19:14
2
with hyperbolic tangent activation functions. This type of p3 ðyÞ ¼ 0:64eðy4:58Þ =0:40
 0:11
network can approximate any continuous nonlinear func- 2
p4 ðyÞ ¼ 1:31eðy4:53Þ =0:43
þ 0:17
tion to arbitrary accuracy [12]. The networks have the ele-
ments of I(k) as inputs and Du(k) as outputs. The networks which are functional representations of the tabulated sys-
were trained using the Levenberg–Marquardt algorithm tem parameters used in previous studies [1].
[21] to solve the nonlinear least-squares problem (12).
The numerical computations were performed using the By Eq. (17), the predicted outputs ^y ðk þ 1jkÞ; . . . ;
^y ðk þ L þ N jkÞ can be determined according to
routine lsqnonlin of MATLAB’s Optimization Toolbox.
In all examples, the optimization horizon N is taken suf- ^y ðk þ ljkÞ ¼ H^xðk þ l j kÞ; l ¼ 0; 1; . . . ; L þ N ð18Þ
ficiently large for the closed-loop systems to reach steady
where the state estimates are given by the state-dependent
state. Hence, a zero terminal cost is used in the cost func-
Kalman filter [1]
tion (11).
Example 1. In this example we consider the simulated pH ^xðkjkÞ ¼ ^xðkjk  1Þ þ KðkÞ½yðkÞ  H^xðkjk  1Þ
neutralization process studied in [10,18,17,1]. In [1] an ^xðk þ l þ 1jkÞ ¼ F ð^y ðk þ ljkÞÞ^xðk þ ljkÞ
explicit MPC scheme was applied to the process by using a þ G1 ð^y ðk þ ljkÞÞDuðk þ l  LÞ
neural network to approximate the optimal MPC strategy.
The process is described by nonlinear differential equa- l = 0, 1, . . . , L + N, where K(k) is the Kalman filter gain,
tions. By velocity-based linearization and integration over given by
the sampling period, the process can be described by the  1
state-dependent parameter model [17,1] KðkÞ ¼ P ðkÞH T HP ðkÞH T þ Rn
xðk þ 1Þ ¼ F ðyðkÞÞxðkÞ þ G0 ðyðkÞÞdðkÞ P ðk þ 1Þ ¼ F ð^y ðk j kÞÞðI  KðkÞH ÞP ðkÞF T ð^y ðk j kÞÞ þ Gv Rv GTv
þ G1 ðyðkÞÞDuðk  LÞ þ Gv ðyðkÞÞvðkÞ ð17Þ ð19Þ
yðkÞ ¼ HxðkÞ þ nðkÞ The optimal control strategy is a function of the predicted
where the output y(k) is the controlled pH value and u(k) is state ^xðk þ L j kÞ and the reference trajectory yr(k + L + i),
the input flow used for control. The system is subject to a i = 1, . . . , N. It will be assumed that the reference trajectory
deterministic disturbance d(k) and independent stochastic is given by a reference model
B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946 941

xr ðk þ 1Þ ¼ F r xr ðkÞ þ Gr y sp ðkÞ Table 1


ð20Þ Training results in Example 1 for networks with various numbers of
y r ðkÞ ¼ H r xr ðkÞ þ J r y sp ðkÞ hidden nodes (nh)

where ysp(k) denotes the setpoint, which is subject to step nh nw Cost


changes. It follows that the control strategy can be taken Training Test
as a function of the predicted system state ^xðk þ LjkÞ, the NN 5 41 4.03 3.21
state xr(k + Ljk) of the reference model, and the setpoint NN 10 81 3.73 2.94
ysp(k + L) at time instant k + L, i.e., the information I(k) NN 11 89 3.71 2.93
NN 12 97 3.71 2.93
available to the controller (7) can be taken as
  MPC – – 3.70 2.91
IðkÞ ¼ ^xðk þ LjkÞ; xr ðk þ LÞ; y sp ðk þ LÞ ð21Þ The total number of weights (nw) is also given. Results obtained with the
MPC strategy are included for comparison.
The design parameters were selected in accordance with [1].
The weights in the cost function (11) were Q = 1 and
R = 1. The white noise variances Rv = Rn = 0.001 were function on the test data is achieved using a network with 11
used in the Kalman filter equation (19), and the reference hidden layer nodes, and no further reduction could be
model (20) was taken as a second-order system with unit obtained by increasing the network size. Therefore this
stationary gain and a double pole at 0.9. network structure is used in the simulations below.
Due to the nonlinear system dynamics, rather large data These results agree with those in [1], where a network of
sets are required in order to capture the optimal controller the same size was found to be optimal when used to
characteristics accurately in the whole operating region. approximate the MPC strategy. For comparison, Table 1
This can be compared to nonlinear system identification, also shows the results obtained with a nonlinear MPC
where long training sequences are also required [16]. The strategy. The prediction and control horizon of the MPC
training data (9) were constructed so as to train the con- strategy was N = 20 steps.
troller for both trajectory following and disturbance rejec- In order to avoid convergence to local optima, the opti-
tion in the output (pH) range 3 6 y 6 7. Therefore two mization of the neural network weights was performed
types of training data were generated. For trajectory fol- using different initial points. In this example no problems
lowing, training data were generated by taking the initial with local optima were encountered. On average, the neu-
states as the stationary states corresponding to y ðmÞ ðlÞ ¼ ral network weights converged in 400–500 iterations. This
y ðmÞ ðmÞ
sp ðlÞ ¼ y r ðlÞ ¼ y
ðmÞ
ðkÞ; l < k and introducing the set- is comparable to the number of iterations required by a
point changes y sp ðiÞ ¼ y ðmÞ ðkÞ  1; i ¼ k; k þ 1; . . . Eight
ðmÞ
direct approximation of the optimal MPC strategy with a
equally spaced initial pH values and fourteen setpoints neural network of the same size [1] and using the same data
were selected in the chosen pH range, giving a total of 14 points. However, as the trajectories (10) and associated
reference trajectories. The control horizon in the cost func- cost (11) are evaluated at each iteration, the computational
tion (11) was set to N = 150, which is sufficient to achieve a burden associated with the training is heavier than in the
transition to the new setpoint according to the trajectory direct approximation. On the other hand, the direct
defined by the reference model (20). For disturbance rejec- approximation method requires the calculation of the opti-
tion, training data were constructed by taking constant mal MPC control action for each data point, and its overall
setpoints and reference signals y ðmÞ ðmÞ
r ðiÞ ¼ y sp ðiÞ; i ¼ k; k þ computational requirements are therefore heavier.
1; . . . ; k þ N and initial states corresponding to steady The closed-loop responses obtained with the neural net-
states with y ðmÞ ðkÞ ¼ y ðmÞ
sp ðkÞ  0:1. The same fourteen differ- work controller and the optimal MPC strategy for setpoint
ent constant setpoints which were used for trajectory changes are given in Fig. 1. Fig. 2 shows the responses when
following were selected, resulting in a total of 28 initial there is also a white measurement noise with variance
states. In this case the control horizon taken as N = 25 Rn = 0.001. Notice that the setpoint changes in Figs. 1 and
steps, which corresponds to the time required for the sys- 2 are distinct from the ones included in the training data.
tems to reach the setpoint after an initial offset. The total In order to further test the generalization properties of the
number of initial states and reference trajectories compris- neural network controller to situations not included in the
ing the training set is thus M = 42. As each element of the training data, a sequence of setpoint changes was applied,
training set contains N data points, the total number of where the changes take place before the new steady state
data points is 2800. has been reached. The responses in Fig. 3 show that the
The optimal network size was selected by using a sepa- neural network controller performs well in this case as well.
rate test data set. The test data consisted of 36 sequences Fig. 4 shows the responses to step disturbances in d(k) at
and 2400 data points, which were generated in a similar various steady-state pH values in the region 3–7. Step
way as the training data, using six initial pH values in the changes from 10 to 11 mmol/l occurred at time instant
range 3.5 6 y 6 6.5 and setpoints in the range 3 6 ysp 6 7. k = 75 and back to 10 mmol/l at time k = 325. Measure-
Networks of different sizes were trained in order to find ment noise was also present in the simulations. Notice that
the one with the smallest test data cost. The training results the disturbance d(k) is unknown to the controller and the
are summarized in Table 1. The minimum value of the cost neural network controller had not been specifically trained
942 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946

7 7 7

6 6 6
pH

5 5

pH
5
4 4
4
3 3
3
0 20 40 60 80 100 0 20 40 60 80 100
0 20 40 60 80 100 120 140 160 180 200

10 10
10
u (mmol/l)

8 8

u (mmol/l)
8
6 6

6
4 4

0 20 40 60 80 100 0 20 40 60 80 100 4
k k
0 20 40 60 80 100 120 140 160 180 200
Fig. 1. Responses obtained in Example 1 with the neural network k
controller (solid lines) and model predictive control (dashed lines) for
setpoint changes. Fig. 3. Responses obtained in Example 1 with the neural network
controller (solid lines) and model predictive control (dashed lines) for a
sequence of setpoint changes. The total cost obtained when using the
neural network controller is 2.60 and 1.60 when using the optimal MPC
7 7 strategy.
6 6
pH

5 5 7
4 4
6
3 3
pH

5
0 20 40 60 80 100 0 20 40 60 80 100
4

10 10 3
u (mmol/l)

0 50 100 150 200 250 300 350 400 450 500


8 8

6 6
10

4 4
8
u (mmol/l)

0 20 40 60 80 100 0 20 40 60 80 100
k k
6
Fig. 2. Responses obtained in Example 1 with the neural network
controller (solid lines) and model predictive control (dashed lines) for 4

setpoint changes when there is measurement noise. 0 50 100 150 200 250 300 350 400 450 500
k

using step disturbances. Elimination of steady-state offsets Fig. 4. Responses obtained in Example 1 with the neural network
was guaranteed by imposing the condition Du(k) = controller (solid lines) and model predictive control (dashed lines) for step
disturbances when there is measurement noise. The total cost is 22.6 when
gNN(I(k); w) = 0 at the steady states, cf. [1].
using the neural network controller and 21.8 when using the optimal MPC
The simulation results show that the neural network strategy.
controller achieves near-optimal control performance for
various disturbance types. It should be noted that the neu-
ral network controller requires only 238 flops at each sam- dcA V_
pling interval, which is less than 0.1% of the average ¼ ðcA0  cA Þ  k 1 ð#ÞcA  k 3 ð#Þc2A
dt VR
number of operations required by the model predictive dcB V_
controller. This is in accordance with the results in [1]. ¼ cB þ k 1 ð#ÞcA  k 2 ð#ÞcB
dt VR
Example 2. In this example both centralized and decen- d# 1  
¼ k 1 ð#ÞcA DH RAB þ k 2 ð#ÞcB DH RBC þ k 3 ð#Þc2A DH RAD
tralized neural network model predictive controllers are dt qC p
designed for a simulated multivariable non-isothermal V_ k w AR
continuous stirred tank reactor with a van de Vusse þ ð#0  #Þ þ ð#K  #Þ
VR qC p V R
reaction scheme [8]. The process involves the reactant A,
d#K 1 _ 
the desired product B, as well as two by-products C and D ¼ QK þ k w AR ð#  #K Þ
dt mK C p K
which are also produced in the reaction. The reactor is
modelled by a system of four coupled differential equations, ð22Þ
B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946 943

where cA and cB are the concentrations of components A range 9000 kJ=h 6 Q_ K 6 0 kJ=h.
and B, respectively, # is the reactor temperature and #K A discrete-time system representation of the form (1)
is the coolant temperature. The concentration of A in the was constructed by using Euler’s approximation to discret-
feed stream is cA0 and the temperature of the feed stream ize the model (22) with the sampling time 20 s. The con-
is #0. The reactor volume is VR, V_ is the feed flow rate trolled outputs were taken as y1 = cB [mol/l] and y2 =
and Q_ K is the rate of heat addition or removal. The reac- # [C] and the inputs were defined as u1 ¼ V_ [m3/h] and
tion coefficients k1, k2 and k3 are given by the Arrhenius u2 ¼ Q_ K [kJ/h].
equation, The procedure in Section 3 was used to design both cen-
  tralized and decentralized neural network controllers for
EAi the reactor system. Training data were generated in analogy
k i ð#Þ ¼ k 0i  exp ; i ¼ 1; 2; 3 ð23Þ
# þ 273:15  C with Example 1, with the outputs and setpoints in the region
0.8 mol/l 6 cB 6 1.0 mol/l, 125 C 6 # 6 135 C. For tra-
Numerical values of the model parameters are given in jectory following, the initial states were taken to correspond
Table 2. to steady states associated with the outputs y m1 ðkÞ; y m2 ðkÞ and
setpoints y msp;1 , y msp;2 . Four equally spaced values in the con-
The control objective is to control the concentration cB
sidered regions were selected for each output. For each ini-
and the reactor temperature # by manipulating the feed
tial steady state, eight combinations of setpoint values
flow rate V_ and the rate of heat addition or removal Q_ K .
ðy msp;1 ; y msp;2 Þ ¼ ðy m1 ðkÞ  0:1; y m2 ðkÞ  5Þ; ðy m1 ðkÞ; y m2 ðkÞ  5Þ;
The concentration cB and the reactor temperature # are
ðy m1 ðkÞ 0:1; y m2 ðkÞÞ were selected. Excluding values falling
available from measurements. The feed concentration cA0
outside the selected variable range, this results in 84 initial
and the feed temperature #0 are treated as disturbances,
states. The length of the prediction horizon was set to
with the nominal values 5.1 mol/l and 130 C, respectively.
N = 60 steps. For disturbance rejection, training data were
The feed flow V_ is constrained to the interval 0:05 m3 =h 6
constructed by taking constant reference signals y mr1 ðiÞ ¼
V_ 6 0:35 m3 =h and the rate of heat removal Q_ K lies in the
y msp;1 ; y mr2 ðiÞ ¼ y msp;2 ; i ¼ k; k þ 1; . . . and initial states corre-
sponding to steady states with y m1 ðkÞ ¼ y msp;1  0:02 and
Table 2 y m2 ðkÞ ¼ y msp;2  1. Using four equally spaced values for
Parameters of the van de Vusse reactor
each setpoint, 64 initial states are obtained. The length of
k01 = 1.287 · 1012 h1 DH RAB ¼ 4:2 kJ=mol the prediction horizon was N = 25. Thus, the total number
k02 = 1.287 · 1012 h1 DH RBC ¼ 11:0 kJ=mol
k03 = 9.043 · 109 l/(mol h) DH RAD ¼ 41:85 kJ=mol
of initial states in the training set was M = 148, and the
EA1 = 9758.3 K q = 0.9342 kg/l number of training data points was 6640. The test data set
EA2 = 9758.3 K Cp = 3.01 kJ/(kg K) consisted of four setpoint changes between the values
EA3 = 8560 K kw = 4032 kJ/(h m2 K) cB = 0.85 and 0.95 mol/l and the values # = 128 and
AR = 0.215 m2 VR = 0.01 m3 133 C (cf. Figs. 5–7), and was used as in Example 1 to select
mK = 5.0 kg C pK ¼ 2:0 kJ=ðkg KÞ
the network sizes.

1
c (mol/l)

0.9
B

0.8
0 10 20 0 10 20 0 10 20 0 10 20
135
ϑ (°C)

130

125
0 10 20 0 10 20 0 10 20 0 10 20

0.3
V (m3/h)

0.2

0.1
0 10 20 0 10 20 0 10 20 0 10 20
0
Q (MJ/h)

–5
K

–10
0 10 20 0 10 20 0 10 20 0 10 20
k k k k

Fig. 5. Responses obtained in Example 2 to four setpoint changes when using a neural network controller (solid lines) and an optimal model predictive
controller (dashed lines) designed for the weights in Eq. (25).
944 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946

c (mol/l)
0.9

B
0.8
0 10 20 0 10 20 0 10 20 0 10 20
135

ϑ (°C) 130

125
0 10 20 0 10 20 0 10 20 0 10 20

0.3
V (m3/h)

0.2

0.1
0 10 20 0 10 20 0 10 20 0 10 20
0
Q (MJ/h)

–5
K

–10
0 10 20 0 10 20 0 10 20 0 10 20
k k k k

Fig. 6. Responses obtained in Example 2 to four setpoint changes used as test data when using a centralized neural network controller (solid lines) and
an optimal model predictive controller (dashed lines) designed for the weights in Eq. (26).

1
c (mol/l)

0.9
B

0.8
0 10 20 0 10 20 0 10 20 0 10 20
135
ϑ (°C)

130

125
0 10 20 0 10 20 0 10 20 0 10 20

0.3
V (m3/h)

0.2

0.1
0 10 20 0 10 20 0 10 20 0 10 20
0
Q (MJ/h)

–5
K

–10
0 10 20 0 10 20 0 10 20 0 10 20
k k k k

Fig. 7. Responses obtained in Example 2 to four setpoint changes used as test data when using a decentralized neural network controller (solid lines)
and an optimal model predictive controller (dashed lines).

The neural network controllers were taken to be func- where ny = 3 and nu = 3. No significant improvement could
tions of past outputs y(k  i) and input increments be obtained by increasing the values ny and nu.
Du(k  i), the state xr(k) of the reference model (cf. Eq. For the multivariable example system, the training of
(20)) and the current setpoint. Both outputs had identical the neural network controllers turned out to be more
reference models, which were taken as first-order systems demanding than for the single-input single-output system
with unit stationary gains and poles at 0.8. In the central- studied in Example 1. In particular, the optimization of
ized controller case the controller in Eq. (7) was taken as the neural network weights could be stuck in local minima.
a function of These were avoided by starting the optimization from a
 number of random initial points.
IðkÞ ¼ yðkÞ; . . . ; yðk  ny þ 1Þ; Duðk  1Þ; . . . ; Duðk  nu Þ;
 It was observed that in the multivariable case, the choice
xr ðkÞ; y sp ðkÞ ð24Þ of the relative magnitudes of the weights on the outputs in
B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946 945

the cost function (11) has a significant effect on the network I i ðkÞ ¼ y i ðkÞ; . . . ; y i ðk  ny;i þ 1Þ; Dui ðk  1Þ; . . . ;
training performance. If the contribution to the cost func- 
Dui ðk  nu;i Þ; xr;i ðkÞ; y sp;i ðkÞ ; i ¼ 1; 2 ð28Þ
tion from one output dominates over the contributions
from the other outputs, the network will only learn to con- where ny,i = nu,i = 3, i = 1, 2. No significant improvement
trol the dominating output, whereas satisfactory control of could be obtained by increasing the values ny,i and nu,i.
the other outputs is not achieved. This happens despite the The network sizes were determined as above. A neural net-
fact that the optimal model predictive controller based on work controller with four hidden layer neurons was deter-
the same weights achieves good control performance for mined to control the concentration, while a network with
all outputs. For the inputs a similar situation holds. The three hidden layer neurons was used to control the temper-
set of cost function weights for which a neural network ature. The total number of weights is 72. The network gives
controller can be efficiently trained by the procedure in Sec- the cost 3145 on the training data and 139.2 on the test
tion 3 is therefore limited. This behaviour is illustrated in data. Fig. 7 gives the closed-loop responses for the test
Fig. 5, which shows the closed-loop responses obtained data, showing that the performance of the decentralized
with the optimal MPC strategy and the best neural network controller is remarkably good for setpoint changes, despite
approximator which could be found when the weight the facts that there are considerable interactions between
matrices in the cost function (11) were taken as the loops (cf. Eq. (22)).
  It should be observed that although the decentralized
400 0 10 0
Q¼ ; R¼ ð25Þ controllers are represented by separate networks, they can-
0 10 0 107 not be trained independently. This is because due to the
interactions between the loops one controller affects the
The contribution of the first output cB to the cost is much
output controlled by the other and vice versa. The simulta-
smaller than the contribution of the second output #. Con-
neous training of the networks can be performed in a
sequently, the second output dominates and poor control
straightforward way by defining the global parameter vec-
of the first output is obtained. Therefore, the weight on T
tor w ¼ ½wT1 ; wT2  and by applying a standard gradient-
the first output should be increased. A similar phenome-
based nonlinear least-squares approach of the form
non, although less prominent, can be seen for the inputs.
described in Section 3. The training is further complicated
The controllers considered below will the based on the
by the fact that the information Ii(k) used to determine the
weight matrices
control variable ui(k) does not define the state of the system
  uniquely. For these reasons the training of the decentral-
2500 0 90 0
Q¼ ; R¼ ð26Þ ized neural network controllers proved to be more
0 1 0 107
demanding both computationally and with respect to the
For these weights, the contributions to the cost from the quality of training data required. In this example, the
individual outputs (inputs) will have the same orders of whole training data set was essential in order to train
magnitude when using the optimal strategy. the decentralized controller properly, whereas the central-
The best performance on the test data when using a con- ized controller could be trained satisfactorily even though
troller with the input in Eq. (24) was achieved with a net- the number of data points were reduced.
work having four hidden layer neurons, and 78 weights.
The cost on the training data was 2336 and on the test data 5. Conclusion
the cost was 128.3. For comparison, with an optimal model
predictive controller the training data cost was 1719 and Optimal neural network control of constrained nonlin-
the test data cost was 125.8. The responses of the neural ear systems has been studied. The neural network control-
network controller and model predictive control on the test ler is designed by minimizing an MPC type cost function
data sequences are shown in Fig. 6. Notice that the system off-line for a set of training data. In this way the procedure
has inverse response characteristics, and both controllers is closely related to MPC, and it can be considered as a
give inverse responses. form of explicit model predictive control.
An optimal decentralized neural network controller was The neural network controller has a number of distinct
designed as follows. For the chemical reactor it is natural advantages over standard nonlinear model predictive con-
to use a decentralized controller where the feed flow u1 is trol. In analogy with other explicit MPC methods, the neu-
a function of the product concentration y1 and the rate ral network controller has substantially reduced on-line
of heat exchange u2 is a function of the reactor temperature computational requirements. In addition, the computa-
y2. Hence a decentralized neural network controller con- tional effort involved in the network training depends
sisting of the individual controllers mainly on the network complexity, and not on the length
Dui ðkÞ ¼ gNN;i ðI i ðkÞ; wi Þ; i ¼ 1; 2 ð27Þ of the control horizon. This makes it feasible to design
controllers with a longer control horizon than might be
was used, where the information available to the individual possible in MPC. Moreover, the structure of the neural
controllers consisted of local variables only, network controller can be fixed, so that controllers with a
946 B.M. Åkesson, H.T. Toivonen / Journal of Process Control 16 (2006) 937–946

specified structure, such as decentralized controllers, can be [9] J. Gómez Ortega, E.F. Camacho, Mobile robot navigation in a
designed. The main limitation of the neural network con- partially structured static environment, using neural predictive
control, Control Engineering Practice 4 (12) (1996) 1669–1679.
troller is that substantial off-line computations may be [10] T.K. Gustafsson, B.O. Skrifvars, K.V. Sandström, K.V. Waller,
needed in order to train it properly, and for some choices Modeling of pH for control, Industrial & Engineering Chemistry
of cost functions it may not even be feasible to achieve sat- Research 34 (3) (1995) 820–827.
isfactory accuracies. [11] M.A. Henson, Non-linear model predictive control: Current status
Numerical examples show that the neural network and future directions, Computers & Chemical Engineering 23 (2)
(1998) 187–202.
model predictive controller can be trained to achieve [12] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward
near-optimal control performance (when compared to the networks are universal approximators, Neural Networks 2 (5) (1989)
optimal MPC strategy) using both centralized and decen- 359–366.
tralized controller structures. [13] K.J. Hunt, D. Sbarbaro, R. Zbikowski, P.J. Gawthrop, Neural
networks for control—a survey, Automatica 28 (6) (1992) 1083–1112.
[14] M.A. Hussain, Review of the applications of neural networks in
Acknowledgement chemical process control – simulation and online implementation,
Artificial Intelligence in Engineering 13 (1) (1999) 55–68.
This work was supported by the Academy of Finland [15] M.R.D. Nayeri, A. Alasty, K. Daneshjou, Neural optimal control of
(Grant 206750). Bernt M. Åkesson was supported by flexible spacecraft slew maneuver, Acta Astronautica 55 (10) (2004)
the Finnish Graduate School in Chemical Engineering 817–827.
[16] M. Nørgaard, O. Ravn, N.K. Poulsen, L.K. Hansen, Neural
(GSCE). Networks for Modelling and Control of Dynamic Systems,
Springer-Verlag, London, 2000.
References [17] R.H. Nyström, B.M. Åkesson, H.T. Toivonen, Gain-scheduling
controllers based on velocity-form linear parameter-varying models
[1] B.M. Åkesson, H.T. Toivonen, J.B. Waller, R.H. Nyström, Neural applied to an example process, Industrial & Engineering Chemistry
network approximation of a nonlinear model predictive controller Research 41 (2) (2002) 220–229.
applied to a pH neutralization process, Computers & Chemical [18] R.H. Nyström, K.V. Sandström, T.K. Gustafsson, H.T. Toivonen,
Engineering 29 (2) (2005) 323–335. Multimodel robust control of nonlinear plants: a case study, Journal
[2] M.S. Ahmed, M.A. Al-Dajani, Neural regulator design, Neural of Process Control 9 (2) (1999) 135–150.
Networks 11 (9) (1998) 1695–1709. [19] T. Parisini, R. Zoppoli, A receding-horizon regulator for nonlinear
[3] F. Allgöwer, T.A. Badgwell, S.J. Qin, J.B. Rawlings, S.J. Wright, systems and neural approximation, Automatica 31 (10) (1995) 1443–
Nonlinear predictive control and moving horizon estimation – an 1451.
introductory overview, in: P.M. Frank (Ed.), Advances in Control: [20] T. Parisini, R. Zoppoli, Neural approximations for multistage
Highlights of ECC’99, Springer-Verlag, Berlin, 1999, pp. 391–449 optimal control of nonlinear stochastic systems, IEEE Transactions
(Chapter 12). on Automatic Control 41 (6) (1996) 889–895.
[4] S.N. Balakrishnan, R.D. Weil, Neurocontrol: a literature survey, [21] W.H. Press, B.P. Flannery, S.A. Teukolsky, W.T. Vetterling,
Mathematical and Computer Modelling 23 (1–2) (1996) 101–117. Numerical Recipes in C: The Art of Scientific Computing, Cambridge
[5] A. Bemporad, M. Morari, V. Dua, E.N. Pistikopoulos, The explicit University Press, 1992.
linear quadratic regulator for constrained systems, Automatica 38 (1) [22] S.J. Qin, T.A. Badgwell, An overview of nonlinear model predictive
(2002) 3–20. control applications, in: F. Allgöwer, A. Zheng (Eds.), Nonlinear
[6] D.P. Bertsekas, J.N. Tsitsiklis, Neuro-Dynamic Programming, Model Predictive Control, Birkhäuser Verlag, Basel, 2000.
Athena Scientific, Belmont, Mass, 1996. [23] C.-Y. Seong, B. Widrow, Neural dynamic optimization for control
[7] L. Cavagnari, L. Magni, R. Scattolini, Neural network implementa- systems—Part II: Theory, IEEE Transactions on Systems, Man, and
tion of nonlinear receding-horizon control, Neural Computing & Cybernetics 31 (4) (2001) 490–501.
Applications 8 (1) (1999) 86–92. [24] R. Zoppoli, M. Sanguineti, T. Parisini, Can we cope with the curse of
[8] S. Engell, K.-U. Klatt, Nonlinear control of a non-minimum phase dimensionality in optimal control by using neural approximators? in:
CSTR, in: Proceedings of the American Control Conference, San Proceedings of the 40th IEEE Conference on Decision and Control,
Francisco, CA, USA, 1993, pp. 2941–2945. Orlando, FL, USA, 2001, pp. 3540–3545.

You might also like