Neural algorithms for solving differential equations,
Neural algorithms for solving differential equations,
AND
IN SEOK KANG
I. INTRODUCTION
algorithms for solving the finite difference equations. In Section II, the basic idea of
neural networks is reviewed, and general continuous or discrete neural minimiza-
tion algorithms are introduced. As an example, in Section III, the differential equa-
tion U’ =f(u) is considered to show how the neural minimization algorithms can
solve the equation, and the result of numerical simulation of the equation U’ = u is
described. In Section IV, general continuous and discrete neural algorithms for
solving a wide range of complex partial differential equations are derived. In
Section V, implementation schemes of neural algorithms utilizing high-capacity
optical interconnecting devices are described. Characteristic features of the neural
algorithms compared with conventional algorithms are discussed in Section VI.
FIG. 1. Schematic diagram of a simple neura! network. T,,, Tik, and Tjk are connection strengths
between the pairs of neurons (i, j), (i, k), and (j, k).
581/91/l-S
112 LEE AND KANG
of the specific problem. Then, an initial trial state for the network representing the
initial trial solution may converge to the final state of the network according to the
neural dynamics, and it may give the solution of the problem. This method of com-
putation is based on the collective interaction between the neurons, and it exhibits
a high degree of parallelism. Another characteristic of the neural network is that the
processing of each neuron is extremely simple, however, the large number of
neurons and interconnections yields enormous computational power.
vi = g( uih (2.2)
where t is the continuous time which corresponds to the updating parameter, Tti is
the interconnection strength, and g(x) is a nonlinear function whose form can be
taken to be
and stop at minima of this function. The updating rule represented by Eqs. (2.1),
(2.2), and (2.3) for minimizing the energy function (2.4) is highly parallel, and it has
been implemented by utilizing optics [7).
The Hoplield model described above can be applied only to minimization of
quadratic energy functions which have symmetric T matrices. However, the
Hopfield model for arbitrary energy functions has been investigated recently [8].
Consider an energy function for a minimization problem that can be written by
E = F( Vi), (2.5)
NEURAL COMPUTING OF DIFFERENTIAL EQUATIONS 113
where F is a non-singular and bounded function of variables Vi, and the partial
derivatives with respect to Vi are assumed to be well defined. E is assumed to be
always positive or zero, which is valid because E is bounded. The time evolution
of the energy function defined by Eq. (2.5) is given by
Equation (2.10) is always negative or zero. Therefore, the change of the energy
function in time according to the updating rule Eq. (2.7) and (2.8) guarantees mini-
mization of the energy function.
where B’s are the state variables and the total number of state variables is N.
Partially synchronous minimization is considered for the most general case.
Totally synchronous or totally asynchronous minimizations are specific examples of
the general case.Assume that, at each step, M state variables are selectedrandomly,
and minimization is carried out by updating the M state variables simultaneously
and leaving all the other state variables unchanged. M can be any integer from 1
to N, and the minimization algorithm becomes totally asynchronous or totally
114 LEE AND KANG
B; = Bi + ABi, (2.12)
where ie P. The updated state variables are also binary variables having values 1
and - 1. Therefore, the possible values of AB, are
ABi= -2, 0, 2. (2.13)
The incremental change in energy AE due to the updated state variables given by
Eq. (2.12) is considered to develop an algorithm which minimizes the energy
function described by Eq. (2.11). AE is defined as
where D [ Bi, ... B,] E is a partial derivative with respective to the state variables
Bi, . . . B, at ABi= 0 for all iE P. The total number of terms in the summation of
Eq. (2.16) is finite because E is a polynomial.
To reduce the products of changes in Eq. (2.16) to linear forms, the following
relations are derived. For an arbitrary state variable B and a positive integer n,
(AB)n satisfies
(AB)“= (-2B)“-’ AB, (2.17)
which is the same as the left-hand side of Eq. (2.17) in the case of AB= -2.
Consider next a product of changes given by AAB,, ... AB, with an arbitrary
coefficient A. Assume that m > 1. The product can be written as
AABi, ~~~AB,=~A)(1/2)[-{S(A)ABil-AB,~~~~AB,)*
+ (ABil)* + (ABiT*..AB,)*]p (2.20)
G IAI(lP)C(ABil)* + (AB,..~ABim)21, (2.21)
where S(A) is the sign of A. The first term in Eq. (2.20) is always negative or zero,
and Eq. (2.21) follows. The second term in Eq. (2.21) is a product of smaller
number of changes than the left term in Eq. (2.20). Therefore, this technique can be
applied iteratively to reduce the product of changes in Eq. (2.20) to a sum of
individual changes, and it is given by
m-1
AAB,,
j=l
c 2-~(ABii)2i+2-(m-1)(ABi,)2(m-‘)
1
. (2.22)
+2-b- l)(ABik)*+‘)
1. (2.23)
where
Z(m)= _ 4
( >[
m~12(2j-j-1)+2(*(m-~L-m)
j=l 1.
(2.25)
If Eqs. (2.24) and (2.25) are used in Eq. (2.16), the incremental energy change is
given by
dzx -C Gi ABE, (2.26)
m=l ilEP
where K and K’ are positive integers, and V,, are binary variables.
NEURAL COMPUTING OF DIFFERENTIAL EQUATIONS 117
where A is a positive constant and the updating rule is derived from the time
derivative
dE/dt=x 2A[(U,+,- ua- IF9 -f(UJl wJ,+ I/aVbJPh
abs
-~au,-,/avb~)/2h-af(u,)/avb~} (3.6)
= -2A([(U,- u,-,)/2h-+f-(Ub-,)l/2h
- [tub+,- Ub)/2h - f ( Ub + 1) IPh
- [tub+, - ub-mh-wb)] (df(ub)/dub)) taubiaVbs). (3.7)
Therefore, the continuous neural agorithm for solving Eq. (3.1) consists of
dW,,/dt= -2A{(1/2h)‘(-U,,, +2ub- ub-2)+(1/2h)[f(Ub+l)
-f(“b-l)-(ub+,- ub-,)(dfub)/dUb)l
B. Numerical Simulation
Numerical simulation of the continuous neural algorithm described in Section III.
A was carried out to explain in detail how it works. The differential equation is
chosen as
du/dx = u. (3.10)
The finite difference equation for Eq. (3.10) is given by
3.
2.7182 k=lSO
k=lOO
k=75
2.
k=50
k=25
1. k=l
FIG. 2. Convergence of the initial trial solution U, = 1 for i = 1 to 100. The solution was stabilized
after 170 iterations.
NEURAL COMPUTING OF DIFFERENTIAL EQUATIONS 119
Equations (3.15)-( 3.18) are utilized to obtain the solution for the differential
equation.
Computer simulation based on Eqs. (3.15~(3.18) has been performed. The
parameters were chosen as h = 0.01, N = 100, and AtA = 0.0001. Here, AtA was
chosen as a maximum value satisfying the condition that the final converged solu-
tion does not depend on AtA characterizing discretization of the differential equa-
tion (3.13) with respect to the parameter t. For simplicity, the algorithm without
thresholding was considered, and AT&T PC6300 was used. The initial value was
U, = 1. Two sets of values for Ui, i = 1 to N, at the k = 0 iteration step, i.e., initial
trial solutions, were selected to show that both trial solutions converge to the
correct solution. Figure 2 shows convergence of the trial solution given by Ui = 1
for i= 1 to N. Another trial solution defined by Ui= 1. 5hi+ 1 for i= 1 to N is
shown to converge to the correct solution in Fig. 3.
C. Discrete Neural Algorithm
Derivation of the discrete neural algorithm for solving the differential equation
U’ = u is presented to illustrate the algorithm introduced in Section 1I.C. Totally
asynchronous algorithm, i.e., updating one state variable at a time but randomly,
is considered for simplicity. The energy function for the discrete model is given by
E=C C(Ua+,- ua-,)P- Kzl*, (3.19)
FIG. 3. Convergence of the initial trial solution CT,=l.Sih+ 1 for i= 1 to 100 and h=O.Ol. In this
case, the initial trial solution is closer to the correct solution than the initial trial solution used in Fig. 2.
However, the solution was stabilized after 160 iterations.
120 LEEANDKANG
where K and K’ are positive integers, and B,, are binary variables having values 1
and - 1. The incremental change in energy due to the variable B,,y is
Equation (3.22) together with the step thresholding function gives the discrete
neural algorithm.
A. General Algorithms
The general form of differential equations to be considered in this section is
represented by
=D,W~,a+,,,- &,a-,W)
=(1/2h)*x {6.+,‘+,‘,b-6.+,‘-,‘,b
b
where 6 is a unit matrix, xIo is the initial value for th coordinate xl, m’, and 1’ are
M-dimensional vectors satisfying the conditions m’(m) = 1 and m’ = 0 otherwise
and l’(1) = 1 and I’ = 0 otherwise. The finite difference equation for the differential
equation (4.1) is then given by
where A is a positive constant. If the energy function given by Eq. (4.11) is mini-
mized with respect to Vibr, the solution of the differential equation is given by the
values Vi& co) which minimize Eq. (4.11). The time derivative of Eq. (4.11) is given
by
and
(aF,,2,1aujb)=aF:,faujb
+ c [aF:cia(DmuF)i(a(Dm up)/(aujbl
mP
+ C caF:,/a(~,,~,,)i~a(~,,u,)iau,~i
m[P
(4.18)
NEURAL COMPUTING OF DIFFERENTIAL EQUATIONS 123
and
am,u,,w,,
=ww* c {Se+m’+,‘,e-6e+m,-r,,e
-6cd,,.,.+6,-*,-,,,,>(aup,lau,b)
=(1/2h)2 ~jp{6c+m'+l',b-bc+m'--l',b
-6 e--m'+,'.b+~c--m'-,I',b).
(4.19)
+I CaF-:,la(D,u,)1(1/2h)6jp(Bc+,‘,b-6,-,’,b)
mP
+ C CaF~,/a(D,,“,)1(1/2h)2 djp{bc+mr+l’,b
mlP
-6 -6 e--m’+l’,b+~c-m’-,I’,b}
e+m’-l’,b
+I [aF:,/a(D,Uj~)l(1/2h>(S,+,:b-6,~,:b)
m
+I [aF:,/a(D,,U,)1(1/2h)2 &+d+l’,b
ml
-6 c+m’-f’,b -6 e-m’+I’,b+6e-m’-,‘,b}
aE/av,, = A 1 aF:,/auib
jbke
+c [aF~,/a(D,uj~)l(1/2h)(6,+,:b-6,-,:b)
m
+I CaF:,/a(D,,Uj=)1(1/2h)’ (&+m’+,‘,b
ml
-6 c+m’-t’,b -6 c~m’+I’,b+Be--m’--[‘,b)
= Ac 8F:,laU,
kc
+I (‘/“)(‘,~,,,,-‘,+,,,,)C’F~,I’(‘,’i~)l
m
+c w2~)* (Sa--m’~I’,c-6a~m,+r,,c
ml
-6 P+m,-I.,r+~n+m,+,,,c)CaF:,la(o,,u,)l
where Eq. (4.9) was used to sum over j and b, and a and c in the unit matrices 6
were rearranged. Utilizing the relations (4.4) and (4.5) in Eq. (4.23), the algorithm
for solving the differential equation is finally given by
dW,Jdt = -A 1 a(F;:,)/Z’,
k
+ 1 o,,Ca(F:,)/a(o,,u,)l
ml
+ higher-order terms (8 U,/a Vi,,s), (4.24)
and
where Bjbn are binary variables having values 1 and - 1. The discrete algorithm can
be obtained by using Eqs. (2.27) and (2.29).
B. Boundary Conditions
Dynamical equations, i.e., updating rule, which yield solutions for differential
equations have been developed. However, the initial or boundary conditions of the
NEURAL COMPUTINGOFDIFFERENTIALEQUATIONS 125
differential equations have not been considered. In this section, the method of
incorporating the initial or boundary conditions in the updating rule is discussed.
First, it is shown that initial conditions can be considered as boundary condi-
tions. As described in Section III.B, consider a first-order differential equation of
one independent variable x. The initial condition may be written as u(xO) = ZQ,,
where u is the dependent variable and x0 is the initial point. In this case, the final
point x1 should be specified to solve the differential equation numerically. There-
fore, if 0 and N+ 1, which correspond to discretized coordinates of x0 and x1 + h,
are selected as the boundary points, the variables of the problem are U(l),
WI, .**,U(N) and the boundary conditions are given by U(0) and U(N+ 1). In this
case, U(N+ 1) is not a fixed value but varies subject to the original finite difference
equation. Therefore, U(N+ 1) can be written as a function of U(O), U(l), .... U(N).
Boundary conditions including initial conditions can be incorporated as follows.
First, select the boundary for a problem. Then, the domain surrounded by the
boundary is discretized. The boundary values are assigned to the discrete variables
for the boundary points. However, the dynamical equations presented in
Section 1V.A are chosen only for the variables inside the boundary. The values for
the boundary variables are supplied by utilizing the finite difference equations at the
boundary at each iteration.
phenomena [ 101 or an electronic method. Finally, the updated values for the
neurons represented by a two-dimensional optical wave are fed back to the incident
port of the optical system. Gain is included in the feedback loop to compensate
the loss. The solution of the differential equation is obtained when the iteration
converges. The above scheme of optical implementation is illustrated in Fig. 4.
Optical implementation of the interconnection based on volume holography
utilizing photorefractive crystals has been extensively investigated [4]. The motiva-
tion for using volume holograms comes from their ability to store information in
three dimensions. Using volume holographic technique, it is possible to impress a
grating pattern into such a crystal so as to transfer a light signal from an input
point to another point in an output plane. Another input-output connection can
similarly be implemented by a second grating oriented inside the crystal at a dif-
ferent angle from the first one. The maximum number of connections that can be
specified if a volume hologram is used is upperbounded by the degrees of freedom
available in the volume of the crystal which is equal to V,/A3, where VH is the
volume of the hologram and 2 is the optical wave length. As a numerical example,
consider an optical wave with wavelength lpm, a 1 cm square input plane. and 1 cm
cube volume holograms. Then, the total number of neurons which can be processed
in the system is lo’, and the total number of interconnections becomes 10”.
The proposed optical system fully exploits the advantages of optical parallel com-
puting. It makes use of a two-dimensional optical wave to represent discretized
independent variables. High-capacity volume holograms are used to interconnect a
huge number of variables. Furthermore, the system is one of the most parallel
systems that can be implemented to solve the finite difference equations. The total
MULTIPLICATION
AND SUM
INPUT
. . . OUTPUT
. v . “‘, ” . . . .. . . . . . .
@&k) \\lr6kk’ . ydk+” .
. r
INTERCONNECTlON THRESHOL3
(VOLUME HOLOGRAM)
1.
FEEDBACK
FIG. 4. Optical system for solving differential equations. Mirrors and beam splitters for the optical
wave arc not drawn.
NEURAL COMPUTING OF DIFFERENTIAL EQUATIONS 127
VI. DISCUSSIONS
The main purpose of this paper is to introduce the concept of neural networks
and to develop neural algorithms for solving differential equations. The detailed
analysis on the algorithms will be published separately. However, general features
of the neural algorithms will be discussed in this section. The remarkable collective
computational properties such as recognition from partial input, robustness, and
error-correction capability have been demonstrated recently. As an example,
Hoplield introduced associative memory in the neural network described in
Section 1I.A. The memory vectors @“I, where i represents the neurons and m
describes the number of memory vectors, are stored in the weight of the inter-
connections as T.. = C BP’B!“’ summed over the memory vectors. The computa-
tional function o/the Hdplieid neural network is to reconstruct the whole original
memory given an initial partial memory. In simulations, correct convergence was
obtained for the total number of four memory vectors in the case of 30 neurons
when the initial trial memory vector differed from the original memory vector by
seven bits [6]. Therefore, the radius of convergence of the neural network is in
general very large. This property is very important in solving differential equations.
In conventional algorithms, a good initial guess of the solution is essential in
solving differential equations effectively. However, the radius of convergence in
neural algorithms is expected to be very large, and thus, selection of an initial trial
solution is not so sensitive. This has been demonstrated in the results of numerical
simulation shown in Figs. 2 and 3.
Another important characteristic of the neural network is nonlinear dynamics
introduced by nonlinear thresholding. This property is expected to yield a large
processing gain in neural computation. As an example, for the neural associative
memory described above in this section, the number of iterations which yielded the
correct original memory was approximately 5. Therefore, it is expected that the rate
of convergence in neural algorithms is much faster than the conventional algo-
rithms. The second property of neural algorithms which affects the rate of con-
vergence is the collective processing inherent in the neural networks. This is because
the greater the problem size, the more neurons participate collectively in solving the
differential equations. To illustrate this, consider a stiff differential equation. To
solve this equation, the number of mesh points should be very large so that dis-
cretization of the equation is accurate and smooth. This means the total time, i.e.,
the rate of convergence, required for solving the equation increases rapidly if the
conventional serial algorithms are used. In the neural algorithms proposed in this
paper, the equation can be solved by increasing the total number of neurons.
However, the time for solving the equation actually decreases even if the total
128 LEE AND KANG
number of neurons increases, becauseat each iteration all the neurons interact with
each other in parallel, i.e., collectively. This point has been demonstrated by con-
sidering the specific example described in Fig. 2. We solved the same problem
assuming all the same conditions except decreasing the total number of mesh points
between 0 and 1. Fifty mesh points instead of 100 mesh points were chosen, and h
was 0.02 in this case. The constant AtA was 0.0001. The result of the numerical
simulation is shown in Fig. 5. Comparing Figs. 2 and 5, it is clear that as the
number of neurons increase, the rate of convergence decreases.Specifically, the rate
of convergence for this example decreased by half when the number of neurons
increased twice. The third property which decreasesthe time for solving differential
equations is the optical feedback used in implementing the neural algorithms. This
means communication between processors, i.e., neurons, is performed by the fastest
physical method. Therefore, optical implementation reduces the computation time
even if the total number of iterations becomes very large.
In the above paragraph, the rate of convergence for the stiff differential equations
has been investigated by changing the total number of mesh points, i.e., neurons.
However, it is also important to study the rate of convergence for the stiff differen-
tial equations maintaining the same total number of mesh points and changing the
degree of stiffness. As an example, a differential equation du/dx = --Au was
investigated in the interval 0 <x < 1 with initial condition u(0) = 1, where ,? is a
positive constant and serves as a stiffness parameter. The updating rule for this case
can be obtained from Eq. (3.8)
U,(k+ l)= Ub(k)+2AtA{[(1/2h)* (U,+,(k)-2U,(k)
(6.1)
2.7182
2.
k=SO
k=l
I I >
0. 0.5 1. x
FIG. 5. Convergence of the same problem as described in Fig. 2 when the number of mesh points
was decreased to 50.
NEURAL COMPUTING OF DIFFERENTIAL EQUATIONS 129
0. ~, I)
0.5 1. x
-1. -
-2.
FIG. 6. Convergence of the differential equation u’ = -0.5~. The converged solution was obtained at
k= 138.
which is the same as Eq. (3.15) except for the A* factor. The boundary conditions
are given by
U-,(k) = U,(k) + 2hMJ,,, (6.2)
U,, I(k) = U,- 1%) - 2hJU&), (6.3)
U,+,(k) = U,(k) - 2hA[U,- ,(k) - ZhLU,(k)]. (6.4)
FIG. 7. Convergence of the differential equation U’ = --u. The converged solution is not good
compared with Fig. 6.
130 LEEANDKANG
-2.J
FIG. 8. Convergence of the differential equation u’ = - 321.The converged solution was obtained at
k = 156. However, the solution looks very bad compared with Figs. 6 and 7.
Computer simulations based on Eqs. (6.1~(6.4) have been carried out for three dif-
ferent values of A =OS, 1, and 3. The other parameters were chosen as h =O.Ol,
N= 100, and AtA = 0.0001 for the three different values of A, which are the same
parameters as those used in Fig. 2. The same initial trial solution for three different
values of A were chosen as Vi(O) = 1, i= 1, .... 100. The numerical results are shown
in Figs. 68. Figure 6 represents the case of A= 0.5, and the algorithm converged at
k = 138. Figure 7 shows the result of A.= 1. The solution in this case converged at
k = 140. However, as shown in the figure the converged solution is not good com-
pared with the case of il= 0.5. When the stiffness parameter 1 was increased to 3,
the algorithm converged at k = 156 as shown in Fig. 8. However, this solution is
much worse than the case of A= 1. From this simple example, we see that the rate
of convergence and effectiveness of the proposed neural algorithms deteriorates
when the differential equations become stiffer and the total number of mesh points,
i.e., neurons, are kept the same.
Finally, the neural algorithms proposed in this paper have several advantages
over the conventional parallel algorithms. First, the neural algorithms are much
simpler than others. The only algebraic operations required in the neural algo-
rithms are addition and multiplication. Inverse or other complicated logic opera-
tions are not needed. Second, the neural algorithms are the most parallel, compared
with conventional parallel algorithms. For example, the algorithms for a hypercube
have limited parallelism due to a small number of interconnections. Simplicity in
implementation of the neural algorithms is the third advantage.
VII. SUMMARY
Highly parallel neural algorithms for solving finite difference equations have been
developed. They can be applied to a wide range of differential equations. Nonlinear
NEURAL COMPUTING OF DIFEERENTIAL EQUATIONS 131
ACKNOWLEDGMENTS
The authors acknowledge valuable comments by the reviewers. H. Lee would like to acknowledge
support by the National Science Foundation Grant No. EET-8810288 and the Center for Advanced
Technology in Telecommunications. We thank W. S. Baek for computer simulation.
REFERENCES