Chapter 4
Chapter 4
4.1. Introduction
Artificial neural networks (ANNs) are, as their name indicates,
computational networks which attempt to simulate, in a gross manner, the
networks of nerve cell (neurons) of the biological (human or animal) central
nervous system. This simulation is a gross cell-by-cell (neuron-by-neuron,
element-by-element) simulation [58]. The idea of ANNs is not to replicate the
operation of the biological systems but to make use of what is known about
the functionality of the biological networks for solving complex problems
[59]. A complex system may be decomposed into simpler elements (nodes in
ANNs), in order to be able to understand it. Also, simple elements may be
gathered to produce a complex system. ANN allows using very simple
computational operations (additions, multiplication and fundamental logic
elements) to solve complex, mathematically ill-defined problems, and nonlinear
problems. ANN models go by many names such as connectionist, parallel
distributed processing models, and neuromorphic system. Whatever the name,
all these models attempt to achieve good performance via dense
interconnection of simple computational elements [60].
40
Chapter Four Artificial Neural Networks
many signals simultaneously. These signals may either assist (excite) or inhibit
the firing of the neuron. This simplified mechanism of signal transfer
constituted the fundamental step of early neuron computing development and
the operation of the building unit of ANNs [59].
3.1.1
42
Chapter Four Artificial Neural Networks
4.5.Models of a Neuron
A neuron is an information-processing unit that is fundamental to the
operation of a neural network. The block diagram of Fig. (3.2) shows the
model of a neuron, which forms the basis for designing artificial neural
network. Here we identify three basic elements of the neuronal model [64].
44
Chapter Four Artificial Neural Networks
Bias
x
1 w
1 b
x n y
2 w
e f(net)
Inpu 2 Outpu
t
ts
x w Summing Activation t
3 3 junction function
Synaptic
w
x
n nweights
( )
n
y=f ( net )= ∑ wi x i+ b (4.2)
i=1
45
Chapter Four Artificial Neural Networks
In this case f =1, the transfer function is shown in Fig. (4.3- a) where:
(∑ )
n
y=f ( net )= wi x i+ b =net (4.3)
i=1
The expression of the output y in the linear case can be written as:
{ }
1 if ∧net ≥+1/2
y= net if ∧+1 /2> net>−1 /2 ( 4.4 )
0 if ∧net ≤−1/2
{
y= 1 if ∧net ≥ 0 (4.5)
o if ∧net< 0 }
3. Sigmoid function
In this case the net neuron input is mapped into values between +1 and
0.The neuron transfer function is shown in Fig. (4.3 – c) and is given by:
1
y= (4.6)
1+exp (
−net
T )
where T is a constant.
4. Tansigmoid function
In this case the net neuron input is mapped into values between +1 and
-1. The neuron transfer function is shown in Fig. (4.3 – d) and is given by:
1−exp (−net )
y=tanh ( net )= (4.7)
1+ exp (−net )
46
Chapter Four Artificial Neural Networks
y y
+1 +1
net net
0 0
-1 -1
a b
y
y +1
+1
0.5 net
net 0
0 -1
c d
47
Chapter Four Artificial Neural Networks
Neural network
48
Chapter Four Artificial Neural Networks
net11 y
f f f
x1 1 2 3
x2
x3 net12 y
xn f f f
1 2 3
net1m1
f f f
1 2 3
[ ] []
2 2 2 2
w 1,1 w 1,2 ⋯ w1 ,m 2 b1
2 2 2 2
W 2= w 2,1 w 2,2 ⋱ w2 ,m 2 andb 2= b 2 (4.8)
⋮ ⋮ ⋮ ⋮
2 2 2
w m 1,1 w m 1,2 ⋯ wm 1 ,m 2 b 2m 2
50
Chapter Four Artificial Neural Networks
the input weights until acceptable network accuracy is reached. So, Training
consists of presenting input and output data to the network [69].
b- Unsupervised Learning:
This type does not use external influence to adjust its weights. Instead,
it internally monitor their performance and look for regularities or trends in
the input signals, and make adaptations according to the function of the
network. Even without being told whether it is right or wrong, the network
still must have some information about how to organize itself. This
information is built into the network topology and learning rules [69].
There are four basic types of learning rules: Hebbian, Delta Rule,
Boltzmann, and competitive learning.
Δ wij =η y j ( n ) x i ( n ) (4.9)
Where n is a time step, ηis appositive constant (learning rate coefficient) and
51
Chapter Four Artificial Neural Networks
e j ( n ) =d j ( n )− y j (n)(4.11)
In other words, the delta rule may be stated as: "The adjustment made
to a synaptic weight of a neuron is proportional to the product of the error
signal and the input signal of the synapse in question." [64].
Whereη is the learning rate, and ρijand ρij are the correlationsbetween the
states of units i and j when the network operates in the clamped mode and
free-running mode, respectively.
{
∆ wij = η ( xi −wij ) if neuron j wins the competition (4.14)
0 if neuron jloses the competition
Note that only the weights of the winner unit get updated.The effect of
this learning rule is to move the stored patternin the winner unit (weights) a
little bit closer to the input pattern [61].
Where d is the desired output for the given input and y is the output produced
by the neural network.
A total MSE sums the error over all individual examples and all the
output neurons in the network.
( )
1 m
∑ ∑ ( d j k − y jk )
2
MSE= / p . m(4.16)
k=1 i=1
Where y j(k)is the output value of the j th output of the network when k thtraining
exampleis presented; d j(k) is the desired output for the j th output for thek th
training example; p is the number of training examples in the training data;
and m is the number of output neurons in the neural network. Root-mean
square error (RMSE) is the root of the MSE [72].
Where:
y j: Input actual output of hidden neuron j .
f : Activation function
Step 5: Sums weighted output of hidden layer and apply activation function
to compute output of output layer.
y k =f (Σ y i wij ) ( 4.18 )
y k : The output of neuron k .
55
Chapter Four Artificial Neural Networks
Step 8: Sums delta input for each hidden unit and calculate error term
δ j=Σ k δ k w jk f ( Σ i x i wij ) ( 4.21)
56
Chapter Four Artificial Neural Networks
Begin
Network initiation
YES
End
57
Chapter Four Artificial Neural Networks
performance of a trained network involves two main criteria: (1) how well the
neural network recalls the predicted response from data sets used to train the
network (called the recall step); and (2) how well the network predicts
responses from data sets that were not used in training (called the
input pattern is introduced to the trained network. The network then attempts
sufficiently, the network output will differ only slightly from the actual output
58
Chapter Four Artificial Neural Networks
data. In testing the network, the weight factors are not changed: they are
frozen at their last values when training ceased. Recall testing is so named
because it measures how well the network can recall what it has learned.
can generalize what it has learned and form rules with which to make
decisions about data it has not previously seen. The network generalizes well
when it sensibly interpolates these new patterns [78].
4.13. Artificial Neural Network models of the present study
The neural network is implemented using neural network tool box that
is available in Matlab 2007 software). This program implements several
different neural network algorithms, including backpropagation algorithm.
The configuration and training of neural networks is a trial–and–error process
due to such undetermined parameters as the number of hidden layers, the
number of nodes in the hidden layers, the learning parameter, and the number
of training patterns.
b) Immersion Duration
(h) 2) Inhibition
efficiency (%)
c) ACX concentration
(mg/l)
60
Chapter Four Artificial Neural Networks
c) ACX concentration
(mg/l)
61
Chapter Four Artificial Neural Networks
Where:
an and tn ׳are the normalized target and output data respectively,
sta and stt are the standard deviation of the target and output data
respectively. N is the number of the data in target vector.
In present study the network was tested with two hidden layers with
different activation functions for both of first hidden layer, second hidden
62
Chapter Four Artificial Neural Networks
layer and output layer. Different numbers of nodes in each hidden layer from
(3 to14) nodes are used.
Table (4.2) and (4.3) showed how the number of nodes in the two
hidden layers for the first and second models respectively affects the response
of network with different types and arrangements of activation functions. The
training function selected here is conjugate gradient backpropagation with
Polak-Ribiere updates (TRAINCGP).
Table (4.2) Training MSE & testing regression for the two hidden layer networks
with different types and arrangements of activation function for first model
MSE
(tansig, (tansig, (tansig, (purelin, (tansig,
Network (train)
purelin, tansig, tansig, tansig, purelin,
type &
purelin) purelin) tansig) tansig) tansig)
R(test)
MSE 0.006867 0.004648 0.866685 0.267799 0.269498
3-3
R(test) 0.99187 0.99444 0.43771 0.97524 0.98271
MSE 0.006904 0.002827 0278386 0.605891 1.39711
3-4
R(test) 0.99185 0.9962 0.96397 0.71592 0.59332
MSE 0.005792 0.001373 1.16271 0.349531 1.08988
4-6
R(test) 0.99061 0.99736 0.83163 0.96549 0.36873
MSE 0.002436 0.000367 1.08849 0.265321 0.425579
5-8
R(test) 0.996 0.99936 0.58482 0.97633 0.97621
MSE 0.003906 0.000964 0.265067 0.265536 0.634028
6-9
R(test) 0.99559 0.99693 0.9754 0.97433 0.62809
MSE 0.001613 0.000372 0.26474 1.84901 0.265453
7-8
R(test) 0.99499 0.99839 0.95426 0.66551 0.975
MSE 0.010721 0.000538 0.26506 0.265586 0.265967
5-11
R(test) 0.9926 0.99466 0.97785 0.97607 0.97361
MSE 0.001418 0.000954 0.264058 0.267059 0.265829
9-4
R(test) 0.99545 0.99866 0.95685 0.97839 0.97314
63
Chapter Four Artificial Neural Networks
Table (4.3) Training MSE & testing regression for the two hidden layer networks
with different types and arrangements of activation function for second model.
MSE
(tansig, (tansig, (tansig, (purelin, (tansig,
Network (train)
purelin, tansig, tansig, tansig, purelin,
type &
purelin) purelin) tansig) tansig) tansig)
R(test)
MSE 0.005669 0.004197 0.009120 0.0148931 0.007648
3-4
R(test) 0.9869 0.9866 0.98015 0.97807 0.98064
MSE 0.005890 0.004286 0.004875 0.006775 0.005282
4-3
R(test) 0.98173 0.98382 0.98128 0.98081 0.9838
MSE 0.004994 0.002972 0.006238 0.137416 0.004874
5-4
R(test) 0.99084 0.99513 0.98043 0.87353 0.984
MSE 0.003719 0.001376 0.003583 0.143473 0.726964
6-5
R(test) 0.91766 0.96658 0.9611 0.966 0.58213
MSE 0.003210 0.001467 0.001734 0.050627 0.004298
7-7
R(test) 0.986 0.98608 0.97633 0.90539 0.98636
MSE 0.003764 0.001302 0.57335 0.002855 0.003853
7-9
R(test) 0.98968 0.99542 0.759446 0.98378 0.98665
4-11 MSE 0.005060 0.001666 0.001754 0.002030 0.006140
64
Chapter Four Artificial Neural Networks
From the Table above, it can be seen that the networks with (tansig, tansig,
purelin), give the best performance and testing regression. It can be seen that
the network (7-9) gives best performance and regression for both training and
testing. The architecture of neural network for this model is given in Fig.
(4.10).The results obtained of pickling test by artificial neural network
predictions were found shown to be in good agreement with the experimental
values of the present study. i.e. correlation coefficient, R=0.99542 as shown
in Fig.(4.11).
Table (4.4) showed the statistical parameters used to give a description
for good training for artificial neural networks modeling in the two proposed
models.
Table (4.4) Statistical parameters for first and second models.
65
Chapter Four Artificial Neural Networks
% HCl CR
[ ACX]
Time %IE
Input layer
Output layer
(3 nodes)
First hidden layer (2 nodes)
(5 nodes)
Second hidden
layer (8 nodes)
66
Chapter Four Artificial Neural Networks
300
Experimental results
250 ــــــــــــBest linear fit
150
100
50
0
0 50 100 150 200 250 300
Experimental Results
% HCl Scale
[ACX] density
removal
Time [Fe]
Content
67
Chapter Four Artificial Neural Networks
70
Experimental results
60 ــــــــــــBest linear fit
50 R=0.9954
2
ANN Results
40
30
20
10
0
10 15 20 25 30 35 40 45 50 55 60
Experimental Results
The structure of the program that used to train ANN is illustrated in the
following chart:
68
Chapter Four Artificial Neural Networks
Begin
Training ANN
YES
End
69