0% found this document useful (0 votes)
97 views30 pages

Chapter 4

Artificial neural networks (ANNs) attempt to simulate the networks of biological neurons in the brain. ANNs are composed of simple computational elements called nodes that are densely interconnected. They can be used to solve complex problems by decomposing them into simpler elements. A key characteristic of ANNs is their ability to learn by adjusting the weights between nodes based on training data. The chapter describes the basic structure and functioning of biological neurons and synapses, and introduces models of artificial neurons.

Uploaded by

Eng Ula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views30 pages

Chapter 4

Artificial neural networks (ANNs) attempt to simulate the networks of biological neurons in the brain. ANNs are composed of simple computational elements called nodes that are densely interconnected. They can be used to solve complex problems by decomposing them into simpler elements. A key characteristic of ANNs is their ability to learn by adjusting the weights between nodes based on training data. The chapter describes the basic structure and functioning of biological neurons and synapses, and introduces models of artificial neurons.

Uploaded by

Eng Ula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

CHAPTER FOUR

Artificial Neural Networks

4.1. Introduction
Artificial neural networks (ANNs) are, as their name indicates,
computational networks which attempt to simulate, in a gross manner, the
networks of nerve cell (neurons) of the biological (human or animal) central
nervous system. This simulation is a gross cell-by-cell (neuron-by-neuron,
element-by-element) simulation [58]. The idea of ANNs is not to replicate the
operation of the biological systems but to make use of what is known about
the functionality of the biological networks for solving complex problems
[59]. A complex system may be decomposed into simpler elements (nodes in
ANNs), in order to be able to understand it. Also, simple elements may be
gathered to produce a complex system. ANN allows using very simple
computational operations (additions, multiplication and fundamental logic
elements) to solve complex, mathematically ill-defined problems, and nonlinear
problems. ANN models go by many names such as connectionist, parallel
distributed processing models, and neuromorphic system. Whatever the name,
all these models attempt to achieve good performance via dense
interconnection of simple computational elements [60].

4.2. Brief Historical Review


ANNs research has experienced three periods of extensive activity. The
first peak in the 1940s was due to McCulloch and Pitts pioneering work. The
second occurred in the 1960s with Rosenblatt’s perceptron convergence
theorem and Minsky and Papert’s work showing the limitations of a simple
perceptron since the early 1980s, ANNs have received considerable renewed
interest. The major developments behind this resurgence include Hopfield’s

40
Chapter Four Artificial Neural Networks

energy approach in 1982 and the backpropagation learning algorithm first


proposed by Werbos, and then popularized by Rumelhart et al. in 1986 [61].

4.3. Biological Neural Networks


The brain consists of a large number (approximately 1011) of highly
connected element (approximately 104 connections per element) called
neurons. A neuron (or nerve cell) is a special biological cell that processes
information [58]. It composes of a cell body (or soma) and two types of out-
reaching tree-like branches: the axon and the dendrites (Fig. 4.1-a). The cell
body has a nucleus that contains information about hereditary traits and
a plasma that holds the molecular equipment for producing material needed by
the neurons. A neuron receives singles (impulses) from other neurons through
its dendrites (receivers) and transmits singles generated by its cell body along
the axon (transmitter), which eventually branches into strands and substrands.
At the terminals of these strands are the synapses. A synapse is an elementary
structure and functional unit between two neurons (an axon strand of one
neuron and a dendrite of other) [61]. An impulse travels within the dendrites
and through the cell towards the pre-synaptic membrane of the synapse. Upon
arrival at the membrane, a neurotransmitter (chemical) is released from the
vesicles in quantities proportional to the strength of the incoming signal (Fig.
4.1-b). The neurotransmitter diffuses within the synaptic gap towards the
post-synaptic membrane, and eventually into the dendrites of neighboring
neurons, thus forcing them (depending on the threshold of the receiving
neuron) to generate a new electrical signal. The generated signal passes
through the second neuron(s) in a manner identical to that just described. The
amount of signal that passes through a receiving neuron depends on the
intensity of the signal emanating from each of the feeding neurons, their
synaptic strengths, and the threshold of the receiving neuron. Because a
neuron has a large number of dendrites/ synapses, it can receive and transfer
41
Chapter Four Artificial Neural Networks

many signals simultaneously. These signals may either assist (excite) or inhibit
the firing of the neuron. This simplified mechanism of signal transfer
constituted the fundamental step of early neuron computing development and
the operation of the building unit of ANNs [59].

Fig. (4.1) a- Interconnection of biological neural nets [58]


b- Mechanism of signal transfer between two biological neurons [59].

3.1.1
42
Chapter Four Artificial Neural Networks

4.4. Characteristics of Neural Networks


Neural networks have many powerful characteristics which are useful
in problem solving of many fields and applications. Specifically,
characteristics of a neural net include self-organization, fault tolerance,
adaptive learning, and most importantly the ability to deal effectively with the
contradictions, errors, and inexactitudes of real world knowledge. Thus,
neural nets have the promise to excel beyond any current systems [62].

The benefits of the neural network can be summarized as follows [63]:

1. Nonlinearity: A neuron is basically a nonlinear device. Consequently, a


neural network, made up of an inter connection of neurons, which is itself
nonlinear. Moreover, the nonlinearity is of a special kind in the sense that
it is distributed throughout the network.
2. Input-output mapping: A popular paradigm of learning called supervised
learning involves the modification of the synaptic weights of a neural
network by applying a set of training samples. Each sample consists of a
unique input signal and the corresponding desired response. The
previously applied training samples may be reapplied during the training
session, usually in a different order. Thus, the network learns from the
samples by constructing an input-output mapping for the problem at hand.
3. Adaptivity: Neural networks have a built-in capability to adapt their
synaptic weights to changes in the surrounding environment. In particular,
a neural network trained to operate in a specific environment can be easily
retrained to deal with minor changes in the operating environmental
conditions. Moreover, when it is operating in a nonstationary environment
a neural network can be designed to change its synaptic weights in real
time.
43
Chapter Four Artificial Neural Networks

4. Fault tolerance: A neural network, implemented in hardware form, has the


potential to be inherently fault tolerant in the sense that its performance is
degraded gracefully under adverse operating. For example, if a neuron or
its connecting links are damaged, recall of a stored pattern is impaired in
quality. However, owing to the distributed nature of information in the
network, the damage has to be extensive before the overall response of the
network is degraded seriously.

4.5.Models of a Neuron
A neuron is an information-processing unit that is fundamental to the
operation of a neural network. The block diagram of Fig. (3.2) shows the
model of a neuron, which forms the basis for designing artificial neural
network. Here we identify three basic elements of the neuronal model [64].

1. A set of synapses or connecting links, each of which is characterized by a


weight or strength of its own. Specifically, a signal ( x i) at the input of
synapse (i) connected to neuron ( j) is multiplied by the synaptic weightw ij.
2. An adder for summing the input signals, weighted by the respective
synapses of the neuron.
3. An activation or transfer function for limiting the amplitude of the output
of a neuron. Typically, the normalized amplitude range of the output of a
neuron is written as the closed unit interval [0, 1] or alternatively [-1, 1].

44
Chapter Four Artificial Neural Networks

Bias
x
1 w
1 b
x n y
2 w
e f(net)
Inpu 2 Outpu
t
ts
x w Summing Activation t
3 3 junction function

Synaptic
w
x
n nweights

Fig. (4.2) Structure of a Single Artificial Neuron [65]

The neuronal model of Fig. (4.2) also includes an externally applied


bias, denoted by(b). The bias (b) has the effect of increasing or lowering the
net put of the activation function, depending on whether it is positive or
negative, respectively [64].

Generally, inputs, weights, thresholds (bias) and neuron output could


be real value or binary or bipolar. All inputs are multiplied by their
corresponding weights and added together to form the net input to the neuron
called net. The mathematical expression for net can be simply written as:
n
net =∑ wi xi +b=w 1 x 1 +w 2 x 2+ w3 x 3+.......+ wn xn + b(4.1)
i=1

The neuron behaves as an activation or mapping function f(net) to


produce an output y which can be expressed as:

( )
n
y=f ( net )= ∑ wi x i+ b (4.2)
i=1

Where f is called the neuron activation function or the neuron transfer


function [65].

Some examples of the neuron activation functions are [65]:

45
Chapter Four Artificial Neural Networks

1. Linear activation function

In this case f =1, the transfer function is shown in Fig. (4.3- a) where:

(∑ )
n
y=f ( net )= wi x i+ b =net (4.3)
i=1

The expression of the output y in the linear case can be written as:

{ }
1 if ∧net ≥+1/2
y= net if ∧+1 /2> net>−1 /2 ( 4.4 )
0 if ∧net ≤−1/2

2. Threshold activation function

In this case, the output is hard limited to two values +1 and -1


(sometimes 0) depending on the sign of net as shown in Fig. (4.3 - b). The
expression of the output y in this case can be written as:

{
y= 1 if ∧net ≥ 0 (4.5)
o if ∧net< 0 }
3. Sigmoid function

In this case the net neuron input is mapped into values between +1 and
0.The neuron transfer function is shown in Fig. (4.3 – c) and is given by:
1
y= (4.6)
1+exp (
−net
T )
where T is a constant.

4. Tansigmoid function

In this case the net neuron input is mapped into values between +1 and
-1. The neuron transfer function is shown in Fig. (4.3 – d) and is given by:

1−exp (−net )
y=tanh ( net )= (4.7)
1+ exp (−net )

46
Chapter Four Artificial Neural Networks

y y
+1 +1
net net
0 0
-1 -1
a b
y
y +1
+1
0.5 net
net 0
0 -1
c d

Fig. (4.3)Activation Functions [65].

4.6. Network Topologies


Based on the connection pattern (architecture), ANNs can be grouped
into two categories (see Fig.(4.4)) [66]:
 Feed-forward networks, where the data flow from input to output units
is strictly feedforward. The data processing can extend over multiple
(layers of) units, but no feedback connections are present, that is,
connections extending from outputs of units to inputs of units in the
same layer or previous layers.
 Recurrent networks that do contain feedback connections. Contrary to
feed-forward networks, the dynamical properties of the network are
important. In some cases, the activation values of the units undergo a
relaxation process such that the network will evolve to a stable state in
which these activations do not change anymore. In other applications,
the changes of the activation values of the output neurons are
significant, such that the dynamical behavior constitutes the output of
the network.

47
Chapter Four Artificial Neural Networks

Neural network

Feed-forward Network Recurrent/feedback networks

Radial Basis Functioncompetitive


Single layer perception nets networks ART models

Multilayer perception Hopfield network

Fig. (4.4) A Taxonomy of Feed-Forward and Recurrent/Feedback Network


Architectures [61]

4.7. Neural Networks Architecture


The management of neurons into layers and the connection patterns
within and between layers is called the net architecture.

Since ANN is frequently used as nonlinear function approximators, the


activation function f is usually a nonlinear function. The most common type
of ANN is the multi-layer feed forward neural network which consists of
group of interconnected neurons organized in layers: input layer, hidden layer
and output layer where each layer consists of a group of neurons [65]. Each
layer has a weight matrix W , a bias vector b , and an output vector y .To
distinguish between the weight matrices, output vectors, etc., for each of these
layers in our Figs., we append the number of the layer as a superscript to the
variable of interest. You can see the use of this layer notation in the three-
layer network shown below, and in the equations at the bottom of the Fig.
[4.5].The number of hidden layers, number of neurons in each layer totally
depends on the complexity of the problem being solved by the network. [65].

48
Chapter Four Artificial Neural Networks

Input Layer 1 Layer2 Layer3

net11 y
f f f
x1 1 2 3
x2
x3 net12 y
xn f f f
1 2 3
net1m1
f f f
1 2 3

Fig. (4.5) Architecture of Multilayer Feed forward Neural Network [65].

The network shown above is feed forward because signals propagate


only in a forward direction from the input nodes to the output nodes and no
signals are allowed to be fed-back among the neurons. This structure is
commonly used in system identification and nonlinear function
approximation applications [65]. It has n inputs, m1 neurons in the first layer,
m 2neurons in the second layer, etc. It is common for different layers to have

different numbers of neurons.


A constant input 1 is fed to the biases for each neuron. Note that
the outputs of each intermediate layer are the inputs to the following layer
[67].Typically; the neurons in each layer of the network have as their inputs
the output signals of the preceding layer only. The set of output signals of the
neurons in the output layer of the network constitutes the overall response of
the network to the activation pattern supplied by the source nodes in the input
layer [63]. Thus layer 2 can be analyzed as a one-layer network with m1
inputs, m 2neurons, and an m1 × m2 weight matrix W 2[67]. The layer 2input
vector elements enter this layer through its weight matrix W 2, which its weight
and bias matrices can be written as:
49
Chapter Four Artificial Neural Networks

[ ] []
2 2 2 2
w 1,1 w 1,2 ⋯ w1 ,m 2 b1
2 2 2 2
W 2= w 2,1 w 2,2 ⋱ w2 ,m 2 andb 2= b 2 (4.8)
⋮ ⋮ ⋮ ⋮
2 2 2
w m 1,1 w m 1,2 ⋯ wm 1 ,m 2 b 2m 2

The input to layer 2 is y 1; the outputis y 2. Now that we have identified


all the vectors and matrices of layer 2, wecan treat it as a single-layer network
on its own. This approach can be takenwith any layer of the network.
The layers of a multilayer network play different roles. A layer that
produces the network output is called an output layer. All other layers are
called hidden layers. The three-layer network shown earlier has one output
layer (layer 3) and two hidden layers (layer 1 and layer 2). Some authors refer
to the inputs as a fourth layer [67].
4.8. Learning in Artificial Neural Network
Learning is a significant ability that is closely associated to adaptation.
It can be defined as the modification of behavior following upon and induced
by interaction with the environment and as a result of experiences leading to
the establishment of new pattern of response to external stimuli [68]. In the
case of ANNs learning the process of changing the weight in order to achieve
some desirable results. Mainly there are two types of learning: supervised and
unsupervised.
a- Supervised Learning:
The vast majority of artificial neural network solutions have been
trained with supervision. In this mode, the actual output of a neural network is
compared to the desired output. Weights, which are usually randomly set to
begin with, are then adjusted by the network so that the next iteration, or
cycle, will produce a closer match between the desired and the actual output.
This global error reduction is created over time by continuously modifying

50
Chapter Four Artificial Neural Networks

the input weights until acceptable network accuracy is reached. So, Training
consists of presenting input and output data to the network [69].
b- Unsupervised Learning:
This type does not use external influence to adjust its weights. Instead,
it internally monitor their performance and look for regularities or trends in
the input signals, and make adaptations according to the function of the
network. Even without being told whether it is right or wrong, the network
still must have some information about how to organize itself. This
information is built into the network topology and learning rules [69].

There are four basic types of learning rules: Hebbian, Delta Rule,
Boltzmann, and competitive learning.

4.8.1. Hebbian Learning Rule


In the brain, the learning is partially done by change in the synaptic
strength. The first person to comment upon this was the psychologist Donald
Hebb in 1949.An important property of this rule is that learning is done
locally, that is, the change in synapse weight depends only on the activities of
the two neurons connected by it [70]. In his own words, " when an axon of
cell A is near enough to excite a cell B and repeatedly or persistently takes
part in firing it, some growth process or metabolic changes takes place in
one or both cells such that A's efficiency as one of the cells firing B, Is
increased"[64].

To formulate Hebbian learning in mathematical terms, consider a


synaptic weightw ijof neuron j with input signal x i (n) and output signal y j(n),
the simplest form of Hebbian learning is described by[64]:

Δ wij =η y j ( n ) x i ( n ) (4.9)

Where n is a time step, ηis appositive constant (learning rate coefficient) and

w ij ( n+1)=wij (n)+ Δ wij (n)(4.10)

51
Chapter Four Artificial Neural Networks

Habbian learning mechanisms directly or indirectly form basis of many


of the learning mechanisms used to train ANN.

4.8.2.Delta Learning Rule


In the supervised learning paradigm, the network is given a desired
output for each input pattern. During the learning process, the actual output y
generated by the network may not equal the desired outputd . The basic
principle of delta rules (or error-correction learning rules) is to use the error
signal (d − y)to modifythe connection weights to gradually reduce this error
[56].Instead of the output y j(n) one selectse j ( n ) , whereby e j ( n ) is the difference
between the desired output d j ( n ) and the output actually reached at the time of
learning.

e j ( n ) =d j ( n )− y j (n)(4.11)

According to the delta rule, the adjustment Δ w jk (n) applied to the


synaptic weight w jk at time step n is defined by:

Δ w jk (n)=η ek ( n ) x j (n)( 4.12)

In other words, the delta rule may be stated as: "The adjustment made
to a synaptic weight of a neuron is proportional to the product of the error
signal and the input signal of the synapse in question." [64].

4.8.3. Boltzmann Learning Rule


Boltzmann machines are symmetric recurrent networks consisting of
binary units. By symmetric, we mean that the weight on the connection from
unit i to unit j is equal to the weight on the connection from unit j to unit i.
The objective of Boltzmann learning is to adjust the connection weights so
that the states of visible units satisfy a particular desired probability
distribution. According to this rule, the change in the connection weight w ijis
given by:

∆ wij =η( ρij −ρij )( 4.13)


52
Chapter Four Artificial Neural Networks

Whereη is the learning rate, and ρijand ρij are the correlationsbetween the
states of units i and j when the network operates in the clamped mode and
free-running mode, respectively.

Boltzmann learning can be viewed as a special case of error-correction


learning in which error is measured not as the direct difference between
desired and actual outputs, but as the difference between the correlations
among the outputs of two neurons under clamped and free running conditions
[61].

4.8.4. Competitive Learning Rule


Unlike Hebbian learning (in which multiple output units can be fired
simultaneously), competitive-learning output units compete among
themselves for activation. As a result, only one output unit is active at any
given time. This phenomenon is known as winner-take-all. Competitive
learning has been found to exist in biological neural network [56].A simple
competitive learning rule can be stated as [64]:

{
∆ wij = η ( xi −wij ) if neuron j wins the competition (4.14)
0 if neuron jloses the competition

Note that only the weights of the winner unit get updated.The effect of
this learning rule is to move the stored patternin the winner unit (weights) a
little bit closer to the input pattern [61].

4.9Training of Multi-Layer Perceptron


The Multi-Layer Perceptron (MLP)(its sample is given in Fig.(3.5)) is a
nonparametric technique for performing a wide variety of estimation tasks.
Error Back-Propagation (EBP) is one of the most important and widely used
algorithms for training multilayer perceptron’s [71]. Training process of MLP
networks continues until a certain number of iterations or a desired error rate
is reached. The most common error approximation method used in MLP
53
Chapter Four Artificial Neural Networks

networks is mean square error (MSE) and it is defined by the following


formula:
2
Err=( d− y ) /2(4.15)

Where d is the desired output for the given input and y is the output produced
by the neural network.

A total MSE sums the error over all individual examples and all the
output neurons in the network.

( )
1 m

∑ ∑ ( d j k − y jk )
2
MSE= / p . m(4.16)
k=1 i=1

Where y j(k)is the output value of the j th output of the network when k thtraining
exampleis presented; d j(k) is the desired output for the j th output for thek th
training example; p is the number of training examples in the training data;
and m is the number of output neurons in the neural network. Root-mean
square error (RMSE) is the root of the MSE [72].

4.10. Error Backpropagation (EBP)


MLP have been applied successfully to solve some difficult diverse
problems by training them in a supervised manner with a popular algorithm
known as the EBP algorithm[63].BP algorithm was first defined by Werbos
(1974) and later improved by Rumelhart et al. (1986) [73] as a euphemism for
generalized delta rule. The BP generalize delta rule is a decent method to
minimize the total squared error of the output computed by the net [74].
The Back propagation algorithm is used in layered feed-forward ANNs.
This means that the artificial neurons are organized in layers, and send their
signals “forward”, and then the errors are propagated backwards [75].
The training procedure consists of two main steps: Feed-forward and
back-propagation [73]. During the forward pass the synaptic weights of
network are all fixed. During the backward pass, on the other hand, the
synaptic weights are all adjusted in accordance with the error-correction rule.
54
Chapter Four Artificial Neural Networks

Specifically, the actual response of the network is subtracted from a desired


(target) response to produce an error signal. This error signal is then
propagated backward through the network, against direction of synaptic
connections -hence the name “error back-propagation” [63, 64].

4.11. Backpropagation Training Algorithm


The back propagation (BP) training algorithm involves three stages the
feed forward of the input training pattern, the calculation and back
propagation of the associated weight error and the weight adjustments. The
training algorithm is as shown below [76].
Step 1: Initialize the weights.
Step 2: While squared error is greater than a tolerance, execute steps 3 to 11.
Step 3: For each training pair, do steps 4 to 11.
Step 4: Sums weighted input and apply activation function to compute
output of hidden layer.
y j=f ( Σ x i w ij ) ( 3.17 ) 4.17

Where:
y j: Input actual output of hidden neuron j .

x i :Input signal of input neuron i .

w ij: Weight between input neuron i and hidden neuron j .

f : Activation function

Step 5: Sums weighted output of hidden layer and apply activation function
to compute output of output layer.
y k =f (Σ y i wij ) ( 4.18 )
y k : The output of neuron k .

Step 6: Compute back propagation error


'
δ k =( d k – y k ) f ( Σ y j w jk ) (4.19)
d k : The desired output of neuron k .

f ' : The derivative of the activation function.

55
Chapter Four Artificial Neural Networks

Step 7: Calculate weight correction term


Δw jk ( n )=η δ k y j +α Δw jk ( n−1 )( 4.20 )

Step 8: Sums delta input for each hidden unit and calculate error term
δ j=Σ k δ k w jk f ( Σ i x i wij ) ( 4.21)

Step 9: Calculate weight correction term


Δw jk ( n )=η δ j x i+ α Δw ij ( n−1 ) ( 4.22 )

Step 10: Update weights


w jk ( new )=w jk ( old ) + Δ w jk ( 4.23 )
w ij ( new )=w ij ( old )+ Δwij ( 4.24)

Step 11: Compute the sum squared error.


The specific program flow chart of training network using BP algorithm
shows as Fig. (4.6).

56
Chapter Four Artificial Neural Networks

Begin

Network initiation

Initialize learning sample

Compute network layer output of each neuron

Compute the train error

Modify the network weight

NO Meet the error


precision?

YES

End

Fig. (4.6) BP Algorithm Flow Chart [76]

57
Chapter Four Artificial Neural Networks

The BP stages are illustrated in Fig. (4.7) shown below.

Fig. (4.7)Backpropagation Training Cycle [77]

4.12. Testing The Training Network


An important aspect of developing neural networks is determining how

well the network performs once training is completed. Checking the

performance of a trained network involves two main criteria: (1) how well the

neural network recalls the predicted response from data sets used to train the

network (called the recall step); and (2) how well the network predicts

responses from data sets that were not used in training (called the

generalization step). In the recall step, the network’s performance is evaluated

in recalling specific initial input used in training. Thus, a previously used

input pattern is introduced to the trained network. The network then attempts

to predict the corresponding output. If the network has been trained

sufficiently, the network output will differ only slightly from the actual output

58
Chapter Four Artificial Neural Networks

data. In testing the network, the weight factors are not changed: they are

frozen at their last values when training ceased. Recall testing is so named

because it measures how well the network can recall what it has learned.

Generalization testing is so named because it measures how well the network

can generalize what it has learned and form rules with which to make

decisions about data it has not previously seen. The network generalizes well
when it sensibly interpolates these new patterns [78].
4.13. Artificial Neural Network models of the present study
The neural network is implemented using neural network tool box that
is available in Matlab 2007 software). This program implements several
different neural network algorithms, including backpropagation algorithm.
The configuration and training of neural networks is a trial–and–error process
due to such undetermined parameters as the number of hidden layers, the
number of nodes in the hidden layers, the learning parameter, and the number
of training patterns.

4.13.1. Selection of the Training and Testing Patterns


The total experimental data are divided into two sets: a training set and
a testing set. The training set is used for computing the gradient and updating
the network weights and biases to diminish the training error, and find the
relationship between the input and output parameters. Hence, the learning
process is a crucial phase in NN modeling. The testing set is used to evaluate
the generalization ability of the learning process. In this study the artificial
neural networks modeling are divided into two models. The first ANN model
which is used to predict the corrosion behavior of metal in acid solution, the
network is trained for corrosion rate and inhibition efficiency. The inputs are:
a) The concentration of hydrochloric acid solution (corrosive solution) in % ,
59
Chapter Four Artificial Neural Networks

b) Immersion duration (h), c) Concentration of ACX inhibitor(mg/l), and the


outputs are: 1) The corrosion rate (mpy) and 2) Inhibition efficiency(%).The
randomly selected data used to train and test the first neural network are 77
and 19 respectively. The second model which is used to predict the pickling
rate of scale – containing metal surface in acid solution, the network is trained
for pickling rate. The inputs for the second model are the: a) Concentration of
hydrochloric acid solution (corrosive solution) (%), b) Immersion duration
(h), and c) Concentration of ACX (mg/l), and the output is: 1) The scale
density (mg/cm2) and 2) The amount of iron content dissolved (mg/l). The
numbers of data used to train and test the later network are 70 and 14
respectively. The training and testing data used for the first and second
models were taken from the experimental work results of the present study.
Table (4.1) The Input variables and the predicted output of the proposed
artificial neural networks modeling.
Model Input Variables Predicted
Outputs (Target)
First Model: Prediction a) Hydrochloric acid 1) Corrosion rate
of Corrosion Behavior concentration (%) (mpy)

b) Immersion Duration
(h) 2) Inhibition
efficiency (%)
c) ACX concentration
(mg/l)

Second Model: a) Hydrochloric acid 1) Scale density


Prediction
concentration (%) (mg/cm2)
of Pickling Rate
b) Immersion Duration

60
Chapter Four Artificial Neural Networks

(h) 2) Iron content (mg/l)

c) ACX concentration
(mg/l)

4.13.2. Optimization Technique and Error Estimates


Neural network functions depend non-linearly on their weights and so
the minimization of the corresponding error function requires the use of
iterative non-linear optimization algorithms. These algorithms make use of
the derivatives of the error function with respect to the weights of the
network. Resilient backpropagation algorithm is the optimization technique
employed in building of present ANNs. After completing the training process,
the model is tested using another batch of data which has not been used in the
training set.

The following statistical parameters of significance are calculated at the


end of the training and testing calculations:

1. Mean square error (MSE): is a statistical measure of the differences


between the values of the outputs in the training set and the output values the
network is predicting. The goal is to minimize the value of MSE.
2. Correlation coefficient (R): is a measure of how the actual and predicted
values correlate to each other. The goal is to maximize the value of R.The
correlation coefficient function can be described as following [74]:
'
an∗t n
N −1
R= ( 4.25)
sta∗stt

61
Chapter Four Artificial Neural Networks

Where:
an and tn‫ ׳‬are the normalized target and output data respectively,
sta and stt are the standard deviation of the target and output data
respectively. N is the number of the data in target vector.

4.13.3Number of Hidden Layers, Number of Nodes in Hidden


Layer and Type of Activation Function
The choice of the number of hidden layers, number of nodes in the
hidden layer and the activation function depends on the network application.
Two hidden layers were used in the neural networks modeling of the present
study because it performs significantly better than one hidden layer.
Although, using a single hidden layer might be sufficient in solving many
functional approximation problems, some other problems may be easier to be
solved with two hidden layers configuration [74].

It is usually to start with a relatively small number of hidden units and


increase it until we are satisfied with the approximation quality of the
network. Unfortunately, the network needs to be fully retrained after each
modification of its structure. The number of nodes in a hidden layer(s)
drastically affects the outcome of the network training. Therefore, trial-and-
error approach is carried out to choose an adequate number of hidden layers
and number of nodes in each hidden layer. The number of nodes in the hidden
layer is selected according to the following rules:

1-The maximum error of the output network parameters should be as small as


possible for both training patterns and testing patterns.

2-The correlation coefficient should be as high as possible. It is a measure of


how well the variation in the output is explained by the targets. If this number
is equal to (1), then there is perfect correlation between targets and outputs..

In present study the network was tested with two hidden layers with
different activation functions for both of first hidden layer, second hidden
62
Chapter Four Artificial Neural Networks

layer and output layer. Different numbers of nodes in each hidden layer from
(3 to14) nodes are used.
Table (4.2) and (4.3) showed how the number of nodes in the two
hidden layers for the first and second models respectively affects the response
of network with different types and arrangements of activation functions. The
training function selected here is conjugate gradient backpropagation with
Polak-Ribiere updates (TRAINCGP).

Table (4.2) Training MSE & testing regression for the two hidden layer networks
with different types and arrangements of activation function for first model
MSE
(tansig, (tansig, (tansig, (purelin, (tansig,
Network (train)
purelin, tansig, tansig, tansig, purelin,
type &
purelin) purelin) tansig) tansig) tansig)
R(test)
MSE 0.006867 0.004648 0.866685 0.267799 0.269498
3-3
R(test) 0.99187 0.99444 0.43771 0.97524 0.98271
MSE 0.006904 0.002827 0278386 0.605891 1.39711
3-4
R(test) 0.99185 0.9962 0.96397 0.71592 0.59332
MSE 0.005792 0.001373 1.16271 0.349531 1.08988
4-6
R(test) 0.99061 0.99736 0.83163 0.96549 0.36873
MSE 0.002436 0.000367 1.08849 0.265321 0.425579
5-8
R(test) 0.996 0.99936 0.58482 0.97633 0.97621
MSE 0.003906 0.000964 0.265067 0.265536 0.634028
6-9
R(test) 0.99559 0.99693 0.9754 0.97433 0.62809
MSE 0.001613 0.000372 0.26474 1.84901 0.265453
7-8
R(test) 0.99499 0.99839 0.95426 0.66551 0.975
MSE 0.010721 0.000538 0.26506 0.265586 0.265967
5-11
R(test) 0.9926 0.99466 0.97785 0.97607 0.97361
MSE 0.001418 0.000954 0.264058 0.267059 0.265829
9-4
R(test) 0.99545 0.99866 0.95685 0.97839 0.97314

63
Chapter Four Artificial Neural Networks

MSE 0.020705 0.001533 0.265846 0.275163 0.266173


3-11
R(test) 0.99656 0.99737 0.97458 0.97606 0.97209
5-14 MSE 0.002466 0.00109 0.265149 0.265511 0.266637
R(test) 0.99727 0.99889 0.97297 0.97995 0.97744
From the Table above, it can be seen that the networks with (tansig,
tansig, purelin), give the best performance and testing regression. It can be
seen that the network (5-8) gives best performance and regression for both
training and testing. The architecture of neural network for this model is given
in Fig. (4.8).The results obtained of corrosion test by artificial neural network
predictions are shown to be in good agreement with the experimental values
of the present study. i.e. correlation coefficient, R=0.99936 as shown in Fig.
(4.9).

Table (4.3) Training MSE & testing regression for the two hidden layer networks
with different types and arrangements of activation function for second model.
MSE
(tansig, (tansig, (tansig, (purelin, (tansig,
Network (train)
purelin, tansig, tansig, tansig, purelin,
type &
purelin) purelin) tansig) tansig) tansig)
R(test)
MSE 0.005669 0.004197 0.009120 0.0148931 0.007648
3-4
R(test) 0.9869 0.9866 0.98015 0.97807 0.98064
MSE 0.005890 0.004286 0.004875 0.006775 0.005282
4-3
R(test) 0.98173 0.98382 0.98128 0.98081 0.9838
MSE 0.004994 0.002972 0.006238 0.137416 0.004874
5-4
R(test) 0.99084 0.99513 0.98043 0.87353 0.984
MSE 0.003719 0.001376 0.003583 0.143473 0.726964
6-5
R(test) 0.91766 0.96658 0.9611 0.966 0.58213
MSE 0.003210 0.001467 0.001734 0.050627 0.004298
7-7
R(test) 0.986 0.98608 0.97633 0.90539 0.98636
MSE 0.003764 0.001302 0.57335 0.002855 0.003853
7-9
R(test) 0.98968 0.99542 0.759446 0.98378 0.98665
4-11 MSE 0.005060 0.001666 0.001754 0.002030 0.006140

64
Chapter Four Artificial Neural Networks

R(test) 0.99067 0.9943 0.96221 0.98884 0.98255


MSE 0.002104 0.002004 0.002123 0.007967 0.231318
12-3
R(test) 0.98926 0.99399 0.98504 0.98169 0.9841
MSE 0.006205 0.001630 0.001639 0.005308 0.005560
4-13
R(test) 0.98317 0.98574 0.98454 0.97952 0.98334
MSE 0.001541 0.001325 0.39879 0.214616 0.001663
14-3
R(test) 0.97081 0.97142 0.91446 0.87307 0.96807

From the Table above, it can be seen that the networks with (tansig, tansig,
purelin), give the best performance and testing regression. It can be seen that
the network (7-9) gives best performance and regression for both training and
testing. The architecture of neural network for this model is given in Fig.
(4.10).The results obtained of pickling test by artificial neural network
predictions were found shown to be in good agreement with the experimental
values of the present study. i.e. correlation coefficient, R=0.99542 as shown
in Fig.(4.11).
Table (4.4) showed the statistical parameters used to give a description
for good training for artificial neural networks modeling in the two proposed
models.
Table (4.4) Statistical parameters for first and second models.

Model Statistical Parameters


Mean Square Error Correlation Coefficient
(MSE) (R)
First Model 3.670×10-4 0.99936

Second Model 1.302×10-3 0.99542

65
Chapter Four Artificial Neural Networks

% HCl CR

[ ACX]

Time %IE

Input layer
Output layer
(3 nodes)
First hidden layer (2 nodes)
(5 nodes)
Second hidden
layer (8 nodes)

Fig. (4.8): The architecture of first ANN model.

66
Chapter Four Artificial Neural Networks

300
Experimental results
250 ‫ ــــــــــــ‬Best linear fit

ANN Results 200 R=0.99936

150

100

50

0
0 50 100 150 200 250 300
Experimental Results

Fig.(4.9): Comparison between ANN results and experimental results


using conjugate gradient backpropagation with Polak-Ribiere updates.

% HCl Scale
[ACX] density
removal
Time [Fe]
Content

Input layer Output layer (2


(3 nodes) nodes)
First hidden
layer (7nodes) Second hidden
layer (9 nodes)

Fig.(4.10):The architecture of second ANN model.

67
Chapter Four Artificial Neural Networks

70
Experimental results
60 ‫ ــــــــــــ‬Best linear fit
50 R=0.9954
2
ANN Results

40

30

20

10

0
10 15 20 25 30 35 40 45 50 55 60
Experimental Results

Fig.(4.11): Comparison between ANN results and experimental results


using conjugate gradient backpropagation with Polak-Ribiere updatse.

4.13.4.ANN Training Program Structure

The structure of the program that used to train ANN is illustrated in the
following chart:

68
Chapter Four Artificial Neural Networks

Begin

Input training data

Normalization (pre-processes) of training data

Choosing the number of hidden layers

Choosing the number of nodes in each hidden layer

Choosing the activation functions for hidden layers and output


layer

Choosing the training function

Training ANN

Simulation (predicted output)

Normalization (post-processes) of predicted


output

Meet the MSE & R


NO Precision?

YES
End

Fig.(4.12)The Structure of the Neural Network Program.

69

You might also like