0% found this document useful (0 votes)
140 views34 pages

Feedforward

1. A multilayer feedforward neural network consists of an input layer, one or more hidden layers, and an output layer connected in a forward feeding manner. 2. Signals propagate from the input through the hidden layers to the output layer. Backpropagation is used to calculate error signals that propagate backwards to adjust weights and minimize errors. 3. During training, each example is propagated forward, error is calculated at the output, then error signals propagate backwards to adjust weights, reducing errors over multiple epochs until the network converges.

Uploaded by

kiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views34 pages

Feedforward

1. A multilayer feedforward neural network consists of an input layer, one or more hidden layers, and an output layer connected in a forward feeding manner. 2. Signals propagate from the input through the hidden layers to the output layer. Backpropagation is used to calculate error signals that propagate backwards to adjust weights and minimize errors. 3. During training, each example is propagated forward, error is calculated at the output, then error signals propagate backwards to adjust weights, reducing errors over multiple epochs until the network converges.

Uploaded by

kiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

UNIT-4

FEED FORWARD ANN

In multilayer feed forward networks, the network consists of a set of sensory units
(source nodes) that constitute the input layer, one or more hidden layers of
computation nodes, and an output layer of computation nodes. The input signal
propagates through the network in a forward direction, on a layer-by-layer basis.
These neural networks are commonly referred to as multilayer perceptron (MLPs).

A multilayer perceptron has three distinctive characteristics:

1. The model of each neuron in the network includes a nonlinear activation function.
A commonly used form of nonlinearity that satisfies this requirement is a sigmoidal
nonlinearity defined by the logistic function:
where Vj is the induced local field (i.e., the weighted sum of all synaptic inputs plus the
bias) of neuron j, and Yj is the output of the neuron. The presence of nonlinearities is
important because otherwise the input-output relation of the network could be reduced to
that of a single-layer perceptron.

2. The network contains one or more layers of hidden neurons that are not part of the input
or output of the network. These hidden neurons enable the network to learn complex tasks
by extracting progressively more meaningful features from the input patterns (vectors).
3. The network exhibits high degrees of connectivity, determined by the synapses of
the network. A change in the connectivity of the network requires a change in the
population of synaptic connections or their weights.
Feed-forward networks have the following characteristics:

1. Perceptrons are arranged in layers, with the first layer taking in inputs and the
last layer producing outputs. The middle layers have no connection with the external
world, and hence are called hidden layers.

2. Each perceptron in one layer is connected to every perceptron on the next layer.
Hence information is constantly "feed forward" from one layer to the next., and this
explains why these networks are called feed-forward networks.

3. There is no connection among perceptrons in the same layer.


STRUCTURE OF MULTI-LAYER FEED FORWARD NETWORKS
• A Multi-layer feed forward network consists of a layer of input units, one or more
layer of hidden units and a layer of output units.
• The input signal propagates through network in a forward direction on a layer by
layer basis.
• These are called feed forward because output of one layer of neurons feed forward
onto next layer of neurons.
• The following Figure shows the architectural graph of a multilayer perceptron
with two hidden layers and an output layer.
• The network shown here is fully connected.
• This means that a neuron in any layer of the network is connected to all the
nodes/neurons in the previous layer.
• Signal flow through the network progresses in a forward direction, from left to
right and on a layer-by-layer basis.
Two kinds of signals are identified in this network.
1. Function Signals.
• A function signal is an input signal (stimulus) that comes in at the input end of the
network, propagates forward (neuron by neuron) through the network, and
emerges at the output end of the network as an output signal. We refer to such a
signal as a "function signal" for two reasons.
• First, it is presumed to perform a useful function at the output of the network.
• Second, at each neuron of the network through which a function signal passes, the
signal is calculated as a function of the inputs and associated weights applied to
that neuron. The function signal is also referred to as the input signal.
2. Error Signals.
• An error signal originates at an output neuron of the network, and propagates
backward (layer by layer) through the network.
• We refer to it as an "error signal" because its computation by every neuron of the
network involves an error-dependent function in one form or another
• The output neurons (computational nodes) constitute the output layers of the
network.
• The remaining neurons (computational nodes) constitute hidden layers of the
network.
• Thus the hidden units are not part of the output or input of the network hence their
designation as "hidden."
• The first hidden layer is fed from the input layer made up of sensory units (source
nodes); the resulting outputs of the first hidden layer are in turn applied to the next
hidden layer; and so on for the rest of the network.
Each hidden or output neuron of a multilayer perceptron is designed to perform two
computations:

1. The computation of the function signal appearing at the output of a neuron, which
is expressed as a continuous nonlinear function of the input signal and synaptic
weights associated with that neuron.

2. The computation of an estimate of the gradient vector (i.e., the gradients of the
error surface with respect to the weights connected to the inputs of a neuron), which
is needed for the backward pass through the network.
What is Backpropagation?

• Back-propagation is the essence of neural net training. It is the method of fine-


tuning the weights of a neural net based on the error rate obtained in the previous
epoch (i.e., iteration). Proper tuning of the weights allows you to reduce error rates
and to make the model reliable by increasing its generalization.

• Backpropagation is a short form for "backward propagation of errors." It is a


standard method of training artificial neural networks. This method helps to
calculate the gradient of a loss function with respects to all the weights in the
network.
1. Inputs X, arrive through the preconnected path
2. Input is modeled using real weights W. The weights are usually randomly
selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers,
to the output layer.
4. Calculate the error in the outputs
ErrorB= Actual Output – Desired Output
5. Travel back from the output layer to the hidden layer to adjust the weights such
that the error is decreased.
6. Keep repeating the process until the desired output is achieved
Why We Need Backpropagation?
Most prominent advantages of Backpropagation are:

• Backpropagation is fast, simple and easy to program


• It has no parameters to tune apart from the numbers of input
• It is a flexible method as it does not require prior knowledge about the network
• It is a standard method that generally works well
• It does not need any special mention of the features of the function to be learned.
Types of Backpropagation Networks
Two Types of Backpropagation Networks are:
1. Static Back-propagation
2. Recurrent Backpropagation
1. Static back-propagation:
It is one kind of backpropagation network which produces a mapping of a static
input for static output. It is useful to solve static classification issues like optical
character recognition.
2. Recurrent Backpropagation:
Recurrent backpropagation is fed forward until a fixed value is achieved. After that,
the error is computed and propagated backward.
The main difference between both of these methods is: that the mapping is rapid in
static back-propagation while it is nonstatic in recurrent backpropagation.
Training Algorithm
For training, BPN will use binary sigmoid activation function. The training of BPN
will have the following three phases.
Phase 1 − Feed Forward Phase
Phase 2 − Back Propagation of error
Phase 3 − Updating of weights
All these steps will be concluded in the algorithm as follows
Step 1 − Initialize the following to start the training −
Weights
Learning rate α
For easy calculation and simplicity, take some small random values.
Step 2 − Continue step 3-11 when the stopping condition is not true.
Step 3 − Continue step 4-10 for every training pair.
BACK PROPAGATION TRAINING AND CONVERGENCE
For a given training set, back -propagation learning may thus proceed in one of two
basic ways:
1. Sequential Mode.
• The sequential mode of back-propagation learning is also referred to as on-line,
pattern, or stochastic mode.
• In this mode of operation weight updating is performed after the presentation of each
training example.
• One complete presentation of the entire training set during the learning process is
called an epoch.
• The learning process is maintained on an epoch-by-epoch basis until the synaptic
weights and bias levels of the network stabilize and the average squared error over the
entire training set converges to some minimum value.
• To be specific, consider an epoch consisting of N training examples (patterns)
arranged in the order (x(l), d(l)) . . . . , (x(N), d(N)).
• The first example pair (x(1), d(l)) in the epoch is presented to the network, and the
sequence of forward and backward computations described previously is
performed, resulting in certain adjustments to the synaptic weights and bias levels
of the network.
• Then the second example pair (x(2), d(2)) in the epoch is presented, and the
sequence of forward and backward computations is repeated, resulting in further
adjustments to the synaptic weights and bias levels.
• This process is continued until the last example pair (x(N), d(N)) in the epoch is
accounted for.
2. Batch Mode.
• In the batch mode of back-propagation learning, weight updating is performed
after the presentation of all the training examples that constitute an epoch.
• For a particular epoch, we define the cost function as the average squared error

• where the error signal ej(n) pertains to output neuron j for training example n.
• The error ej(n) equals the difference between dj(n) and yj(n), which represents the
jth element of the desired response vector dj(n) and the corresponding value of the
network output, respectively.
• Here the inner summation with respect to j is performed over all the neurons in the
output layer of the network, whereas the outer summation with respect to n is
performed over the entire training set in the epoch at hand.
• For a learning-rate parameter 11, the adjustment applied to synaptic weight wji
connecting neuron i to neuron j, is defined by the delta rule.
• To get a brief overview of what Neural Networks are, a neural network is simply a collection
of Neurons(also known as activations), that are connected through various layers.
• It attempts to learn the mapping of input data to output data, on being provided a training set.
• The training of the neural network later facilitates the predictions made by it on a testing data
of the same distribution.
• This mapping is attained by a set of trainable parameters called weights, distributed over
different layers.
• The weights are learned by the backpropagation algorithm whose aim is to minimize a loss
function.
• A loss function measures how distant the predictions made by the network are from the
actual values.
• Every layer in a neural network is followed by an activation layer that performs some
additional operations on the neurons.
The Universal Approximation Theorem
• Mathematically speaking, any neural network architecture aims at finding any
mathematical function y= f(x) that can map attributes(x) to output(y).
• The accuracy of this function i.e. mapping differs depending on the distribution of
the dataset and the architecture of the network employed.
• The function f(x) can be arbitrarily complex.
• The Universal Approximation Theorem tells us that Neural Networks has a kind
of universality i.e. no matter what f(x) is, there is a network that can
approximately approach the result and do the job! This result holds for any
number of inputs and outputs.
• If we observe the neural network above, considering the input attributes provided
as height and width, our job is to predict the gender of the person.
• If we exclude all the activation layers from the above network, we realize that h₁
is a linear function of both weight and height with parameters w₁, w₂, and the
bias term b₁.
• Therefore mathematically,
h₁ = w₁*weight + w₂*height + b₁
Similarily,
h2 = w₃*weight + w₄*height + b₂
• Going along these lines we realize that o1 is also a linear function of h₁ and h2,
and therefore depends linearly on input attributes weight and height as well.
• An activation layer is applied right after a linear layer in the Neural Network to
provide non-linearities.
• Non-linearities help Neural Networks perform more complex tasks.
• An activation layer operates on activations (h₁, h2 in this case) and modifies them
according to the activation function provided for that particular activation layer.
• Activation functions are generally non-linear except for the identity function.
• Some commonly used activation functions are ReLu, sigmoid, softmax, etc.
• With the introduction of non-linearities along with linear terms, it becomes possible
for a neural network to model any given function approximately on having
appropriate parameters(w₁, w₂, b₁, etc in this case).
• The parameters converge to appropriateness on training suitably.
• You can get better acquainted mathematically with the Universal Approximation
theorem from here.
PRACTICAL AND DESIGN ISSUES OF BACK PROPAGATION
LEARNING
• The universal approximation theorem is important from a theoretical viewpoint,
because it provides the necessary mathematical tool for the viability of feed
forward networks with a single hidden layer as a class of approximate solutions.
• Without such theorem, we could conceivably be searching for a solution that
cannot exist.
• However, the theorem is not constructive, that is, it does not actually specify how
to determine a multilayer perceptron with the stated approximation properties.
• The universal approximation theorem assumes that the continuous function to be
approximated is given and that a hidden layer of unlimited size is available for the
approximation.
• Both of these assumptions are violated in most practical applications of multilayer
perceptrons.
• The problem with multilayer perceptrons using a single hidden layer is that the
neurons there in tend to interact with each other globally.
• In complex situations this interaction makes it difficult to improve the
approximation at one point without worsening it at some other point.
• In particular, we may proceed as follows:
1. Local features are extracted in the first hidden layer. Specifically, some neurons
in the first hidden layer are used to partition the input space into regions, and other
neurons in that layer learn the local features characterizing those regions.
2. Global features are extracted in the second hidden layer. Specifically, a neuron in
the secondhidden layer combines the outputs of neurons in the first hidden layer
operating on a particular region of the input space, and thereby learns the global
features for that region and outputs zero elsewhere.

You might also like