0% found this document useful (0 votes)
12 views16 pages

Single Neuron Model

The document outlines the step-by-step working of Artificial Neural Networks (ANN), detailing the processes of forward propagation and back propagation, including the role of weights, biases, and activation functions. It explains how errors are minimized using Gradient Descent and describes various types of activation functions and neural networks. Additionally, it highlights different types of neural networks such as Feed-forward, Radial Basis Functions, Kohonen Self-organizing, Recurrent, and Convolution Neural Networks.

Uploaded by

nafulaaurelia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views16 pages

Single Neuron Model

The document outlines the step-by-step working of Artificial Neural Networks (ANN), detailing the processes of forward propagation and back propagation, including the role of weights, biases, and activation functions. It explains how errors are minimized using Gradient Descent and describes various types of activation functions and neural networks. Additionally, it highlights different types of neural networks such as Feed-forward, Radial Basis Functions, Kohonen Self-organizing, Recurrent, and Convolution Neural Networks.

Uploaded by

nafulaaurelia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Step by Step Working of the

Artificial Neural Network

SCO 411
Dr. Elijah Maseno

SCO 411 1
SCO 411 2
1. In the first step, Input units are passed i.e data is
passed with some weights attached to it to the hidden
layer. We can have any number of hidden layers. In the
above image inputs x1,x2,x3,….xn is passed.
2. Each hidden layer consists of neurons. All the inputs are
connected to each neuron.
3. After passing on the inputs, all the computation is
performed in the hidden layer.
Computation performed in hidden layers are done in two steps
which are as follows:
• First of all, all the inputs are multiplied by their weights.
Weight is the gradient or coefficient of each variable. It
shows the strength of the particular input. After assigning
the weights, a bias variable is added. Bias is a constant that
helps the model to fit in the best way possible.
Z1 = W1*In1 + W2*In2 + W3*In3 + W4*In4 + W5*In5 + b
W1, W2, W3, W4, W5 are the weights assigned to the inputs
In1, In2, In3, In4, In5, and b is the bias.
• Then in the second step, the activation function is applied
to the linear equation Z1. The activation function is a
nonlinear transformation that is applied to the input before
sending it to the next layer of neurons. The importance of
the activation function is to inculcate nonlinearity in the
model.

4
4. The whole process described in point 3 is performed in each
hidden layer. After passing through every hidden layer, we
move to the last layer i.e our output layer which gives us
the final output.
The process explained above is known as forwarding
Propagation.
5. After getting the predictions from the output layer, the error is
calculated i.e the difference between the actual and the predicted
output.
If the error is large, then the steps are taken to minimize the error and
for the same purpose, Back Propagation is performed.

5
What is Back Propagation and How it
works?
• In the network or model, each link assigned with some
weight.
• Weight is nothing an integer number that controls the signal
between the two neurons. If the network generates a “good
or desired” output, there is no need to adjust the weights.
• However, if the network generates a “poor or undesired”
output or an error, then the system update the weights in
order to improve subsequent results.
Back Propagation with Gradient
Descent
Gradient Descent is one of the optimizers which helps in
calculating the new weights. Let’s understand step by step
how Gradient Descent optimizes the cost function.
In the image below, the curve is our cost function curve and
our aim is the minimize the error such that Jmin i.e global
minima is achieved.

7
• First, the weights are initialized randomly i.e random value
of the weight, and intercepts are assigned to the model
while forward propagation and the errors are
calculated after all the computation. (As discussed above)
• Then the gradient is calculated i.e derivative of error
w.r.t current weights
• Then new weights are calculated using the below formula,
where a is the learning rate which is the parameter also
known as step size to control the speed or steps of the
backpropagation. It gives additional control on how fast we
want to move on the curve to reach global minima.

8
• This process of calculating the new weights, then errors from
the new weights, and then updation of weights continues till we
reach global minima and loss is minimized.
A point to note here is that the learning rate i.e a in our
weight updation equation should be chosen wisely. Learning
rate is the amount of change or step size taken towards
reaching global minima. It should not be very small as it will
take time to converge as well as it should not be very
large that it doesn’t reach global minima at all. Therefore, the
learning rate is the hyperparameter that we have to choose
based on the model.

9
10
Activation Functions
• Activation functions are attached to each neuron and are
mathematical equations that determine whether a neuron should
be activated or not based on whether the neuron’s input is
relevant for the model’s prediction or not. The purpose of the
activation function is to introduce the nonlinearity in the data.

Various Types of Activation Functions are :


• Sigmoid Activation Function
• TanH / Hyperbolic Tangent Activation Function
• Rectified Linear Unit Function (ReLU)
• Leaky ReLU
• Softmax
Sigmoid Function
The sigmoid function is used when the model is predicting
probability.
ReLU (rectified linear unit)
Function
• The ReLU (rectified linear unit) function gives the value but
says if it’s over 1, then it will just be 1, and if it’s less than 0,
it will just be 0. The ReLU function is most commonly used
these days.
Hyperbolic Tangent Function

The hyperbolic tangent function is similar to the sigmoid


function but has a range of -1 to 1.
Types of Neural Networks
The different types of neural networks are discussed below:
• Feed-forward Neural Network: This is the simplest form of
ANN (artificial neural network); data travels only in one
direction (input to output). This is the example we just looked
at. When you actually use it, it’s fast; when you’re training it, it
takes a while. Almost all vision and speech recognition
applications use some form of this type of neural network.
• Radial Basis Functions Neural Network: This model classifies
the data point based on its distance from a center point. If you
don’t have training data, for example, you’ll want to group things
and create a center point. The network looks for data points
that are similar to each other and groups them. One of the
applications for this is power restoration systems.
• Kohonen Self-organizing Neural Network: Vectors of random
input are input to a discrete map comprised of neurons. Vectors
are also called dimensions or planes. Applications include using it
to recognize patterns in data like a medical analysis.
Recurrent Neural Network: In this type, the hidden layer saves
its output to be used for future prediction. The output becomes
part of its new input. Applications include text-to-speech
conversion.
Convolution Neural Network: In Convolution Neural Network, the
input features are taken in batches—as if they pass through a
filter. This allows the network to remember an image in parts.
Applications include signal and image processing, such as facial
recognition.

You might also like