0% found this document useful (0 votes)
348 views58 pages

1 MC Culloh Pitts Neuron Model 22 Jul 2019material I 22 Jul 2019 Intro New

The document discusses artificial neural networks (ANNs). It defines ANNs as parallel computational systems consisting of simple processing elements connected to perform tasks similarly to the human brain. ANNs can learn from training data and generalize to new situations, making them useful for applications involving pattern recognition, prediction, and more. The document provides an overview of ANNs and their uses, learning types, history, comparison to the human brain/nervous system, and the structure of biological neural networks at a basic level.

Uploaded by

Barry Allen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
348 views58 pages

1 MC Culloh Pitts Neuron Model 22 Jul 2019material I 22 Jul 2019 Intro New

The document discusses artificial neural networks (ANNs). It defines ANNs as parallel computational systems consisting of simple processing elements connected to perform tasks similarly to the human brain. ANNs can learn from training data and generalize to new situations, making them useful for applications involving pattern recognition, prediction, and more. The document provides an overview of ANNs and their uses, learning types, history, comparison to the human brain/nervous system, and the structure of biological neural networks at a basic level.

Uploaded by

Barry Allen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Artificial Neural Networks

by
Dr.A.Sharmila, SELECT

1
ANN

According to the father of Artificial Intelligence, John McCarthy, it is


“The science and engineering of making intelligent machines, especially
intelligent computer programs”.

Artificial Intelligence is a way of making a computer, a computer-


controlled robot, or a software think intelligently, in the similar
manner the intelligent humans think.

AI is accomplished by studying how human brain thinks, and how humans


learn, decide, and work while trying to solve a problem, and then using
the outcomes of this study as a basis of developing intelligent software
and systems.

From a practical point of view, an ANN is just a parallel computational


system consisting of many simple processing elements connected
together in a specific way in order to perform a particular task.
Why Study Artificial Neural Networks?
 They are extremely powerful computational devices (Turing equivalent,
universal computers)

 Massive parallelism makes them very efficient

 They can learn and generalize from training data – so there is no need for
enormous deeds of programming

 They are particularly fault tolerant – this is equivalent to the “graceful


degradation” found in biological systems

 They are very noise tolerant – so they can cope with situations where
normal symbolic systems would have difficulty

 In principle, they can do anything a symbolic/logic system can do, and


more. (In practice, getting them to do it can be rather difficult…)
3
What are Artificial Neural Networks Used for?
 As with the field of AI in general, there are two basic goals for
neural network research:

 Brain modeling: The scientific goal of building models of how


real brains work
 This can potentially help us understand the nature of human
intelligence, formulate better teaching strategies, or better
remedial actions for brain damaged patients.

 Artificial System Building : The engineering goal of building


efficient systems for real world applications.
 This may make machines more powerful, relieve humans of
tedious tasks, and may even improve upon human
performance.

4
What are Artificial Neural Networks Used for?
 Brain modeling
 Models of human development – help children with developmental problems
 Simulations of adult performance – aid our understanding of how the brain
works
 Neuropsychological models – suggest remedial actions for brain damaged
patients

 Real world applications


 Financial modeling – predicting stocks, shares, currency exchange rates
 Other time series prediction – climate, weather, airline marketing tactician
 Computer games – intelligent agents, backgammon, first person shooters
 Control systems – autonomous adaptable robots, microwave controllers
 Pattern recognition – speech recognition, hand-writing recognition, sonar
signals
 Data analysis – data compression, data mining
 Noise reduction – function approximation, ECG noise reduction
 Bioinformatics – protein secondary structure, DNA sequencing 5
Learning in Neural Networks

 There are many forms of neural networks. Most


operate by passing neural ‘activations’ through a
network of connected neurons.

 One of the most powerful features of neural


networks is their ability to learn and generalize
from a set of training data. They adapt the
strengths/weights of the connections between
neurons so that the final output activations are
correct.

6
Learning in Neural Networks

 There are three broad types of learning:

1. Supervised Learning (i.e. learning with a teacher)


2. Reinforcement learning (i.e. learning with limited
feedback)
3. Unsupervised learning (i.e. learning with no help)

7
A Brief History
 1943 McCulloch and Pitts proposed the McCulloch-Pitts neuron model

 1949 Hebb published his book The Organization of Behavior, in which the Hebbian learning rule was proposed.

 1958 Rosenblatt introduced the simple single layer networks now called Perceptrons.

 1969 Minsky and Papert’s book Perceptrons demonstrated the limitation of single layer perceptrons, and almost
the whole field went into hibernation.

 1982 Hopfield published a series of papers on Hopfield networks.

 1982 Kohonen developed the Self-Organizing Maps that now bear his name.

 1986 The Back-Propagation learning algorithm for Multi-Layer Perceptrons was re-discovered and the whole
field took off again.

 1990s The sub-field of Radial Basis Function Networks was developed.

 2000s The power of Ensembles of Neural Networks and Support Vector Machines becomes apparent.
8
Overview
 Artificial Neural Networks are powerful computational systems
consisting of many simple processing elements connected
together to perform tasks analogously to biological brains.

 They are massively parallel, which makes them efficient, robust,


fault tolerant and noise tolerant.

 They can learn from training data and generalize to new


situations.

 They are useful for brain modeling and real world applications
involving pattern recognition, function approximation,
prediction, …

9
The Nervous System

 The human nervous system can be divided into three stages that may be
represented in block diagram form as:
 The receptors collect information from the environment – e.g. photons on
the retina.
 The effectors generate interactions with the environment – e.g. activate
muscles.
 The flow of information/activation is represented by arrows – feed forward
and feedback.

10
Levels of Brain Organization

 The brain contains both large scale and small scale anatomical
structures and different functions take place at higher and lower
levels. There is a hierarchy of interwoven levels of organization:
1. Molecules and Ions
2. Synapses
3. Neuronal microcircuits
4. Dendritic trees
5. Neurons
6. Local circuits
7. Inter-regional circuits
8. Central nervous system

 The ANNs we study in this module are crude approximations to


levels 5 and 6.
11
Brains vs. Computers
 There are approximately 10 billion neurons in the human cortex,
compared with 10 of thousands of processors in the most powerful
parallel computers.

 Each biological neuron is connected to several thousands of other


neurons, similar to the connectivity in powerful parallel computers.

 Lack of processing units can be compensated by speed. The typical


operating speeds of biological neurons is measured in milliseconds (10-3
s), while a silicon chip can operate in nanoseconds (10-9 s).

 The human brain is extremely energy efficient, using approximately 10-


16 joules per operation per second, whereas the best computers today
use around 10-6 joules per operation per second.

 Brains have been evolving for tens of millions of years, computers have
been evolving for tens of decades.
12
Structure of a Human Brain

13
The Human Brain
• Brain contains about 10 10 basic units called neurons. Each
neuron in turn, is connected to about 10 4 other neurons.
• A neuron is a small cell that receives electro-chemical
signals from its various sources and in turn responds by
transmitting electrical impulses to other neurons.

14
Training vs. Inference

• Training: acquiring knowledge

• Inference: solving a problem using the


acquired knowledge

15
Biological Neural Networks Synapse

 A biological neuron has three types of main Synapse Dendrites


Axon
components ;dendrites, soma (or cell body) and axon. Axon

 The majority of neurons encode their outputs or activations


as a series of brief electrical pulses (i.e. spikes or action Soma Soma
potentials). Dendrites
Synapse
 Dendrites are the receptive zones that receive activation
from other neurons i.e. Accepts input.

 The cell body (soma) of the neuron’s processes the incoming


activations and converts them into output activations i.e.
Process the inputs

 Axons are transmission lines that send activation to other


neurons i.e. Turns the processed inputs to outputs.

 Synapses allow weighted transmission of signals (using


neurotransmitters) between axons and dendrites to build up
large neural networks.It is electrochemical contact between
neurons. 16
Neural network: Definition

• Neural network: information processing paradigm inspired by


biological nervous systems, such as our brain

• Structure: large number of highly interconnected processing


elements (neurons) working together

• Like people, they learn from experience (by example)

17
Artificial Neural Network: Definition

• The idea of ANN: NNs learn relationship between cause and


effect or organize large volumes of data into orderly and
informative patterns.

• Definition of ANN: “Data processing system consisting of a large


number of simple, highly interconnected processing elements
(artificial neurons) in an architecture inspired by the structure of
the cerebral cortex of the brain”

(Tsoukalas & Uhrig, 1997).

18
Artificial Neurons
• ANNs have been developed as generalizations of
mathematical models of neural biology, based on the
assumptions that:
1. Information processing occurs at many simple
elements called neurons.
2. Signals are passed between neurons over connection links.
3. Each connection link has an associated weight, which, in
typical neural net, multiplies the signal transmitted.
4. Each neuron applies an activation function to its net input to
determine its output signal.

19
Analogy of ANN With Biological NN
Synapse

Synapse Dendrites
Axon
Axon

Soma Soma
Dendrites
Synapse

• Dendrites receives signals from other neurons.


• The soma, sums the incoming signals. When sufficient input is received, the cell fires; that is it transmit
a signal over its axon to other cells.
• Associated Terminologies of Biological and Artificial Neural Net
• Biological Neural Network Artificial Neural Network
Cell Body/Soma Neurons
Dendrite Input
Synapse Weights
20
Axon Output
Typical Architecture of ANNs
• A typical neural network contains a large number of artificial neurons
called units arranged in a series of layers.

21
Typical Architecture of ANNs (cont.)
• Input layer —It contains those units (artificial neurons) which
receive input from the outside world on which network will
learn, recognize about or otherwise process.
• Output layer—It contains units that respond to the
information about how it’s learned any task.
• Hidden layer—These units are in between input and output
layers. The job of hidden layer is to transform the input into
something that output unit can use in some way.

 Most neural networks are fully connected that means to say


each hidden neuron is fully connected to the every neuron
in its previous layer(input) and to the next layer (output)
layer.
22
Popular ANNs Architectures (Sample)

http://www.asimovinstitute.org/neural-network-zoo/ 16
Popular ANNs Architectures (cont.)
 Single layer perceptron —Neural Network having two
input units and one output units with no hidden layers.
 Multilayer Perceptron —These networks use more than
one hidden layer of neurons, unlike single layer perceptron.
These are also known as deep feedforward neural
networks.
 Hopfield Network—A fully interconnected network of
neurons in which each neuron is connected to every other
neuron. The network is trained with input pattern by setting
a value of neurons to the desired pattern. Then its weights
are computed. The weights are not changed. Once trained
for one or more patterns, the network will converge to the
learned patterns. 24
Popular ANNs Architectures (cont.)
Deep Learning Neural Network — It is a feedforward neural
networks with big structure (many hidden layers and large
number of neurons in each layer) used fro deep learning
Recurrent Neural Network —Type of neural network in which
hidden layer neurons has self-connections. Recurrent neural
networks possess memory. At any instance, hidden layer
neuron receives activation from the lower layer as well as it
previous activation value.
Long /Short Term Memory Network (LSTM) —Type of neural
network in which memory cell is incorporated inside hidden
layer neurons is called LSTM network.
Convolutional Neural Network —is a class of deep, feed-
forward artificial neural networks that has successfully been
applied to analyzing visual imagery 25
How are ANNs being used in solving problems?
• The problem variables are mainly: inputs, weights and outputs.
• Examples (training data) represent a solved problem. i.e. Both
the inputs and outputs are known
• There are many different algorithms that can be used when
training artificial neural networks, each with their own
separate advantages and disadvantages.
• The learning process within ANNs is a result of altering the
network's weights and bias (threshold), with some kind of
learning algorithm.
• The objective is to find a set of weight matrices which when
applied to the network should map any input to a correct
output.
• For a new problem, we now have the inputs and the weights,
therefore, we can easily get the outputs.
26
Learning Techniques in ANNs

Supervised Learning

•In supervised learning, the training data is input to the


network, and the desired output is known weights and bias are
adjusted until output yields desired value.

Unsupervised Learning

•The input data is used to train the network whose output


is known. The network classifies the input data and
adjusts the weight by feature extraction in input data.

Reinforcement Learning

•Here the value of the output is unknown, but the network


provides the feedback whether the output is right or wrong.
It is semi-supervised learning.
27
Learning Algorithms
𝑚
Depends on 𝑋𝑗𝑌𝑗 𝑇
𝑤=
Hebbian input-output 𝑗= 1
correlation

𝜕𝐸
Gradient minimization Δ𝑤𝑖𝑗 = 𝜂
𝜕𝑤𝑖𝑗
Descent of error E

Only output
Competitive neuron with Winner-take-
Learning high input is all strategy
updated

Weights are
Stochastic Such as
adjusted in a
simulated
Learning probabilistic
annealing
fashion 28
Learning Algorithms
Gradient Descent

• This is the simplest training algorithm used in case of


supervised training model.
• In case, the actual output is different from target output, the
difference or error is find out.
• The gradient descent algorithm changes the weights of the
network in such a manner to minimize this error.

Back propagation

• It is an extension of gradient based delta learning rule.


• Here, after finding an error (the difference between desired
and target), the error is propagated backward from output
layer to the input layer via hidden layer.
• It is used in case of multilayer neural network.

29
Learning Data Sets in ANN
Training set

• A training dataset is a dataset of examples used for learning, that is to


fit the parameters (e.g., weights) of a classifier.
• One Epoch comprises of one full training cycle on the training set.

Validation set (development set)

• A validation dataset is a set of examples used to tune the


hyperparameters (i.e. number of hidden) of a classifier.
• the validation set should follow the same probability distribution as the
training dataset.

Test set

• A test set is therefore a set of examples used only to assess the performance
(i.e. generalization) of a fully specified classifier.
• A better fitting of the training dataset as opposed to the test dataset usually
points to overfitting.
• the test set should follow the same probability distribution as the training
dataset. 30
Applications of ANNs
• Signal processing
• Pattern recognition, e.g.
handwritten characters
or face
 identification.
• Diagnosis or mapping symptoms to a
medical case.
• Speech recognition
• Human Emotion Detection
• Educational Loan Forecasting
• Computer Vision
• Deep Learning
31
The McCulloch-Pitts Neuron

27
 The First Artificial Neuron
 As mentioned in the research history McCulloch and Pitts (1943) produced the first neural
network, which was based on their artificial neuron. Although this work was developed in the
early forties, many of the principles can still be seen in the neural networks of today.
We can make the following statements about a McCulloch-Pitts network
 The activation of a neuron is binary. That is, the neuron either fires (activation of one) or does not
fire (activation of zero).
For the network shown in Fig. the activation function for unit Y is
f(y_in) = 1, if y_in >= T, else 0
where y_in is the total input signal received
T is the threshold for Y.
 Neurons in a McCulloch-Pitts network are connected by directed, weighted paths.
 If the weight on a path is positive the path is excitatory, otherwise it is inhibitory.
 All excitatory connections into a particular neuron have the same weight, although different
weighted connections can be input to different neurons.
 Each neuron has a fixed threshold. If the net input into the neuron is greater than the threshold,
the neuron fires.
 The threshold is set such that any non-zero inhibitory input will prevent the neuron from firing.
 It takes one time step for a signal to pass over one connection.
The McCulloch-Pitts Neuron
 This vastly simplified model of real neurons is also known as a Threshold Logic
Unit :
 A set of synapses (i.e. connections) brings in activations
from other neurons.
 A processing unit sums the inputs, and then applies a non-
linear activation function (i.e. squashing/transfer/threshold
function).
 An output line transmits the result to other neurons.

34
Networks of McCulloch-Pitts Neurons
 Artificial neurons have the same basic components as biological
neurons. The simplest ANNs consist of a set of McCulloch-Pitts
neurons labeled by indices k, i, j and activation flows between
them via synapses with strengths wki, wij:

35
Some Useful Notation
 We often need to talk about ordered sets of related numbers – we call them
vectors, e.g.
 x = (x1, x2, x3, …, xn) , y = (y1, y2, y3, …, ym)

 The components xi can be added up to give a scalar (number), e.g.


 s = x1 + x2 + x3 + … + xn = SUM(i, n, xi)

 Two vectors of the same length may be added to give another vector, e.g.
 z = x + y = (x1 + y1, x2 + y2, …, xn + yn)

 Two vectors of the same length may be multiplied to give a scalar, e.g.
 p = x.y = x1y1 + x2 y2 + …+ xnyn = SUM(i, N, xiyi)

36
Some Useful Functions

 Common activation functions

 Identity function
 f(x) = x for all x

 Binary step function (with threshold )

1 if x  
f (x)  
 0 if x  

37
Some Useful Functions

 Binary sigmoid
1
f ( x) 
1  e x

 Bipolar sigmoid
2
g ( x)  2 f ( x)  1  x
1
1 e

3
8
The McCulloch-Pitts Neuron Equation
 Using the above notation, we can now write down a simple
equation for the output of a McCulloch-Pitts neuron as a
function of its n inputs ini :

39
Review
 Biological neurons, consisting of a cell body, axons,
dendrites and synapses, are able to process and
transmit neural activation.

 The McCulloch-Pitts neuron model (Threshold Logic


Unit) is a crude approximation to real neurons that
performs a simple summation and thresholding function
on activation levels

 Appropriate mathematical notation facilitates the


specification and programming of artificial neurons and
networks of artificial neurons.
40
Networks of McCulloch-Pitts Neurons
 One neuron can’t do much on its own. Usually we will have many neurons labeled
by indices k, i, j and activation flows between them via synapses with strengths
wki, wij:

41
The Artificial Neuron

28
The McCulloch-Pitts Neuron
• In the context of neural networks, a McCulloch-
Pitts is an artificial neuron using the step
function as the activation function.
• It is also called Threshold Logic Unit.
• Threshold step function:

0 for 𝑥< 𝑇
F(x)=
= 1 for 𝑥≥ 𝑇

43
The McCulloch-Pitts Neuron (cont).

• In simple words, the output of the McCulloch-Pitts


Neuron equals to 1 if the result of the previous
equation ≥ T , otherwise the output will equals to
zero:
• Where T is a threshold value.

44
Example 1
• If a McCulloch-Pitts neuron has 3 inputs (x1=1 ,
x2=1, x3=1) and the weights are (w1=1, w2 = -1,
w3= -1) and there is no bias. Find the output?

X1
w1 1 output =0
w2
X2
T
w3
X3

Sum=(1*1)+(1*-1)+(1*-1) + 0 = -1
45
Example 2
Features of McCulloch-Pitts model

• Allows binary 0,1 states only


• Operates under a discrete-time assumption
• Weights and the neurons’ thresholds are fixed in the
model and no interaction among network neurons (no
learn)
• Just a primitive model.
• We can use multi – layer of McCulloch-Pitts neurons to
implement the basic logic gates. All we need to do is
find the appropriate connection weights and neuron
thresholds to produce the right outputs for each set of
inputs.
Activation Functions
• Assume:

• S can be anything, ranging from -inf to +inf. The neuron


really doesn’t know the bounds of the value. So how do
we decide whether the neuron should fire or not ( output
= 1 or 0).
• So, we decided to add “activation functions” for this
purpose. To check the S value produced by a neuron
and decide whether outside connections should consider
this neuron as “fired” or not. Or rather let’s say—
“activated” or not.
• activation function serves as a threshold and also
called as “Transfer function”. 48
Activation Functions (cont.)

• The activation functions can be basically divided into 2


types-

1. Linear Activation Function


2. Non-linear Activation Functions
 (Unit Step, Sigmoid, Tach , ReLU, Leaky ReLU , Softmax, …..)

• In most cases, activation function are non-linear


function, that is, the role of activation function is to
make neural networks non-linear.
49
Activation Functions (cont.)

• There have been many kinds of activation functions (over 640 different
activation function proposals) that have been proposed over the years.
• However, best practice confines the use to only a limited kind of
activation functions.
• Next we will explore the most important and widely used activation
function.

• But, the most important question is ”how do we know which one to


use?”.
• Answer: Depending on best practice and nature of the problem
50
Popular Activation Functions

• Linear or Identity
• Step Activation Function (previously explained)
• Sigmoid or Logistic Activation Function
• Tanh or hyperbolic tangent Activation Function
• ReLU (Rectified Linear Unit) Activation Function
• Leaky ReLU
• Softmax function

https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6 39
Linear or Identity Activation Function
• the function is a line or linear.
• Therefore, the output of the functions will not be
confined between any range.

40
Step Activation Function
• Used in McCulloch-Pitts Neuron.
0 𝑓𝑜𝑟𝑥< 𝑇 1
F(x)=
1 𝑓𝑜𝑟𝑥≥ 𝑇
0 T

• hard limiter activation function is a special case


of step function:
1
0 𝑓𝑜𝑟𝑥< 0
F(x)=hardlim(x)=
1 𝑓𝑜𝑟𝑥≥ 0 0

• Sign activation function is a special case of step


function: 1

− 1 𝑓𝑜𝑟𝑥< 0
F(x)=sign(x)=
+ 1 𝑓𝑜𝑟𝑥≥ 0 -1 41
Sigmoid or Logistic Activation Function
• The Sigmoid Function curve looks like a S-shape.
• It exists between (0 to 1). Used in binary classifiers to
predict the probability (0 to1) as an output.

1
f(x)=
1+ 𝑒− 𝑥

54
Tanh or hyperbolic tangent Activation Function

• The Tanh Function curve looks like a S-shape.


• The range of the tanh function is from (-1 to 1)
𝑥 −𝑥
𝑒 −𝑒
f(x)= tanh 𝑥 = 𝑒𝑥+ 𝑒− 𝑥

55
ReLU (Rectified Linear Unit) Activation Function

• The ReLU is the mostly used activation function.


• Any negative input given to the ReLU activation
function turns the value into zero immediately in
the graph

f(x)= max(0, 𝑥)

56
Leaky ReLU

• Leaky ReLUs allow a small, non-zero gradient


when the unit is not active (negative values).

57
Softmax Activation Function

• The softmax function is a generalization of the


sigmoid logistic function.
• The softmax function is used in multiclass
classification methods.

58

You might also like