0% found this document useful (0 votes)

33 views59 pages

Introduction To Artificial Neural Networks and Perceptron

- Artificial neural networks (ANNs) were first introduced in 1943 and aimed to mimic the human brain. Early successes led to belief in intelligent machines, but progress stalled until the 1980s with improvements in network architectures and training techniques. - Biological neurons receive and transmit electrical signals at connections called synapses. ANNs abstract this model with simple mathematical units (artificial neurons) that perform computations. - The perceptron is a basic ANN architecture, consisting of a single layer of linear threshold units (LTUs) that classify inputs. Weights are updated using the perceptron learning rule to minimize errors on the training data. However, perceptrons cannot represent non-linear problems like XOR. - The limitations

Uploaded by

Prathik Narayan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views59 pages

Introduction To Artificial Neural Networks and Perceptron

Uploaded by

Prathik Narayan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 59

Introduction to Artificial

Neural Networks and

Perceptron
History of ANN
From Biological to Artificial Neurons
• ANNs - first introduced in 1943 by the neurophysiologist Warren
McCulloch and the mathematician Walter Pitts.
• McCulloch and Pitts presented a simplified computational model to
perform complex computations using propositional logic.
• The early successes of ANNs until the 1960s led to the widespread belief
that we would soon be conversing with truly intelligent machines.
• When it became clear that this promise would go unfulfilled (at least for
quite a while), funding flew elsewhere and ANNs entered a long dark
era.
• In the early 1980s there was a revival of interest in ANNs as new
network architectures were invented and better training techniques
were developed.
• ANNs seem to have entered a good circle of funding and progress.
• ANNs based product news on headline, which pulls more and more
attention and funding toward them, resulting in more and more
progress, and even more amazing products.
• But by the 1990s, powerful alternative Machine Learning techniques
such as Support Vector Machines were favored by most researchers, as
they seemed to offer better results and stronger theoretical
foundations.
• Another wave of interest in ANNs is due to few good reasons :
• ANNs outperform other ML techniques on very large data and complex
problems.
• due to tremendous increase in computing power, able to train large neural
networks in a reasonable amount of time
• The training algorithms have been improved
Biological Neurons
Biological Neurons
• The cell composed of a cell body containing the nucleus and most of the
cell’s complex components, and many branching extensions
called dendrites, plus one very long extension called the axon.
• The axon’s length may be just a few times longer than the cell body, or
up to tens of thousands of times longer.
• Near its extremity the axon splits off into many branches
called telodendria, and at the tip of these branches are miniature
structures called synaptic terminals (or simply synapses), which are
connected to the dendrites (or directly to the cell body) of other
neurons.
• Biological neurons receive short electrical impulses called signals from
other neurons via these synapses.

• When a neuron receives a sufficient number of signals from other

neurons within a few milliseconds, it fires its own signals.

• Each neuron typically connected to thousands of other neurons. The

neurons are organized in a vast network of billions of neurons

• Highly complex computations can be performed by a vast network of

fairly simple neurons.
Neuron Model

• Neuron collects signals from dendrites

• Sends out spikes of electrical activity through an
axon, which splits into thousands of branches.
• At end of each branch, a synapses converts activity
into either exciting or inhibiting activity of a
dendrite at another neuron.
• Neuron fires when exciting activity surpasses
inhibitory activity
• Learning changes the effectiveness of the synapses
Neuron Model

• Abstract neuron model:

Logical Computations with Neurons
• A very simple model of the biological neuron, which later became
known as an artificial neuron proposed by Warren McCulloch and
Walter Pitts
• It has one or more binary (on/off) inputs and one binary output.
• The artificial neuron simply activates its output when more than a
certain number of its inputs are active.
• This Simplified model also outperforms in complex computations
using propositional logic.
• For example, let’s build a few ANNs that perform various logical
computations , assuming that a neuron is activated when at least two of
its inputs are active.
ANN performs simple Logical Computations

1. Identity function: If neuron A is activated, then neuron C gets activated

as well (since it receives two input signals from neuron A), but if neuron A
is off, then neuron C is off as well.
2. Logical AND: Neuron C is activated only when both neurons A and B are
activated (a single input signal is not enough to activate neuron C).

3. Logical OR: Neuron C gets activated if either neuron A or neuron B is

activated (or both).

4. Computes a slightly more complex logical proposition: Neuron C is

activated only if neuron A is active and if neuron B is off. If neuron A is
active all the time, then you get a logical NOT: neuron C is active when
neuron B is off, and vice versa.
The Perceptron
• The Perceptron is one of the simplest ANN architectures, invented in
1957 by Frank Rosenblatt.

• It is based on a slightly different artificial neuron called a linear

threshold unit (LTU).

• The inputs and output are now numbers (instead of binary on/off
values) and each input connection is associated with a weight.

• The LTU computes a weighted sum of its inputs (z = w1 x1 + w2 x2 + ⋯

+ wn xn = wT · x), then applies a step function to that sum and outputs the
result: hw(x) = step (z) = step (wT · x).
The Perceptron -linear threshold unit (LTU)
or Threshold logic unit(TLU)
.
• The most common step function used in Perceptrons is the Heaviside
step function, sometimes the signum function is used instead.

• A single LTU can be used for simple linear binary classification.

• It computes a linear combination of the inputs and if the result exceeds
a threshold, it outputs the positive class or else outputs the negative
class (just like a Logistic Regression classifier or a linear SVM).
• For example, a single LTU to classify iris flowers based on the petal
length and width.
• A Perceptron is simply composed of a single layer of LTUs, with each
neuron connected to all the inputs.
• These connections are often represented using special pass through
neurons called input neurons
• They just give output whatever input they are fed. Moreover, an extra
bias feature is generally added (x0 = 1).
• This bias feature is typically represented using a special type of neuron
called a bias neuron, which just outputs 1 all the time.
• A Perceptron with two inputs and three outputs is represented .
• This Perceptron can classify instances simultaneously into three different
binary classes, which makes it a multioutput classifier.
The Perceptron
The Perceptron
Perceptron learning rule (weight update)
Perceptron learning rule (weight update)

• wi, j is the connection weight between the ith input neuron and the
jth output neuron.
• xi is the ith input value of the current training instance.
• is the output of the jth output neuron for the current training instance.
• yj is the target output of the jth output neuron for the current training
instance.
• η is the learning rate.
• The decision boundary of each output neuron is linear, so Perceptrons
are incapable of learning complex patterns (just like Logistic Regression
classifiers).

• However, if the training instances are linearly separable, Rosenblatt

demonstrated that this algorithm would converge to a solution. This
is called the Perceptron convergence theorem.

• Scikit-Learn provides a Perceptron class that implements a single LTU

network.
• The Perceptron learning algorithm strongly resembles - Stochastic
Gradient Descent(SGD).
• In fact, Scikit-Learn’s Perceptron class is equivalent to using
an SGDClassifier with the following hyperparameters:
loss="perceptron", learning_rate="constant", eta0 =1 (the learning
rate), and penalty=None (no regularization).
• Note that contrary to Logistic Regression classifiers, Perceptrons do not
output a class probability; rather, they just make predictions based on a
hard threshold.
• This is one of the good reasons to prefer Logistic Regression over
Perceptrons.
OR GATE Perceptron Training Rule
OR GATE Perceptron Training Rule
OR GATE Perceptron Training Rule
Parallel Implementation of Perceptron

• The training of a perceptron is an inherently sequential process.

• If the number of dimensions of the vectors involved is huge, then we
might obtain some parallelism by computing dot products in parallel.
• In order to get signiﬁcant parallelism, we have to modify the
perceptron algorithm slightly, so that many training examples are
used with the same estimated weight vector w.
• let us formulate the parallel algorithm as a MapReduce job.
Weaknesses of Perceptron
• Incapable of solving some trivial e.g., the Exclusive OR (XOR)
classification problem – highlighted by Marvin Minsky and Seymour
Papert

• So, many researchers dropped connectionism altogether (i.e., the study

of neural networks) in favor of higher-level problems such as logic,
problem solving, and search.

• Limitations of Perceptrons : eliminated by stacking multiple

Perceptrons.

• The resulting ANN is called a Multi-Layer Perceptron (MLP).

Perceptron Learning Theorem

• Recap: A perceptron (threshold unit) can learn anything that it can

represent (i.e. anything separable with a hyperplane)

32
The Exclusive OR problem

A Perceptron cannot represent Exclusive OR since it is not linearly

separable.

33
Minsky & Papert (1969) offered solution to XOR problem by
combining perceptron unit responses using a second layer of
Units. Piecewise linear classification using an MLP with
threshold (perceptron) units

34
• In particular, an MLP can solve the XOR problem
• For each combination of inputs: with inputs (0, 0) or (1, 1) the network
outputs 0, and with inputs (0, 1) or (1, 0) it outputs 1.
Multi-Layer Perceptron and Its Properties
.

• An MLP is composed of one (pass

through) input layer, one or more layers
of LTUs, called hidden layers, and one
final layer of LTUs called the output layer
(0ften more than 3 layers).
• Every layer except the output layer
includes a bias neuron
• is fully connected to the next layer
• No connections within a layer
• No direct connections between input and
•Number of output units need not equal output layers
number of input units • When an ANN has two or more hidden
• Number of hidden units per layer can be layers, it is called a deep neural network
more or less than input or output units (DNN)
What do each of the layers do?

3rd layer can generate arbitrarily

1st layer draws linear 2nd layer combines the complex boundaries
boundaries boundaries
37
MLP with Back propagation
• Rumelhart introducing the back propagation training algorithm for MLP.

• BP has two phases: Forward Pass and Backward Pass

• Forward pass phase: computes ‘functional signal’, feed forward
propagation of input pattern signals through network
• For each training instance, the algorithm feeds it to the network and computes
the output of every neuron in each consecutive layer

• Backward pass phase: computes ‘error signal’, propagates the error

backwards through network starting at output units (where the error is
the difference between actual and desired output values)
• It then proceeds to measure how much of these error contributions
came from each neuron in the previous hidden layer—and so on until
the algorithm reaches the input layer.

• This reverse pass efficiently measures the error gradient across all the
connection weights in the network by propagating the error gradient
backward in the network (hence the name of the algorithm).
• For each training instance the back propagation algorithm :
1. First makes a prediction (forward pass),
2. Measures the error,
3. Then goes through each layer in reverse to
measure the error contribution from each
connection (reverse pass),
4. and finally slightly tweaks the connection weights
to reduce the error (Gradient Descent step).
Key change to the MLP’s
architecture
• To work algorithm properly ,
• the step function replaced with the logistic function, σ(z) =
1 / (1 + exp(–z)).
• Because ,
• The step function contains only flat segments - Gradient
Descent cannot move on a flat surface
• The logistic function has a well-defined nonzero derivative
everywhere, allowing Gradient Descent to make some
progress at every step.
• The back propagation algorithm also used with two more
popular activation functions :
• The hyperbolic tangent function: tanh (z) = 2σ(2z) – 1
• It is S-shaped, continuous, and differentiable, but its output
value ranges from –1 to 1 (instead of 0 to 1 in the case of the
logistic function),
• which tends to make each layer’s output more or less
normalized (i.e., centered around 0) at the beginning of
training. This often helps speed up convergence.
• The ReLU function: ReLU (z) = max (0, z).
• It is continuous but unfortunately not differentiable at z = 0
(the slope changes abruptly, which can make Gradient
Descent bounce around).
• In practice it works very well and has the advantage of being
fast to compute.
• Most importantly, the fact that it does not have a maximum
output value also helps reduce some issues during Gradient
Descent
Conceptually: Forward Activity -
Backward Error

45
Forward Propagation of Activity
• Step 1: Initialise weights at random, choose a
learning rate η
• Until network is trained:
• For each training example i.e. input pattern and
target output(s):
• Step 2: Do forward pass through net (with fixed
weights) to produce output(s)
• i.e., in Forward Direction, layer by layer:
• Inputs applied
• Multiplied by weights
• Summed
• ‘Squashed’ by sigmoid activation function
• Output passed to each neuron in next layer
• Repeat above until network output(s) produced

46
Step 3. Back-propagation of error
 Compute error (delta or local gradient) for each output
unit δ k
 Layer-by-layer, compute error (delta or local gradient)
for each hidden unit δ j by backpropagating errors
(as shown previously)

Step 4: Next, update all the weights Δwij

Update = LearningFactor· (DesiredOutput − ActualOutput) · Input
By gradient descent, and go back to Step 2
 The overall MLP learning algorithm, involving forward pass
and backpropagation of error (until the network training
completion), is known as the Generalised Delta Rule (GDR),
or more commonly, the Back Propagation (BP) algorithm

47
‘Back-prop’ algorithm summary (with NO Maths!)

48
‘Back-prop’ algorithm summary
(with Maths!) (Not Examinable)

49
MLP/BP: A worked example

50
Worked example: Forward Pass

51
Worked example: Forward Pass

52
Worked example: Backward Pass

53
Worked example: Update Weights
Using Generalized Delta Rule (BP)
Update = LearningFactor· (DesiredOutput − ActualOutput) ·
Input

54
Similarly for the all weights wij:

55
Verification that it works

56
• An MLP is often used for classification, with each output
corresponding to a different binary class
• e.g., spam/ham, urgent/not-urgent, and so on.
• When the classes are exclusive (e.g., classes 0 through 9 for
digit image classification), the output layer is typically
modified by replacing the individual activation functions by a
shared softmax function.

• The output of each neuron corresponds to the estimated

probability of the corresponding class.

Note that the signal flows only in one direction (from the
inputs to the outputs), so this architecture is an example of
a feedforward neural network (FNN).
• Biological neurons seem to implement a roughly sigmoid (S-shaped)
activation function, so researchers stuck to sigmoid functions for a very
long time.

• But it turns out that the ReLU activation function generally works better
in ANNs.

Unit 5
No ratings yet
Unit 5
61 pages
CCNA 1 v7 Modules 8 - 10 - Communicating Between Networks Exam Answers
No ratings yet
CCNA 1 v7 Modules 8 - 10 - Communicating Between Networks Exam Answers
31 pages
FALLSEM2023-24 CSE4020 ETH VL2023240103694 2023-09-01 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ETH VL2023240103694 2023-09-01 Reference-Material-I
35 pages
Mod-1 Part 1
No ratings yet
Mod-1 Part 1
143 pages
What Is Perceptron - Simplilearn
No ratings yet
What Is Perceptron - Simplilearn
46 pages
This Document Is About Artificial Inteligence.
No ratings yet
This Document Is About Artificial Inteligence.
81 pages
Week 2
No ratings yet
Week 2
47 pages
neural network
No ratings yet
neural network
18 pages
Unit 1
No ratings yet
Unit 1
25 pages
ML Unit 5
No ratings yet
ML Unit 5
33 pages
Deep Leaning
No ratings yet
Deep Leaning
117 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
25 pages
CO2- ANN Structure and Funadamentals_P1
No ratings yet
CO2- ANN Structure and Funadamentals_P1
65 pages
Ain3001 - Introduction - To.ann
No ratings yet
Ain3001 - Introduction - To.ann
39 pages
Neural Network – Overview
No ratings yet
Neural Network – Overview
37 pages
Module 1
No ratings yet
Module 1
100 pages
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
No ratings yet
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
65 pages
Neural Network
No ratings yet
Neural Network
85 pages
AML_unit_1
No ratings yet
AML_unit_1
160 pages
Mi 2
No ratings yet
Mi 2
605 pages
MLT Unit 2 Perceptron
No ratings yet
MLT Unit 2 Perceptron
34 pages
20200428135045cfbc718e2c (1)
No ratings yet
20200428135045cfbc718e2c (1)
30 pages
Neural Networks and CNN
No ratings yet
Neural Networks and CNN
25 pages
28 Lecture CSC462
No ratings yet
28 Lecture CSC462
28 pages
cv_2025_Spring_14
No ratings yet
cv_2025_Spring_14
33 pages
DL_Unit_I_&_Unit_II
No ratings yet
DL_Unit_I_&_Unit_II
156 pages
PERCEPTRON IMPLEMENTATION
No ratings yet
PERCEPTRON IMPLEMENTATION
33 pages
Machine Learning Using Neural Networks: Presentation By: C. Vinoth Kumar SSN College of Engineering
No ratings yet
Machine Learning Using Neural Networks: Presentation By: C. Vinoth Kumar SSN College of Engineering
24 pages
DL Slides 1
No ratings yet
DL Slides 1
63 pages
3 - ANN Part One PDF
No ratings yet
3 - ANN Part One PDF
30 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
81 pages
Introduction To ANN
No ratings yet
Introduction To ANN
107 pages
Chapter 1 - Introduction To Deep Learning 2023
No ratings yet
Chapter 1 - Introduction To Deep Learning 2023
50 pages
module-4
No ratings yet
module-4
84 pages
Intro DL
No ratings yet
Intro DL
48 pages
Module - 2
No ratings yet
Module - 2
33 pages
Wk. 12. Artificial Neural Networks [12!05!2021] (1)
No ratings yet
Wk. 12. Artificial Neural Networks [12!05!2021] (1)
48 pages
Artifical Neural Network
No ratings yet
Artifical Neural Network
69 pages
Chapter 3-1 Neural Network
No ratings yet
Chapter 3-1 Neural Network
43 pages
Ann L1
No ratings yet
Ann L1
81 pages
Unit 1 Until MLP
No ratings yet
Unit 1 Until MLP
56 pages
Lesson 03 Artificial Neural Network
No ratings yet
Lesson 03 Artificial Neural Network
116 pages
ML Unit-5 Final
No ratings yet
ML Unit-5 Final
23 pages
UNIT-II MLT1
No ratings yet
UNIT-II MLT1
45 pages
Neural Networks
No ratings yet
Neural Networks
42 pages
UNIT-4 Material
No ratings yet
UNIT-4 Material
43 pages
DL Unit 2
No ratings yet
DL Unit 2
107 pages
Part7.2 Artificial Neural Networks
No ratings yet
Part7.2 Artificial Neural Networks
51 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
23 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
ML-24-NN-part1-v.0.2_=7
No ratings yet
ML-24-NN-part1-v.0.2_=7
85 pages
unit 4- DL
No ratings yet
unit 4- DL
33 pages
Chapter10 Keras
No ratings yet
Chapter10 Keras
66 pages
Lecture Slides-Week13,14
No ratings yet
Lecture Slides-Week13,14
62 pages
Wk9-Neural Networks
No ratings yet
Wk9-Neural Networks
46 pages
FALLSEM2024-25_BCSE209L_TH_VL2024250101737_2024-08-06_Reference-Material-I
No ratings yet
FALLSEM2024-25_BCSE209L_TH_VL2024250101737_2024-08-06_Reference-Material-I
20 pages
UNDERSTANG PERCEPTRON and Perceptron LEARNING
No ratings yet
UNDERSTANG PERCEPTRON and Perceptron LEARNING
26 pages
2 DeepLearning
No ratings yet
2 DeepLearning
46 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
75 pages
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
From Everand
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
Fouad Sabry
No ratings yet
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
From Everand
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
Fouad Sabry
No ratings yet
S.H.EXPORTS Ok
No ratings yet
S.H.EXPORTS Ok
12 pages
Certificate of Insurance - 1714965157078
No ratings yet
Certificate of Insurance - 1714965157078
13 pages
Calculation of Average Mutual Information (AMI) and False-Nearest Neighbors (FNN) For The Estimation of Embedding Parameters of Multidimensional Time Series in Matlab
No ratings yet
Calculation of Average Mutual Information (AMI) and False-Nearest Neighbors (FNN) For The Estimation of Embedding Parameters of Multidimensional Time Series in Matlab
10 pages
Axle Load Kawasaki
No ratings yet
Axle Load Kawasaki
24 pages
POSPac v8.8 Release Notes
No ratings yet
POSPac v8.8 Release Notes
77 pages
Using Open-Ended Tools in Facilitating Language Learning
100% (2)
Using Open-Ended Tools in Facilitating Language Learning
14 pages
Awareness Briefings.
No ratings yet
Awareness Briefings.
25 pages
Unit01 03
No ratings yet
Unit01 03
147 pages
Operation Management
No ratings yet
Operation Management
15 pages
Answers 2
No ratings yet
Answers 2
202 pages
Recommendation Information Security en
No ratings yet
Recommendation Information Security en
1 page
I IT1 DN AA2 Z43 GT Us
No ratings yet
I IT1 DN AA2 Z43 GT Us
7 pages
DS-KV8113-WME1 (C) - Video Intercom Villa Door Station - Datasheet - V1.0
No ratings yet
DS-KV8113-WME1 (C) - Video Intercom Villa Door Station - Datasheet - V1.0
3 pages
SLT 2020
No ratings yet
SLT 2020
234 pages
Distributed Systems Principles and Paradigms: Second Edition Andrew S. Tanenbaum Maarten Van Steen
No ratings yet
Distributed Systems Principles and Paradigms: Second Edition Andrew S. Tanenbaum Maarten Van Steen
22 pages
Power Plant Engg Lect Notes 6th Sem
No ratings yet
Power Plant Engg Lect Notes 6th Sem
58 pages
CMT 308
No ratings yet
CMT 308
2 pages
Account Statement From 2 Oct 2024 To 17 Oct 2024: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
No ratings yet
Account Statement From 2 Oct 2024 To 17 Oct 2024: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
3 pages
وقايه محول شركه ميجر
100% (1)
وقايه محول شركه ميجر
56 pages
Diamec PHC 4
No ratings yet
Diamec PHC 4
152 pages
Mikrotik QOS Configuration
No ratings yet
Mikrotik QOS Configuration
53 pages
Gravimetric Measurement of Particulate Concentration of Hydrogen Fuel
No ratings yet
Gravimetric Measurement of Particulate Concentration of Hydrogen Fuel
4 pages
Fraco Octubre
No ratings yet
Fraco Octubre
2 pages
Fundamentals of Programming
No ratings yet
Fundamentals of Programming
18 pages
Week 06. Programming of Safety Critical Systems - MISRA-C
No ratings yet
Week 06. Programming of Safety Critical Systems - MISRA-C
32 pages
A PBN Oversight Course
No ratings yet
A PBN Oversight Course
3 pages
University of Madras: Bce-Cse2B Elective: Iot and Its Applications
No ratings yet
University of Madras: Bce-Cse2B Elective: Iot and Its Applications
2 pages
Efficiency of the Carnot Engine
No ratings yet
Efficiency of the Carnot Engine
4 pages
BAC Resolution Covid Supply
No ratings yet
BAC Resolution Covid Supply
2 pages