Lecture_3 (1) (1)
Lecture_3 (1) (1)
Institute of Technology,
Chembur, Mumbai
Input Signals
Output Signals
Middle Layer
Input Layer Output Layer
x1
Y
w1
x2
w2
Neuron Y Y
wn Y
xn
Y Y Y Y
+1 +1 1 1
0 X 0 X 0 X 0 X
-1 -1 -1 -1
Inputs
x1 Linear Hard
w1 Combiner Limiter
Output
Y
w2
q
x2
Threshold
Class A1
1
2
1
x1
Class A2 x1
e( p) = Yd ( p) −Y( p) where p = 1, 2, 3, . . .
where p = 1, 2, 3, . . .
a is the learning rate, a positive constant less than
unity.
The perceptron learning rule was first proposed by
Rosenblatt in 1960. Using this rule we can derive the
perceptron training algorithm for classification tasks.
Step 2: Activation
Activate the perceptron by applying inputs x1(p),
x2(p),…, xn(p) and desired output Yd (p).
Calculate the actual output at iteration p = 1
n
Y ( p ) = step x i ( p ) w i ( p ) − q
i = 1
where n is the number of the perceptron inputs,
and step is a step activation function.
Dwi ( p) = a xi ( p) . e( p)
Step 4: Iteration
Increase iteration p by one, go back to Step 2 and
repeat the process until convergence.
21/02/2022 ASHWINI SAWANT, ASSISTANT PROFESSOR, VESIT 18
Example of perceptron learning: the logical operation AND
Inputs Desired Initial Actual Error Final
Epoch output weights output weights
x1 x2 Yd w1 w2 Y e w1 w2
1 0 0 0 0.3 − 0.1 0 0 0.3 − 0.1
0 1 0 0.3 − 0.1 0 0 0.3 − 0.1
1 0 0 0.3 − 0.1 1 −1 0.2 − 0.1
1 1 1 0.2 − 0.1 0 1 0.3 0.0
2 0 0 0 0.3 0.0 0 0 0.3 0.0
0 1 0 0.3 0.0 0 0 0.3 0.0
1 0 0 0.3 0.0 1 −1 0.2 0.0
1 1 1 0.2 0.0 1 0 0.2 0.0
3 0 0 0 0.2 0.0 0 0 0.2 0.0
0 1 0 0.2 0.0 0 0 0.2 0.0
1 0 0 0.2 0.0 1 −1 0.1 0.0
1 1 1 0.1 0.0 0 1 0.2 0.1
4 0 0 0 0.2 0.1 0 0 0.2 0.1
0 1 0 0.2 0.1 0 0 0.2 0.1
1 0 0 0.2 0.1 1 −1 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
5 0 0 0 0.1 0.1 0 0 0.1 0.1
0 1 0 0.1 0.1 0 0 0.1 0.1
1 0 0 0.1 0.1 0 0 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
Threshold: q = 0.2; learning rate: = 0.1
1 1 1
x1 x1 x1
0 1 0 1 0 1
Output Signals
Input Signals
First Second
Input hidden hidden Output
layer layer layer layer
i wij j wjk
xi k yk
m
n l yl
xn
Input Hidden Output
layer layer layer
Error signals
2.4 2.4
− , +
Fi Fi
where Fi is the total number of inputs of neuron i
in the network. The weight initialisation is done
on a neuron-by-neuron basis.
k ( p) = yk ( p) 1 − yk ( p) ek ( p)
ek ( p) = yd ,k ( p) − yk ( p)
where
Calculate the weight corrections:
Dw jk ( p) = y j ( p) k ( p)
Update the weights at the output neurons:
w jk ( p + 1) = w jk ( p) + Dw jk ( p)
21/02/2022 ASHWINI SAWANT, ASSISTANT PROFESSOR, VESIT 30
Step 3: Weight training (continued)
(b) Calculate the error gradient for the neurons in
the hidden layer:
l
j ( p) = y j ( p) [1 − y j ( p)] k ( p) w jk ( p)
k =1
Calculate the weight corrections:
Dwij ( p) = xi ( p) j ( p)
Update the weights at the hidden neurons:
wij ( p + 1) = wij ( p) + Dwij ( p)
q3
w13 −1
x1 1 3 w35
w23 q5
5 y5
w24
x2 2 4 w45
w24
Input q4 Output
layer layer
−1
Hiddenlayer
21/02/2022 ASHWINI SAWANT, ASSISTANT PROFESSOR, VESIT 33
◼ The effect of the threshold applied to a neuron in the
hidden or output layer is represented by its weight, q,
connected to a fixed input equal to −1.
◼ The initial weights and threshold levels are set
randomly as follows:
w13 = 0.5, w14 = 0.9, w23 = 0.4, w24 = 1.0, w35 = −1.2,
w45 = 1.1, q3 = 0.8, q4 = −0.1 and q5 = 0.3.
10 0
Sum-Squared Error
10 -1
10 -2
10 -3
10 -4
0 50 100 150 200
Epoch
+1.5
−1
+1.0
x1 1 3 −2.0 +0.5
+1.0
5 y5
+1.0
x2 2 +1.0
4
+1.0
+0.5
−1
21/02/2022 ASHWINI SAWANT, ASSISTANT PROFESSOR, VESIT 41
Decision boundaries
x2 x2 x2
x1 + x2 – 1.5 = 0 x1 + x2 – 0.5 = 0
1 1 1
x1 x1 x1
0 1 0 1 0 1
Dw jk ( p) = Dw jk ( p − 1) + y j ( p) k ( p)
1.5
1
Learning Rate
0.5
-0.5
-1
0 20 40 60 80 100 120 140
Epoch
Sum-Squared Erro
10 0
10 -1
10 -2
10 -3
10 -4
0 10 20 30 40 50 60 70 80 90 100
Epoch
1
0.8
Learning Rate
0.6
0.4
0.2
0
0 20 40 60 80 100 120
Epoch
Sum-Squared Erro
10 1
10 0
10 -1
10 -2
10 -3
10 -4
0 10 20 30 40 50 60 70 80
Epoch
2.5
2
Learning Rate
1.5
0.5
0
0 10 20 30 40 50 60 70 80 90
Epoch
x1 1 y1
x2 2 y2
xi i yi
xn n yn
+1, if X 0
sign
Y = −1, if X
Y, if X =
y1
y2
Y =
y n
(−1, 1, 1) (1, 1, 1)
y1
0
(−1,−1,−1) (1,−1,−1)
(−1,−1, 1) (1,−1, 1)
y3
21/02/2022 ASHWINI SAWANT, ASSISTANT PROFESSOR, VESIT 57
◼ The stable state-vertex is determined by the weight
matrix W, the current input vector X, and the
threshold matrix q. If the input vector is partially
incorrect or incomplete, the initial state will converge
into the stable state-vertex after a few iterations.
◼ Suppose, for instance, that our network is required to
memorise two opposite states, (1, 1, 1) and (−1, −1, −1).
Thus,
1 −1
Y1 = 1 Y2 = −1 or Y1T = 1 1 1 Y2T = − 1 − 1 − 1
1 −1
1 y1(p) 1 y1(p)
x2 (p) 2 x2(p+1) 2
2 y2(p) 2 y2(p)
xi (p)
j yj(p) j yj(p)
i xi(p+1) i
m ym(p) m ym(p)
xn(p) n xn(p+1) n
Input Output Input Output
layer layer layer layer
(a) Forward direction. (b) Backward direction.
m=1
where M is the number of pattern pairs to be stored in
the BAM.