0% found this document useful (0 votes)

124 views42 pages

Statistical Methods in ANN

This document discusses statistical training methods for artificial neural networks. It describes Boltzmann training and Cauchy training, which are statistical methods that can help neural networks avoid getting stuck in local minima during training. Boltzmann training uses an artificial temperature variable and the Boltzmann distribution to probabilistically accept changes that increase error. Cauchy training is similar but uses the Cauchy distribution and a faster inverse linear temperature reduction to improve training time over Boltzmann training. Both methods gradually reduce randomness during training to help the network find the global minimum for error.

Uploaded by

elakkadan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

124 views42 pages

Statistical Methods in ANN

Uploaded by

elakkadan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 42

MODULE-IV

STATISTICAL METHODS IN ANN

Module 4
Statistical Methods: Boltzmann's Training - Cauchy
training - Artificial specific heat methods - applications
to general non-linear optimization problems

Statistical Methods are used for

Training ANN
Producing output from trained network
Training Methods

Deterministic Methods
Statistical Training Methods

Deterministic Training Method

Follows a step by step procedure.

Weights are changed based on

their current

values of weight.

also based on the desired output and the

actual output.
E.g.:-Perceptron Training Algorithm.
Back Propagation Algorithm etc

Statistical Training Methods

Make pseudo random change in the weights

Retains

only those change which results in

improvements.

GENERAL PROCEDURE
( FOR STTISTICAL TRAINING METHOD)
Apply a set of input and compute the resulting
output

Compare the result with target, find the error.

The objective of the training is to minimize the error.
Select a weight in random and adjust it by a small
random amount.

the adjustment improves our objective retain

the change

Otherwise

return the weight to the previous

value

Repeat

the procedures until the network is

trained to the desired level

The local minima problem

Objective Function

The objective function minimization problem can get

trapped in poor solution.

A
B
Weight

If the objective function is at A and if the random

weight

changes

are

small

then

the

weight

adjustment will be rejected.

The superior weight setting at point B will never
found and the

system will be trapped in local

minima instead of global minima at point B.

If the random weight changes are large both point
A and B are visited frequently, but so will every
other point.
The weight will change so drastically that it will
never settle at desired point.

Solution & Explanation

Statistical method overcome local minima problem by
a weight adjustment strategy.
Example:

Let

the fig. represents a ball on a surface in a

box.
If the box is shaken violently ,then the ball will
move rapidly from one side to the other side.
The probability to occupy any point on the
surface is equal for all points.

If the violence

of shaking is gradually reduced the ball

will stick to both point A and B.

If the shaking is again reduced it will settle to point B.

The ANN are trained in the same way as through

random weight adjustment.
At first large random adjustment are made.
The weight change that improves the objective
function is retained.
The average step size is hen gradually reduced to
reach global minimum.

Annealing [ Boltzmann Law ]

Annealing:-If a metal is raised to a temperature
above melting point ,the atoms are in violent random
motion. The atoms always tend to reach a minimum
energy state. As the metal is gradually cooled the
atoms enters a minimum possible energy state
corresponds to each temperature.

P (e) exp ( e / kT )
P(e)=probability that the system is in a state with
energy e.,k Boltzmanns constant. T temperature.

Simulated Annealing [Boltzmann Traing]

Define a variable T that represents an artificial

temperature. (Start with T at large value).

Apply a set of input to the network, and calculate

the outputs and objective function.

Make a random change weight and recalculate the

network output.

Calculate new objective function.

If the objective function is reduced, retain the

weight change.

If the weight

change results in an increase in

weight change ,calculate the probability of accepting

the weight change.

P(c) exp

( c / kT )

P(c)=probability of a change of c in the objective

function,k Boltzmanns constant. T temperature.

Select

a random number r from a uniform

distribution between zero and one.
If p(c) is greater than r, retain the change otherwise
return the weight to previous value.
This allows the system to take a step in a
direction that worsen the objective
function,hence
escapes
from
local
minimum.
Repeat the weight change process over each of the
weights in the network, gradually reducing the
temperature T untill an acceptably low value for
objective function is obtained.

How to select weights/artificial Temperature for training

The size of the random weight change is selected by

various methods.
2
2
Eg:- P ( ) exp( w / T )
P(w)=Probability of a weight change of size w.
T=artificial temperature
To achieve global minimum at the earliest the
cooling rate is usually expressed as follows

T0
T (t )
(log(1 t ))

The main disadvantage of Boltzmanns training is very

low cooling rate and hence long computations.
Boltzmanns machine usually takes impractical time
for training.

Cauchy Training
Cauchy training method is more rapid than
Boltzmann training.

Cauchy training substitutes cauchys distribution for

Boltzmann's distribution.

Caushys distribution has longer tails", hence more

probability for larger step size.

The

temperature reduction rate is changed to

inverse linear. (For Boltzmann training it was inverse

logarithmic.)

Cauchy s distribution is

T (t )
P( x)
[T (t ) 2 x 2 ]
The inverse linear relationship
reduction reduces the training time.

T0
T (t )
(1 t )

for

temperature

Methods of Ab Initio Prediction of Protein Structure
No ratings yet
Methods of Ab Initio Prediction of Protein Structure
54 pages
Entropy, Concentration, and Learning - A Statistical Mechanics Primer
No ratings yet
Entropy, Concentration, and Learning - A Statistical Mechanics Primer
38 pages
DIP Lecture-7 10 RKJ Image Enhancement
No ratings yet
DIP Lecture-7 10 RKJ Image Enhancement
210 pages
Lecture 4 - Local Search
No ratings yet
Lecture 4 - Local Search
32 pages
Lecture 9
No ratings yet
Lecture 9
31 pages
Simulated Annealing
No ratings yet
Simulated Annealing
46 pages
Boltzmann Learning
No ratings yet
Boltzmann Learning
47 pages
Chapter 11 Stochastic Methods Rooted in Statistical Mechanics
No ratings yet
Chapter 11 Stochastic Methods Rooted in Statistical Mechanics
24 pages
De Oliveira, Penna, Herrmann - 1996 - Broad Histogram Method
No ratings yet
De Oliveira, Penna, Herrmann - 1996 - Broad Histogram Method
14 pages
Statistical Physics: Willoughby Seago
No ratings yet
Statistical Physics: Willoughby Seago
42 pages
ANN.ch12 Boltzmann Machine
No ratings yet
ANN.ch12 Boltzmann Machine
55 pages
Full Notes
No ratings yet
Full Notes
197 pages
8.2. SE5072 - Optimization
No ratings yet
8.2. SE5072 - Optimization
73 pages
Special Networks
No ratings yet
Special Networks
42 pages
SimulatedAnnealing PDF
No ratings yet
SimulatedAnnealing PDF
26 pages
Lecture 7
No ratings yet
Lecture 7
60 pages
Aaw1147 Noe SM
No ratings yet
Aaw1147 Noe SM
17 pages
Statistical Physics Methods in Optimization and Machine Learning Notes
No ratings yet
Statistical Physics Methods in Optimization and Machine Learning Notes
279 pages
Generalized Simulated Annealing
No ratings yet
Generalized Simulated Annealing
12 pages
Blotzman Learning
No ratings yet
Blotzman Learning
15 pages
Ant Colony Optimization
100% (1)
Ant Colony Optimization
13 pages
Stochastic Search Algorithms-I
No ratings yet
Stochastic Search Algorithms-I
19 pages
Simulated Annealing: Aravind Sudheesan Biju B Roll No: 1 Roll - No:2
No ratings yet
Simulated Annealing: Aravind Sudheesan Biju B Roll No: 1 Roll - No:2
13 pages
Lecture 11: Simulated Annealing: Linear and Combinatorial Optimization
No ratings yet
Lecture 11: Simulated Annealing: Linear and Combinatorial Optimization
20 pages
Derivative Free Optimization Simulated Annealing: Presentation By: C. Vinoth Kumar SSN College of Engineering
No ratings yet
Derivative Free Optimization Simulated Annealing: Presentation By: C. Vinoth Kumar SSN College of Engineering
16 pages
AI Module II Lecture 2
No ratings yet
AI Module II Lecture 2
15 pages
JohnScalesAvery 2012 AppendixAENTROPYANDIN InformationTheoryAndE
No ratings yet
JohnScalesAvery 2012 AppendixAENTROPYANDIN InformationTheoryAndE
6 pages
Simulated Annealing: Starting With Steepest Descent Method
No ratings yet
Simulated Annealing: Starting With Steepest Descent Method
28 pages
Thfel PR15
No ratings yet
Thfel PR15
16 pages
Fast Dropout
No ratings yet
Fast Dropout
11 pages
Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
No ratings yet
Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
25 pages
NIOT-UnitII-HillClimbing - and-SA
No ratings yet
NIOT-UnitII-HillClimbing - and-SA
6 pages
Boltzmann Machine Learning
No ratings yet
Boltzmann Machine Learning
15 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
Boltz321 PDF
No ratings yet
Boltz321 PDF
7 pages
Boltzmann Machine - Tutorialspoint
100% (1)
Boltzmann Machine - Tutorialspoint
3 pages
A Mean Field Theory Learning Algorithm For Neural Networks
No ratings yet
A Mean Field Theory Learning Algorithm For Neural Networks
27 pages
Screenshot 2024-10-19 at 10.37.25 AM
No ratings yet
Screenshot 2024-10-19 at 10.37.25 AM
25 pages
Boltzmann Machines
No ratings yet
Boltzmann Machines
7 pages
Simulated Annealing
No ratings yet
Simulated Annealing
15 pages
Sa Sel Slides
No ratings yet
Sa Sel Slides
32 pages
Ising Proj
No ratings yet
Ising Proj
12 pages
Algorithm For Unconstrained-Multivariable Case-2 (CH 6)
No ratings yet
Algorithm For Unconstrained-Multivariable Case-2 (CH 6)
31 pages
Ising 3
No ratings yet
Ising 3
6 pages
Lec 40
No ratings yet
Lec 40
57 pages
Simulated Annealing and The Boltzmann Machine
No ratings yet
Simulated Annealing and The Boltzmann Machine
4 pages
Image With GAN-topic
No ratings yet
Image With GAN-topic
20 pages
Optimization Via Search: CPSC 315 - Programming Studio Spring 2008 Project 2, Lecture 4
No ratings yet
Optimization Via Search: CPSC 315 - Programming Studio Spring 2008 Project 2, Lecture 4
44 pages
L24 Simulated Annelaing
No ratings yet
L24 Simulated Annelaing
9 pages
AIPFAst Simulated Annealing
No ratings yet
AIPFAst Simulated Annealing
6 pages
Unit 5
No ratings yet
Unit 5
110 pages
Contemporary Communication Systems Using Matlab Proakis and Salehi
100% (2)
Contemporary Communication Systems Using Matlab Proakis and Salehi
443 pages
10.34: Numerical Methods Applied To Chemical Engineering Prof. K. Beers
No ratings yet
10.34: Numerical Methods Applied To Chemical Engineering Prof. K. Beers
24 pages
Foundations of Deep Learning
No ratings yet
Foundations of Deep Learning
48 pages
Kerala University Mtech Syllabus Electrical Machines
No ratings yet
Kerala University Mtech Syllabus Electrical Machines
84 pages
Signals and Systems Unit 3
No ratings yet
Signals and Systems Unit 3
55 pages
Implementing Bit-Serial Digital Filters in At6000 Fpgas
No ratings yet
Implementing Bit-Serial Digital Filters in At6000 Fpgas
9 pages
Differential Evolution
No ratings yet
Differential Evolution
12 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
Boltzmann Machine
No ratings yet
Boltzmann Machine
6 pages
Ai&ml Unit 4
No ratings yet
Ai&ml Unit 4
21 pages
COL106 Minor 2: Arpit Saxena
No ratings yet
COL106 Minor 2: Arpit Saxena
9 pages
Steepest Descent
No ratings yet
Steepest Descent
2 pages
ARTIFICIAL NEURAL NETWORKS-moduleIII
No ratings yet
ARTIFICIAL NEURAL NETWORKS-moduleIII
61 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
Presentation 1
No ratings yet
Presentation 1
18 pages
07-Fec RSC
No ratings yet
07-Fec RSC
30 pages
A Stable AI-Based Binary and Multiple Class Heart Disease Prediction Model For IoMT
No ratings yet
A Stable AI-Based Binary and Multiple Class Heart Disease Prediction Model For IoMT
9 pages
Public Sector Undertakings Recruiting Through GATE 2014
No ratings yet
Public Sector Undertakings Recruiting Through GATE 2014
3 pages
Digital Signal Processing Lab 4: Figure 3.1: Basic View of Sampling Theorem
No ratings yet
Digital Signal Processing Lab 4: Figure 3.1: Basic View of Sampling Theorem
3 pages
Dynamic Programming: 17.1. The Coin Changing Problem
No ratings yet
Dynamic Programming: 17.1. The Coin Changing Problem
3 pages
All-Wheel Direct Drive System Using Switched Reluctance Motor in Inverted Configuration
No ratings yet
All-Wheel Direct Drive System Using Switched Reluctance Motor in Inverted Configuration
2 pages
Runge Kutta Method
0% (1)
Runge Kutta Method
53 pages
(Revisi 1) Jadwal Munaqosyah Periode Agustus 2022
No ratings yet
(Revisi 1) Jadwal Munaqosyah Periode Agustus 2022
6 pages
Kerala University M Tech Power Systems Syllabus
No ratings yet
Kerala University M Tech Power Systems Syllabus
88 pages
A Design of LDPC Codes With Large Girth Based On The Sub-Matrix Shifting
No ratings yet
A Design of LDPC Codes With Large Girth Based On The Sub-Matrix Shifting
4 pages
MCMC Brief
100% (1)
MCMC Brief
69 pages
Math 360 Numerical Analysis Spring 10
No ratings yet
Math 360 Numerical Analysis Spring 10
6 pages
Boltzman Machine
No ratings yet
Boltzman Machine
4 pages
Monte Carlo Simulation of 2-D Ising Model Using Wang-Landau Method
No ratings yet
Monte Carlo Simulation of 2-D Ising Model Using Wang-Landau Method
4 pages
Midterm 1 Solutions
No ratings yet
Midterm 1 Solutions
7 pages
Lecture 5 Discrete-Time Convolution
No ratings yet
Lecture 5 Discrete-Time Convolution
92 pages
HEVC Overview Rev2
No ratings yet
HEVC Overview Rev2
110 pages
Auto Dock
67% (3)
Auto Dock
46 pages
Method For Simultaneous Equations
No ratings yet
Method For Simultaneous Equations
7 pages
MA3251 Unit 4
No ratings yet
MA3251 Unit 4
68 pages
ANN Question Paper 2022
No ratings yet
ANN Question Paper 2022
4 pages
Textbook G426 Sacchi
No ratings yet
Textbook G426 Sacchi
199 pages
Solving Problems Related To Dynamic Programming (Basics) : CSE 4404: Algorithms Lab
No ratings yet
Solving Problems Related To Dynamic Programming (Basics) : CSE 4404: Algorithms Lab
4 pages
Dijkstra Solutions
No ratings yet
Dijkstra Solutions
13 pages
DC Component: I/Solutions To Get Rid of DC Component
No ratings yet
DC Component: I/Solutions To Get Rid of DC Component
2 pages
ML - Module 3 - Chapter 4 RNSIT
No ratings yet
ML - Module 3 - Chapter 4 RNSIT
5 pages
حسین
No ratings yet
حسین
3 pages

Statistical Methods in ANN

Uploaded by

Statistical Methods in ANN

Uploaded by

MODULE-IV

STATISTICAL METHODS IN ANN

Statistical Methods are used for

Deterministic Training Method

Follows a step by step procedure.

also based on the desired output and the

Statistical Training Methods

Make pseudo random change in the weights

only those change which results in

Compare the result with target, find the error.

the adjustment improves our objective retain

return the weight to the previous

the procedures until the network is

trained to the desired level

The local minima problem

The objective function minimization problem can get

If the objective function is at A and if the random

adjustment will be rejected.

system will be trapped in local

minima instead of global minima at point B.

Solution & Explanation

the fig. represents a ball on a surface in a

of shaking is gradually reduced the ball

will stick to both point A and B.

If the shaking is again reduced it will settle to point B.

The ANN are trained in the same way as through

Annealing [ Boltzmann Law ]

Simulated Annealing [Boltzmann Traing]

Define a variable T that represents an artificial

Apply a set of input to the network, and calculate

Make a random change weight and recalculate the

Calculate new objective function.

If the objective function is reduced, retain the

change results in an increase in

weight change ,calculate the probability of accepting

P(c)=probability of a change of c in the objective

a random number r from a uniform

How to select weights/artificial Temperature for training

The size of the random weight change is selected by

The main disadvantage of Boltzmanns training is very

Cauchy training substitutes cauchys distribution for

Caushys distribution has longer tails", hence more

temperature reduction rate is changed to

inverse linear. (For Boltzmann training it was inverse

You might also like