0% found this document useful (0 votes)
8 views64 pages

AI Unit II Lec Notes Deep Learning

This document provides an overview of feedforward networks, particularly focusing on multilayer perceptrons (MLPs) and their significance in machine learning. It discusses the architecture of MLPs, the backpropagation algorithm for training neural networks, and various types of gradient descent optimization methods. Additionally, it covers the Kohonen Self-Organizing Feature Map as a competitive learning model in neural networks.

Uploaded by

Praneeth B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views64 pages

AI Unit II Lec Notes Deep Learning

This document provides an overview of feedforward networks, particularly focusing on multilayer perceptrons (MLPs) and their significance in machine learning. It discusses the architecture of MLPs, the backpropagation algorithm for training neural networks, and various types of gradient descent optimization methods. Additionally, it covers the Kohonen Self-Organizing Feature Map as a competitive learning model in neural networks.

Uploaded by

Praneeth B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

UNIT II

Feedforward Networks

P Jyothi,Asst. Prof., CSE Dept.


P Jyothi
Asst. prof.,
CSE Dept.

P Jyothi,Asst. Prof., CSE Dept.


Introduction

 Deep feedforward networks, also often called feedforward


neural networks, or multilayer perceptrons (MLPs)
 These models are called feedforward because information
flows through the function being evaluated from x,
through the intermediate computations used to define f ,
and finally to the output y.
 There are no feedback connections in which outputs of
the model are fed back into itself. When feedforward
neural networks are extended to include feedback
connections, they are called recurrent neural networks
P Jyothi,Asst. Prof., CSE Dept.
Multilayer Perceptron

 Feedforward networks are of extreme importance to


machine learning practitioners. They form the basis of
many important commercial applications.
For example, the convolutional networks used for object
recognition from photos are a specialized kind of
feedforward network. Feedforward networks are a
conceptual stepping stone on the path to recurrent
networks, which power many natural language applications

P Jyothi,Asst. Prof., CSE Dept.


 Feedforward neural networks are called networks because
they are typically represented by composing together
many different functions. The model is associated with a
directed acyclic graph describing how the functions are
composed together.
 For example, we might have three functions f (1), f (2),
and f (3) connected in a chain, to form f(x) = f(3)(f
(2)(f(1) (x))). These chain structures are the most
commonly used structures of neural networks. In this
case, f (1) is called the first layer of the network, f (2) is
called the second layer, and so on.
P Jyothi,Asst. Prof., CSE Dept.
 The overall length of the chain gives the depth of the
model. It is from this terminology that the name “deep
learning” arises. The final layer of a feedforward network
is called the output layer
 Feedforward networks have introduced the concept of a
hidden layer, and this requires us to choose the activation
functions that will be used to compute the hidden layer
values.

P Jyothi,Asst. Prof., CSE Dept.


 We must also design the architecture of the network,
including how many layers the network should contain,
how these layers should be connected to each other, and
how many units should be in each layer.
 Learning in deep neural networks requires computing the
gradients of complicated functions. We present the back-
propagation algorithm and its modern generalizations,
which can be used to efficiently compute these gradients.

P Jyothi,Asst. Prof., CSE Dept.


Multilayer Perceptron

Perceptrons bear similarity to


neurons as the structure is
very similar. Perceptron also
takes input and give output in
the same fashion as a neuron
does. Perceptrons are the
building block of all the
architectures in deep-learning.
The input given to the
perceptron is the dot product
of weights and the input. The
function takes this input and
gives some output. If the
output is greater than 0 then
the final output(y^) is 1 else 0.
P Jyothi,Asst. Prof., CSE Dept. You can choose any function
as an activation function. For
Multilayer Perceptron Conti..

Multilayer perceptron(MLP)

P Jyothi,Asst. Prof., CSE Dept.


Multilayer Perceptron Conti..

 MLP we have multiple layers of perceptrons. MLPs are


feed-forward artificial neural networks. In MLP we have at
least 3 layers. The first layer is called the input layer, the
next ones are called hidden layers and last on is called the
output layer. The nodes in the input layer don’t have
activation, in fact, the nodes in the input layers represent
the data point. If the data point is represented using a d-
dimensional vector then the input layer will have d nodes.

P Jyothi,Asst. Prof., CSE Dept.


Multilayer Perceptron Conti..

 In the above diagram, we have one input layer, 2 hidden


layers, and the last final layer. All layers are fully connected.
This means the current node is connected with the nodes
from the previous layer. We have a weight matrix in each
layer that stores all the weight for that layer. This essentially
is what we get once training is over. All these weights get
updated during training using back-propagation.

P Jyothi,Asst. Prof., CSE Dept.


Multilayer Perceptron Conti..

P Jyothi,Asst. Prof., CSE Dept.


Multilayer Perceptron Conti..

 Multilayer Perceptron falls under the category


of feedforward algorithms, because inputs are combined
with the initial weights in a weighted sum and subjected to
the activation function, just like in the Perceptron. But the
difference is that each linear combination is propagated to
the next layer.
 Each layer is feeding the next one with the result of their
computation, their internal representation of the data. This
goes all the way through the hidden layers to the output
layer.
P Jyothi,Asst. Prof., CSE Dept.
Gradient Descent

 Gradient Descent is known as one of the most commonly used


optimization algorithms to minimize errors between actual and
expected results. Further, gradient descent is also used to train
Neural Networks.
 In mathematical terminology, Optimization algorithm refers to the
task of minimizing/maximizing an objective function f(x)
parameterized by x. Similarly, in machine learning, optimization
is the task of minimizing the cost function parameterized by the
model's parameters. The main objective of gradient descent is to
minimize the convex function using iteration of parameter
updates. Once these machine learning models are optimized,
these models can be used as powerful tools for Artificial
Intelligence and various computer science applications.
P Jyothi,Asst. Prof., CSE Dept.
Gradient Descent Conti..

 It is also called as Gradient Descent or Steepest Descent


 Gradient Descent is defined as one of the most
commonly used iterative optimization algorithms of
machine learning to train the machine learning and
deep learning models. It helps in finding the local
minimum of a function.

P Jyothi,Asst. Prof., CSE Dept.


Gradient Descent Conti..

 The best way to define the local minimum or local maximum of a function using
gradient descent is as follows:

P Jyothi,Asst. Prof., CSE Dept.


Gradient Descent Conti..

• If we move towards a negative gradient or away


from the gradient of the function at the current
point, it will give the local minimum of that
function.
• Whenever we move towards a positive gradient or
towards the gradient of the function at the current
point, we will get the local maximum of that
function.
P Jyothi,Asst. Prof., CSE Dept.
Gradient Descent Conti..

 The main objective of using a gradient descent


algorithm is to minimize the cost function using
iteration. To achieve this goal, it performs two steps
iteratively:
• Calculates the first-order derivative of the function to
compute the gradient or slope of that function.
• Move away from the direction of the gradient, which means
slope increased from the current point by alpha times,
where Alpha is defined as Learning Rate. It is a tuning
parameter in the optimization process which helps to
decide the length of the steps.
P Jyothi,Asst. Prof., CSE Dept.
Gradient Descent Conti..

 The cost function is defined as the measurement of


difference or error between actual values and expected
values at the current position and present in the form
of a single real number. It helps to increase and improve
machine learning efficiency by providing feedback to this
model so that it can minimize error and find the local or
global minimum. Further, it continuously iterates along the
direction of the negative gradient until the cost function
approaches zero. At this steepest descent point, the model
will stop learning further.
P Jyothi,Asst. Prof., CSE Dept.
Gradient Descent Conti..

Types of Gradient Descent

 Based on the error in various training models, the Gradient


Descent learning algorithm can be divided into Batch
gradient descent, stochastic gradient descent, and
mini-batch gradient descent.

P Jyothi,Asst. Prof., CSE Dept.


Gradient Descent Conti..

 1. Batch Gradient Descent:


 Batch gradient descent (BGD) is used to find the error for each point in the
training set and update the model after evaluating all training examples. This
procedure is known as the training epoch. In simple words, it is a greedy
approach where we have to sum over all examples for each update.
 Advantages of Batch gradient descent:
• It produces less noise in comparison to other gradient descent.
• It produces stable gradient descent convergence.
• It is Computationally efficient as all resources are used for all training samples.

P Jyothi,Asst. Prof., CSE Dept.


Gradient Descent Conti..

2. Stochastic gradient descent


 Stochastic gradient descent (SGD) is a type of gradient descent
that runs one training example per iteration. Or in other words, it
processes a training epoch for each example within a dataset
and updates each training example's parameters one at a time.
As it requires only one training example at a time, hence it is
easier to store in allocated memory.
 However, it shows some computational efficiency losses in
comparison to batch gradient systems as it shows frequent
updates that require more detail and speed. Further, due to
frequent updates, it is also treated as a noisy gradient. However,
sometimes it can be helpful in finding the global minimum and
also escaping the local minimum.
P Jyothi,Asst. Prof., CSE Dept.
Gradient Descent Conti..

 Advantages of Stochastic gradient descent:


 In Stochastic gradient descent (SGD), learning happens on every example, and
it consists of a few advantages over other gradient descent.
• It is easier to allocate in desired memory.
• It is relatively fast to compute than batch gradient descent.
• It is more efficient for large datasets.

P Jyothi,Asst. Prof., CSE Dept.


Gradient Descent Conti..

 3. MiniBatch Gradient Descent:


 Mini Batch gradient descent is the combination of both batch gradient descent
and stochastic gradient descent. It divides the training datasets into small batch
sizes then performs the updates on those batches separately. Splitting training
datasets into smaller batches make a balance to maintain the computational
efficiency of batch gradient descent and speed of stochastic gradient descent.
Hence, we can achieve a special type of gradient descent with higher
computational efficiency and less noisy gradient descent.
 Advantages of Mini Batch gradient descent:
• It is easier to fit in allocated memory.
• It is computationally efficient.
• It produces stable gradient descent convergence.

P Jyothi,Asst. Prof., CSE Dept.


Backpropagation

 Backpropagation is one of the important concepts


of a neural network. Our task is to classify our data
best. For this, we have to update the weights of
parameter and bias, but how can we do that in a
deep neural network? In the linear regression
model, we use gradient descent to optimize the
parameter. Similarly here we also use gradient
descent algorithm using Backpropagation.

P Jyothi,Asst. Prof., CSE Dept.


Backpropagation Conti..

 The main features of Backpropagation are the


iterative, recursive and efficient method through
which it calculates the updated weight to improve
the network until it is not able to perform the task
for which it is being trained.

P Jyothi,Asst. Prof., CSE Dept.


Backpropagation Conti..

 Backpropagation is the essence of neural network training. It is


the method of fine-tuning the weights of a neural network
based on the error rate obtained in the previous epoch (i.e.,
iteration). Proper tuning of the weights allows you to reduce
error rates and make the model reliable by increasing its
generalization.
 Backpropagation in neural network is a short form for
“backward propagation of errors.” It is a standard method of
training artificial neural networks. This method helps calculate
the gradient of a loss function with respect to all the weights in
the network.
P Jyothi,Asst. Prof., CSE Dept.
Backpropagation Conti..

How Backpropagation Algorithm Works


 The Back propagation algorithm in neural network
computes the gradient of the loss function for a single
weight by the chain rule. It efficiently computes one
layer at a time, unlike a native direct computation. It
computes the gradient, but it does not define how the
gradient is used. It generalizes the computation in the
delta rule.
P Jyothi,Asst. Prof., CSE Dept.
Backpropagation Conti..

P Jyothi,Asst. Prof., CSE Dept.


Backpropagation Conti..

1. Inputs X, arrive through the preconnected path


2. Input is modeled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to
the output layer.
4. Calculate the error in the outputs

5. Travel back from the output layer to the hidden layer to adjust the weights such that
the error is decreased.
 Keep repeating the process until the desired output is achieved

P Jyothi,Asst. Prof., CSE Dept.


Backpropagation Conti..

Most prominent advantages of Backpropagation are:


• Backpropagation is fast, simple and easy to program
• It has no parameters to tune apart from the numbers of input
• It is a flexible method as it does not require prior knowledge about the network
• It is a standard method that generally works well
• It does not need any special mention of the features of the function to be learned.

P Jyothi,Asst. Prof., CSE Dept.


Backpropagation Conti..

 Two Types of Backpropagation Networks are:


• Static Back-propagation
• Recurrent Backpropagation
 Static back-propagation
 It is one kind of backpropagation network which produces a mapping of a static input for
static output. It is useful to solve static classification issues like optical character
recognition.
 Recurrent Backpropagation
 Recurrent Back propagation in data mining is fed forward until a fixed value is achieved.
After that, the error is computed and propagated backward.
 The main difference between both of these methods is: that the mapping is rapid in static
back-propagation while it is nonstatic in recurrent backpropagation.

P Jyothi,Asst. Prof., CSE Dept.


Kohonen Self- Organizing Feature Map

 Kohonen Self-Organizing feature map (SOM)


refers to a neural network, which is trained using
competitive learning. Basic competitive learning
implies that the competition process takes place
before the cycle of learning. The competition
process suggests that some criteria select a
winning processing element. After the winning
processing element is selected, its weight vector is
adjusted according to the used learning law
P Jyothi,Asst. Prof., CSE Dept.
Kohonen Self- Organizing Feature
Map Conti..
 The self-organizing map is typically represented as a
two-dimensional sheet of processing elements
described in the figure given below. Each processing
element has its own weight vector, and learning of
SOM (self-organizing map) depends on the adaptation
of these vectors.
 The processing elements of the network are made
competitive in a self-organizing process, and specific
criteria pick the winning processing element whose
weights are updated. Generally, these criteria are used
to limit the Euclidean distance between the input
vector and the weight vector.
P Jyothi,Asst. Prof., CSE Dept.
Kohonen Self- Organizing Feature Map
Conti..
 SOM (self-organizing map) varies from basic
competitive learning so that instead of adjusting only
the weight vector of the winning processing element
also weight vectors of neighboring processing
elements are adjusted. First, the size of the
neighborhood is largely making the rough ordering of
SOM and size is diminished as time goes on.
 At last, only a winning processing element is adjusted,
making the fine-tuning of SOM possible. The use of
neighborhood makes topologically ordering procedure
possible, and together with competitive learning makes
process non-linear.
P Jyothi,Asst. Prof., CSE Dept.
Kohonen Self- Organizing Feature Map
Conti..
 The self-organizing map refers to an unsupervised learning
model proposed for applications in which maintaining a
topology between input and output spaces.
 It is fundamentally a method for dimensionality reduction,
as it maps high-dimension inputs to a low dimensional
discretized representation and preserves the basic
structure of its input space.

P Jyothi,Asst. Prof., CSE Dept.


Kohonen Self- Organizing Feature Map
Conti..

P Jyothi,Asst. Prof., CSE Dept.


Kohonen Self- Organizing Feature Map
Conti..
 All the entire learning process occurs without supervision
because the nodes are self-organizing. They are also
known as feature maps, as they are basically retraining the
features of the input data, and simply grouping themselves
as indicated by the similarity between each other. It has
practical value for visualizing complex or huge quantities of
high dimensional data and showing the relationship
between them into a low, usually two-dimensional field to
check whether the given unlabeled data have any structure
to it.
P Jyothi,Asst. Prof., CSE Dept.
Kohonen Self- Organizing Feature Map
Conti..
 A Self-Organizing Map utilizes competitive learning instead of error-correction
learning, to modify its weights. It implies that only an individual node is activated
at each cycle in which the features of an occurrence of the input vector are
introduced to the neural network, as all nodes compete for the privilege to
respond to the input.
 The architecture of the Self Organizing Map with two clusters and n input
features of any sample is given below:

P Jyothi,Asst. Prof., CSE Dept.


Learning Vector Quantization (LVQ)

 This algorithm stands at the intersection of clustering and classification, offering


a unique approach to solving multi-class classification problems.
 Architecture

P Jyothi,Asst. Prof., CSE Dept.


Learning Vector Quantization (LVQ)
Conti..
 LVQ network is a two-layered network. The first layer can be called the competitive
layer and the second layer can be called the linear layer. The names of the layers
are the result of the activation function used in that layer. The LVQ network is
illustrated in the diagram.
 Notation used in this diagram:
R is the size of the input vector
W is the weight matrix
S is the number of neurons
n is the net input to the activation function
a is the net output of the activation function
All the superscript numbers denote the neural network layer.
Example: W¹ is the weight matrix of the first layer and W² is the weight matrix of the
second layer.
P Jyothi,Asst. Prof., CSE Dept.
Learning Vector Quantization (LVQ)
Conti..
 Activation Functions
 Since LVQ is a two-layered network, we have two activation functions, one for
each layer.
1. Competitive
 The main purpose of the competitive activation function is to identify the
“winning” neuron or prototype vector that best matches the input data.

P Jyothi,Asst. Prof., CSE Dept.


Learning Vector Quantization (LVQ)
Conti..
Linear
 The linear activation function is very straightforward. The output of the layer is
the same as the net input for the layer.

P Jyothi,Asst. Prof., CSE Dept.


Learning Vector Quantization (LVQ)
Conti..
 LVQ serves well for simpler classification tasks with moderate-sized datasets
and distinct class separations. However, its constraints should be taken into
account when tackling more intricate scenarios.\
 Advantages of Using LVQ:
1. Ease of Understanding: LVQ’s straightforward nature makes it accessible and
suitable for those new to machine learning.
2. Clear Interpretation: By assigning labels to prototypes, LVQ offers insights into
the reasoning behind classification decisions.
3. Partial Labeling Support: LVQ can handle situations where only a portion of
the data is labeled, enhancing its applicability.

P Jyothi,Asst. Prof., CSE Dept.


Learning Vector Quantization (LVQ)
Conti..
Disadvantages of the LVQ Algorithm:
1. Sensitivity to Starting Points: LVQ’s performance can vary based on where
prototypes are initially placed, affecting results.
2. Complex Boundary Limitation: When dealing with intricate decision
boundaries, LVQ might struggle to accurately model them using existing
prototypes.
3. Susceptibility to Noise: Noise within training data may misposition prototypes,
leading to compromised classification quality.
4. Scalability Challenges: Managing an adequate number of prototypes
becomes demanding as the number of classes or features increases.
5. Bias towards Dominant Classes: In cases of imbalanced training data, LVQ
may exhibit a bias towards the more prevalent classes.
P Jyothi,Asst. Prof., CSE Dept.
COUNTER-PROPAGATION (CPN) NETWORK

 The data compression yields to the data reduction which is to be send


or stored usually with the possibility of its full reproduction
(decompression). Image data compression is used to encode large
amounts of image data for transmission over a limited-capacity
channels. Recently, neural networks algorithms have been developed
for data compression, yielding superior performance over classical
techniques.
 For the efficient compression, the image data are passed through a
network producing the binary vectors as their compressed version.
After decompression the image data and output data are expected to
be very close. The compression ratio depends greatly on the tolerated
amount of error. Algorithm is made of two major steps, namely
network training (or learning) and processing (data compression and
decompression)

P Jyothi,Asst. Prof., CSE Dept.


COUNTER-PROPAGATION (CPN) NETWORK Conti..

P Jyothi,Asst. Prof., CSE Dept.


COUNTER-PROPAGATION (CPN) NETWORK Conti..

 Assuming that each pixel has a gray level between 0 and 255 (0
– means absolutely dark, and 255 means absolutely white) the
pixel can be stored with one byte (8 bits) of information. The
hidden layer is composed from 𝑞 neurons (𝑞 << 𝑛, i. e. 𝑞 = 16)
and realizes data compression.
 In the above-mentioned example, we get reduction factor equal
to 4. The input layer and hidden layer can be treated as a
transmitter. In the output layer there is 𝑛 neurons again, and
the decompression has place reproducing the original 64
elements pattern samples. The output layer can be treated as a
receiver. Usually the most popular system of a network learning
is the is the back-propagation algorithm.
P Jyothi,Asst. Prof., CSE Dept.
COUNTER-PROPAGATION (CPN) NETWORK Conti..

P Jyothi,Asst. Prof., CSE Dept.


COUNTER-PROPAGATION (CPN) NETWORK Conti..

 It was shown that neural network, called the counter-propagation


network, can perform for some applications even better than the back-
propagation one. The architecture of the counterpropagation network is a
combination of the self-organizing map of Kohonen and the outstar
structure of Grossberg. It has two layers similar to feedforward networks
BUT it has different learning strategy.
 The counter-propagation (CPN) network is composed of two feedforward
layers: the Kohonen layer and the Grossberg layer. The number of neurons
in these layers can differ. There is a total interconnection between
layers; every Kohonen neuron is connected with each Grossberg neuron.
The Kohonen layer is connected to the network input, and is operating
under the rule winner takes all (WTA). CPN is useful in pattern mapping
and associations, data compression, and classification.

P Jyothi,Asst. Prof., CSE Dept.


COUNTER-PROPAGATION (CPN) NETWORK Conti..

P Jyothi,Asst. Prof., CSE Dept.


COUNTER-PROPAGATION (CPN) NETWORK Conti..

 For the normalized vector input signal 𝑿 = [𝑥1, 𝑥2, . . . , 𝑥𝑘] each Kohonen
neuron 𝑘𝑖 obtains at its input the weighted sum

 for 𝑖 = 1,2, … , 𝑁, 𝑘 𝑗=1 where 𝑁 is the number of Kohonen neurons, 𝑤𝑖𝑗 is the
weight between 𝑗th input node and the 𝑖th Kohonen neuron, 𝑥𝑗 has the value
of normalized component of the real input

P Jyothi,Asst. Prof., CSE Dept.


COUNTER-PROPAGATION (CPN) NETWORK Conti..

 The WTA winner is the neuron with the maximal value of 𝑁𝐸𝑇𝑖 . This
neuron generates on its output the signal equal to 1 blocking the rest of
neurons and activates the Grossberg neurons via the weights 𝑣𝑖𝑗.
Grossberg neurons 𝐺𝑖 obtain the weighted sum

P Jyothi,Asst. Prof., CSE Dept.


COUNTER-PROPAGATION (CPN) NETWORK Conti..

 The compression method: The method follows the conventional image


processing scheme which is on an orthogonal basis: By decomposing a
large pictorial data into frames of subimages which are then analyzed
for extraction of frequencies and identification of pictorial features in
order to build a mapping table which can be used to reconstruct the
image with the coordinate information.
 When using a CPN for this purpose, the Kohonen network identifies
the pattern class each subimage belongs to and generates a class
index by the winning node. The indices generated are of the same
sequence that the subimages are fed in. The outstar network takes
part in approximating class vectors representing these classes. The
class index sequence generated and the weight matrix of the outstar
layer are used to restore the image
P Jyothi,Asst. Prof., CSE Dept.
Adaptive Resonance Theory

 The Adaptive Resonance Theory (ART) was incorporated


as a hypothesis for human cognitive data handling. The
hypothesis has prompted neural models for pattern
recognition and unsupervised learning. ART system has
been utilized to clarify different types of cognitive and brain
data.
 The Adaptive Resonance Theory addresses the stability-
plasticity(stability can be defined as the nature of
memorizing the learning and plasticity refers to the fact that
they are flexible to gain new information) dilemma of a
system that asks how learning can proceed in response to
huge input patterns and simultaneously not to lose the
P Jyothi,Asst. Prof., CSE Dept.

stability for irrelevant patterns.


Adaptive Resonance Theory Conti..

 Other than that, the stability-elasticity dilemma is concerned


about how a system can adapt new data while keeping
what was learned before. For such a task, a feedback
mechanism is included among the ART neural network
layers. In this neural network, the data in the form of
processing elements output reflects back and ahead among
layers. If an appropriate pattern is build-up, the resonance
is reached, then adaption can occur during this period.

P Jyothi,Asst. Prof., CSE Dept.


Adaptive Resonance Theory Conti..

 It can be defined as the formal analysis of how to


overcome the learning instability accomplished by
a competitive learning model, let to the
presentation of an expended hypothesis,
called adaptive resonance theory (ART). This
formal investigation indicated that a specific type of
top-down learned feedback and matching
mechanism could significantly overcome the
instability issue.
P Jyothi,Asst. Prof., CSE Dept.
Adaptive Resonance Theory Conti..

P Jyothi,Asst. Prof., CSE Dept.


Adaptive Resonance Theory Conti..

ART1 Implementation process:


 ART1 is a self-organizing neural network having input and output neurons mutually
couple using bottom-up and top-down adaptive weights that perform recognition. To
start our methodology, the system is first trained as per the adaptive resonance
theory by inputting reference pattern data under the type of 5*5 matrix into the
neurons for clustering within the output neurons.
 Next, the maximum number of nodes in L2 is defined following by the vigilance
parameter. The inputted pattern enrolled itself as short term memory activity over a
field of nodes L1. Combining and separating pathways from L1 to coding field L2,
each weighted by an adaptive long-term memory track, transform into a net signal
vector T. Internal competitive dynamics at L2 further transform T, creating a
compressed code or content addressable memory. With strong competition,
activation is concentrated at the L2 node that gets the maximal L1 → L2 signal. The
primary objective of this work is divided into four phases as follows Comparision,
recognition, search, and learning.
P Jyothi,Asst. Prof., CSE Dept.
Adaptive Resonance Theory Conti..

Advantage of adaptive learning theory(ART):


 It can be coordinated and utilized with different techniques to give more
precise outcomes.
 It doesn't ensure stability in forming clusters.
 It can be used in different fields such as face recognition, embedded system,
and robotics, target recognition, medical diagnosis, signature verification, etc.

P Jyothi,Asst. Prof., CSE Dept.


Adaptive Resonance Theory Conti..

Application of ART:
 ART stands for Adaptive Resonance Theory. ART neural networks used for
fast, stable learning and prediction have been applied in different areas. The
application incorporates target recognition, face recognition, medical diagnosis,
signature verification, mobile control robot.

P Jyothi,Asst. Prof., CSE Dept.


Adaptive Resonance Theory Conti..

Target recognition:
 Fuzzy ARTMAP neural network can be used for automatic classification of
targets depend on their radar range profiles. Tests on synthetic data show the
fuzzy ARTMAP can result in substantial savings in memory requirements when
related to k nearest neighbor(kNN) classifiers.
Medical diagnosis:
 Medical databases present huge numbers of challenges found in general
information management settings where speed, use, efficiency, and accuracy
are the prime concerns.

P Jyothi,Asst. Prof., CSE Dept.


Adaptive Resonance Theory Conti..

Signature verification:

 Automatic signature verification is a well known and active


area of research with various applications such as bank
check confirmation, ATM access, etc. the training of the
network is finished using ART1 that uses global features as
input vector and the verification and recognition phase uses
a two-step process. In the initial step, the input vector is
coordinated with the stored reference vector, which was
used as a training set, and in the second step, cluster
formation takes place.
P Jyothi,Asst. Prof., CSE Dept.
Adaptive Resonance Theory Conti..

Mobile control robot:


 Nowadays, we perceive a wide range of robotic devices. It is still
a field of research in their program part, called artificial
intelligence. The human brain is an interesting subject as a
model for such an intelligent system. Inspired by the structure of
the human brain, an artificial neural emerges.
 Similar to the brain, the artificial neural network contains
numerous simple computational units, neurons that are
interconnected mutually to allow the transfer of the signal from
the neurons to neurons. Artificial neural networks are used to
solve different issues with good outcomes compared to other
decision algorithms.
P Jyothi,Asst. Prof., CSE Dept.

You might also like