AI17-Neural Networks

Artificial Neural Networks (ANNs) are computational models inspired by biological neural systems, designed to understand intelligent behavior through interconnected units. Learning in ANNs can be achieved through algorithms like perceptron and backpropagation, allowing for the representation of complex functions and patterns. The structure of neural networks can vary, including single-layer and multi-layer configurations, with considerations for overfitting and optimal architecture during the learning process.

Uploaded by

zaydenguide

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views34 pages

AI17-Neural Networks

Uploaded by

zaydenguide

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Artificial Neural Networks

Neural Networks
• Analogy to biological neural systems
• Attempt to understand natural biological systems through computational modeling
• Intelligent behavior as an “emergent” property of large number of simple units rather
than from explicitly encoded symbolic rules and algorithms
• A neural network is just a collection of units connected together; the properties of the
network are determined by its topology and the properties of the “neurons”
• Researchers in AI and statistics became interested in the more abstract properties of
neural networks, such as their ability to perform distributed computation, to tolerate
noisy inputs, and to learn
• Hence they aimed to create artificial neural networks. (Other names include
connectionism, parallel distributed processing, and neural computation.)
Real Neuron

• A neuron is a cell in the brain whose principal function is the collection,

processing, and dissemination of electrical signals
• Brain's information-processing capacity is from networks of such neurons
Neural Network Learning
• Learning approach based on modeling adaptation in biological neural systems
• Perceptron: Initial algorithm for learning simple neural networks (single layer)
developed in the 1950’s.
• Backpropagation: More complex algorithm for learning multi-layer neural
networks developed in the 1980’s.
Units in neural networks
• Neural networks are composed of nodes or units connected by directed links
• A link from unit i to unit j serves to propagate the activation ai from i to j
• Each link has a numeric weight, Wi,j associated with it, which determines the strength and sign of
the connection
• Each unit j first computes a weighted sum of its inputs:

• an activation function g is applied to this sum to derive the output:

• The activation function g is typically either a hard threshold, in which case the
unit is called a perceptron, or a logistic function, in which case the term sigmoid
perceptron is sometimes used.
• Both of these nonlinear activation function ensure the important property that
the entire network of units can represent a nonlinear function.
• Logistic activation function has the added advantage of being differentiable.
• The activation function g is designed to meet:
• we want the unit to be "active" (near +1) when the "right" inputs are given, and "inactive"
(near 0) when the "wrong" inputs are given
• the activation needs to be nonlinear, otherwise the entire neural network collapses into a
simple linear function
• sets the actual threshold for the unit, in the sense that the unit is activated when
the weighted sum of "real" inputs (i.e., excluding the bias input) exceeds W0,i
Network structures
• acyclic or feed-forward networks
• has connections only in one direction
• Every node receives input from “upstream” nodes and delivers output to “downstream”
nodes; there are no loops
• represents a function of its current input
• it has no internal state other than the weights themselves
• cyclic or recurrent networks
• feeds its outputs back into its own inputs
• means that the activation levels of the network form a dynamical system that may reach
a stable state or exhibit oscillations or even chaotic behavior
• the response of the network to a given input depends on its initial state, which may
depend on previous inputs
• can support short-term memory
• Feed-forward networks are usually arranged in layers, such that each unit receives
input only from units in the immediately preceding layer.
• single layer networks, which have no hidden units
• multilayer networks, which have one or more layers of hidden units that are not connected to
the outputs of the network
• neural networks can be used in cases where multiple outputs are appropriate
• A neural network can be used for classification or regression
• For Boolean classification with continuous outputs (e.g., with sigmoid units), it is
traditional to have a single output unit, with a value over 0.5 interpreted as one
class and a value below 0.5 as the other
• For k-way classification, one could divide the single output unit's range into k
portions, but it is more common to have k separate output units, with the value of
each one representing the relative likelihood of that class given the current input
Single layer feed-forward neural networks

a3 = g(w0,3 + w1,3 a1 + w2,3 a2)

= g(w0,3 + w1,3 x1 + w2,3 x2)
a4 = g(w0,4 + w1,4 a1 + w2,4 a2)
= g(w0,4 + w1,4 x1 + w2,4 x2)

a simple two-input, two-output

perceptron network
Multi-layer feed-forward neural networks

A neural network with two inputs, one

hidden layer of two units, and one
output unit

a5 = g(w0,5+w3,5 a3 + w4,5 a4)

= g(w0,5+w3,5 g(w0,3 + w1,3 a1 + w2,3 a2) + w4,5 g(w0,4 + w1,4 a1 + w2,4 a2))
= g(w0,5+w3,5 g(w0,3 + w1,3 x1 + w2,3 x2) + w4,5 g(w0,4 + w1,4 x1 + w2,4 x2)).
Multi-layer feed-forward neural networks

A neural network with two inputs, one

hidden layer of two units, and one
output layer of two units

a5 = g(w0,5+w3,5 a3 + w4,5 a4)

= g(w0,5+w3,5 g(w0,3 + w1,3 a1 + w2,3 a2) + w4,5 g(w0,4 + w1,4 a1 + w2,4 a2))
= g(w0,5+w3,5 g(w0,3 + w1,3 x1 + w2,3 x2) + w4,5 g(w0,4 + w1,4 x1 + w2,4 x2)).
a6 = g(w0,6+w3,6 a3 + w4,6 a4)
= g(w0,6+w3,6 g(w0,3 + w1,3 a1 + w2,3 a2) + w4,6 g(w0,4 + w1,4 a1 + w2,4 a2))
= g(w0,6+w3,6 g(w0,3 + w1,3 x1 + w2,3 x2) + w4,6 g(w0,4 + w1,4 x1 + w2,4 x2)).
Single layer feed-forward neural networks
(perceptrons)
• A network with all the inputs connected directly to the outputs is called a single-
layer neural network , or a perceptron network.
• Since each output unit is independent of the others-each weight affects only one
of the outputs
• With a threshold activation function, we can view the perceptron as representing
a Boolean function
• can represent some quite "complex" Boolean functions very compactly and cannot represent
some

• defines a hyperplane in the input space, so the perceptron returns 1 if and only if the input is
on one side of that hyperplane
• depending on the type of activation function used, the training processes will be
either the perceptron learning rule or gradient descent rule for the logistic
regression
• linearly separable functions constitute just a small fraction of all Boolean
functions
• Each cycle through the examples is called an epoch.
• Epochs are repeated until some stopping criterion is reached-typically, that the
weight changes have become very small.
Re-visiting weight updates
• perceptron learning rule

• Signmoid perceptron learning rule

Expressiveness of MLPs
• The advantage of adding hidden layers is that it enlarges the space of hypotheses
that the network can represent
• With more hidden units, we can produce more bumps of different sizes in more
places
• With a single, sufficiently large hidden layer, it is possible to represent any
continuous function of the inputs with arbitrary accuracy
• With two layers, even discontinuous functions can be represented
• Unfortunately, for any particular network structure, it is harder to characterize
exactly which functions can be represented and which ones cannot.
• The problem of choosing the right number of hidden units in advance is still not
well understood
Learning in multilayer networks
• one minor complication arise in multilayer networks: interactions among the
learning problems when the network has multiple outputs.
• In such cases, we should think of the network as implementing a vector function
hw rather than a scalar function; for example, the network returns a vector [a5, a6].
• the target output will be a vector y.
• A perceptron network decomposes into m separate learning problems for an m-
output problem, this decomposition fails in a multilayer network.
• For example, both a5 and a6 depend on all of the input-layer weights, so updates to those
weights will depend on errors in both a5 and a6.
• this dependency is very simple in the case of any loss function that is additive
across the components of the error vector y − hw(x).
• For the L2 loss, we have, for any weight w,

where the index k ranges over nodes in the output layer

• The major complication comes from the addition of hidden layers to the network.
• Whereas the error y − hw at the output layer is clear, the error at the hidden
layers seems mysterious because the training data do not say what value the
hidden nodes should have.
• It turns out that we can back-propagate the error from the output layer to the
hidden layers. The back-propagation process emerges directly from a derivation
of the overall error gradient.
• Idea is that hidden node j is "responsible" for some fraction of the error in each
of the output nodes to which it connects
• ∆k values are divided according to the strength of the connection between the
hidden node and the output node and are propagated back to provide the ∆𝑗
values for the hidden layer
• We have multiple output units, so let Errk be the kth component of the error
vector y − hw.
• Let us define a modified error Δk =Errk × g’(ink), so that the weight-update rule
becomes

• The propagation rule for the Δ values is the following:

• the weight-update rule for the weights between the inputs and the hidden layer
is essentially identical to the update rule for the output layer:
• The back-propagation process can be summarized as follows:
• The gradient of loss with respect to weights connecting the hidden layer to the
output layer will be zero except for weights wj,k that connect to the kth output
unit. For those weights, we have
• To obtain the gradient with respect to the wi,j weights connecting the input layer
to the hidden layer, we have to expand out the activations aj and reapply the
chain rule.
Learning neural network structures
• How to find the best network structure?
• neural networks are subject to overfitting when there are too many parameters in the model
• Fully connected networks, the only choices to be made concern the number of hidden
layers and their sizes
• try several and keep the best
• cross-validation techniques are needed
• not fully connected, then need to find some effective search method through the very
large space of possible connection topologies
• optimal brain damage algorithm begins with a fully connected network and removes connections
from it
• After the network is trained for the first time, an information-theoretic approach identifies an
optimal selection of connections that can be dropped
• The network is then retrained, and if its performance has not decreased then the process is repeated
• It is also possible to remove units that are not contributing much to the result
Learning neural network structures …
• Several algorithms have been proposed for growing a larger network from a
smaller one.
• Tiling algorithm
• The idea is to start with a single unit that does its best to produce the correct output on as
many of the training examples as possible.
• Subsequent units are added to take care of the examples that the first unit got wrong.
• The algorithm adds only as many units as are needed to cover all the examples.

ADVANCED DIGITAL SIGNAL PROCESSING Lab Manual
100% (1)
ADVANCED DIGITAL SIGNAL PROCESSING Lab Manual
44 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Feedforward
No ratings yet
Feedforward
34 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Artificial Neural Network (2)
No ratings yet
Artificial Neural Network (2)
75 pages
Week-3 Module-2 Neural Network
No ratings yet
Week-3 Module-2 Neural Network
58 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
UNIT V
No ratings yet
UNIT V
49 pages
ML UNIT-5
No ratings yet
ML UNIT-5
19 pages
Multilayer Perceptron
No ratings yet
Multilayer Perceptron
16 pages
Neural Net Notes
No ratings yet
Neural Net Notes
7 pages
Ch 12_Artificial Neural Networks
No ratings yet
Ch 12_Artificial Neural Networks
39 pages
SOFT COMPUTING UNIT 2
No ratings yet
SOFT COMPUTING UNIT 2
22 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
Neural Networks
No ratings yet
Neural Networks
54 pages
Module 3 Ppt
No ratings yet
Module 3 Ppt
83 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
MLP Lecture 4
No ratings yet
MLP Lecture 4
35 pages
AI UNIT 4 PART 2
No ratings yet
AI UNIT 4 PART 2
45 pages
Lesson 7.0 Supervised Learning With Neural Networks (1)
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks (1)
22 pages
Neural Network: Prof. Subodh Kumar Mohanty
No ratings yet
Neural Network: Prof. Subodh Kumar Mohanty
37 pages
Multilayer Perceptron
No ratings yet
Multilayer Perceptron
9 pages
Neural Network Basics 2.1 Neurons or Nodes and Layers
No ratings yet
Neural Network Basics 2.1 Neurons or Nodes and Layers
9 pages
DWDM Unit4-2
No ratings yet
DWDM Unit4-2
4 pages
Neural Network and Fuzzy Logic
50% (2)
Neural Network and Fuzzy Logic
54 pages
Unit 4
No ratings yet
Unit 4
38 pages
UNIT V (1)
No ratings yet
UNIT V (1)
25 pages
13_Ann
No ratings yet
13_Ann
39 pages
Unit 6 Application of AI
No ratings yet
Unit 6 Application of AI
91 pages
Unit 2 Deep Learning
No ratings yet
Unit 2 Deep Learning
19 pages
Unit 3
100% (1)
Unit 3
11 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
S2_5_NN
No ratings yet
S2_5_NN
22 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
Chapter 6 - Artificial Intelligence notes
No ratings yet
Chapter 6 - Artificial Intelligence notes
13 pages
Refined Chapter 5 UceQEJ (2)
No ratings yet
Refined Chapter 5 UceQEJ (2)
79 pages
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
No ratings yet
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
4 pages
ADVANCED_SUPERVISED_LEARNING[1]
No ratings yet
ADVANCED_SUPERVISED_LEARNING[1]
17 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
82 pages
Unit-4 MLT
No ratings yet
Unit-4 MLT
105 pages
Main
No ratings yet
Main
25 pages
UNIT-II chapter-2
No ratings yet
UNIT-II chapter-2
20 pages
TO Artificial Neural Networks
No ratings yet
TO Artificial Neural Networks
22 pages
Feedforward Neural Network
No ratings yet
Feedforward Neural Network
5 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Unit-12
No ratings yet
Unit-12
28 pages
Supervised Learning Unit 4-Neural Network
No ratings yet
Supervised Learning Unit 4-Neural Network
30 pages
Lecture15 NeuronNetworks
No ratings yet
Lecture15 NeuronNetworks
61 pages
Chapter3
No ratings yet
Chapter3
30 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
14 pages
Chapter 6
No ratings yet
Chapter 6
15 pages
Ann
No ratings yet
Ann
86 pages
Lecture-2 Learning Process45452465442
No ratings yet
Lecture-2 Learning Process45452465442
50 pages
Learning Rules For Multilayer Feedforward Neural Networks
No ratings yet
Learning Rules For Multilayer Feedforward Neural Networks
19 pages
Deep learning notes
No ratings yet
Deep learning notes
47 pages
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
8-The Sampling Theorem
No ratings yet
8-The Sampling Theorem
8 pages
DSP Model Exam Set A QP
No ratings yet
DSP Model Exam Set A QP
3 pages
A Gentle Introduction To Neural Networks AI
No ratings yet
A Gentle Introduction To Neural Networks AI
1 page
Introduction To Operations Research, 7th Edition
No ratings yet
Introduction To Operations Research, 7th Edition
6 pages
2 Polynomials PDF
No ratings yet
2 Polynomials PDF
4 pages
FP Growth
No ratings yet
FP Growth
21 pages
Lecture 05: Design of Non-Recursive Digital Filters: John Chiverton
No ratings yet
Lecture 05: Design of Non-Recursive Digital Filters: John Chiverton
56 pages
MAXIMA & MINIMA
No ratings yet
MAXIMA & MINIMA
66 pages
Session 08022022
No ratings yet
Session 08022022
128 pages
Taylor Series and Numerical Methods
No ratings yet
Taylor Series and Numerical Methods
14 pages
cs188 sp23 Note01
No ratings yet
cs188 sp23 Note01
2 pages
R22 DVT Handout
No ratings yet
R22 DVT Handout
10 pages
Oracle Parallel Distribution and 12c Adaptive Plans
No ratings yet
Oracle Parallel Distribution and 12c Adaptive Plans
4 pages
Reinforcement Learning: A Survey: Leslie Pack Kaelbling Michael L. Littman Andrew W. Moore
No ratings yet
Reinforcement Learning: A Survey: Leslie Pack Kaelbling Michael L. Littman Andrew W. Moore
49 pages
Monte Carlo Schedule Risk Analysis
No ratings yet
Monte Carlo Schedule Risk Analysis
3 pages
Types of Machine Learning
No ratings yet
Types of Machine Learning
63 pages
Agricultural Data Analysis Using Machine Learningpdf
No ratings yet
Agricultural Data Analysis Using Machine Learningpdf
5 pages
Lecture 2 - Database Design
No ratings yet
Lecture 2 - Database Design
71 pages
9-Introduction To Fourier Transform
No ratings yet
9-Introduction To Fourier Transform
10 pages
A01 Reliability-Green-Belt Flyer EN P-1
No ratings yet
A01 Reliability-Green-Belt Flyer EN P-1
2 pages
Class 12th - Maths I - Solutions
No ratings yet
Class 12th - Maths I - Solutions
13 pages
Unsupervised Feature Learning and Deep Learning - A Review and New Perspectives Author Yoshua Bengio, Aaron Courville, and Pascal Vincent
No ratings yet
Unsupervised Feature Learning and Deep Learning - A Review and New Perspectives Author Yoshua Bengio, Aaron Courville, and Pascal Vincent
30 pages
Module 4
No ratings yet
Module 4
54 pages
Yusuf Notes
No ratings yet
Yusuf Notes
4 pages
EG1108 T8 - Ans PDF
No ratings yet
EG1108 T8 - Ans PDF
10 pages
Automata Theory PPT Seminar
100% (1)
Automata Theory PPT Seminar
21 pages
Applications of Differential Equations
No ratings yet
Applications of Differential Equations
5 pages
Stack
No ratings yet
Stack
14 pages
NUMERICAL ANALYSIS Project
No ratings yet
NUMERICAL ANALYSIS Project
13 pages