0% found this document useful (0 votes)

6 views39 pages

Unit 03

Uploaded by

Annu Yadav vlogs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views39 pages

Unit 03

Uploaded by

Annu Yadav vlogs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Unit – III

A Survey of Neural Network Model

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Single Layer Perceptron

Perceptron the first adaptive network architecture was invented

by Frank Rosenblatt in 1957.

It can be used for the classification of patterns that are linearly

separable.

Fundamentally, it consists of a single neuron with adjustable

weights and bias.

This algorithm is suitable for binary/bipolar input vector with

bipolar target

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Architecture of Single Layer Perceptron

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Algorithm
Step 1: Initialize all weights and bias to Zero and set learning rate α ( (0 to
1)
- wi = 0 for i =1 to n where n is number of input neurons and b =0.
Step 2: While stopping condition is false do step 3-7.
Step 3: For each input training pair s:t do steps 4-6.
Step 4: Set activation for input units with the input vectors.
- xi = Si ( i= 1 to n)
Step 5: Compute the output unit response

Y in = b + ∑xwi
i i

The activation function is used is,

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Algorithm…
Step 6: The weights and bias are updated if the target is not equal to
the output response.

If t ≠ y and the value of xi is not zero

- wi(new) = wi(old) + α xi t

- b(new) = b(old) + α t

Else

- wi(new) = wi(old)

- b(new) = b(old)

Step 7 : test for stopping condition

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Problems
1. Apply the Perceptron to the training the pattern that define AND
function with bipolar inputs and targets.

2. Apply the Perceptron to the training patterns that define OR function

input and target.

3. Classify the two dimensional input patterns (representing letters)

using the Perceptron rule (The T-C Problem).

4. Classify the two dimensional input patterns (representing letters)

using the Perceptron rule (The I- J Problem).

5. Apply the Perceptron to the training patterns that define XOR

function input and target.
6

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Application Procedure
Step 1: The weights to be used here are taken from the training
algorithm
Step 2: For each input vector x to be classified do steps 3- 4.
Step 3: Input units activations are set
Step 4: Calculate the response of output unit,

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

More Problem
For the following noisy version of training patterns, identify the
response of network by segregating it into correct, incorrect, or
indefinite.

(0 -1 -1) , ( 0 1 -1), (0 0 1), (0 0 -1), (0 1 0), (1 0 1),

( 1 0 -1), ( 1 -1 0), (1 0 0), (1 1 0), (0 -1 0), (1 1 1)

Solution (Hints)

- If x1w1 + x2w2 + x3w3 > 0 then the response is correct

- If x1w1 + x2w2 + x3w3 < 0 then the response is incorrect

- If x1w1 + x2w2 + x3w3 = 0 then the response is indefinite

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Using the perceptron learning rule, find the weights required to

perform the following classifications. Vector (1 1 1 1), (-1 1 -1 -1) and
( 1 -1 -1 1) are the member of class (having target value 1); Vectors (
1 1 1 -1) and ( 1 -1 -1 1) are not the members of class (having target
value -1). Use learning rate of 1 and starting weights of zero (0).
Using each of the training and vectors as input, test the response of
the net.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Least Mean Square

This rule is refer Delta Learning Rule or Widrow Hoff Rule

The delta rule is valid only for continuous activation functions and in
the supervised training mode.

The learning signal for this rule is called delta.

The adjustment made to a synaptic weight of a neuron is proportional

to the product of the error signal and the input signal of the synapse.

The delta rule can be applied for single output unit and several output
unit.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Derivation of Delta Rule
The delta rule changes the weight of the connections to minimize the
difference between the net input to the output unit, yin and the target
value t.

The delta rule is given by

∆wi = α(t – yin) xi

- Where x is the vector of activation of input units.

- yin is the net input to output unit

- t is the target vector

- α is the learning rate

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Derivation of Delta Rule…
The mean square error for a particular training pattern is E = ∑ j inj
(
j
t − y ) 2

The gradient of E is a Vector consisting of the partial derivatives of E

with respect to each of the weights. The error can be reduced rapidly
by adjusting weight wij
Taking partial differentiation of E w.r.t. wij
∂E
= ∑ (t j − yinj ) 2

∂wij i

∂
= (t j − yinj ) 2
∂wij
12

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Derivation of Delta Rule…

Since the weight wij influence the error only at output unit yJ

Thus the error will be reduced rapidly depending upon the given
learning by adjusting the weights:

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Multilayer Perceptron
Hidden layers of computation nodes
input propagates in a forward direction, layer-by-layer basis
- also called Multilayer Feedforward Network, MLP
Error back-propagation algorithm
- supervised learning algorithm
- error-correction learning algorithm
- Forward pass
o input vector is applied to input nodes
o its effects propagate through the network layer-by-layer
o with fixed synaptic weights
- backward pass
o synaptic weights are adjusted in accordance with error signal
o error signal propagates backward, layer-by-layer fashion
14

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

MLP Distinctive Characteristics
Non-linear activation function 1
yi =
- differentiable 1 + exp( −v j )
- sigmoidal function, logistic function
- nonlinearity prevent reduction to single-layer perceptron
One or more layers of hidden neurons
- progressively extracting more meaningful features from input
patterns
High degree of connectivity
Nonlinearity and high degree of connectivity makes theoretical analysis
difficult
Learning process is hard to visualize
BP is a landmark in NN: computationally efficient training
15

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Preliminaries
Function signal
- input signals comes in at the input end of the network
- propagates forward to output nodes
Error signal
- originates from output neuron
- propagates backward to input nodes
Two computations in Training
- computation of function signal
- computation of an estimate of gradient vector
o gradient of error surface with respect to the weights
16

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Back Propagation Network (BPN)
Back propagation is a systematic method for training multi-layer
artificial neural networks.

It has a mathematical foundation that is strong if not highly practical.

It is multi-layer forward network using extend gradient descent based

delta learning rule.

Back propagation provides a computationally efficient method for

changing the weights in feed forward network with differentiable
activation function units, to learn a training set of input-output.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Architecture

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Training Algorithm

The training algorithm of back propagation involves four stages:

1. Initialization of weights

2. Feed forward

3. Back Propagation of errors

4. Update of the weights and biases

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Parameters
x: input training vector

x = (x1, x2, …, xn)

t: Output target vector

t = (t1, t2, …, tm)

α : learning rate

δk : error at output unit yk

δj : error at hidden unit zj

yk = out put unit k.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Algorithm
Step 1: Initialize weight to small random values
Step 2: While stopping condition is false, do steps 3-10.
Step 3: for each training pair do step 4-9
Feed Forward
Step 4: Each input unit receive the input signal xi and transmit this
signals to all unit in the layer above i.e. hidden units.
Step 5: Each hidden unit (zj : j = 1,..,p) sums it weighted input
signals.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Algorithm..

Step 6: Each output unit (yk, k = 1,.., m) sum its weighted input
signals.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Algorithm…

Back Propagation Error

Step 7: Each output unit (yk, k=1, …, m) receives a target pattern
corresponding to an input pattern error information term is calculated
as:
- δk = (tk – yk) f(y-ink)
Step 8: Each hidden unit (zj, j= 1,…, n) sum its delta inputs from units
in the layer above
- δ-inj = ∑ δj wjk
- The error information term is calculated as
- δj = δ-inj f(z-inj)

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Algorithm…
Updating of weight and biases
Step 9: Each output unit (yk, k = 1, …, m) updates it bias and weights
(j = 0, …, p)
The weight correction term is given by
- ∆wjk = αδkzj
And the bias correction term is given by
- ∆w0k = αδk
Therefore
- wjk(new) = wjk(old) + ∆wjk
- w0k(new) = w0k(old) + ∆w0k
24

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Algorithm…
Each hidden unit (zj, j = 1,…, p) updates it bias and weights (I =
0,…,n)

The weight correction term

- ∆vij = αδjxi

The bias correction term

∆v0j = αδj

Therefore

- vij(new) = vijold) + ∆vij

- v0j(new) = v0j(old) + ∆v0j
25

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Application Algorithm
Step 1: Initialize weights from training program

Step 2: For each input vector do steps 3-5.

Step 3: For i = 1, …, n; set activation of input unit xi:

Step 4: For j = 1,…, p

n
Z − inj = v oj + ∑i =1
x i v ij

Step 5: For k = 1, …, m
p
y − ink = w ok + ∑j=1
z jw jk

Y k = f ( y − ink )
26

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Merit of Back Propagation

1. The mathematical formula present here, can be applied to any

network and does not require any special mention of the features of
function to be learnt.

2. The computing time is reduced if the weights chosen are small at the
beginning.

3. The batch update of weights exist, which provides a smoothing

effect on the weight correction terms.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Demerit of Back Propagation
1. The number of learning steps may be high and also the learning
phase has intensive calculations.
2. The selection of the number of hidden nodes in the network is a
problem. If number of hidden network is small, then the function to
be learnt may not be possibly represented, as capacity of network is
small. If the number of hidden neurons is increased, the number of
independent variable of the error function also increases and the
computing time also increases rapidly.
3. For complex problem it may require days or weeks to train the
network or it may not train at all. Long training time results in non-
optimum step size.
4. The network may get trapped in a local minima even though there is
much deeper minimum nearby.
5. The training may sometimes cause temporal instability to the system
28

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Problem
Find the new weights when the network illustrated in following figure.
The input pattern is [0.6 0.8 0] and target output is 0.9. Use learning
rate α = 0.3 and use binary sigmoid activation function.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Solution
Step 1: Initialize the weight and bias
- w = [-1 1 2], w0 = [-1]
2 1 0
v = 1 2 2  v 0 = [0 0 − 1]
 
 0 3 1 
Step 3: For each training pair
- x = [0.6 0.8 0]
- t = [0.9]

Rest part of solution see on board

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Problem

Apply the Back Propagation Learning to the training patterns that

define XOR function input and target.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Important point for the selection of parameters
Initial Weights
- It will influence whether the net reaches a global (or only a local)
minima of the error and if so how rapidly it converges.
- If the initial weight is too large the initial input signals to each hidden
or output unit will fall in the saturation region where derivative of
sigmoid has a very small value.
- If initial weights are too small, the net input of a hidden or output unit
will approach zero, which then causes extremely slow learning.
- To get better result the weight (and bias) are set to random numbers
between -0.5 and 0.5 or between -1 and 1.
32

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Important point for the selection of parameters

Number of hidden units

If the activation function can vary with the function, then it can be
seen that a n-input, m-output function requires at most 2n+1 hidden
units.

If more number of hidden layers are present, then the calculation for
the δ are repeated.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Important point for the selection of parameters…
Selection of learning rate
A high learning rate leads to rapid learning but the weights may
oscillate, while a lower learning rate leads to slower learning. Method
suggested for adopting learning rate are:
Start with a high learning rate and steadily decrease it. Changes in the
weight vectors must be small in order to reduce oscillations or an
divergence.
A simple method is to increase the learning rate in order to improve
performance and to decrease the learning rate in order to worsen
performance.
Another method is to double the learning rate until the error value
worsen.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Application of Back Propagation

Optical Character recognition

Image Compression

Data Compression

Load forecasting problem in power system area

Control problem

Non linear simulation

Fault detection problem etc.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Adaline
It is developed Widrow and Hoff.

It is used for bipolar activations for its input signals and target output.

The weights and the bias of the Adaline are adujustable.

The learning rule used can be called as Delta rule, Least Mean
Square rule or Widrow Hoff Rule.

The activation of this rule with single output unit, several output units.

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Architecture

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Training Algorithm

Step 1: Initialize weights (not zero but small random values are used). Set
learning rate α.

Step 2: While stopping condition is false, do step 3-7.

Step 3: For each bipolar training pair s:t perform Steps 4-6.

Step 4: Set activations of input units xi = si for i = 1 to n.

Step 5: Compute net input to output unit y-in = b + ∑ xi wi

Step 6: update bias and weights, i = 1 to n.

- wi(new) = wi(old) + α (t – y-in)xi
- b(new) = b(old) + α (t – y-in)

Step 7: Test for stopping condition

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Application Algorithm

Step 1: Initialize weights obtained from the training algorithm.

Step 2: For each bipolar input vector x perform Steps 3-5.

Step 3: Set activation of input unit.

Step 4: Calculate the net input to output unit y-in = b + ∑ xi wi

Step 5: Finally apply the activation to obtain the output y.

 1, y−in ≥ 0
y = f ( y−in ) = 
− 1 y−in < 0
39

Dr L K Sharma, Rungta College of Engineering and Technology, Bhilai (CG)

Unit 02
No ratings yet
Unit 02
23 pages
Soft Computing
No ratings yet
Soft Computing
92 pages
22SCSE1180094 - Shyam Lab File (SCA) !!!!
No ratings yet
22SCSE1180094 - Shyam Lab File (SCA) !!!!
28 pages
Neural Network and Deep Learning - Unit 1
No ratings yet
Neural Network and Deep Learning - Unit 1
20 pages
CT1 NNDL Question Bank
No ratings yet
CT1 NNDL Question Bank
8 pages
Single Layer Perceptron Experiment
No ratings yet
Single Layer Perceptron Experiment
11 pages
Networks With Threshold Activation Functions: Navigation
No ratings yet
Networks With Threshold Activation Functions: Navigation
6 pages
Network Learning (Training)
No ratings yet
Network Learning (Training)
29 pages
Session 78
No ratings yet
Session 78
24 pages
Highly Interconnected Processing Elements: 1. Neural Network Architectures
No ratings yet
Highly Interconnected Processing Elements: 1. Neural Network Architectures
7 pages
Model of Neuron in An ANN
No ratings yet
Model of Neuron in An ANN
12 pages
CSC 323-06 Artificial Neural Network
No ratings yet
CSC 323-06 Artificial Neural Network
29 pages
Ch1-Fundamental of Neural Network
No ratings yet
Ch1-Fundamental of Neural Network
59 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
Shallow Neural Network
No ratings yet
Shallow Neural Network
152 pages
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
No ratings yet
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
24 pages
Soft Computing-Lab File
No ratings yet
Soft Computing-Lab File
40 pages
Perceptrons
No ratings yet
Perceptrons
11 pages
Supervised Learning Neural Networks
No ratings yet
Supervised Learning Neural Networks
34 pages
DL CHPT 1
No ratings yet
DL CHPT 1
59 pages
Seminar Artificial Neural Network 24 9
No ratings yet
Seminar Artificial Neural Network 24 9
39 pages
Neural Network Learning Rules Explained
No ratings yet
Neural Network Learning Rules Explained
11 pages
Unit2 Soft Computing
No ratings yet
Unit2 Soft Computing
22 pages
ML Tushar Assignment
No ratings yet
ML Tushar Assignment
8 pages
Neural Network Training Methods Explained
No ratings yet
Neural Network Training Methods Explained
12 pages
Learning With Linear Neurons: Adapted From Lectures by Geoffrey Hinton and Others Updated by N. Intrator, May 2007
No ratings yet
Learning With Linear Neurons: Adapted From Lectures by Geoffrey Hinton and Others Updated by N. Intrator, May 2007
59 pages
Soft Computing
No ratings yet
Soft Computing
40 pages
Assignment Neural Networks
No ratings yet
Assignment Neural Networks
7 pages
Neural Networks: Keras & Backpropagation
No ratings yet
Neural Networks: Keras & Backpropagation
19 pages
Unit 1 Application of Soft Computing - Presentation
No ratings yet
Unit 1 Application of Soft Computing - Presentation
47 pages
Nnfuzzysampleprograms
No ratings yet
Nnfuzzysampleprograms
9 pages
Single Layer Feedforward Networks
No ratings yet
Single Layer Feedforward Networks
21 pages
Final PPT DataMining
No ratings yet
Final PPT DataMining
64 pages
Neural Network Learning Methods
No ratings yet
Neural Network Learning Methods
50 pages
DL Unit 1
No ratings yet
DL Unit 1
10 pages
Multilayer Perceptron Neural Network Guide
No ratings yet
Multilayer Perceptron Neural Network Guide
8 pages
Delta Rule Extensions in Neural Networks
No ratings yet
Delta Rule Extensions in Neural Networks
5 pages
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
14 pages
Neural Network: Sudipta Roy
No ratings yet
Neural Network: Sudipta Roy
25 pages
Chapter 2 Adaline
No ratings yet
Chapter 2 Adaline
71 pages
Hoff Learning Rule
No ratings yet
Hoff Learning Rule
22 pages
Tasks On Neurons and ANN
No ratings yet
Tasks On Neurons and ANN
15 pages
NN Ch04
No ratings yet
NN Ch04
29 pages
Learning in A Feed Forward Multiple Layer ANN - Backpropagation
No ratings yet
Learning in A Feed Forward Multiple Layer ANN - Backpropagation
18 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
39 pages
Ppt-Ii NNFL
No ratings yet
Ppt-Ii NNFL
43 pages
SC Module2 Notes
No ratings yet
SC Module2 Notes
65 pages
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
No ratings yet
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
16 pages
Feedforward Neural Networks Overview
No ratings yet
Feedforward Neural Networks Overview
6 pages
Week-3 Module-2 Neural Network
No ratings yet
Week-3 Module-2 Neural Network
58 pages
Neural Network Learning Rules Explained
No ratings yet
Neural Network Learning Rules Explained
5 pages
Vanishing Gradient Problem
No ratings yet
Vanishing Gradient Problem
3 pages
Multi Layer Feed-Forward Network Learning
No ratings yet
Multi Layer Feed-Forward Network Learning
5 pages
Neural
No ratings yet
Neural
4 pages
Chapter 6 - Artificial Intelligence Notes
No ratings yet
Chapter 6 - Artificial Intelligence Notes
13 pages
ML Cie-2
No ratings yet
ML Cie-2
2 pages
Neural Networks Exam
No ratings yet
Neural Networks Exam
3 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
16 pages
Elective Courses
No ratings yet
Elective Courses
10 pages
Zero To Deep Learning
100% (5)
Zero To Deep Learning
753 pages
CNN Students
No ratings yet
CNN Students
170 pages
Minor Project Poster
No ratings yet
Minor Project Poster
1 page
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
10 pages
Hand Writing Recognition
No ratings yet
Hand Writing Recognition
12 pages
And Gate Perceptron Neumerical
No ratings yet
And Gate Perceptron Neumerical
3 pages
Validation and Training
No ratings yet
Validation and Training
3 pages
Deep Learning for Vision: FDP Feb 2024
No ratings yet
Deep Learning for Vision: FDP Feb 2024
2 pages
CNN and Stacked LSTM Model For Indian Sign Language Recognition
No ratings yet
CNN and Stacked LSTM Model For Indian Sign Language Recognition
9 pages
Generative AI Refresher - E0
90% (10)
Generative AI Refresher - E0
3 pages
Kecerdasan Artifisial Dan Masyarakat - M5
No ratings yet
Kecerdasan Artifisial Dan Masyarakat - M5
8 pages
Backpropagation in CNNs Guide
No ratings yet
Backpropagation in CNNs Guide
4 pages
Chapter 2 Practice Sheet
No ratings yet
Chapter 2 Practice Sheet
1 page
ANN Concepts
No ratings yet
ANN Concepts
5 pages
Artificial Intelligence: Alexnet
No ratings yet
Artificial Intelligence: Alexnet
20 pages
التعلم العميق
No ratings yet
التعلم العميق
192 pages
Understanding Deep Learning
100% (1)
Understanding Deep Learning
39 pages
Imp Questions For Ci - Update
No ratings yet
Imp Questions For Ci - Update
8 pages
Machine Learning: Neural Networks Slides Mostly Adapted From Tom Mithcell, Han and Kamber
No ratings yet
Machine Learning: Neural Networks Slides Mostly Adapted From Tom Mithcell, Han and Kamber
41 pages
ANN Slide Explanations
No ratings yet
ANN Slide Explanations
2 pages
AI (Whole)
No ratings yet
AI (Whole)
96 pages
Unit I Architecture of Neural Network
No ratings yet
Unit I Architecture of Neural Network
74 pages
Machine Learning and Generative AI
No ratings yet
Machine Learning and Generative AI
5 pages
Plant Leaf Disease Detection and Classification Based On CNN With LVQ Algorithm
100% (1)
Plant Leaf Disease Detection and Classification Based On CNN With LVQ Algorithm
4 pages
NNFLC 3 Unit Qbwa
No ratings yet
NNFLC 3 Unit Qbwa
15 pages
RNN & LSTM Notes
No ratings yet
RNN & LSTM Notes
8 pages
Advanced Deep Learning Course
No ratings yet
Advanced Deep Learning Course
149 pages
B - Principles of Training BP
No ratings yet
B - Principles of Training BP
11 pages