0% found this document useful (0 votes)

54 views54 pages

Lect 4

The document describes gradient descent learning and backpropagation algorithms for training neural networks. It discusses how gradient descent aims to find the minimum error by computing derivatives of the error function with respect to weights. Batch and incremental training modes are described. Backpropagation is introduced as a method for calculating error gradients for multi-layer networks using generalized delta rule. Sigmoid activation functions and their use in backpropagation are also mentioned.

Uploaded by

norain ismail sulieman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views54 pages

Lect 4

Uploaded by

norain ismail sulieman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

University Of Khartoum

Department Of Electronics & Electrical

Engineering
Software & Control Engineering

EEE52511: NEURAL NETWORKS

& FUZZY SYSTEMS
By: Dr. Hiba Hassan Sayed
Lecture 4
30/1/2023 U of K: Dr. Hiba Hassan 2

GRADIENT DESCENT LEARNING

30/1/2023 U of K: Dr. Hiba Hassan 3

Gradient Descent Learning in NN

• The gradient is the rate of change of f(x) at a particular value of x.

• Hence, it is the partial derivative of f(x) with respect to x.
• That led to Gradient Descent Learning, its aim is to find the
minimum error by computing the derivative of the error function
with respect to the weight. Sometimes it is called Gradient
Descent Minimization.
30/1/2023 U of K: Dr. Hiba Hassan 4

Finding the minimum of a function: gradient descent

30/1/2023 U of K: Dr. Hiba Hassan 5

Cont.
30/1/2023 U of K: Dr. Hiba Hassan 6

Cont.
• For a target (t) & an actual output (o), the error is given by the
following mean square error cost function,

• Where D is the set of training examples.

• There are 2 types of gradient descent based cost function, they are
stated next;
30/1/2023 U of K: Dr. Hiba Hassan 7

Where,
30/1/2023 U of K: Dr. Hiba Hassan 8
30/1/2023 U of K: Dr. Hiba Hassan 9

Batch Training
• Batch Training: In batch mode the weights and biases of the
network are updated only after the entire training set has been
applied to the network. The gradients calculated at each training
example are added together to determine the change in the weights
and biases.
• Batch Gradient Descent: In the batch steepest descent training
function the weights and biases are updated in the direction of the
negative gradient of the performance function.
30/1/2023 U of K: Dr. Hiba Hassan 10

Batch Gradient Descent with Momentum

• This algorithm often provides faster convergence.

• Momentum allows a network to respond not only to the local
gradient, but also to recent trends in the error surface.
• Acting like a low-pass filter, momentum allows the network to ignore
small features in the error surface.
• Without momentum a network may get stuck in a shallow local
minimum, such as shown in the next figure.
30/1/2023 U of K: Dr. Hiba Hassan 11

Cont.
Local and global minima Effect of adding Momentum
30/1/2023 U of K: Dr. Hiba Hassan 12

Incremental Mode Gradient Descent

• When we use the gradient with respect to one training example at
a time, the gradient descent becomes the Hoff’s delta rule, which
is given by,

Wi   (t  o) xi
• Also called the Least Mean Square, LMS, method.
30/1/2023 U of K: Dr. Hiba Hassan 13

LMS Learning Rule

Mean Square Error:
• Like the perceptron learning rule, the least mean square (LMS) - the delta
rule - algorithm is an example of supervised training, in which the learning
rule is provided with a set of examples of desired network behavior:
p1 , t1 , p2 , t 2  , ... , pQ , t Q 
• We want to minimize the average of the sum of the squared errors
between target & actual network output:
Q Q
1 1
mse   e(k )   (t (k ) - a(k ))
2 2

Q k 1 Q k 1
30/1/2023 U of K: Dr. Hiba Hassan 14

LMS Algorithm/ Widrow-Hoff rule

• The LMS algorithm was presented by Widrow and Hoff, hence, it is
called Widrow-Hoff learning algorithm.
• As seen before, it is based on an approximate steepest descent
procedure.
• Widrow and Hoff decided that they could estimate the mean
square error by using the squared error at each iteration.
30/1/2023 U of K: Dr. Hiba Hassan 15

Comparing Perceptron & Delta Rules

• Perceptron rule
• Thresholded output.
• Converges after a finite number of iterations to a hypothesis that perfectly
classifies the training data, provided the training examples are linearly
separable.
• Linearly separable data.
• Delta rule
• Unthresholded output.
• Converges toward the error minimum, possibly requiring unbounded time,
but converges regardless of whether the training data are linearly
separable or not.
• Linearly non-separable data.
30/1/2023 U of K: Dr. Hiba Hassan 16

Adaptive Linear Neuron Network Architecture (ADALINE)

• The ADALINE network is a single layer neural network with multiple

nodes, where each node accepts multiple inputs to generate one output.
• ADALINE networks are similar to the perceptron, but their transfer
function is linear rather than hard-limiting. This allows their outputs to
take on any value, whereas the perceptron output is limited to either 0 or
1.
• Both the ADALINE and the perceptron can only solve linearly separable
problems.
30/1/2023 U of K: Dr. Hiba Hassan 17

Cont.
• An adaptive linear system responds to changes in its environment
as it is operating.
• These networks are often used in error cancellation, signal
processing, and control systems. For example, they are used by
many long distance phone lines for echo cancellation.
• The pioneering work in this field was done by Widrow and Hoff,
who gave the name ADALINE to adaptive linear elements.
30/1/2023 U of K: Dr. Hiba Hassan 18

The ADALINE Neural Network

30/1/2023 U of K: Dr. Hiba Hassan 19

Cont.
• Multiple layer ADALINE is called MADALINE.
• The Widrow-Hoff rule can only train single-layer linear networks.
• This is not much of a disadvantage; single-layer linear networks are
just as capable as multilayer linear networks.
• For every multilayer linear network, there is an equivalent single-
layer linear network.
30/1/2023 U of K: Dr. Hiba Hassan 20

BACKPROPAGATION ALGORITHM
30/1/2023 U of K: Dr. Hiba Hassan 21

BackPropagation Algorithm
• The backpropagation algorithm was made popular by Rumelhart,
Hinton and Williams in 1986 "Learning Internal Representations by
Error Propagation". Rumelhart, David E.; McClelland, James
L. (eds.). Parallel Distributed Processing : Explorations in the
Microstructure of Cognition. Vol. 1 : Foundations. Cambridge: MIT
Press. ISBN 0-262-18120-7.]
• The researchers used semi-linear neurons with differentiable activation
functions in the hidden neurons (logistic activation functions or
sigmoids).
30/1/2023 U of K: Dr. Hiba Hassan 22

Cont.
• The error between the target and actual output is calculated at
every iteration and is back propagated through the layers of the
ANN to adapt the weights.
• The weights are adapted such that the error is minimized.
• Once the error has reached a justified minimum value, the training
is stopped.
• Among the first applications of the BP algorithm is speech
synthesis called NETalk developed by Terence Sejnowski
[Sejnowski & Rosenberg, 1987 “Parallel Networks that Learn to
Pronounce English Text”, Complex Systems 1, 145-168]
30/1/2023 U of K: Dr. Hiba Hassan 23

Cont.
• The configuration for training a neural network using the BP
algorithm is shown in the figure below.
30/1/2023 U of K: Dr. Hiba Hassan 24

The Generalized Delta Rule (G.D.R.)

• In BP algorithm, like in other learning algorithms, the goal is to find
the next value of the adaptation weights (Δw) which is also known
as the G.D.R.
• Consider the following ANN model:
30/1/2023 U of K: Dr. Hiba Hassan 25

Cont.
• We need to obtain the following algorithm to adapt the weights
between the output (k) and hidden (j) layers:

• Where the weights are adapted as follows:

• And t is the iteration number and is the error signal between the
output and hidden layers & is given by:
30/1/2023 U of K: Dr. Hiba Hassan 26

Cont.
• Adaptation between input (i) and hidden (j) layers :

• The new weight is thus:

• and the error signal through layer j is:

• Where,

• And,
30/1/2023 U of K: Dr. Hiba Hassan 27

Backpropagation Algorithm
• The following ANN model is used to derive the backpropagation
algorithm:
30/1/2023 U of K: Dr. Hiba Hassan 28

BP (cont.)
• The backpropagation has two steps,
• Forward propagation, and
• Backward propagation.
• Our ANN model has the following assumptions:
• A two-layer multilayer NN model, i.e. with 1 set of hidden neurons.
• Neurons in layer i are fully connected to layer j and neurons in
layer j are fully connected to layer k.
• Input layer neurons have linear activation functions and hidden
and output layer neurons have logistic activation functions
(sigmoids).
30/1/2023 U of K: Dr. Hiba Hassan 29

Note: Sigmoid Function

• Sigmoids have a variable c that controls their firing angle.
30/1/2023 U of K: Dr. Hiba Hassan 30

Cont.
• When c is large, the sigmoid becomes like a threshold function and
when is c is small, the sigmoid becomes more like a straight line
(linear).
• When c is large learning is much faster but a lot of information is
lost, however when c is small, learning is very slow but information
is retained.
• Since this function is differentiable, it enables the B.P. algorithm to
adapt the lower layers of weights in a multilayer neural network.
30/1/2023 U of K: Dr. Hiba Hassan 31

Cont.
• The firing angle used here is c=1.
• Bias weights are used with bias signals of 1 for hidden (j) and output
layer (k) neurons.
• In many ANN models, bias weights (θ) with bias signals of 1 are used to
speed up the convergence process.
• The learning parameter is given by the symbol η and is usually fixed a
value between 0 and 1, however, in many applications nowadays an
adaptive η is used.
• Usually η is set large in the initial stage of learning and reduced to a
small value at the final stage of learning.
• A momentum term α is also used in the G.D.R. to avoid local minima.
30/1/2023 U of K: Dr. Hiba Hassan 32

Steps of BP Algorithm
• Step 1: Obtain a set of training patterns.
• Step 2: Set up neural network model: No. of Input neurons, Hidden
neurons, and Output Neurons.
• Step 3: Set learning rate η and momentum rate α
• Step 4: Initialize all connection Wji , Wkj and bias weights θj θk to
random values.
• Step 5: Set minimum error, Emin
• Step 6: Start training by applying input patterns one at a time and
propagate through the layers then calculate total error.
30/1/2023 U of K: Dr. Hiba Hassan 33

Cont.
• Step 7: Backpropagate error through output and hidden layer and
adapt weights.
• Step 8: Backpropagate error through hidden and input layer and
adapt weights.
• Step 9: Check if Error < Emin
• If not repeat Steps 6-9. If yes stop training.
30/1/2023 U of K: Dr. Hiba Hassan 34

Solving an XOR Problem

• In this example we use the BP algorithm to solve a 2-bit XOR problem.
• The training patterns of this ANN is the XOR example as given in the next
table.
• For simplicity, the ANN model has only 4 neurons (2 inputs, 1 hidden and
1 output) and has no bias weights.
• The input neurons have linear functions and the hidden and output
neurons have sigmoid functions.
• The weights are initialized randomly.
• We train the ANN by providing the patterns #1 to #4 through an iteration
process until the error is minimized.
30/1/2023 U of K: Dr. Hiba Hassan 35

Cont.
• The training patterns of this ANN is the XOR example as given in
the following table:
30/1/2023 U of K: Dr. Hiba Hassan 36

Cont.
• The ANN model and its initial weights,

• Training begins when the pattern#1 and its target are provided to the
ANN.
• 1st pattern: 0, 0 target : 0
30/1/2023 U of K: Dr. Hiba Hassan 37
30/1/2023 U of K: Dr. Hiba Hassan 38

Compute the error by comparing this value to the target,

30/1/2023 U of K: Dr. Hiba Hassan 39

Cont.
• This error is now backpropagated through the layers following the
error signal equations given as follows:
• Between output (k) and hidden (j) layer

• Thus
• Between hidden (j) and input (i) layer :

• = -0.0035
30/1/2023 U of K: Dr. Hiba Hassan 40

Cont.
• Now we have calculated the error signal between layers (k) and (j)

• If we had chosen the learning rate and momentum term as follows :

• η = 0.1 and α= 0.9
• and the previous change in weight is 0 and Ojo= 0.5
• Then,

= -0.0064
30/1/2023 U of K: Dr. Hiba Hassan 41

Cont.
• This is the increment of the weight after the first iteration for the
weight between layers k and j.
• Now this change in weight is added to the actual weight as follows

• and thus the weight between layers k and j has been adapted.
30/1/2023 U of K: Dr. Hiba Hassan 42

Cont.
• Similarly for the weights between layers j and i, the adaptation follows

• Now this change in weight is added to the actual weight as follows:

• and this is the adapted weight between layers j and i after pattern#1 is
seen by the ANN in the first iteration.
• The whole calculation is then repeated for the next pattern (pattern#2 =
[0, 1]) with tk=1.
• After all the 4 patterns have been completed the whole process is
repeated for pattern#1 again.
30/1/2023 U of K: Dr. Hiba Hassan 43

UNSUPERVISED LEARNING
30/1/2023 U of K: Dr. Hiba Hassan 44

Unsupervised Learning
• Unsupervised learning is the process of finding structure, patterns
or correlation in the given data.
• Many times this type of learning depends on associative learning
procedures.
• We focus on two main approaches:
• Unsupervised Hebbian learning
• Principal component analysis
• Unsupervised competitive learning
• Clustering
30/1/2023 U of K: Dr. Hiba Hassan 45

Types of Analysis used in

Unsupervised Learning
• Correlational analysis
• Identifying the correlations among features.
• Accomplished via Hebbian learning
• Cluster analysis
• Identifying the relational structure of the data.
• Accomplished via competitive learning.

• Cluster analysis is a form of categorization, whereas

Correlational analysis is a form of simplification.
30/1/2023 U of K: Dr. Hiba Hassan 46

Hebbian Learning
• An association principle was proposed by Hebb in 1949 in the
context of biological neurons.
• Hebb’s principle
When a neuron repeatedly excites another neuron, then the
threshold of the latter neuron is decreased, or the synaptic
weight between the neurons is increased, in effect increasing
the likelihood of the second neuron to be excited by the first.
30/1/2023 U of K: Dr. Hiba Hassan 47

Hebbian Learning as Correlation Learning

• Hebbian learning is an associative learning, it associates things that
occur together.
• Thus Hebbian learning can be thought of as learning the auto-
correlation of the input space.
• Example: a child recognizes a banana by its shape & wants to eat it.
Then, he smells it and after a couple of exposures to that experiment
starts, drooling! Once he smells it without even seeing it.
• Conclusion: the child has associated the smell with the banana &
produced a response (hunger effect) even without seeing its shape.
30/1/2023 U of K: Dr. Hiba Hassan 48

Cont.
• Brilliant idea by Hebb(1949):cells that fire together, wire
together

Banana-smell Hungry Neuron

Neuron
30/1/2023 U of K: Dr. Hiba Hassan 49

Hebbian Learning Neural Network

Output Signals
Input Signals

i j
30/1/2023 U of K: Dr. Hiba Hassan 50

Banana Associator Example

30/1/2023 U of K: Dr. Hiba Hassan 51

Example (cont.)
• The inputs are defined as follows:

• If we want the network to associate the response to the shape of

the banana & not its smell, w0 is assigned a value greater than –b,
while w is assigned a value less than –b.
• Hence we choose; w0 = 1& w = 0.
• The output of the network reduces to;
a = hardlim(p0 - 0.5)
30/1/2023 U of K: Dr. Hiba Hassan 52

Hebbian Learning
• Hebbian learning rule Δwji = ηyjxi
• Consider the update of a single weight w,
w(n + 1) = w(n) + ηy(n)x(n)
• For a linear activation function
w(n + 1) = w(n)[1 + ηx2(n)]
• Weights increase without bounds. If initial weight is negative, then
it will increase in the negative range. If it is positive, then it will
increase in the positive range.
• Hebbian learning is naturally unstable.
30/1/2023 U of K: Dr. Hiba Hassan 53

Oja’s Learning Rule

• To solve the problem of the simple Hebbian rule that causes the
weights to increase (or decrease) without bounds,
• The weights need to be normalized to one as follows,
wji(n + 1) = [wji(n) + ηyj(n)xi(n)] / √Σi[wji(n) + ηyj(n)xi(n)]2
• This equation effectively imposes a constraint on the weights.
• Oja approximated the normalization (for small η) as:
30/1/2023 U of K: Dr. Hiba Hassan 54

Oja’s Rule (continued)

wji(n + 1) = wji(n) + ηyj(n)[xi(n) – yj(n)wji(n)]

• This rule is also known as the generalized Hebbian rule.

• The 2nd term is called a weight decay term or a ‘forgetting term’.

CSC 323-06 Artificial Neural Network
No ratings yet
CSC 323-06 Artificial Neural Network
29 pages
MLP and Backpropagation Overview
No ratings yet
MLP and Backpropagation Overview
22 pages
Neural Networks Course Overview
No ratings yet
Neural Networks Course Overview
72 pages
Chapter 3
No ratings yet
Chapter 3
30 pages
Machine Learning: Algorithms and Applications: (Continued)
No ratings yet
Machine Learning: Algorithms and Applications: (Continued)
17 pages
Machine Learning
No ratings yet
Machine Learning
68 pages
Neural Networks: Keras & Backpropagation
No ratings yet
Neural Networks: Keras & Backpropagation
19 pages
2025 Lecture07 P2 MLP
No ratings yet
2025 Lecture07 P2 MLP
56 pages
Multi Layer Feed-Forward Network Learning
No ratings yet
Multi Layer Feed-Forward Network Learning
5 pages
Supervised Learning Network
No ratings yet
Supervised Learning Network
33 pages
Lect 3 PDF
No ratings yet
Lect 3 PDF
59 pages
Chapter 2. Training NN
No ratings yet
Chapter 2. Training NN
50 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Neural Networks: Single & Multi-Layer Overview
No ratings yet
Neural Networks: Single & Multi-Layer Overview
35 pages
Artificial Neural Networks & Fuzzy Logic
No ratings yet
Artificial Neural Networks & Fuzzy Logic
13 pages
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
No ratings yet
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
24 pages
Fundamentals of Deep Learning Explained
No ratings yet
Fundamentals of Deep Learning Explained
72 pages
Week-12 - Introduction To ML-NN-CNN
No ratings yet
Week-12 - Introduction To ML-NN-CNN
45 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Learning Rules For Multilayer Feedforward Neural Networks
No ratings yet
Learning Rules For Multilayer Feedforward Neural Networks
19 pages
Unit-4 NN
No ratings yet
Unit-4 NN
14 pages
Unit 5
No ratings yet
Unit 5
219 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Module 3.docxaiml
No ratings yet
Module 3.docxaiml
20 pages
Neural Network Learning Techniques
No ratings yet
Neural Network Learning Techniques
23 pages
Scunit 2 Application of Soft Computing kcs056
No ratings yet
Scunit 2 Application of Soft Computing kcs056
26 pages
AIML-Module-3-part 2
No ratings yet
AIML-Module-3-part 2
122 pages
Unit - 4 ANN
No ratings yet
Unit - 4 ANN
46 pages
2023 Lecture11 NeuralNetworks
No ratings yet
2023 Lecture11 NeuralNetworks
48 pages
Chapter 10: Artificial Neural Networks
No ratings yet
Chapter 10: Artificial Neural Networks
17 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Back Propagation
100% (1)
Back Propagation
27 pages
Unit II Supervised II
No ratings yet
Unit II Supervised II
16 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Week 3
No ratings yet
Week 3
15 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
No ratings yet
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
24 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Training A Neural Network (Module 2)
No ratings yet
Training A Neural Network (Module 2)
43 pages
ML Unit - 2
No ratings yet
ML Unit - 2
70 pages
Unit 4
No ratings yet
Unit 4
18 pages
2024 MTH058 Lecture02 Backpropagation
No ratings yet
2024 MTH058 Lecture02 Backpropagation
62 pages
BDA Unit 2
No ratings yet
BDA Unit 2
48 pages
Backpropagation in Neural Networks
No ratings yet
Backpropagation in Neural Networks
56 pages
Ann 2 A
No ratings yet
Ann 2 A
20 pages
ML 2
No ratings yet
ML 2
10 pages
Lect3 UWA PDF
No ratings yet
Lect3 UWA PDF
73 pages
Error
No ratings yet
Error
24 pages
Module 2 Notes - Full
No ratings yet
Module 2 Notes - Full
54 pages
AI Unit II Lec Notes Deep Learning
No ratings yet
AI Unit II Lec Notes Deep Learning
64 pages
Multi-Layer Perceptron & Backpropagation
No ratings yet
Multi-Layer Perceptron & Backpropagation
88 pages
AI17-Neural Networks
No ratings yet
AI17-Neural Networks
34 pages
RBFN and TDNN
No ratings yet
RBFN and TDNN
42 pages
Neural Network: Prof. Subodh Kumar Mohanty
No ratings yet
Neural Network: Prof. Subodh Kumar Mohanty
37 pages
Eulerian Lagrangian Representations
No ratings yet
Eulerian Lagrangian Representations
91 pages
Vidya Drolia
No ratings yet
Vidya Drolia
137 pages
2005 Ashrae Handbook Fundamentals I P Ed
50% (2)
2005 Ashrae Handbook Fundamentals I P Ed
9 pages
Electronic System Design Lab Guide
No ratings yet
Electronic System Design Lab Guide
30 pages
B47 - 10008 STD 6 Ins 012 R00
No ratings yet
B47 - 10008 STD 6 Ins 012 R00
70 pages
JMX Tools Reference Guide
No ratings yet
JMX Tools Reference Guide
24 pages
ED Lab Experiment 9
No ratings yet
ED Lab Experiment 9
14 pages
Holter Angel Suarez
100% (1)
Holter Angel Suarez
8 pages
Dew Point Compressed Air Application Note B210991EN B LOW v1
100% (1)
Dew Point Compressed Air Application Note B210991EN B LOW v1
4 pages
IAM Assignment ToR 26.10.23
No ratings yet
IAM Assignment ToR 26.10.23
5 pages
Class 11 CS Mid-Term
No ratings yet
Class 11 CS Mid-Term
3 pages
Coconut-Lemon Beverage Optimization
No ratings yet
Coconut-Lemon Beverage Optimization
7 pages
Form Sheet A - Letter of Commitment - Oman Schools
No ratings yet
Form Sheet A - Letter of Commitment - Oman Schools
4 pages
Presentation 1
No ratings yet
Presentation 1
37 pages
E-Learning Trends Post-COVID-19
No ratings yet
E-Learning Trends Post-COVID-19
4 pages
Automation in Manufacturing Course Guide
No ratings yet
Automation in Manufacturing Course Guide
41 pages
Thermodynamics - Chapter 2 - Lecture 2
No ratings yet
Thermodynamics - Chapter 2 - Lecture 2
20 pages
Orientations Toward Death: A Vital Aspect of The Study of Lives by Edwin S. Shneidman 1963
No ratings yet
Orientations Toward Death: A Vital Aspect of The Study of Lives by Edwin S. Shneidman 1963
28 pages
Aim: Code: Swserver.C: Implementation of Sliding Window Protocol
No ratings yet
Aim: Code: Swserver.C: Implementation of Sliding Window Protocol
5 pages
An in Silico Study On The Carbon Capture Performance of Menthol-Based
No ratings yet
An in Silico Study On The Carbon Capture Performance of Menthol-Based
15 pages
Dynamometer
No ratings yet
Dynamometer
3 pages
The Voice of The Rain (Stanza Wise Question)
100% (1)
The Voice of The Rain (Stanza Wise Question)
6 pages
Music's Impact on Running Performance
No ratings yet
Music's Impact on Running Performance
6 pages
The Fateful: A Warlock Patron Guide
No ratings yet
The Fateful: A Warlock Patron Guide
5 pages
Biomedical Optics Question Bank
100% (2)
Biomedical Optics Question Bank
5 pages
Chavani Company Folder
No ratings yet
Chavani Company Folder
11 pages
RFIS - END TERM Project - Group - 8
No ratings yet
RFIS - END TERM Project - Group - 8
17 pages
Level aa Wordless Book: "In"
No ratings yet
Level aa Wordless Book: "In"
15 pages
Heavy Oil Recovery with H2O2
No ratings yet
Heavy Oil Recovery with H2O2
1 page
Structuring IB Business 10 Mark Answers
100% (10)
Structuring IB Business 10 Mark Answers
3 pages

Lect 4

Uploaded by

Lect 4

Uploaded by

University Of Khartoum

Department Of Electronics & Electrical

EEE52511: NEURAL NETWORKS

GRADIENT DESCENT LEARNING

Gradient Descent Learning in NN

• The gradient is the rate of change of f(x) at a particular value of x.

Finding the minimum of a function: gradient descent

• Where D is the set of training examples.

Batch Gradient Descent with Momentum

• This algorithm often provides faster convergence.

Incremental Mode Gradient Descent

LMS Learning Rule

LMS Algorithm/ Widrow-Hoff rule

Comparing Perceptron & Delta Rules

Adaptive Linear Neuron Network Architecture (ADALINE)

• The ADALINE network is a single layer neural network with multiple

The ADALINE Neural Network

The Generalized Delta Rule (G.D.R.)

• Where the weights are adapted as follows:

• The new weight is thus:

• and the error signal through layer j is:

Note: Sigmoid Function

Solving an XOR Problem

Compute the error by comparing this value to the target,

• If we had chosen the learning rate and momentum term as follows :

• Now this change in weight is added to the actual weight as follows:

Types of Analysis used in

• Cluster analysis is a form of categorization, whereas

Hebbian Learning as Correlation Learning

Banana-smell Hungry Neuron

Hebbian Learning Neural Network

Banana Associator Example

• If we want the network to associate the response to the shape of

Oja’s Learning Rule

Oja’s Rule (continued)

• This rule is also known as the generalized Hebbian rule.

You might also like