0% found this document useful (0 votes)
90 views13 pages

Module 3 SPECIAL LEARNING NETWORK

Module III discusses competitive learning in neural networks, focusing on how nodes specialize through competition to identify clusters in data, exemplified by algorithms like K-Means and self-organizing maps. It outlines the K-Means algorithm's iterative process for clustering unlabelled data and introduces various neural network architectures, including Hebb networks and Hopfield networks, which utilize different learning rules for pattern association and optimization. Additionally, it covers associative memory networks and the Boltzmann Machine, emphasizing their roles in pattern recognition and optimization tasks.

Uploaded by

anujbhagat031
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views13 pages

Module 3 SPECIAL LEARNING NETWORK

Module III discusses competitive learning in neural networks, focusing on how nodes specialize through competition to identify clusters in data, exemplified by algorithms like K-Means and self-organizing maps. It outlines the K-Means algorithm's iterative process for clustering unlabelled data and introduces various neural network architectures, including Hebb networks and Hopfield networks, which utilize different learning rules for pattern association and optimization. Additionally, it covers associative memory networks and the Boltzmann Machine, emphasizing their roles in pattern recognition and optimization tasks.

Uploaded by

anujbhagat031
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module III: SPECIAL LEARNING NETWORK

Competitive learning is a form of unsupervised learning in artificial neural networks, in which


nodes compete for the right to respond to a subset of the input data. A variant of Hebbian
learning, competitive learning works by increasing the specialization of each node in the
network. It is well suited to finding clusters within data.
Models and algorithms based on the principle of competitive learning include vector
quantization and self-organizing maps (Kohonen maps).

There are three basic elements to a competitive learning rule:

 A set of neurons that are all the same except for some randomly distributed synaptic
weights, and which therefore respond differently to a given set of input patterns
 A limit imposed on the "strength" of each neuron
 A mechanism that permits the neurons to compete for the right to respond to a given
subset of inputs, such that only one output neuron (or only one neuron per group), is
active (i.e. "on") at a time. The neuron that wins the competition is called a "winner-
take-all" neuron.

Accordingly, the individual neurons of the network learn to specialize on ensembles of similar
patterns and in so doing become 'feature detectors' for different classes of input patterns.

The fact that competitive networks recode sets of correlated inputs to one of a few output
neurons essentially removes the redundancy in representation which is an essential part of
processing in biological sensory systems.

Basic Concept of Competitive Network

This network is just like a single layer feed-forward network having feedback connection
between the outputs. The connections between the outputs are inhibitory type, which is
shown by dotted lines, which means the competitors never support themselves.

Fig: Basic Concept of Competitive Learning Rule


As said earlier, there would be competition among the output nodes so the main concept is
- during training, the output unit that has the highest activation to a given input pattern, will
be declared the winner. This rule is also called Winner-takes-all because only the winning
neuron is updated and the rest of the neurons are left unchanged.

What is K-Means Algorithm?

K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabelled


dataset into different clusters. Here K defines the number of pre-defined clusters that need
to be created in the process, as if K=2, there will be two clusters, and for K=3, there will be
three clusters, and so on.
It is an iterative algorithm that divides the unlabelled dataset into k different clusters in such
a way that each dataset belongs only one group that has similar properties.
It allows us to cluster the data into different groups and a convenient way to discover the
categories of groups in the unlabelled dataset on its own without the need for any training.
It is a centroid-based algorithm, where each cluster is associated with a centroid. The main
aim of this algorithm is to minimize the sum of distances between the data point and their
corresponding clusters.
The algorithm takes the unlabelled dataset as input, divides the dataset into k-number of
clusters, and repeats the process until it does not find the best clusters. The value of k should
be predetermined in this algorithm.
The k-means clustering algorithm mainly performs two tasks:
o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to
the particular k-center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is away from other
clusters.
The below diagram explains the working of the K-means Clustering Algorithm:
How does the K-Means Algorithm Work?
The working of the K-Means algorithm is explained in the below steps:
Step-1: Select the number K to decide the number of clusters.
Step-2: Select random K points or centroids. (It can be other from the input dataset).
Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Step-5: Repeat the third steps, which means reassign each data point to the new closest
centroid of each cluster.
Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
Step-7: The model is ready.

Feature mapping
Feature mapping is a process which converts the patterns of arbitrary dimensionality into a
response of one or two-dimensional arrays of neurons, i.e., it converts a wide pattern space
into a typical feature space. The network performing such a mapping is called feature map.
Apart from its capability to reduce the higher dimensionality, it has to preserve the
neighbourhood relations of the input patterns, i.c., it has to obtain a topology preserving map.
For obtaining such feature maps, it is required to find a self-organizing neural array which
consists of neurons arranged in a one-dimensional array or a two-dimensional array. To depict
this, a typical network structure where each component of the input vector x is connected to
each of the nodes is shown in Figure 5-5.

On the other hand, if the input vector is two-dimensional, the inputs, say x(a, b), can arrange
themselves in a two-dimensional array defining the input space (a, b) as in Figure 5-5. Here,
the two layers are fully connected.

The topological preserving property is observed in the brain, but not found in any other
artificial neural network. Here, there are m output cluster units arranged in a one- or two-
dimensional array and the input signals are n-tuples. The cluster (output) unit’s weight vector
serves as an exemplar of the input pattern that is associated with that cluster. At the time of
self-organization, the weight vector of the cluster unit which matches the input pattern very
closely is chosen as the winner unit. The closeness of weight vector of cluster unit to the input
pattern may be based on the square of the minimum Euclidean distance. The weights are
updated for the winning unit and its neighbouring units. It should be noted that the weight
vectors of the neighbouring units are not close to the input pattern and the connective
weights do not multiply the signal sent from the input units to the cluster units until dot
product measure of similarity is being used.

Flowchart: The flowchart for KSOFM is shown in Figure 5-11, which indicates the flow of
training process. The process is continued for particular number of epochs or till the learning
rate reduces to a very small rate. The architecture consists of two layers; input layer and
output layer (cluster). There are "n" units in the input layer and "" units in the output layer.
Basically, here the winner unit is identified by using either dot product or Euclidean distance
method and the weight updation using Kohonen learning rules is performed over the winning
cluster unit.
Hebb Network

For a neural net, the Hebb learning rule is a simple one. Donald Hebb stated in 1949 that in
the brain, the learning is performed by the change in the synaptic gap. Hebb explained it:

"When an axon of cell A is near enough to excite cell B. and repeatedly or permanently takes
place in firing it, some growth process or metabolic change takes place in one or both the cells
such that A's efficiency, as one of the cells firing B, is increased."

According to the Hebb rule, the weight vector is found to increase proportionately to the
product of the input and the learning signal. Here the learning signal is equal to the neuron's
output. In Hebb learning. If two interconnected neurons are 'on' simultaneously then the
weights associated with these neurons can be increased by the modification made in their
synaptic gap (strength). The weight update in Hebb rule is given by

wi (nⅇw) = wi (old) + X i Y
The Hebb rule is more suited for bipolar data than binary data. If binary data is used, the
above weight updation formula cannot distinguish two conditions namely:

1. A training pair in which an input unit is "on" and target value is "off"

2: A training pair in which both the input unit and the target value are "off."

Thus, there are limitations in Hebb rule application over binary data. Hence, the
representation using bipolar data is advantageous.

Flowchart of Training Algorithm

The training algorithm is used for the calculation and adjustment of weights. S:t refers to each
training input and target output pair. Till there exists a pair of training input and target output,
the training process takes place, else, it is stopped.

Training Algorithm:

The training algorithm of Hebb network is given below:

Step 0: First initialize the weights. Basically in this network they may be set to zero, ie.

Wi = 0 for i = 1t o n

Where "n" may be the total number of input neurons.

Step 1: Steps 2-4 have to be performed for each input training vector and target output pair,
S:t

Step 2: Input units activations are set. Generally, the activation function of input layer is
identity function: x=s, for i = 1 to n.
Step 3: Output units activations are set: y=t

Step 4: Weight adjustments and bias adjustments are performed:

wi (nⅇw) = wi (old) + X i Y

b(nⅇw) = b(old) + Y
Application area:

The Hebb rule can be used for pattern association, pattern categorization, and pattern
classification and over a range of other areas.

Hopfield neural network


Hopfield neural network was invented by Dr. John J. Hopfield in 1982. It consists of a single
layer which contains one or more fully connected recurrent neurons. The Hopfield network is
commonly used for auto-association and optimization tasks.

Discrete Hopfield Network: A Hopfield network which operates in a discrete line fashion
or in other words, it can be said the input and output patterns are discrete vector, which can
be either binary 0,10,1 or bipolar +1,−1+1,−1 in nature. The network has symmetrical weights
with no self-connections i.e., wij = wji and wii = 0.

Architecture
Following are some important points to keep in mind about discrete Hopfield network −
 This model consists of neurons with one inverting and one non-inverting output.
 The output of each neuron should be the input of other neurons but not the input of
self.
 Weight/connection strength is represented by wij.
 Connections can be excitatory as well as inhibitory. It would be excitatory, if the output
of the neuron is same as the input, otherwise inhibitory.
 Weights should be symmetrical, i.e. wij = wji
The output from Y1 going to Y2, Yi and Yn have the weights w12, w1i and w1n respectively.
Similarly, other arcs have the weights on them.
Training Algorithm
During training of discrete Hopfield network, weights will be updated. As we know that we
can have the binary input vectors as well as bipolar input vectors. Hence, in both the cases,
weight updates can be done with the following relation.

Continuous Hopfield Network


In comparison with Discrete Hopfield network, continuous network has time as a continuous
variable. It is also used in auto association and optimization problems such as travelling salesman
problem.

Associate Memory Network

These kinds of neural networks work on the basis of pattern association, which means they can store
different patterns and at the time of giving an output they can produce one of the stored patterns by
matching them with the given input pattern. These types of memories are also called Content-
Addressable Memory CAMCAM. Associative memory makes a parallel search with the stored
patterns as data files.

Following are the two types of associative memories we can observe −

 Auto Associative Memory

 Hetero Associative memory

Auto Associative Memory


This is a single layer neural network in which the input training vector and the output target vectors
are the same. The weights are determined so that the network stores a set of patterns.

Architecture

As shown in the following figure, the architecture of Auto Associative memory network
has ‘n’ number of input training vectors and similar ‘n’ number of output target vectors.

Training Algorithm

For training, this network is using the Hebb or Delta learning rule.

Step 1 − Initialize all the weights to zero as wij = 0 i=1 to n,j=1 to n i=1 to n,j=1 to n

Step 2 − Perform steps 3-4 for each input vector.

Step 3 − Activate each input unit as follows –

Xi=Si(i=1to n)
Step 4 − Activate each output unit as follows −
Yj=Sj(j=1 to n)
Step 5 − Adjust the weights as follows −
Wij(new)=Wij(old)+Xi Yj
Testing Algorithm
Step 1 − Set the weights obtained during training for Hebb’s rule.

Step 2 − Perform steps 3-5 for each input vector.

Step 3 − Set the activation of the input units equal to that of the input vector.

Step 4 − Calculate the net input to each output unit j = 1 to n

Step 5 − Apply the following activation function to calculate the output

Hetero Associative memory


Similar to Auto Associative Memory network, this is also a single layer neural network. However, in
this network the input training vector and the output target vectors are not the same. The weights
are determined so that the network stores a set of patterns. Hetero associative network is static in
nature, hence, there would be no non-linear and delay operations.

Architecture

As shown in the following figure, the architecture of Hetero Associative Memory network
has ‘n’ number of input training vectors and ‘m’ number of output target vectors.

Training Algorithm
For training, this network is using the Hebb or Delta learning rule.
Step 1 − Initialize all the weights to zero as Wij = 0 i=1ton,j=1tomi=1ton,j=1tom
Step 2 − Perform steps 3-4 for each input vector.
Step 3 − Activate each input unit as follows –

Step 4 − Activate each output unit as follows −

Step 5 − Adjust the weights as follows –

Testing Algorithm
Step 1 − Set the weights obtained during training for Hebb’s rule.

Step 2 − Perform steps 3-5 for each input vector.

Step 3 − Set the activation of the input units equal to that of the input vector.

Step 4 − Calculate the net input to each output unit j = 1 to m;

Step 5 − Apply the following activation function to calculate the output

Boltzmann Machine
These are stochastic learning processes having recurrent structure and are the basis of the early
optimization techniques used in ANN. Boltzmann Machine was invented by Geoffrey Hinton and
Terry Sejnowski in 1985. More clarity can be observed in the words of Hinton on Boltzmann
Machine.

“A surprising feature of this network is that it uses only locally available information. The change of
weight depends only on the behavior of the two units it connects, even though the change optimizes
a global measure” - Ackley, Hinton 1985.

Some important points about Boltzmann Machine −

 They use recurrent structure.

 They consist of stochastic neurons, which have one of the two possible states, either 1 or 0.

 Some of the neurons in this are adaptive free state and some are clamped frozen state.
 If we apply simulated annealing on discrete Hopfield network, then it would become
Boltzmann Machine.

Objective of Boltzmann Machine

The main purpose of Boltzmann Machine is to optimize the solution of a problem. It is the work of
Boltzmann Machine to optimize the weights and quantity related to that particular problem.

Architecture

The following diagram shows the architecture of Boltzmann machine. It is clear from the diagram,
that it is a two-dimensional array of units. Here, weights on interconnections between units are –
p where p > 0. The weights of self-connections are given by b where b > 0.

Training Algorithm
As we know that Boltzmann machines have fixed weights, hence there will be no training algorithm
as we do not need to update the weights in the network. However, to test the network we have to
set the weights as well as to find the consensus function CFCF.

Boltzmann machine has a set of units Ui and Uj and has bi-directional connections on them.

 We are considering the fixed weight say wij.

 wij ≠ 0 if Ui and Uj are connected.

 There also exists a symmetry in weighted interconnection, i.e. wij = wji.

 wii also exists, i.e. there would be the self-connection between units.

 For any unit Ui, its state ui would be either 1 or 0.

The main objective of Boltzmann Machine is to maximize the Consensus Function CFCF which can be
given by the following relation
Now, when the state changes from either 1 to 0 or from 0 to 1, then the change in consensus can be
given by the following relation –

Generally, unit Ui does not change its state, but if it does then the information would be residing local
to the unit. With that change, there would also be an increase in the consensus of the network.

Probability of the network to accept the change in the state of the unit is given by the following
relation −

Here, T is the controlling parameter. It will decrease as CF reaches the maximum value.

Testing Algorithm
Step 1 − Initialize the following to start the training −

 Weights representing the constraint of the problem

 Control Parameter T

Step 2 − Continue steps 3-8, when the stopping condition is not true.

Step 3 − Perform steps 4-7.

Step 4 − Assume that one of the state has changed the weight and choose the integer I, J as random
values between 1 and n.

Step 5 − Calculate the change in consensus as follows −

Step 6 − Calculate the probability that this network would accept the change in state

Step 7 − Accept or reject this change as follows −

Case I − if R < AF, accept the change.

Case II − if R ≥ AF, reject the change.

Here, R is the random number between 0 and 1.


Step 8 − Reduce the control parameter temperature as follows −

Step 9 − Test for the stopping conditions which may be as follows −

 Temperature reaches a specified value

 There is no change in state for a specified number of iterations

Artificial Neural Network Applications

Following are some important ANN Applications –

1. Speech Recognition: Speech recognition relies heavily on artificial neural


networks (ANNs). Earlier speech recognition models used statistical models
such as Hidden Markov Models. With the introduction of deep learning,
several forms of neural networks have become the only way to acquire a
precise classification.
2. Handwritten Character Recognition: ANNs are used to recognize
handwritten characters. Handwritten characters can be in the form of letters
or digits, and neural networks have been trained to recognize them.
3. Signature Classification: We employ artificial neural networks to recognize
signatures and categorize them according to the person’s class when
developing these authentication systems. Furthermore, neural networks can
determine whether or not a signature is genuine.
4. Medical: It can be used to detect cancer cells and analyze MRI pictures in
order to provide detailed results.
5. Human Face Recognition: It is one of the biometric methods to identify the given face.
It is a typical task because of the characterization of “non-face” images. However, if a
neural network is well trained, then it can be divided into two classes namely images
having faces and images that do not have faces. First, all the input images must be
pre-processed. Then, the dimensionality of that image must be reduced. And, at last
it must be classified using neural network training algorithm. Following neural
networks are used for training purposes with pre-processed image −

 Fully-connected multilayer feed-forward neural network trained with the help


of back-propagation algorithm.
 For dimensionality reduction, Principal Component Analysis PCA is used.

You might also like