Artificial Neural
Networks
part one
Artificial Intelligence for
Control and Identification
Dr. Wilbert G. Aguilar
Ph.D. in Automatic Control, Robotics and Computer Vision
Mster AR
2007 Dr. X. Parra & Dr. C. Angulo
Outline
1. Why Artificial Neural Networks?
Artificial Neural Networks
2. Model of a neuron
1. Why Artificial Neural Networks?
Von Neumanns Computer
Artificial Neural Networks
determinism
Human Brain
fuzzy behaviour
sequence of instructions
parallelism
high speed
slow speed
repetitive tasks
programming
uniqueness of solutions
ex. matrix product
adaptation to situations
learning
different solutions
ex. face recognition
1. Why Artificial Neural Networks?
[Human] Brain Operation
When humans recognize a face or take an object, they
do not solve equations
Artificial Neural Networks
Brain works in an associative way
Each sensorial state evokes a brain state (an
electro-chemical activity) which is memorized
depending on the necessities
1. Why Artificial Neural Networks?
Playing tennis
Ball trajectory depends on many different factors:
Artificial Neural Networks
shot strength, initial angle, racket trajectory, ball spin, wind
speed,
A desired trajectory requires:
accurate measure of all the variables
simultaneous solution of many complex equations, which
must be solved for each data acquisition (fast dynamics)
How makes a player to manage all that?
1. Why Artificial Neural Networks?
Playing tennis
In a learning phase, a human player tries and experiments with
different actions and memorizes the good ones:
Artificial Neural Networks
if the racket is in the right side and the ball comes from right to left
then move a step backward and cross the racket to left side
if racket is in the right side and the ball comes from left to right then
move a step forward
if racket is
1. Why Artificial Neural Networks?
Playing tennis
Artificial Neural Networks
In an operative phase, the brain controls the actions without
thinking, on the base of the learned associations
A similar mechanism is
used for speech recognition
or motion control
Outline
1. Why Artificial Neural Networks?
Artificial Neural Networks
2. Model of a neuron
2. Model of a neuron
Biological inspiration
Artificial Neural Networks
The basic scheme of a biological neuron is
soma
nucleus
axon
synapse
dendrites
2. Model of a neuron
Biological inspiration
Artificial Neural Networks
Incoming signals from other neurons determine if the neuron
shall fire
The contribution of the signals to the output depends on the
strength of the synaptic connection. The output depends on
the attenuation/amplification in the synapses.
Dendrites receive the amplified signals through the synapses
and send them to the cellular body, where they are added.
When the sum exceeds a threshold the neuron output is
active (firing)
2. Model of a neuron
Biological inspiration
Artificial Neural Networks
Some of the brain properties are:
Neuron speed
milliseconds
Number of neurons
1011 1012
Number of connections
103 104 per neuron
Number of synapses
1014 1016 nervous system
Distributed control
any CPU
Fault tolerant
graceful degradation
Low power consumption
no batteries (but food)
10
2. Model of a neuron
McCulloch & Pitts Model (1943)
Artificial Neural Networks
The McCulloch-Pitts neuron is a binary neuron with
only two states: active (fired/excited) and inactive
(not fired/excited)
The neuron worked by inputting either a 1 or 0 for
each of the inputs (binary inputs), where 1 represented true and 0 false. Likewise, the threshold was
given a real value (), say 1, which would allow for a 0
or 1 output if the threshold was met or exceeded
(binary output)
11
2. Model of a neuron
McCulloch & Pitts Model (1943)
Artificial Neural Networks
W. S. McCulloch and W. Pitts. A logical calculus of ideas immanent in nervous
activity. Bulletin of Mathematical Biophysics, 5:115--133, 1943.
inputs
synaptic
weights
x1
w1
x2
xm
xi{-1,1}
wi{-1,1}
y{-1,1}
summing
junction
w2
activation
function
output
wm
1
1 if
0
-1
-2
y=
-1
+1
Sign Function
+2
w x
i
i=1
-1 otherwise
12
2. Model of a neuron
Artificial Neural Networks
McCulloch & Pitts Model (1943)
inputs
synaptic
weights
x1
w1
summing
junction
x2
activation
function
output
w2
determine the parameters (w1, w2 and ) so that
the neuron represents a logical OR function
13
2. Model of a neuron
McCulloch & Pitts Model (1943)
Some interesting points:
Artificial Neural Networks
it is the relative magnitudes that are important and not their
absolute magnitudes
in the previous example we could change the weights and threshold
by a factor of 2 and the output would be the same
a realization of the AND function may be achieved by simply
changing the threshold to -1
we can specify weights and then determine the binary logic output
The next important step is the ability to specify the desired
output and adjust the weights to achieve that output
14
2. Model of a neuron
Hebbs Rule (1949)
Hebb, D.O. The organization of behavior; a neuropsychological theory. WileyInterscience, New York, 1949
Artificial Neural Networks
Biological Rule: the synapse resistance to the incoming signal
can be changed during a learning process
If an input of a neuron is repeatedly and
persistently causing the neuron to fire, a
metabolic change happens in the synapse of
that particular input to reduce its resistance
learning is not an inherently property of the
neuron but is due to synapses modification
15
2. Model of a neuron
Hebbs Rule (1949)
Artificial Neural Networks
Hebbian Learning Rule: a change in the strength of a connection
is a function of the pre- and postsynaptic neural activities
It is a method of determining how to alter the weights between
model neurons
If xi is the output of the presynaptic neuron, yj the output of the
postsynaptic neuron, and wji the strength of the connection
between them, and learning rate, the form of the learning rule is:
w ji = xi y j
16
2. Model of a neuron
The Perceptron (1958)
Rosenblatt, Frank (1958), The Perceptron: A Probabilistic Model for Information Storage and
Organization in the Brain, Cornell Aeronautical Lab, Psychological Review, v65, N.6, pp.386-408.
Artificial Neural Networks
Combines the McCulloch-Pitts model of an artificial neuron and
the Hebbian learning rule of adjusting weights
In addition to the variable weight values, the perceptron model
added an extra input that represents bias
bias
x1
w1
x2
w2
summing
junction
xm
wm
inputs
synaptic
weights
y
output
activation
function
17
2. Model of a neuron
The Perceptron (1958)
It produces associations between inputs and outputs:
\ {0,1}
Artificial Neural Networks
Pk Tk
k = 1... p
where Pk are the inputs or patterns and Tk are the outputs or
targets
The set of patterns and targets is the learning set or training set
{Pk , Tk }k =1... p
18
2. Model of a neuron
The Perceptron (1958)
Artificial Neural Networks
Perceptron Learning Rule: change the weight by an amount
proportional to the difference between the desired output and the
actual output
w ji = ( Tkj y j ) Pk i
learning rate
desired output
actual output
w ji (t + 1) = w ji (t ) + ( Tkj y j (t ) ) Pk i
i = 1...m
j = 1...n
k = 1... p
19
2. Model of a neuron
The Perceptron (1958)
Novikoff, A. B. (1962). On convergence proofs on perceptrons. Symposium on the
Mathematical Theory of Automata, 12, 615-622. Polytechnic Institute of Brooklyn.
Artificial Neural Networks
Perceptron Theorem: if this process is
repeated cyclical, ends up converging to the
weights looked for in finite time
for
Pk
inputs,
t < | y j (t ) = Tkj
20
2. Model of a neuron
The Perceptron (1958)
Artificial Neural Networks
Vector notation:
x0
w0
x1
w1
x2
w2
xm
wm
x = [x0 x1 x2 ... xm]T
w = [w0 w1 w2 ... wm]T
y = F(wTx) = F(wxT)
21
2. Model of a neuron
Artificial Neural Networks
The Perceptron (1958)
The equation for wxT, can be
viewed as an equation of a line.
Depending on the values of the
weights, this line will separate
the four possible inputs into two
categories
x2
C
x1
Equation of the line becomes
w2 =
w1
w2
x1
w0
w2
w0=-1
w1=-1 A
w2=1 w =1
0
w1=-1
w2=1
B
w0=1
w1=1
w2=1
w0=-1
w1=1
w2=1
22
2. Model of a neuron
The Perceptron (1958)
Example
Artificial Neural Networks
(0,1)
(1,1)
(-1,1)
(1,0)
(-1,0)
(0,-1)
\2
{0,1}
(-1,1)
(-1,0)
(0,-1)
(1,0)
(0,1)
(1,1)
23
2. Model of a neuron
The Perceptron (1958)
Minsky M L and Papert S A 1969 Perceptrons (Cambridge, MA: MIT Press)
The XOR problem Minsky & Papert (1969)
Artificial Neural Networks
(0,1)
(0,0)
(1,1)
(1,0)
\2
{0,1}
(0,0)
(0,1)
(1,0)
(1,1)
24
2. Model of a neuron
The Perceptron (1958)
The XOR problem Minsky & Papert (1969)
Artificial Neural Networks
(0,1)
(0,0)
(1,1)
(1,0)
\2
{0,1}
(0,0)
(0,1)
(1,0)
(1,1)
Functions such as XOR, require two lines to separate the
points into the appropriate classes. Hence we would require
multiple layers of neural units to represent this function
25
2. Model of a neuron
The Perceptron (1958)
Artificial Neural Networks
Synthesis :
Architecture
single layer feedforward
Transfer function
Hardlim
Associations
\ m {0,1}
Learning rule
w ji (t + 1) = w ji (t ) + ( Tkj y j (t ) ) Pk i
i = 1...m
j = 1...n
k = 1... p
ERROR
26
2. Model of a neuron
Learning
The aim of the learning is to find an association between the
patterns and the targets of the training set
Artificial Neural Networks
actual
output
p(1,1)
P= #
p(p,1)
y(1,1)
Y= #
y(p,1)
" p(1,m)
%
"
" y(1,n)
%
"
p(p,m)
t(1,1)
T= #
t(p,1)
y(p,n)
t(1,1) - y(1,1)
E=T-Y= #
t(p,1) - y(p,1)
desired
output
" t(1,n)
%
"
t(p,n)
" t(1,n) - y(1,n)
%
"
= [ 0]
t(p,n) - y(p,n)
27
2. Model of a neuron
Learning
so the aim of the learning is to find the weights that
minimizes the error, but
Artificial Neural Networks
how can we measure the error E = [0]?
Objective function
p
1 p n 2
e = e(w ) = E ( k,j) = diag ( ET E )
2 k =1 j=1
k =1
LEAST
SQUARE
ERROR
28
2. Model of a neuron
Learning
Artificial Neural Networks
It is possible to define the problem as a function approximation
2
1 p n 2
1 p n
min e = e(w ) = E ( k,j) = ( T ( k,j) Y ( k,j) )
2 k =1 j=1
2 k =1 j=1
where
Y ( k,j) = F w ( j,i ) P ( k,i )
i =0
Classification: classifier function
error
surface
Regression: approximation function
29