Academia.eduAcademia.edu

Back Propagation Neural Networks

Abstract

I used to thought that BP Neural Networks algorithm belongs to supervised learning, however, after learned about Sparse Autoencoder algorithm, I realized it can also be used for unsupervised learning (use the unlabeled data itself as both input and output). BP neural networks is the base of a lot of other advanced neural networks algorithm, it is easy, but powerful. NEURON AND NEURAL NETWORKS Neural networks is built by certain amount of computational units, each of these unit is called a 'Neuron', the reason of using these biological words is that, neuroscientists told us how our brains work, and we are trying to mimic part of a brain, make machines more powerful. First we did, as usual, establish an ideal model which describe the neural system, and our definition of neuron is something like this: Neuron is an unit which have inputs and outputs, accompanies with each inputs, there is a weight that decide how this input value distributes, means what percentage of this input value will be taken by each neuron. For the above neuron, the input is W1 * X1 + W2 * X2 + W3 * X3 + 1 * b, sometimes, we also treat this bias b as W0, and set X0 always has the value +1, so the input can also be written as W T X. What about the output? The output is output = f(input), we call this f() as activation function, here, I'll use sigmoid function as the activation function (sure you can use tanh() as well, as I said in the Logistic Regression post), that is, O1 = O2 = O3 = sigmoid(W T X). If we have multi-neurons, we get a neural network, a simple example is like this: In this particular case, we say we have 3 layers in the network, 1 input layer (L1), 1 hidden layer (L2), and 1 output layer (L3). In both input layer and hidden layer, we have a " +1 " unit, we call these bias units, we can either count bias as one of the neuron in its layer or not counting it, this is depend on how you implement the neural networks algorithm. See the rightmost of this network, the output of output layer h W,b(x), means the value of output that calculated using current weight W and b, this is exactly the same as Linear regression and Logistic regression.