The Most advanced complex models are based in deep learning.
The foundations are the
neural networks.
An artificial neural network is a mathematical model where we want to mimic what
happens in our brains. Our neurons are connected, this is how we learn. Learning means
that we create neutral network maps in our brain (some neurons are activated and some
not). When we have already created this neural map we remember it when we see again the
same image.
In a neural network we have inputs and outputs.
We have an activation function: max(X1w1 + X2w2 + bias,0)
w1 and w2 are weights (numbers)
bias is another parameter (a number)
Using the max function we calculate the output (the maximum between the inputs and 0)
Neural network parameters :w1 = 0.2;w2 = 2;bias = 1; Input :X1 = 1,X 2 = 0.5
Output :max(0.2×1 + 2×0.5,0) = 1.2 (if the number was negative the maximum would be 0)
In this case we have to try to get to the target. The objective is to minimize the error to obtain
a model.
The error= target-output
Here we can use the absolute value because we have multiple targets
Error= target- max(x1*w1+x2*w2+bias, 0)
We need to calculate the weights and the bias so that the error is the minimum one.
These weights are calculated using the technique Backpropagation algorithm
When we wanna train a neural network (calculate a model to obtain model) we need to
obtain which weights and which bias will be the optimals to minimize the sum of the errors.
I take all of the data and calculate the weights for all, I calculate average for the weights.
We calculate the weights with a trial and error process. We give the weights random
numbers, between 0 and 1 and then we calculate the outputs and then we calculate the
errors. Using the backpropagation algorithm and derivatives we change the weights to
minimize the whole error.
Neural networks can only deal with numbers. So whatever we do with a neural network, if we
have pictures, these picture must be converted into numbers, if the outputs are letters these
letters must be converted into numbers (very important).
https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle®Datase
t=reg-plane&learningRate=0.03®ularizationRate=0&noise=0&networkShape=4,2&seed=
0.16397&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY
=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&c
ollectStats=false&problem=classification&initZero=false&hideText=false Practise with it
(identify the blue dots and the yellow dots)
More neurons means more weights to calculate. Adding neurons or layers of neurons the
models are better and more accurate. Therefore more money to spend on computers. You
have to see if you can simplify the neural network.
Deep learning is a technique within machine learning. Deep learning is based on neural
networks. These neural networks where we have a lot of layers and a lot of neurons. We
need them to make a lot of calculations. The driverless car they use deep learning.
They have discovered that when we use deep learning, every layer is specialized on doing
something. When we analyze a layer is able to detect things or grey areas that go in vertical,
another layer identify things that are rounded… Then we are finally able to recognize content
in photos. The more layers and the more neurons, the more information we’ll be able to
understand.
In this case some layers are able to identify edges, some textures, some parts and finally
some will identify objects (special tasks for each layer).
The performance of neural networks are much better than traditional and classical
techniques
To calculate deep learning models we need millions of data because it is very complex ( a lot
of money).
There are deep learning techniques (curiosity, we don’t have to learn slide13)
AlexNet has 9 layers of neurons, but each layer has a lot of neurons inside. This was one of
the initial deep learning models. Getting an accuracy of 84.7%. To calculate this model we
needed to calculate almost 63 million parameters (the initial one)- 63 million weights (you
need a lot of computers). That’s why we cannot calculate deep learning with our normal
computers.
Scientists started to try other models, changing the number of layers and neurons and
comparing them to the other models. Another model was the VGGNet. We have many more
layers, we needed to calculate 138 million parameters/ weighs.
The VGG16 and the VGG19 also appeared.
There’s also the ResNet. They changed the architecture of the model and it was only
needed to calculate 12 million parameters.
Scientists tried to change the architecture and how the information flows to try to decrease
the number of calculations.
The inception is the first model that google calculated. Always keeping in mind improving the
accuracy while decreasing the number of weighs at the same time. Changing the
architecture you can decrease the nº of parameters.
The best one was Resnet (accuracy). Being inception not as good as Resnet, inception
needs less parameters, this means that if you want to use this architecture, if you need to
retrain the other models it will take you a lot of time and money. From a practical standpoint
we will probably use inception. It is not worthy only having an increase of 2% and multiply x
10 the number of parameters.
FLOP is a measure to know the number of calculations/operations in a process. More
calculations means more data, time and money (memory, space, processes…)
We have other architectures: Deshnet.