0% found this document useful (0 votes)
26 views63 pages

Course Material - Artificial Intelligence-Week2 - Update

The document discusses artificial neural networks, focusing on multi-layer perceptrons (MLPs) and their ability to learn complex functions like XOR. It explains the structure of ANNs, their advantages, and the limitations of single-layer perceptrons in learning non-linear problems. The document also covers the training algorithms for MLPs, including backpropagation and the use of activation functions.

Uploaded by

Manh Dat Le
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views63 pages

Course Material - Artificial Intelligence-Week2 - Update

The document discusses artificial neural networks, focusing on multi-layer perceptrons (MLPs) and their ability to learn complex functions like XOR. It explains the structure of ANNs, their advantages, and the limitations of single-layer perceptrons in learning non-linear problems. The document also covers the training algorithms for MLPs, including backpropagation and the use of activation functions.

Uploaded by

Manh Dat Le
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Artificial Intelligence :

Deep learning-based text processing


2. Multi-Layer Perceptron

Wonsu Kim
KISTI School
Contents

• Artificial Neural Network


• Multi-layer Perceptron

UST AI Class - 2025 2


Artificial Neural Network
• The recent surge in popularity of deep learning traces back to artificial neural
networks (ANN), which have been studied since the 1950s.
• Artificial neural networks are computing structures inspired by biological neural
networks.

source : ‘From Vision to Actions: Towards Adaptive and Autonomous Humanoid Robots’, Jürgen Schmidhuber

UST AI Class - 2025 3


Artificial Neural Network
An artificial neural network is composed of units or nodes, which are loosely modeled
after biological neurons.

Input Layer Hidden Layer Output Layer


Artificial Neural Network

UST AI Class - 2025 4


Advantage of Artificial Neural Network
• The first advantage is that it can learn. Given data, the neural network can learn
from examples.

• The second advantage is that even if a few units malfunction, the overall system
still functions without significant issues.

UST AI Class - 2025 5


Mathematical Model of Perceptron(Neuron)

Input Weight
Activation
Function f
Output
y

Threshold

UST AI Class - 2025 6


Can Perceptron learn logical gate?

Input Weight Activation “And” problem


Function f
Output

bias

UST AI Class - 2025 7


Can Perceptron learn logical gate?

Output

UST AI Class - 2025 8


Perceptron Learning Algorithm
• To be called learning, a neural network requires an algorithm that automatically adjusts the
weights by itself. Even in a perceptron, a learning algorithm exists.

Intuition
𝑑 − 𝑦 needs to be adjusted toward 0!

𝑑 − 𝑦 > 0 ∶ The weight vector should align more


with the input vector 𝑥
⇒ increased 𝒘 𝒙, increased 𝒚

Input: training data (x1, d1)~ (xm, dm): xk→input, dk→target 𝑑 − 𝑦 < 0 ∶ The weight vector should align less
1. Initialize all weights w and bias b to 0 or small random values. with the input vector 𝑥
⇒ decreased 𝒘 𝒙, decreased 𝒚
2. while (repeat until weights no longer change)
For each training data xk and target dk:
learning rate

For each weight:


UST AI Class - 2025 Updated weight 9
Perceptron Programming for AND Gate
import numpy as np
def perceptron_fit(X, Y, epochs=10): # Training for Perceptron
epsilon = 0.0000001 global W
eta = 0.2 # learning rate
X = np.array([ # Training Dataset
[0, 0, 1], # The last 1 is for bias 1. for t in range(epochs):
[0, 1, 1], # The last 1 is for bias 1. print("epoch=", t, "======================")
[1, 0, 1], # The last 1 is for bias 1. for i in range(len(X)):
[1, 1, 1] # The last 1 is for bias 1. predict = step_func(np.dot(X[i], W))
]) error = Y[i] - predict # Calculating Error
W += eta * error * X[i] # Weight Update
y = np.array([0, 0, 0, 1]) # Answer print("Current Input=",X[i],"Answer=",Y[i],"Output=",\
W = np.zeros(len(X[0])) # Weights predict,"Changed Weights=", W)
print("================================")
def step_func(t): # Activation Function
if t > epsilon: # Rounding of floating-point numbers. def perceptron_predict(X, Y): # Prediction
return 1 global W
else: for x in X:
return 0 print(x[0], x[1], "->", step_func(np.dot(x, W)))

perceptron_fit(X, y, 6)
𝒙𝟏 𝒙𝟐 𝒚
perceptron_predict(X, y)
𝒙𝟏
𝒙𝟐 𝒚
UST AI Class - 2025 10
Output
epoch= 0 ======================
Current Input= [0 0 1] Answer= 0 Output= 0 Changed Weights= [0. 0. 0.]
Current Input= [0 1 1] Answer= 0 Output= 0 Changed Weights= [0. 0. 0.]
Current Input= [1 0 1] Answer= 0 Output= 0 Changed Weights= [0. 0. 0.]
Current Input= [1 1 1] Answer= 1 Output= 0 Changed Weights= [0.2 0.2 0.2]
================================
epoch= 1 ======================
Current Input= [0 0 1] Answer= 0 Output= 1 Changed Weights= [0.2 0.2 0. ]
Current Input= [0 1 1] Answer= 0 Output= 1 Changed Weights= [ 0.2 0. -0.2]
Current Input= [1 0 1] Answer= 0 Output= 0 Changed Weights= [ 0.2 0. -0.2]
Current Input= [1 1 1] Answer= 1 Output= 0 Changed Weights= [0.4 0.2 0. ]
================================
epoch= 2 ======================
Current Input= [0 0 1] Answer= 0 Output= 0 Changed Weights= [0.4 0.2 0. ]
Current Input= [0 1 1] Answer= 0 Output= 1 Changed Weights= [ 0.4 0. -0.2]
Current Input= [1 0 1] Answer= 0 Output= 1 Changed Weights= [ 0.2 0. -0.4]
Current Input= [1 1 1] Answer= 1 Output= 0 Changed Weights= [ 0.4 0.2 -0.2]
================================
epoch= 3 ======================
Current Input= [0 0 1] Answer= 0 Output= 0 Changed Weights= [ 0.4 0.2 -0.2]
Current Input= [0 1 1] Answer= 0 Output= 0 Changed Weights= [ 0.4 0.2 -0.2]
Current Input= [1 0 1] Answer= 0 Output= 1 Changed Weights= [ 0.2 0.2 -0.4]
Current Input= [1 1 1] Answer= 1 Output= 0 Changed Weights= [ 0.4 0.4 -0.2]
UST AI Class - 2025
================================ 11
Output

epoch= 4 ======================
Current Input= [0 0 1] Answer= 0 Output= 0 Changed Weights= [ 0.4 0.4 -0.2]
Current Input= [0 1 1] Answer= 0 Output= 1 Changed Weights= [ 0.4 0.2 -0.4]
Current Input= [1 0 1] Answer= 0 Output= 0 Changed Weights= [ 0.4 0.2 -0.4]
Current Input= [1 1 1] Answer= 1 Output= 1 Changed Weights= [ 0.4 0.2 -0.4]
================================
epoch= 5 ======================
Current Input= [0 0 1] Answer= 0 Output= 0 Changed Weights= [ 0.4 0.2 -0.4]
Current Input= [0 1 1] Answer= 0 Output= 0 Changed Weights= [ 0.4 0.2 -0.4]
Current Input= [1 0 1] Answer= 0 Output= 0 Changed Weights= [ 0.4 0.2 -0.4]
Current Input= [1 1 1] Answer= 1 Output= 1 Changed Weights= [ 0.4 0.2 -0.4]
================================
0 0 -> 0
0 1 -> 0
1 0 -> 0
1 1 -> 1

UST AI Class - 2025 12


Training Perceptron with sklearn

from sklearn.linear_model import Perceptron

# Samples and Label


X = [[0,0],[0,1],[1,0],[1,1]]
y = [0, 0, 0, 1]

# Creating Perceptron. tol: stop condition. random_state: seed for random number
clf = Perceptron(tol=1e-3, random_state=0)

# Training
clf.fit(X, y)

# Testing
print(clf.predict(X))

[0 0 0 1]

UST AI Class - 2025 13


Limitation of Perceptron
• XOR Operation
𝒙𝟏 𝒙𝟐 𝒚
𝒙𝟏
𝒙𝟐 𝒚

𝒙𝟐

𝒙𝟏
𝒙𝟏 𝒙𝟐 𝒚
𝒙𝟏
𝒙𝟐
𝒙𝟐
𝒙𝟐
How can you
separate them with
one line?
𝒙𝟏
𝒙𝟏
UST AI Class - 2025 14
Training Perceptron for XOR
from sklearn.linear_model import Perceptron

# Samples and Label


X = [[0,0],[0,1],[1,0],[1,1]]
y = [0, 1, 1, 0]

# Creating Perceptron. tol: stop condition. random_state: seed for random number
clf = Perceptron(tol=1e-3, random_state=0)

# Training
clf.fit(X, y)

# Testing
print(clf.predict(X))

[0 0 0 1]

Learning does not occur at all. The perceptron can learn AND or
OR operations, but why can't it learn the XOR operation?
UST AI Class - 2025 15
Linearly separable problems

• From a pattern recognition perspective, a perceptron can be described as a type


of linear classifier that classifies input patterns using a straight line.

UST AI Class - 2025 16


Linearly separable problems

• In their 1969 book "Perceptrons," Minsky and Papert mathematically proved that
a single-layer perceptron cannot learn the XOR problem.

How can I separate them


with ONLY one line?

Output

UST AI Class - 2025 17


Solving XOR Problem with Multi-layer Perceptron

• The XOR operation cannot be correctly classified using a single straight line.
• However, if you use two straight lines, you can classify XOR inputs correctly.

UST AI Class - 2025 18


Solving XOR Problem with Multi-layer Perceptron

• If you use two straight lines, you can classify XOR inputs correctly.

𝒙𝟏 𝒙𝟐 𝒚𝟏 𝒚𝟐 𝒚 Target
𝐵𝑖𝑎𝑠 = −0.5
0 0 𝐻 −0.5 = 0 𝐻 −1.5 = 0 𝐻 0 =0 0
𝑤 =1
𝑤′ =1 𝐵𝑖𝑎𝑠 0 1 𝐻 0.5 = 1 𝐻 −0.5 = 0 𝐻 1 =1 1
𝑤 =1 Bias = 0
1 0 𝐻 0.5 = 1 𝐻 −0.5 = 0 𝐻 1 =1 1

𝑤 =1
1 1 𝐻 1.5 = 1 𝐻 0.5 = 1 𝐻 −1 = 0 0

𝑤′ = −2
𝑤 =1
Bias = −1.5
𝐵𝑖𝑎𝑠 𝑦 = 𝐻(𝑤 𝑥 + 𝑤 𝑥 + 𝑏 ) 𝐻: Step function
b 𝑦 = 𝐻(𝑤 𝑥 + 𝑤 𝑥 + 𝑏 )

𝑦 = 𝐻(𝑤′ 𝑦 + 𝑤′ 𝑦 + 𝑏)

UST AI Class - 2025 19


Training Algorithm for Multi-layer Perceptron

• Minsky and Papert predicted that finding an algorithm to train multi-layer


perceptrons would be very difficult. Did this prediction come true? No, it did not.
In the mid-1980s, Rumelhart, Hinton, and others rediscovered a learning
algorithm for multi-layer perceptrons.

Backpropaga
tion!

UST AI Class - 2025 20


Contents

• Artificial Neural Network


• Multi-layer Perceptron

UST AI Class - 2025 21


Problem of Perceptron

• The perceptron was unable to correctly learn the XOR operation.


• This is because the structure of the perceptron was too simple. The human brain
has about 100 billion neurons working together to solve complex cognitive
problems, but the perceptron uses only a single neuron.

Humans use about 100 billion


neurons.

AND gate is OK, BUT,


Not possible for XOR gate..
Only ONE perceptron..

UST AI Class - 2025 22


MLP

• Multilayer Perceptron (MLP): A neural network that contains a hidden layer


between the input layer and the output layer.
Forward pass HOW we training MLP, • Backpropation
which has multiple
node and layer? • Activation Function

output

Error
Calculation ⇒ The error should decrease
through repeated forward
and backward passes.

Input hidden output


layer layer layer

Backward pass
UST AI Class - 2025 23
Activation Function

• Role: Transforms the output of the previous layer and passes the signal to the
neurons in the next layer
• In the perceptron, a step function was used as the activation function, but in
MLPs, various “nonlinear” functions are used as activation functions.

Input Weight

A
Out
put

Activation
Function
bias

• step function: outputs 0 or 1


Two networks perform the same function.
UST AI Class - 2025 24
Using a linear function repeatedly is not
effective.
• This is because it has been mathematically proven that combining multiple linear
functions is ultimately equivalent to a single linear function.

Two networks
perform the
same function.

Two
Two networks
networks perform
perform the
the same
same function.
function.

Any number of linear layers can be replaced by one layer.

UST AI Class - 2025 25


Activation Function

Two networks perform the same function.


UST AI Class - 2025 26
Step function

• The step function is a function that outputs 1 if the sum of the input signals
exceeds 0, and 0 otherwise.

Two networks perform the same function.

UST AI Class - 2025 27


Sigmoid function

• The sigmoid function is a traditional activation function used since the 1980s. It
has an S-shaped curve and is defined as:

Two networks perform the same function.

UST AI Class - 2025 28


ReLU (Rectifed Linear Unit function)

• The ReLU (Rectified Linear Unit) function outputs the input directly if it is greater
than 0, and outputs 0 if the input is less than or equal to 0.

Two networks perform the same function.

UST AI Class - 2025 29


Hyperbolic Tangent(tanh) function

• The `tanh()` function is provided by NumPy. It is notable for its output range,
which is from -1 to +1.

Two networks perform the same function.

UST AI Class - 2025 30


Feed-forward Pass

• The feed-forward pass refers to the process in which input signals are fed into
the input layer units and then propagate through the hidden layers to reach the
output layer.
Check out
Forward pass Forward pass

out
Input put Calculating
Error

Input hidden output


layer layer layer

Backward pass
Two networks perform the same function.

UST AI Class - 2025 31


Feed-forward Pass

Two networks
Two networks perform
perform
the same
the function.
same function.

UST AI Class - 2025 32


Use of Matrix

It’s a matrix
multiplication!

Two networks perform the same function.

UST AI Class - 2025 33


Use of Matrix

It’s a famous
Input Weight Bias
equation in NN!

ℎ 𝑤 𝑤 𝑏
𝑤 𝑤 𝑥
ℎ =𝑓 𝑥 + 𝑏
ℎ 𝑤 𝑤 𝑏

UST AI Class - 2025 34


Error Calculation

• If the output value is incorrect, how should the weights be adjusted? When
training a neural network, the error between the output and target values (i.e.,
desired output) is used. Weights are adjusted in a direction that minimizes this
error. Check out
Error
Forward pass Calculation

out
Input put Calculating
Error

Input hidden output


layer layer layer

Backward pass

UST AI Class - 2025 35


Loss Function

• In neural networks, there needs to be a metric to indicate the degree of learning.


This metric is called the loss function. The value of the loss function is the error
between the target and the network's output.

Prediction Difference of Label and


Prediction

Label
Two networks perform the same function.

UST AI Class - 2025 36


Loss Function

• The total error is the sum of the squared differences between the target output
values and the actual output values for all output nodes.

MSE loss for 1 sample:

Cat 1.0
Otherwise 0.0
or
Dog 1.0
Otherwise 0.0

layer4
layer3
layer1 layer2
Two networks perform the same function.

UST AI Class - 2025 37


Example of Calculating Loss Function

• For example, suppose you have configured an MLP to recognize handwritten


digits. Since the digits range from 0 to 9, the output layer would have 10 units.

28 pixels

28 pixels

Two networks perform the same function.

UST AI Class - 2025 38


Example of Calculating Loss Function

• For example, suppose you have configured an MLP to recognize handwritten


digits. Since the digits can range from 0 to 9, the output layer would have 10
units.
def MSE(target, y):
return 0.5 * np.sum((y-target)**2)

0 1 2 3 4 5 6 7 8 9 ← corresponding digits
target = np.array([ 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ]) ← our target values

y = np.array([ 0.9, 0.0, 0.1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ] ) ← outputs from the model
>>> MSE(target, y)
0.81 ← calculated loss (seem large)
y = np.array([ 0.0, 0.0, 0.8, 0.1, 0.0, 0.0, 0.0, 0.1, 0.0, 0.0 ]) ← revised outputs from the model
>>> MSE(target, y)
0.029999999999999992 ← calculated loss (improved!!)

UST AI Class - 2025 39


Backpropagation
• The backpropagation algorithm is applied after a forward pass is performed to
calculate the output for a given input.
• Before the backpropagation works, the error between the calculated output and
ground truth needs to be calculated.
• The error is propagated backward through the network, and the weights are
adjusted in the direction that reduces the error.
Forward pass

Error
Calculation

Backward pass

UST AI Class - 2025 40


Backward Pass

• In the backward pass, the process of adjusting weights uses the errors computed
in the previous step. The backpropagation algorithm is employed here.
Backpropagation approaches the problem of minimizing the loss function as an
optimization problem.
Check out
Forward pass Backward pass

out
Input put Calculating
Error

Input hidden output


layer layer layer

Backward pass
UST AI Class - 2025 41
The process of adjusting weights

You simply need to


out turn the dial until the
put correct frequency is
tuned.

Input hidden output


layer layer layer

The process of adjusting weights is akin to tuning a radio by adjusting the dial to find the correct
frequency.

UST AI Class - 2025 42


The process of adjusting weights

• When increasing the weights results in a decrease in error → increase the


weights.
• When increasing the weights results in an increase in error → decrease the
weights.

• In practice, you can compute the output of the neural network by increasing or
decreasing the weights, and then compare this output with the target to
recalculate the error. However, this method is time-consuming because it
requires recalculating the network’s output every time a weight is changed. Is
there a better method?

UST AI Class - 2025 43


Practice –
Set up a Development
Environment
AI program development environment settings

Programming Language

• Computers can understand only Machine language


– binary code made up of 0s and 1s
• A programming language acts as a bridge between human language and
machine language
– Low-level language - a language close to machine language
(ex. assembly)
– High-level language - language that humans can easily understand (ex. Java, C++, Python)
• Computer Program
– a sequence or set of instructions in a programming language for a computer to execute

UST AI Class - 2025 45


Programming Language
 Compiler language

• Translates source code into machine code


• ex) C, C++, JAVA, Fortran

 Interpreter language

• Excutes instructions directly without compilation


• Python, Javascript, Ruby

UST AI Class - 2025 46


AI program development environment settings

Python

• high-level, general-purpose programming language


• design philosophy emphasizes code readability
• one of the most popular programming languages
• basically interpreter language, but can be compiled
• various and rich libraries
• free
• Guido van Rossum released it in 1991

UST AI Class - 2025 47


Python development environment and libraries for
developing artificial intelligence models

Integrated Development Environment


(IDE)

Deep Learning

Machine Learning

source : Software Carpentry-xwMOOC (https://statkclee.github.io/ml/index.html)

UST AI Class - 2025 48


Building a Python development environment

• Using Cloud (Google Colab)


– Cloud-based Python development environment provided by Google
– Free use with just a Google account (keep about 12 hours)
– Use via web browser (Google Chrome recommended)
– Various preinstalled Python libraries and additional installations are possible
– Interactive interface similar to Jupyter Notebook
– Can use Google Drive as a storage device
– Not only CPU, but also GPU and TPU can be used
– http://colab.research.google.com

UST AI Class - 2025 49


Practice environment – Google Colab
Notepad Title
• Menu • Server connection and resource usage

• Navigator
- Move/Add Table of
Contents in Notepad
- Code snippet
- File Explorer

• Notepad
- Code cell and Text cell
- Display result of a code cell

UST AI Class - 2025 50


Practice environment – Google Colab
• Notepad Sharing
• Cell type selection

Text Cell

Code Cell
Execution Result

• Notepad
- Code cell and Text cell
- Display result of a code cell

UST AI Class - 2025 51


Jupyter Notebook
• User-facing application to edit code and text
• the underlying file format which is interoperable across many implementations
• Application
– web-based interactive computational environment for creating notebook documents
• Document
– Cells for Markdown (display), Code (to execute), and output of the code type cells

UST AI Class - 2025


Architecure of Jupyter Notebook (as an application) 52
Colab execution/save environment

Colab

Google Drive Github

Source: https://zerowithdot.com/colab-github-workflow/

UST AI Class - 2025 53


Google Drive

• A cloud-based collaboration tool and file storage/sharing service provided by Google.


• Use via web browser (Google Chrome recommended)
• By default, 15 GB for free or you can purchase more space
• http://drive.google.com
• how to mount google drive to colab

use the interface execute the command

UST AI Class - 2025 54


GitHub

• internet hosting service for software development and version control


• commonly used to host open source software development projects
• As of Jan 2023, having over 100M developers and more than 420M repositories
• https://github.com/

UST AI Class - 2025 55


Building a Python development environment

• local installation (Official Python distribution)


– https://www.python.org/
– Complex installation
– Difficult to change version
– Libraries must be installed seperately

UST AI Class - 2025 56


Building a Python development environment

• local installation (Anaconda distribution)


– https://www.anaconda.com/
– a distribution of the Python for scientific computing
• (data science, machine learning applications, large-scale data processing, predictive
analytics, etc.)
• also of the the R language
– simplify package management and deployment

UST AI Class - 2025 57


Anaconda distribution

UST AI Class - 2025 58


Anaconda distribution

UST AI Class - 2025 59


Anaconda distribution

UST AI Class - 2025 60


Anaconda distribution

UST AI Class - 2025 61


Anaconda distribution

UST AI Class - 2025 62


Anaconda distribution

UST AI Class - 2025 63

You might also like