Outline
Introduction to Extreme Learning Machines
Guang-Bin HUANG
Assistant Professor School of Electrical and Electronic Engineering Nanyang Technological University, Singapore
Hands-on Workshop on Machine Learning for BioMedical Informatics 2006, National University of Singapore 21 Nov 2006
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Outline
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Outline
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Outline
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Feedforward Neural Networks with Additive Nodes
Output of hidden nodes G(ai , bi , x) = g (ai x + bi ) (1)
ai : the weight vector connecting the i th hidden node and the input nodes. bi : the threshold of the i th hidden node.
Output of SLFNs
L X i =1
fL (x) =
i G(ai , bi , x)
(2)
Figure 1:
nodes
Feedforward Network Architecture: additive hidden
tu-logo i : the weight vector connecting the i th hidden node and the output nodes. ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Feedforward Neural Networks with Additive Nodes
Output of hidden nodes G(ai , bi , x) = g (ai x + bi ) (1)
ai : the weight vector connecting the i th hidden node and the input nodes. bi : the threshold of the i th hidden node.
Output of SLFNs
L X i =1
fL (x) =
i G(ai , bi , x)
(2)
Figure 1:
nodes
Feedforward Network Architecture: additive hidden
tu-logo i : the weight vector connecting the i th hidden node and the output nodes. ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Feedforward Neural Networks with Additive Nodes
Output of hidden nodes G(ai , bi , x) = g (ai x + bi ) (1)
ai : the weight vector connecting the i th hidden node and the input nodes. bi : the threshold of the i th hidden node.
Output of SLFNs
L X i =1
fL (x) =
i G(ai , bi , x)
(2)
Figure 1:
nodes
Feedforward Network Architecture: additive hidden
tu-logo i : the weight vector connecting the i th hidden node and the output nodes. ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Feedforward Neural Networks with Additive Nodes
g(x)
g(x)
g(x)
x (a) (b)
x (c)
Figure 2: Activation Functions g (x )
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Feedforward Neural Networks with RBF Nodes
Output of hidden nodes G(ai , bi , x) = g (bi x ai ) ai : the center of the i th hidden node. bi : the impact factor of the i th hidden node. (3)
Output of SLFNs
L X i =1
fL (x) =
i G(ai , bi , x)
(4)
Figure 3:
nodes
Feedforward Network Architecture: RBF hidden
i : the weight vector connecting the i th hiddentu-logo node and the output nodes.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Feedforward Neural Networks with RBF Nodes
Output of hidden nodes G(ai , bi , x) = g (bi x ai ) ai : the center of the i th hidden node. bi : the impact factor of the i th hidden node. (3)
Output of SLFNs
L X i =1
fL (x) =
i G(ai , bi , x)
(4)
Figure 3:
nodes
Feedforward Network Architecture: RBF hidden
i : the weight vector connecting the i th hiddentu-logo node and the output nodes.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Feedforward Neural Networks with RBF Nodes
Output of hidden nodes G(ai , bi , x) = g (bi x ai ) ai : the center of the i th hidden node. bi : the impact factor of the i th hidden node. (3)
Output of SLFNs
L X i =1
fL (x) =
i G(ai , bi , x)
(4)
Figure 3:
nodes
Feedforward Network Architecture: RBF hidden
i : the weight vector connecting the i th hiddentu-logo node and the output nodes.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Function Approximation of Neural Networks
Mathematical Model Any continuous target function f (x) can be approximated by SLFNs. In other words, given any small positive value , for SLFNs with enough number of hidden nodes (L) we have fL (x) f (x) < (5)
Learning Issue In real applications, target function f is usually tu-logo unknown. One wishes unknown f could be approximated by SLFNs fL appropriately. ur-logo
Figure 4:
Feedforward Network Architecture
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Function Approximation of Neural Networks
Mathematical Model Any continuous target function f (x) can be approximated by SLFNs. In other words, given any small positive value , for SLFNs with enough number of hidden nodes (L) we have fL (x) f (x) < (5)
Learning Issue In real applications, target function f is usually tu-logo unknown. One wishes unknown f could be approximated by SLFNs fL appropriately. ur-logo
Figure 4:
Feedforward Network Architecture
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Function Approximation of Neural Networks
Mathematical Model Any continuous target function f (x) can be approximated by SLFNs. In other words, given any small positive value , for SLFNs with enough number of hidden nodes (L) we have fL (x) f (x) < (5)
Learning Issue In real applications, target function f is usually tu-logo unknown. One wishes unknown f could be approximated by SLFNs fL appropriately. ur-logo
Figure 4:
Feedforward Network Architecture
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Function Approximation of Neural Networks
Learning Model For N arbitrary distinct samples (xi , ti ) Rn Rm , SLFNs with L hidden nodes and activation function g (x ) are mathematically modeled as fL (xj ) = oj , j = 1, , N (6)
Cost function: E =
PN
j =1
oj tj .
2
Figure 5:
The target is to minimize the cost function E by adjusting the network parameters: i , ai , bi . Feedforward Network Architecture
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Function Approximation of Neural Networks
Learning Model For N arbitrary distinct samples (xi , ti ) Rn Rm , SLFNs with L hidden nodes and activation function g (x ) are mathematically modeled as fL (xj ) = oj , j = 1, , N (6)
Cost function: E =
PN
j =1
oj tj .
2
Figure 5:
The target is to minimize the cost function E by adjusting the network parameters: i , ai , bi . Feedforward Network Architecture
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Function Approximation of Neural Networks
Learning Model For N arbitrary distinct samples (xi , ti ) Rn Rm , SLFNs with L hidden nodes and activation function g (x ) are mathematically modeled as fL (xj ) = oj , j = 1, , N (6)
Cost function: E =
PN
j =1
oj tj .
2
Figure 5:
The target is to minimize the cost function E by adjusting the network parameters: i , ai , bi . Feedforward Network Architecture
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Function Approximation of Neural Networks
Learning Model For N arbitrary distinct samples (xi , ti ) Rn Rm , SLFNs with L hidden nodes and activation function g (x ) are mathematically modeled as fL (xj ) = oj , j = 1, , N (6)
Cost function: E =
PN
j =1
oj tj .
2
Figure 5:
The target is to minimize the cost function E by adjusting the network parameters: i , ai , bi . Feedforward Network Architecture
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Learning Algorithms of Neural Networks
Learning Methods Many learning methods mainly based on gradientdescent/iterative approaches have been developed over the past two decades. Back-Propagation (BP)tu-logo and its variants are most popular.
ur-logo
Figure 6:
Feedforward Network Architecture
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Learning Algorithms of Neural Networks
Learning Methods Many learning methods mainly based on gradientdescent/iterative approaches have been developed over the past two decades. Back-Propagation (BP)tu-logo and its variants are most popular.
ur-logo
Figure 6:
Feedforward Network Architecture
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Learning Algorithms of Neural Networks
Learning Methods Many learning methods mainly based on gradientdescent/iterative approaches have been developed over the past two decades. Back-Propagation (BP)tu-logo and its variants are most popular.
ur-logo
Figure 6:
Feedforward Network Architecture
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Advantagnes and Disadvantages
Popularity Widely used in various applications: regression, classication, etc. Limitations Usually different learning algorithms used in different SLFNs architectures. Some parameters have to be tuned mannually. Overtting.
(animation: [Link]/home/egbhuang/NUS-Workshop/[Link]) tu-logo
Local minima. Time-consuming.
ELM Web Portal: [Link]/home/egbhuang Introduction to Extreme Learning Machines ur-logo
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Advantagnes and Disadvantages
Popularity Widely used in various applications: regression, classication, etc. Limitations Usually different learning algorithms used in different SLFNs architectures. Some parameters have to be tuned mannually. Overtting.
(animation: [Link]/home/egbhuang/NUS-Workshop/[Link]) tu-logo
Local minima. Time-consuming.
ELM Web Portal: [Link]/home/egbhuang Introduction to Extreme Learning Machines ur-logo
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Generalization capability of SLFNs
Figure 7: SLFN1 has poor generalization while SLFN2 has good generalization
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
SLFN Models Function Approximation Learning Methods
Local minima issue of conventional learning methods
Figure 8: Conventional SLFN learning methods usually stuck Local minima
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
New Learning Theory Given any bounded nonconstant piecewise continuous function g (integrable for RBF nodes), for any continuous target function f and any randomly generated sequence {(aL , bL )}, lim f (x) fL (x) = 0
holds with probability one if i is chosen to minimize the f (x) fL (x) , i = 1, , L.
Figure 9:
tu-logo Feedforward Network Architecture
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
New Learning Theory Given any bounded nonconstant piecewise continuous function g (integrable for RBF nodes), for any continuous target function f and any randomly generated sequence {(aL , bL )}, lim f (x) fL (x) = 0
holds with probability one if i is chosen to minimize the f (x) fL (x) , i = 1, , L.
Figure 9:
tu-logo Feedforward Network Architecture
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Unied Learning Platform
Mathematical Model For N arbitrary distinct samples (xi , ti ) Rn Rm , standard SLFNs with L hidden nodes and activation function g (x ) are mathematically modeled as
L X i =1
i G(ai , bi , xj ) = tj , j = 1, , N (7)
ai : the input weight vector connecting the i th hidden node and the input nodes or the center of the i th hidden node. : the weight vector connecting the i thtu-logo
i
Figure 10:
hidden node and the output node. Feedforward Network Architecture bi : the threshold or impact factor of the i th hidden node. ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Unied Learning Platform
Mathematical Model For N arbitrary distinct samples (xi , ti ) Rn Rm , standard SLFNs with L hidden nodes and activation function g (x ) are mathematically modeled as
L X i =1
i G(ai , bi , xj ) = tj , j = 1, , N (7)
ai : the input weight vector connecting the i th hidden node and the input nodes or the center of the i th hidden node. : the weight vector connecting the i thtu-logo
i
Figure 10:
hidden node and the output node. Feedforward Network Architecture bi : the threshold or impact factor of the i th hidden node. ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
Mathematical Model PL
i =1
i G(ai , bi , xj ) = tj , j = 1, , N is equivalent to H = T, where H(a1 , , aL , b1 , , bL , x1 , , xN ) 3 G(aL , bL , x1 ) 7 . 7 . 5 . G(aL , bL , xN ) N L
T 1 . . . L T
G(a1 , b1 , x1 ) 6 . =6 . 4 . G(a1 , b1 , xN ) 2 2 6 6 =6 4 3 7 7 7 5
(8)
2 6 6 and T = 6 4
Lm
3 tT 1 7 . 7 . 7 . 5 tT N N m
(9) tu-logo
H is called the hidden layer output matrix of the neural network; the i th column of H is the output of the i th hidden node with respect to inputs x1 , x2 , , xN . ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Outline
Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Conventional Learning Algorithms of SLFNs Extreme Learning Machine Unied Learning Platform ELM Algorithm
tu-logo
Performance Evaluations
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
Three-Step Learning Model Given a training set = {(xi , ti )|xi Rn , ti Rm , i = 1, , N }, activation function g , and the number of hidden nodes L,
1
Assign randomly input weight vectors or centers ai and hidden node bias or impact factor bi , i = 1, , L. Calculate the hidden layer output matrix H. Calculate the output weight : = H T.
2 3
where H is the Moore-Penrose generalized inverse of hidden layer output matrix H.
tu-logo
Source Codes of ELM [Link]
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
Three-Step Learning Model Given a training set = {(xi , ti )|xi Rn , ti Rm , i = 1, , N }, activation function g , and the number of hidden nodes L,
1
Assign randomly input weight vectors or centers ai and hidden node bias or impact factor bi , i = 1, , L. Calculate the hidden layer output matrix H. Calculate the output weight : = H T.
2 3
where H is the Moore-Penrose generalized inverse of hidden layer output matrix H.
tu-logo
Source Codes of ELM [Link]
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
Three-Step Learning Model Given a training set = {(xi , ti )|xi Rn , ti Rm , i = 1, , N }, activation function g , and the number of hidden nodes L,
1
Assign randomly input weight vectors or centers ai and hidden node bias or impact factor bi , i = 1, , L. Calculate the hidden layer output matrix H. Calculate the output weight : = H T.
2 3
where H is the Moore-Penrose generalized inverse of hidden layer output matrix H.
tu-logo
Source Codes of ELM [Link]
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
Three-Step Learning Model Given a training set = {(xi , ti )|xi Rn , ti Rm , i = 1, , N }, activation function g , and the number of hidden nodes L,
1
Assign randomly input weight vectors or centers ai and hidden node bias or impact factor bi , i = 1, , L. Calculate the hidden layer output matrix H. Calculate the output weight : = H T.
2 3
where H is the Moore-Penrose generalized inverse of hidden layer output matrix H.
tu-logo
Source Codes of ELM [Link]
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
Extreme Learning Machine (ELM)
Three-Step Learning Model Given a training set = {(xi , ti )|xi Rn , ti Rm , i = 1, , N }, activation function g , and the number of hidden nodes L,
1
Assign randomly input weight vectors or centers ai and hidden node bias or impact factor bi , i = 1, , L. Calculate the hidden layer output matrix H. Calculate the output weight : = H T.
2 3
where H is the Moore-Penrose generalized inverse of hidden layer output matrix H.
tu-logo
Source Codes of ELM [Link]
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Unied Learning Platform ELM Algorithm
ELM Learning Algorithm
Salient Features Simple Math is Enough. ELM is a simple tuning-free three-step algorithm. The learning speed of ELM is extremely fast. Unlike the traditional classic gradient-based learning algorithms which only work for differentiable activation functions. Unlike the traditional classic gradient-based learning algorithms facing several issues like local minima, improper learning rate and overtting, etc, the ELM tends to reach the solutions straightforward without such trivial issues. The ELM learning algorithm looks much simpler than many learning algorithms: neural networks and support vector machines.
ur-logo
tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Example Articial Case Real-World Regression Problems Real-World Very Large Complex Applications Real Medical Diagnosis Application: Diabetes Protein Sequence Classication
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Articial Case: Approximation of SinC Function
sin(x )/x , x = 0 1, x =0
Training RMS Dev 0.1148 0.1196 0.1149 0.0037 0.0042 0.0007 Testing RMS Dev 0.0097 0.0159 0.0130 0.0028 0.0041 0.0012 # SVs/ nodes 20 20 2499.9
f (x ) =
(10)
Algorithms ELM BP SVR
Training Time (seconds) 0.125 21.26 1273.4
Table 1: Performance comparison for learning function: SinC (5000 noisy training data and 5000 noise-free testing data).
tu-logo
G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Articial Case: Approximation of SinC Function
1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 10 Expected Actual
1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 10 Expected Actual
10
10
tu-logo
Figure 11: Output of ELM
Figure 12: Output of BP
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Articial Case: Approximation of SinC Function
1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 10 Expected Actual
1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 10 Expected Actual
10
10
tu-logo
Figure 13: Output of ELM
Figure 14: Output of SVM
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Real-World Regression Problems
Datasets Abalone Delta Ailerons Delta Elevators Computer Activity Census (House8L) Auto Price Triazines Machine CPU Servo Breast Cancer Bank domains California Housing Stocks domain training 0.0785 0.0409 0.0544 0.0273 0.0596 0.0443 0.1438 0.0352 0.0794 0.2788 0.0342 0.1046 0.0179
BP testing 0.0874 0.0481 0.0592 0.0409 0.0685 0.1157 0.2197 0.0826 0.1276 0.3155 0.0379 0.1285 0.0358 training 0.0803 0.0423 0.0550 0.0316 0.0624 0.0754 0.1897 0.0332 0.0707 0.2470 0.0406 0.1217 0.0251
ELM testing 0.0824 0.0431 0.0568 0.0382 0.0660 0.0994 0.2002 0.0539 0.1196 0.2679 0.0366 0.1267 0.0348 tu-logo
Table 2: Comparison of training and testing RMSE of BP and ELM.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Real-World Regression Problems
Datasets Abalone Delta Ailerons Delta Elevators Computer Activity Census (House8L) Auto Price Triazines Machine CPU Servo Breast Cancer Bank domains California Housing Stocks domain training 0.0759 0.0418 0.0534 0.0464 0.0718 0.0652 0.1432 0.0574 0.0840 0.2278 0.0454 0.1089 0.0503
SVR testing 0.0784 0.0429 0.0540 0.0470 0.0746 0.0937 0.1829 0.0811 0.1177 0.2643 0.0467 0.1180 0.0518 training 0.0803 0.0423 0.0545 0.0316 0.0624 0.0754 0.1897 0.0332 0.0707 0.2470 0.0406 0.1217 0.0251
ELM testing 0.0824 0.0431 0.0568 0.0382 0.0660 0.0994 0.2002 0.0539 0.1196 0.2679 0.0366 0.1267 0.0348 tu-logo
Table 3: Comparison of training and testing RMSE of SVR and ELM.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Real-World Regression Problems
Datasets Abalone Delta Ailerons Delta Elevators Computer Activity Census (House8L) Auto Price Triazines Machine CPU Servo Breast Cancer Bank domains California Housing Stocks domain BP # nodes 10 10 5 45 10 5 5 10 10 5 20 10 20 SVR (C , ) 4 6 (2 , 2 ) (23 , 23 ) (20 , 22 ) (25 , 25 ) (21 , 21 ) (28 , 25 ) (21 , 29 ) (26 , 24 ) (22 , 22 ) (21 , 24 ) (210 , 22 ) (23 , 21 ) (23 , 29 ) ELM # nodes 25 45 125 125 160 15 10 10 30 10 190 80 110
# SVs 309.84 82.44 260.38 64.2 810.24 21.25 48.42 7.8 22.375 74.3 129.22 2189.2 19.94
tu-logo
Table 4: Comparison of network complexity of BP, SVR and ELM.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Real-World Regression Problems
Datasets BPa SVRb Abalone 1.7562 1.6123 Delta Ailerons 2.7525 0.6726 Delta Elevators 1.1938 1.121 Computer Activity 67.44 1.0149 Census (House8L) 8.0647 11.251 Auto Price 0.2456 0.0042 Triazines 0.5484 0.0086 Machine CPU 0.2354 0.0018 Servo 0.2447 0.0045 Breast Cancer 0.3856 0.0064 Bank domains 7.506 1.6084 California Housing 6.532 74.184 Stocks domain 1.0487 0.0690 a run in MATLAB environment. b run in C executable environment. ELMa 0.0125 0.0591 0.2812 0.2951 1.0795 0.0016 < 104 0.0015 < 104 < 104 0.6434 1.1177 0.0172 tu-logo
Table 5: Comparison of training time (seconds) of BP, SVR and ELM.
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Real-World Very Large Complex Applications
Algorithms
Time (minutes) Training Testing 1.6148 12 693.6000 0.7195 N/A 347.7833
Success Rate (%) Training Testing Rate Dev Rate Dev 92.35 82.44 91.70 0.026 N/A N/A 90.21 81.85 89.90 0.024 N/A N/A
# SVs/ nodes 200 100 31,806
ELM SLFN SVM
Table 6: Performance comparison of the ELM, BP and SVM learning algorithms in Forest Type Prediction application. (100, 000 training data and 480,000+ testing data, each data has 53 attributes.)
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Real Medical Diagnosis Application: Diabetes
Algorithms Time (seconds) Training Testing 0.0118 3.0116 0.1860 0.0031 0.0035 0.0673 Success Rate (%) Training Testing Rate Dev Rate Dev 78.68 86.63 78.76 1.18 1.7 0.91 77.57 74.73 77.31 2.85 3.2 2.35 # SVs/ nodes 20 20 317.16
ELM BP SVM
Table 7: Performance comparison: ELM, BP and SVM.
Algorithms ELM SVM SAOCIF Cascade-Correlation AdaBoost C4.5 RBF Heterogeneous RBF
Testing Rate (%) 77.57 76.50 77.32 76.58 75.60 71.60 76.30 76.30
tu-logo
Table 8: Performance comparison: ELM and other popular methods.
ELM Web Portal: [Link]/home/egbhuang Introduction to Extreme Learning Machines
ur-logo
Neural Networks ELM Performance Evaluations Summary
Protein Sequence Classication
Algorithms ELM (Sigmoid) ELM (RBF Kernel) SVM (RBF Kernel) BP
Training Time (seconds) Speedup 0.0998 9.4351 0.3434 1737.5 17417 184.1528 5060 1
Testing (%) Rate Dev 96.738 96.498 98.056 96.037 0.8628 0.6768 0.6819 1.2132
# SVs/ nodes 160 485 306.42 35
Table 9: Performance comparison of different classiers: Protein Sequence Classication.
D. Wang, et al., Protein Sequence Classication Using Extreme Learning Machine, Proceedings of International Joint Conference on Neural Networks (IJCNN2005), (Montreal, Canada), 31 July - 4 August, 2005.
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Protein Sequence Classication
Training Time (seconds) 66.604 171.02 374.12 624.89 843.68 1228.4 1737.5
Training (%) Rate Dev 88.35 98.729 99.45 99.6 99.511 99.576 99.739 1.9389 1.276 0.8820 0.5356 1.0176 1.2518 0.5353
Testing (%) Rate Dev 85.685 94.524 94.757 95.558 95.551 95.378 96.037 2.8345 2.2774 1.7648 1.4828 1.6258 1.9001 1.2132
# Nodes 5 10 15 20 25 30 35
Table 10: Performance of BP classier: Protein Sequence Classication.
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Sensitivity of number of hidden nodes
0.98
1 0.95
0.96 Average testing accuracy (%) of ELM Average testing accuracy (%) of BP 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.84 5 10 15 20 25 Number of hidden nodes 30 35 0.5 0 100 200 300 Number of hidden nodes 400 500tu-logo ELM (Sigmoid) ELM (RBF Kernel)
0.94
0.92
0.9
0.88
0.86
Figure 15: BP
Figure 16: ELM
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
Summary
ELM needs much less training time compared to popular BP and SVM/SVR. The prediction accuracy of ELM is usually slightly better than BP and close to SVM/SVR in many applications. Compared with BP and SVR, ELM can be implemented easily since there is no parameter to be tuned except an insenstive parameter L. It should be noted that many nonlinear activation functions can be used in ELM. ELM needs more hidden nodes than BP but much less nodes than SVM/SVR, which implies that ELM and BP have much shorter response time to unknown data than SVM/SVR.
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
Summary
ELM needs much less training time compared to popular BP and SVM/SVR. The prediction accuracy of ELM is usually slightly better than BP and close to SVM/SVR in many applications. Compared with BP and SVR, ELM can be implemented easily since there is no parameter to be tuned except an insenstive parameter L. It should be noted that many nonlinear activation functions can be used in ELM. ELM needs more hidden nodes than BP but much less nodes than SVM/SVR, which implies that ELM and BP have much shorter response time to unknown data than SVM/SVR.
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
Summary
ELM needs much less training time compared to popular BP and SVM/SVR. The prediction accuracy of ELM is usually slightly better than BP and close to SVM/SVR in many applications. Compared with BP and SVR, ELM can be implemented easily since there is no parameter to be tuned except an insenstive parameter L. It should be noted that many nonlinear activation functions can be used in ELM. ELM needs more hidden nodes than BP but much less nodes than SVM/SVR, which implies that ELM and BP have much shorter response time to unknown data than SVM/SVR.
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
Summary
ELM needs much less training time compared to popular BP and SVM/SVR. The prediction accuracy of ELM is usually slightly better than BP and close to SVM/SVR in many applications. Compared with BP and SVR, ELM can be implemented easily since there is no parameter to be tuned except an insenstive parameter L. It should be noted that many nonlinear activation functions can be used in ELM. ELM needs more hidden nodes than BP but much less nodes than SVM/SVR, which implies that ELM and BP have much shorter response time to unknown data than SVM/SVR.
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
Summary
ELM needs much less training time compared to popular BP and SVM/SVR. The prediction accuracy of ELM is usually slightly better than BP and close to SVM/SVR in many applications. Compared with BP and SVR, ELM can be implemented easily since there is no parameter to be tuned except an insenstive parameter L. It should be noted that many nonlinear activation functions can be used in ELM. ELM needs more hidden nodes than BP but much less nodes than SVM/SVR, which implies that ELM and BP have much shorter response time to unknown data than SVM/SVR.
tu-logo
ur-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
For Further Reading
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Computational Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006. N.-Y. Liang, et al., A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006. Q.-Y. Zhu, et al., Evolutionary Extreme Learning Machine, Pattern Recognition, vol. 38, no. 10, pp. 1759-1763, 2005. R. Zhang, et al., Multi-Category Classication Using Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis, IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), 2006. G.-B. Huang, et al., Can Threshold Networks Be Trained Directly? IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. Source Codes and references of ELM: [Link] ur-logo tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
For Further Reading
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Computational Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006. N.-Y. Liang, et al., A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006. Q.-Y. Zhu, et al., Evolutionary Extreme Learning Machine, Pattern Recognition, vol. 38, no. 10, pp. 1759-1763, 2005. R. Zhang, et al., Multi-Category Classication Using Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis, IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), 2006. G.-B. Huang, et al., Can Threshold Networks Be Trained Directly? IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. Source Codes and references of ELM: [Link] ur-logo tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
For Further Reading
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Computational Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006. N.-Y. Liang, et al., A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006. Q.-Y. Zhu, et al., Evolutionary Extreme Learning Machine, Pattern Recognition, vol. 38, no. 10, pp. 1759-1763, 2005. R. Zhang, et al., Multi-Category Classication Using Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis, IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), 2006. G.-B. Huang, et al., Can Threshold Networks Be Trained Directly? IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. Source Codes and references of ELM: [Link] ur-logo tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
For Further Reading
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Computational Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006. N.-Y. Liang, et al., A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006. Q.-Y. Zhu, et al., Evolutionary Extreme Learning Machine, Pattern Recognition, vol. 38, no. 10, pp. 1759-1763, 2005. R. Zhang, et al., Multi-Category Classication Using Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis, IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), 2006. G.-B. Huang, et al., Can Threshold Networks Be Trained Directly? IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. Source Codes and references of ELM: [Link] ur-logo tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
For Further Reading
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Computational Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006. N.-Y. Liang, et al., A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006. Q.-Y. Zhu, et al., Evolutionary Extreme Learning Machine, Pattern Recognition, vol. 38, no. 10, pp. 1759-1763, 2005. R. Zhang, et al., Multi-Category Classication Using Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis, IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), 2006. G.-B. Huang, et al., Can Threshold Networks Be Trained Directly? IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. Source Codes and references of ELM: [Link] ur-logo tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
For Further Reading
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Computational Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006. N.-Y. Liang, et al., A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006. Q.-Y. Zhu, et al., Evolutionary Extreme Learning Machine, Pattern Recognition, vol. 38, no. 10, pp. 1759-1763, 2005. R. Zhang, et al., Multi-Category Classication Using Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis, IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), 2006. G.-B. Huang, et al., Can Threshold Networks Be Trained Directly? IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. Source Codes and references of ELM: [Link] ur-logo tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines
Neural Networks ELM Performance Evaluations Summary
Summary Further Reading
For Further Reading
G.-B. Huang, et al., Universal Approximation Using Incremental Networks with Random Hidden Computational Nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., Extreme Learning Machine: Theory and Applications, Neurocomputing, vol. 70, pp. 489-501, 2006. N.-Y. Liang, et al., A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006. Q.-Y. Zhu, et al., Evolutionary Extreme Learning Machine, Pattern Recognition, vol. 38, no. 10, pp. 1759-1763, 2005. R. Zhang, et al., Multi-Category Classication Using Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis, IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), 2006. G.-B. Huang, et al., Can Threshold Networks Be Trained Directly? IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. Source Codes and references of ELM: [Link] ur-logo tu-logo
ELM Web Portal: [Link]/home/egbhuang
Introduction to Extreme Learning Machines