Ensemble Learning
Outcome: Compare and Contrast Ensemble approaches for
combining multiple Machine Learning Techniques
preencoded.png
Ensemble Learning
Imaging that Tale of blind men and elephant. All the blind men had their own description
of the elephant. Even though each of the description was true, it would have been better to
come together and discuss their understanding before coming to conclusion. This story
perfectly describes the Ensemble learning method.
Ensemble Learning
• An ensemble is a machine learning model that combines the predictions from two or
more models.
• The models that contribute to the ensemble, referred to as ensemble members.
• May or may not be the same model type and may or may not be trained on the same
training data.
• Ensemble learning is the process by which multiple models, such as classifiers or
experts, are strategically generated and combined to solve a particular
computational intelligence problem.
• Ensemble learning is primarily used to improve the (classification, prediction, function
approximation, etc.) performance of a model or reduce the likelihood of an unfortunate
selection of a poor one.
Ensemble Learning
Ensemble Learning
Ensemble Learning
Bagging (Parallel Ensemble Learning)
• Bagging, is a machine learning ensemble meta-algorithm intended to improve the
strength and accuracy of machine learning algorithms used in classification and
regression purpose. It additionally help for over-fitting.
• Parallel ensemble methods where the base learners are generated in parallel
• Algorithms : Random Forest, Bagged Decision Trees, Extra Trees
Bagging (Parallel Ensemble Learning)
• A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on
random subsets of the original dataset and then aggregate their individual predictions
(either by voting or by averaging) to form a final prediction.
• Such a meta-estimator can typically be used as a way to reduce the variance (e.g., in a
decision tree), by introducing randomization into its construction procedure and then
making an ensemble out of it.
• Each base classifier is trained in parallel with a training set which is generated by
randomly drawing, with replacement, N examples(or data) from the original training
dataset, where N is the size of the original training set.
Bagging (Parallel Ensemble Learning)
• The training set for each of the base classifiers is independent of each other.
• Many of the original data may be repeated in the resulting training set while others may
be left out.
• Bagging reduces overfitting (variance) by averaging or voting, however, this leads to an
increase in bias, which is compensated by the reduction in variance though.
Bagging (Parallel Ensemble Learning)
Boosting (Sequential Ensemble learning)
• Boosting, is a machine learning ensemble meta-algorithm for principally reducing bias,
and furthermore variance in supervised learning, and a group of machine learning
algorithms that convert weak learner to string ones.
• Sequential ensemble methods where the base learners are generated sequentially.
• Example : Adaboost, Stochastic Gradient Boosting
Boosting (Sequential Ensemble learning)
• Boosting is an ensemble modelling, technique that attempts to build a strong classifier
from the number of weak classifiers.
• It is done by building a model by using weak models in series.
• Firstly, a model is built from the training data.
• Then the second model is built which tries to correct the errors present in the first
model.
• This procedure is continued, and models are added until either the complete training
data set is predicted correctly, or the maximum number of models are added.
Boosting (Sequential Ensemble learning)
Bagging vs Boosting
Different Ways to Combine Classifiers
• Bagging (Bootstrap Aggregating)
• Boosting
• Stacking
• Voting