0% found this document useful (0 votes)
5 views9 pages

Ensembles

Uploaded by

sssalunkheb22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Ensembles

Uploaded by

sssalunkheb22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Understanding Ensembles

●​ Ensemble learning is a machine learning technique


that combines predictions from multiple models
(learners) to improve performance.​

●​ It follows the idea that many weak models, when


combined, can perform better than a single strong
model.​

●​ Analogy: Like choosing a team of diverse experts on


a quiz show. Individually, they may not know
everything, but together, they cover a wide range of
knowledge.​

🔸 Types of Ensemble Methods


1. Bagging (Bootstrap Aggregating)

●​ Stands for Bootstrap AGGregatING.​

●​ Uses homogeneous weak learners (same type of


models, like decision trees).​

●​ Each model is trained on a random subset of the


training data (with replacement).​

●​ Training is done in parallel.​


●​ Final output is based on:​

○​ Majority vote for classification.​

○​ Average for regression.​

●​ Helps reduce variance and prevents overfitting.​

2. Boosting

●​ Also uses homogeneous weak learners.​

●​ Models are trained sequentially—each new model


tries to fix the errors made by the previous one.​

●​ Gives more weight to misclassified instances in each


step.​

●​ Combines models using a deterministic strategy.​

●​ Helps reduce bias and increases model accuracy, but


can overfit if not controlled.​

3. Stacking (Stacked Generalization)

●​ Combines heterogeneous weak learners (different


types like SVM, KNN, DT, etc.).​
●​ Trained in parallel, and outputs of these models are
passed to a meta-model (like logistic regression).​

●​ Meta-model learns how to best combine the base


model outputs to make the final prediction.​

●​ Offers flexibility and higher performance, especially


when base models have different strengths.​

🌲 Random Forest – Key Concepts


●​ Random Forest is an ensemble of decision trees,
built using bagging.​

●​ Rather than averaging predictions from similar trees,


Random Forest:​

1.​Trains each tree using random samples


(bootstrap).​

2.​At each split, considers a random subset of


features, not all.​

●​ This randomness gives two main benefits:​

1.​Reduces overfitting by decorrelating trees.​


2.​Increases generalization performance on unseen
data.​

●​ Each tree contributes to the final prediction through


voting (classification) or averaging (regression).​

📦 Bootstrap Sampling Method


●​ A statistical technique to estimate metrics (like mean)
from a small dataset.​

●​ Steps:​

1.​Draw many sub-samples with replacement


from the dataset (e.g., 1000).​

2.​Calculate the mean (or other metric) for each


sample.​

3.​Average all these values to get a more accurate


estimate.​

●​ It helps in reducing estimation errors and is used in


bagging/random forest.​

🔁 Bagging – In Detail
●​ In bagging, each model is trained on a different
bootstrapped version of the data.​

●​ Models are trained using the same learning


algorithm (like decision trees).​

●​ Final prediction is made by aggregating predictions.​

●​ Works best with unstable learners—models that


change a lot when input data changes slightly.​

○​ Example: Decision trees.​

●​ Main idea: diversity from random sampling increases


model robustness and stability.​

🧮 Gini Impurity (Used in Decision Trees)


●​ Gini impurity is a measure of how impure a node is in
terms of class distribution.​

●​ Formula:​
IG(n)=1−∑i=1J(pi)2I_G(n) = 1 - \sum_{i=1}^{J}
(p_i)^2IG​(n)=1−i=1∑J​(pi​)2​
where pip_ipi​is the proportion of samples belonging
to class iii.​

●​ Example:​
○​ A Gini impurity of 0.444 means there's a 44.4%
chance of misclassifying a random sample from
that node.​

●​ Lower Gini = better split.​

●​ The decision tree chooses the feature and threshold


that minimizes the Gini impurity at each node.​

🔄 Random Forest Pseudocode


1.​Randomly choose k features from the total m
features (where k ≪ m).​

2.​Among the k features, find the best feature and


threshold to split the node.​

3.​Divide the data into daughter nodes using the best


split.​

4.​Repeat steps 1–3 until the max depth or stopping


criteria is met.​

5.​Repeat this process for n trees to build the complete


forest.
Boosting and AdaBoost (Image 1)

●​ Boosting combines multiple weak classifiers to form a


strong classifier.​

●​ It works by:​

○​ Training a model on data.​

○​ Creating the next model to correct errors made by


the previous one.​

○​ Repeating this until the data is predicted well or a


set limit is reached.​

●​ AdaBoost is the first successful boosting algorithm:​

○​ Originally for binary classification, later used for


multi-class problems.​

○​ A great starting point to understand boosting.​

●​ Best used with weak learners (models that perform


slightly better than random guessing).​

●​ Can enhance any machine learning algorithm’s


performance.​

🔸 Advantages of Random Forest (Image 2)


●​ Works for both classification and regression tasks.​

●​ Handles missing values well.​

●​ Avoids overfitting in most classification problems.​

●​ Can be reused for different tasks without changing the


core algorithm.​

●​ Useful for feature engineering:​

○​ Helps identify the most important features in the


dataset.​

🔸 Boosting Algorithm (AdaBoost Process) (Image 3)


●​ Hard-to-classify samples get higher weights in each
iteration.​

●​ Algorithm focuses more on the misclassified


samples.​

●​ In each round:​

○​ A stage weight = ln((1 - error) / error)


is calculated.​
○​ Initially, equal weights 1/N are given to all
samples.​

●​ Final result is a weighted ensemble of classifiers.​

○​ This combined model performs better than


individual classifiers.​

○​ Shows strong potential for accurate


classification.​

You might also like