0% found this document useful (0 votes)
11 views90 pages

22cse63 Module 5

This document covers key concepts in clustering, artificial neural networks (ANN), and instance-based learning. It details the K-means clustering process, the structure and types of neural networks including perceptrons and multi-layer perceptrons, and introduces ensemble learning with a focus on the Random Forest classifier. The document also outlines the algorithm for training a Random Forest classifier using bootstrap sampling and majority voting for predictions.

Uploaded by

ehejdhjee299
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views90 pages

22cse63 Module 5

This document covers key concepts in clustering, artificial neural networks (ANN), and instance-based learning. It details the K-means clustering process, the structure and types of neural networks including perceptrons and multi-layer perceptrons, and introduces ensemble learning with a focus on the Random Forest classifier. The document also outlines the algorithm for training a Random Forest classifier using bootstrap sampling and majority voting for predictions.

Uploaded by

ehejdhjee299
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

MODULE 5

Clustering, Artificial Neural Network And Instance Based Learning

Clustering: k-means, Hierarchical Clustering Artificial Neural Networks: Neural


Network representation, Perceptron, Multi-Layer Networks and Back
propagation algorithm. Instance-Based Learning: k-Nearest Neighbor Learning,
Ensemble Learning-Random Forest classifier
K-Means Clustering
Problem:
Use K-means clustering to divide these points into 2 clusters:

Point Coordinates

a₁ (1,1) Step 1: Initialization


Selecting Randomly 2 clusters::
a₂ (2,1)
• P0=(2,1)
a₃ (2,3)
• P1=(2,3)
a₄ (3,2)

a₅ (4,3)

a₆ (5,5)
Step 2: Calculate Euclidean Distances and Assign Clusters
Step 3: Recalculate Cluster Centers (means)
Step 4: Recalculate Cluster Centers (means)
Step 5: Recalculate Cluster Centers Again
Step 6: Final Distance Calculation and Assignment

Cluster elements are same


as the previous iteration
Cluster 1 = { (1,1), (2,1), (2,3), (3,2) }

Cluster 2 = { (4,3), (5,5) }


Biological Neural Network.
Artificial Neural Network.
Dendrites from Biological Neural
Network represent inputs in Artificial
Neural Networks, cell nucleus represents
Nodes, synapse represents Weights, and
Axon represents Output.

BNN ANN
Dendrites Inputs
Cell nucleus Nodes
Synapse Weights
Axon Output
Artificial Neurons
Artificial Neural Networks

This computation is represented in the form of a transfer function.


Neural Network Types
Types of Neural Networks
Perceptron
A Perceptron is a single-layer neural network used for binary classification (like yes/no or
spam/not spam). It classifies input based on weighted sum and a threshold.
Multi-Layer Perceptron (MLP)
An MLP has multiple layers – input, hidden, and output. It can model non-linear relationships and solve
problems like XOR, image recognition, etc.
Backpropagation Algorithm
Backpropagation is the training algorithm for MLP. It adjusts weights using gradient
descent to minimize the prediction error.
Backpropagation in MLP

Algorithm:
Initialize weights randomly
Repeat:
For each training example:
- Forward pass: compute prediction
- Compute error: prediction - actual
- Backward pass: compute gradients
- Update weights
Until all examples classified or stopping condition met
Return trained network
Perceptrons ANN PROBLEMS
Ensemble Learning is a technique in machine learning where
multiple models (learners) are combined to solve a problem
and improve performance.
Why use Ensemble Learning?
• More accurate than individual models
• Reduces overfitting and variance
• Makes robust predictions
Random Forest
Random Forest is an Ensemble Learning technique that builds multiple
Decision Trees and combines their outputs to make a final prediction.
It is mainly used for:
• Classification (e.g., predicting yes/no)
• Regression (e.g., predicting a number)
Simple Definition
A Random Forest creates many decision trees using random subsets of data and features,
and combines their outputs to make more accurate and stable predictions.
Random Forest Classifier

Random Forest Classifier is a supervised machine learning algorithm that


combines multiple Decision Trees using an ensemble method (specifically,
bagging) to improve prediction accuracy and control overfitting.
Random Forest Classifier Algorithm
1.Input:
1. Training dataset with features and labels (X, Y)
2. Number of trees (N) to build
3. Number of features to consider for each split (optional)
2.For each of the N trees:
a. Bootstrap Sampling:
1. Randomly select data points from the training set with replacement to form a new training set.
b. Train a Decision Tree:
2. At each node, randomly select a subset of features and choose the best feature for splitting (not all features).
c. Grow the tree fully (no pruning).
3.Prediction:
1. For a new data point, get predictions from all decision trees.
2. Use majority voting to assign the final class label.

You might also like