0% found this document useful (0 votes)
19 views25 pages

Decision Tree

Classification in data mining involves assigning class labels to instances based on their features, with binary and multi-class classification as the two main types. Decision trees are a key classification technique, utilizing a tree structure with decision and leaf nodes, and rely on concepts like information gain and entropy. Various algorithms exist for decision tree induction, including ID3, CART, and C4.5.

Uploaded by

japan302.302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views25 pages

Decision Tree

Classification in data mining involves assigning class labels to instances based on their features, with binary and multi-class classification as the two main types. Decision trees are a key classification technique, utilizing a tree structure with decision and leaf nodes, and rely on concepts like information gain and entropy. Various algorithms exist for decision tree induction, including ID3, CART, and C4.5.

Uploaded by

japan302.302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Mining

Classification:
Basic Concepts, Decision Trees (IDE algorithm)
Classification: Definition
• Classification is a task in data mining that involves assigning a class label to each
instance in a dataset based on its features. The goal of classification is to build a
model that accurately predicts the class labels of new instances based on their
features.
• There are two main types of classification: binary classification and multi-class
classification. Binary classification involves classifying instances into two classes,
such as “spam” or “not spam”, while multi-class classification involves classifying
instances into more than two classes.
Classification Techniques
• Decision Tree based Methods
• Rule-based Methods
• Neural Networks
• Naïve Bayes
• Support Vector Machines
Decision Tree
• It is supervised Machine Learning Algorithm that is used for both classification and
regression tasks.
• Tree Structure
• Decision nodes
• Leaf nodes
• Splitting
• Information Gain
• Entropy
Information Gain and Entropy
• Information Gain Measure of how much information, the answer about
specific question provides.
• Entropy is uncertainty/ randomness in the information obtained from IG,
the more the randomness the higher will be the entropy.
Decision Tree Induction
Many Algorithms:
• Hunt’s Algorithm (one of the earliest)
• CART
• ID3, C4.5
• SLIQ,SPRINT
Decision Tree Induction using ID3
• ID3 stands for Iterative Dichotomiser 3 and is named such because the
algorithm iteratively (repeatedly) dichotomizes(divides) features into two or
more groups at each step.

You might also like