0% found this document useful (0 votes)

3 views3 pages

ID3 pgm

The document provides a program that implements the ID3 algorithm for decision tree classification using a dataset. It includes functions for reading data, calculating entropy, and creating nodes for the decision tree. The final output demonstrates the structure of the decision tree built from the dataset 'tennisdata.csv'.

Uploaded by

Suja Mary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views3 pages

ID3 pgm

Uploaded by

Suja Mary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

MACHINE LEARNING LAB - 3 ( ID3

Algorithm )
3. Write a program to demonstrate the working of the decision tree based ID3
Algorithm. Use an appropriate data set for building the decision tree and apply
this knowledge to classify a new sample.
In [1]:
import numpy as np
import math
import csv
In [2]:
def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
headers = next(datareader)
metadata = []
traindata = []
for name in headers:
metadata.append(name)
for row in datareader:
traindata.append(row)

return (metadata, traindata)

In [3]:
class Node:
def __init__(self, attribute):
self.attribute = attribute
self.children = []
self.answer = ""

def __str__(self):
return self.attribute
In [4]:
def subtables(data, col, delete):
dict = {}
items = np.unique(data[:, col])
count = np.zeros((items.shape[0], 1), dtype=np.int32)

for x in range(items.shape[0]):
for y in range(data.shape[0]):
if data[y, col] == items[x]:
count[x] += 1

for x in range(items.shape[0]):
dict[items[x]] = np.empty((int(count[x]), data.shape[1]), dtype="|
S32")
pos = 0
for y in range(data.shape[0]):
if data[y, col] == items[x]:
dict[items[x]][pos] = data[y]
pos += 1
if delete:
dict[items[x]] = np.delete(dict[items[x]], col, 1)

return items, dict

In [5]:
def entropy(S):
items = np.unique(S)

if items.size == 1:
return 0

counts = np.zeros((items.shape[0], 1))

sums = 0

for x in range(items.shape[0]):
counts[x] = sum(S == items[x]) / (S.size * 1.0)

for count in counts:

sums += -1 * count * math.log(count, 2)
return sums
In [6]:
def gain_ratio(data, col):
items, dict = subtables(data, col, delete=False)

total_size = data.shape[0]
entropies = np.zeros((items.shape[0], 1))
intrinsic = np.zeros((items.shape[0], 1))

for x in range(items.shape[0]):
ratio = dict[items[x]].shape[0]/(total_size * 1.0)
entropies[x] = ratio * entropy(dict[items[x]][:, -1])
intrinsic[x] = ratio * math.log(ratio, 2)

total_entropy = entropy(data[:, -1])

iv = -1 * sum(intrinsic)

for x in range(entropies.shape[0]):
total_entropy -= entropies[x]

return total_entropy / iv
In [7]:
def create_node(data, metadata):
if (np.unique(data[:, -1])).shape[0] == 1:
node = Node("")
node.answer = np.unique(data[:, -1])[0]
return node

gains = np.zeros((data.shape[1] - 1, 1))

for col in range(data.shape[1] - 1):

gains[col] = gain_ratio(data, col)

split = np.argmax(gains)

node = Node(metadata[split])
metadata = np.delete(metadata, split, 0)

items, dict = subtables(data, split, delete=True)

for x in range(items.shape[0]):
child = create_node(dict[items[x]], metadata)
node.children.append((items[x], child))

return node
In [8]:
def empty(size):
s = ""
for x in range(size):
s += " "
return s

def print_tree(node, level):

if node.answer != "":
print(empty(level), node.answer)
return
print(empty(level), node.attribute)
for value, n in node.children:
print(empty(level + 1), value)
print_tree(n, level + 2)
In [9]:
metadata, traindata = read_data("tennisdata.csv")
data = np.array(traindata)
node = create_node(data, metadata)
print_tree(node, 0)

Outlook
Overcast
b'Yes'
Rainy
Windy
b'False'
b'Yes'
b'True'
b'No'
Sunny
Humidity
b'High'
b'No'
b'Normal'
b'Yes'

Social Media
No ratings yet
Social Media
18 pages
Face Recognization
No ratings yet
Face Recognization
6 pages
Unit-1 State of the Practice in Analytics
No ratings yet
Unit-1 State of the Practice in Analytics
24 pages
Thought for the Day
No ratings yet
Thought for the Day
1 page
Unit-1 Introduction of Big Data
No ratings yet
Unit-1 Introduction of Big Data
17 pages
Unit-1 Data Structure
No ratings yet
Unit-1 Data Structure
20 pages
Unit-1 Data Analytics Lifecycle
No ratings yet
Unit-1 Data Analytics Lifecycle
57 pages
Agriculture Paper
No ratings yet
Agriculture Paper
5 pages
Decision Tree - Unit3
No ratings yet
Decision Tree - Unit3
21 pages
Ds Plan of Work 2025-26
No ratings yet
Ds Plan of Work 2025-26
2 pages
Unit-1 Data Mining Metrics
100% (1)
Unit-1 Data Mining Metrics
2 pages
Unit-1 Control Statement
No ratings yet
Unit-1 Control Statement
15 pages
Normalization in DBMS
No ratings yet
Normalization in DBMS
22 pages
Unit-III Advanced Machine Learning
No ratings yet
Unit-III Advanced Machine Learning
8 pages
Unit IV Recommender System
No ratings yet
Unit IV Recommender System
5 pages
Sequential Storage
No ratings yet
Sequential Storage
9 pages
Pandas & NumPy Data Analysis Guide
No ratings yet
Pandas & NumPy Data Analysis Guide
11 pages
Regression Model Diagnostics Overview
No ratings yet
Regression Model Diagnostics Overview
8 pages
Programs
No ratings yet
Programs
10 pages
CXC Study Schedule With Time Intervals
No ratings yet
CXC Study Schedule With Time Intervals
3 pages
AI Integration in Cloud Computing Review
No ratings yet
AI Integration in Cloud Computing Review
11 pages
Windows Command Guide 2010
50% (2)
Windows Command Guide 2010
39 pages
Cybersecurity Awareness Offline Course - 04.01.25
No ratings yet
Cybersecurity Awareness Offline Course - 04.01.25
31 pages
Sampling and Quantization
No ratings yet
Sampling and Quantization
7 pages
Data: Lumion System Requirements
No ratings yet
Data: Lumion System Requirements
4 pages
102 1500880491102 XXXPP4297X Itrv
No ratings yet
102 1500880491102 XXXPP4297X Itrv
1 page
Wa0006.
No ratings yet
Wa0006.
5 pages
Algorithm-Pseudocode and Flowchart
No ratings yet
Algorithm-Pseudocode and Flowchart
22 pages
4.1 Fuzzy Inference Systems (Mamdani) : Figure 4-1
No ratings yet
4.1 Fuzzy Inference Systems (Mamdani) : Figure 4-1
7 pages
Daily Security Activities Report - Dec 2022
No ratings yet
Daily Security Activities Report - Dec 2022
8 pages
Student Trainee Performance Evaluation: Laguna State Polytechnic University
No ratings yet
Student Trainee Performance Evaluation: Laguna State Polytechnic University
3 pages
Excel 2011: Formulas and Functions Guide
No ratings yet
Excel 2011: Formulas and Functions Guide
14 pages
Class Xii Minimum Level Learning
No ratings yet
Class Xii Minimum Level Learning
10 pages
CAPP Standard Value Calculation Guide
No ratings yet
CAPP Standard Value Calculation Guide
17 pages
2025-07-18 Biz Main
No ratings yet
2025-07-18 Biz Main
149 pages
EECS1560 Test 1 - Sample
No ratings yet
EECS1560 Test 1 - Sample
7 pages
AI Manual
No ratings yet
AI Manual
16 pages
Past Year Papers
No ratings yet
Past Year Papers
26 pages
Everything Search Utility Guide
No ratings yet
Everything Search Utility Guide
13 pages
V.35 Cable Interface Guide
No ratings yet
V.35 Cable Interface Guide
4 pages
Article Summary
No ratings yet
Article Summary
8 pages
Top 200 Questions With Answers On SAMA IT Framework
No ratings yet
Top 200 Questions With Answers On SAMA IT Framework
24 pages
ILSD FR 001 Rev 007 CHR Clearance Application Form 1
No ratings yet
ILSD FR 001 Rev 007 CHR Clearance Application Form 1
2 pages
What Is STRIPS
No ratings yet
What Is STRIPS
3 pages
Change Request Impact Analysis Guide
No ratings yet
Change Request Impact Analysis Guide
3 pages
Chapter 14 Slides
No ratings yet
Chapter 14 Slides
44 pages
DBMS Lab Manual - Amc
No ratings yet
DBMS Lab Manual - Amc
26 pages
IIT Madras: Experimental Stress Analysis
No ratings yet
IIT Madras: Experimental Stress Analysis
29 pages
HSC Exam Timetable & Guidelines
No ratings yet
HSC Exam Timetable & Guidelines
6 pages

ID3 pgm

Uploaded by

ID3 pgm

Uploaded by

MACHINE LEARNING LAB - 3 ( ID3

return (metadata, traindata)

return items, dict

counts = np.zeros((items.shape[0], 1))

for count in counts:

total_entropy = entropy(data[:, -1])

gains = np.zeros((data.shape[1] - 1, 1))

for col in range(data.shape[1] - 1):

items, dict = subtables(data, split, delete=True)

def print_tree(node, level):

You might also like