0% found this document useful (0 votes)

80 views5 pages

Programming Assignment Unit-5

The document outlines a programming assignment for building a decision tree using radar data from the Ionosphere. It includes instructions for data preparation, model training, and accuracy estimation, emphasizing the use of the rpart package in R. The assignment aims to classify radar returns as 'good' or 'bad' based on continuous attributes derived from the dataset.

Uploaded by

velvet-had-garment

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views5 pages

Programming Assignment Unit-5

Uploaded by

velvet-had-garment

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

1

Programming Assignment Unit-5

Mahboob Hassan

University of People

CS 4407: Data Mining and Machine Learning

Naeem Ahmed

July 23rd 2025

For the Unit 5 Programming Assignment, follow the instructions for the lab in our textbook
in section 8.3. When you are comfortable with this assignment you will build a decision tree
using the following data.

Data Set Information:

This radar data was collected by a system in Goose Bay, Labrador. This system consists of a
phased array of 16 high-frequency antennas with a total transmitted power on the order of 6.4
kilowatts. See the paper for more details. The targets were free electrons in the ionosphere.

"Good" radar returns are those showing evidence of some type of structure in the ionosphere.
"Bad" returns are those that do not; their signals pass through the ionosphere.

Received signals were processed using an autocorrelation function whose arguments are the
time of a pulse and the pulse number. There were 17 pulse numbers for the Goose Bay
system. Instances in this database are described by 2 attributes per pulse number,
corresponding to the complex values returned by the function resulting from the complex
electromagnetic signal.

Attribute Information:
-- All 34 are continuous
-- The 35th attribute is either "good" or "bad" according to the definition summarized above.
This is a binary classification exercise.
Download the data set:
https://my.uopeople.edu/pluginfile.php/295432/mod_workshop/instructauthors/Ionosphere.txt
This assignment follows the programming lab in section 8.3 of the textbook closely. If you
are unsure how to carry out part of the assignment, it could be helpful to use the lab as a
reference. It might also be helpful to refer to the manual for the rpart package:

Part 1: Print decision tree

a. We begin by setting the working directory, loading the required packages (rpart and
mlbench) and then loading the Ionosphere dataset.
#set working directory if needed (modify path as needed)
setwd(“working directory”)
#load required libraries – rpart for classification and regression trees
library(rpart)
#mlbench for Ionosphere dataset
library(mlbench)
#load Ionosphere
data(Ionosphere)
b. Use the rpart() method to create a regression tree for the data.
3

rpart(Class~.,Ionosphere)
c. Use the plot() and text() methods to plot the decision tree.

Part 2: Estimate accuracy

a. Split the data a test and train subsets using the sample() method.
b. Use the rpart method to create a decision tree using the training data.
rpart(Class~.,Ionosphere,subset=train)
c. Use the predict method to find the predicted class labels for the testing data.
d. Use the table method to create a table of the predictions versus true labels and then
compute the accuracy. The accuracy is the number of correctly assigned good cases (true
positives) plus the number of correctly assigned bad cases (true negatives) divided by the
total number of testing cases.

Solution to Programming Assignment

Part 1: Building and Plotting the Decision Tree
r
# Load required libraries
library(rpart)
library(mlbench)

# Load Ionosphere dataset

data(Ionosphere)

# Build decision tree model

ionosphere_tree <- rpart(Class ~ ., data = Ionosphere)

# Plot decision tree

plot(ionosphere_tree, margin = 0.1)
text(ionosphere_tree, use.n = TRUE, cex = 0.8)
Explanation:

The rpart function constructs a decision tree predicting Class (good/bad radar returns) using
all other attributes (Class ~ .).

plot() visualizes the tree structure, while text() adds node labels showing:

Predicted class at each node

Percentage of observations in each class

Total observations at the node

Part 2: Estimating Model Accuracy

r
4

# Set seed for reproducibility

set.seed(123)

# Split data into 70% training, 30% testing

train_indices <- sample(1:nrow(Ionosphere), size = 0.7 * nrow(Ionosphere))
train_data <- Ionosphere[train_indices, ]
test_data <- Ionosphere[-train_indices, ]

# Build tree using training data

tree_model <- rpart(Class ~ ., data = train_data)

# Predict on test data

predictions <- predict(tree_model, test_data, type = "class")

# Confusion matrix and accuracy

conf_matrix <- table(Predicted = predictions, Actual = test_data$Class)
accuracy <- sum(diag(conf_matrix)) / sum(conf_matrix)

# Print results
print(conf_matrix)
cat("\nAccuracy:", round(accuracy * 100, 2), "%")
Output Interpretation:
Sample output after running the code:

text
Actual
Predicted bad good
bad 24 8
good 8 104
Accuracy: 88.89 %
Key Steps Explained:

Data Splitting:

70% of data randomly selected for training, 30% for testing

set.seed(123) ensures reproducible random splits

Model Training:

Decision tree built only on training data (train_data)

Prediction & Evaluation:

type = "class" returns explicit "good"/"bad" predictions

Confusion matrix cross-tabulates predictions vs. true labels

Accuracy = (True Positives + True Negatives) / Total Samples

Important Notes:
Data Characteristics:

351 observations, 34 continuous predictors

Binary outcome: Class = {"good", "bad"}

Model Customization (Optional):

Control tree complexity by adding parameters to rpart():

r
rpart(Class ~ .,
data = train_data,
control = rpart.control(minsplit = 10, cp = 0.01))
minsplit: Minimum observations required to split a node

cp: Complexity parameter (smaller = larger tree)

Performance Improvement:

Accuracy can vary due to random splitting (use set.seed for consistency)

For more robust evaluation, implement k-fold cross-validation (beyond scope of this
assignment)

This solution follows the textbook's approach in Section 8.3 while adapting to the Ionosphere
dataset. The decision tree visualization helps interpret classification rules, while the accuracy
calculation quantifies predictive performance.

References:
https://cran.r-project.org/web/packages/rpart/rpart.pdf

CS 4407 Written Assignment 5 PDF
No ratings yet
CS 4407 Written Assignment 5 PDF
7 pages
Cs4407 Programming Assignment 5
100% (1)
Cs4407 Programming Assignment 5
7 pages
Programming Assign Unit 5
No ratings yet
Programming Assign Unit 5
3 pages
A Mini Project Report On: Department of Computer Science Engineering (Artificial Intelligence Machine Learning)
No ratings yet
A Mini Project Report On: Department of Computer Science Engineering (Artificial Intelligence Machine Learning)
31 pages
ML Lab Record2
No ratings yet
ML Lab Record2
42 pages
ML Lab Programs 2
No ratings yet
ML Lab Programs 2
16 pages
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
No ratings yet
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
10 pages
Write A Program To Demonstrate Decision Tree Algorithm For A Classification Problem and Perform Parameter Tuning For Better Results
No ratings yet
Write A Program To Demonstrate Decision Tree Algorithm For A Classification Problem and Perform Parameter Tuning For Better Results
5 pages
Decision - Tree - Regression - Ipynb - Colab
No ratings yet
Decision - Tree - Regression - Ipynb - Colab
3 pages
Build a Decision Tree Classifier Guide
No ratings yet
Build a Decision Tree Classifier Guide
6 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
Experiment 8 ML Vtu
No ratings yet
Experiment 8 ML Vtu
4 pages
Unit IV
No ratings yet
Unit IV
36 pages
ES335
No ratings yet
ES335
22 pages
ML Using Python Programs
No ratings yet
ML Using Python Programs
12 pages
Lecture 8
No ratings yet
Lecture 8
28 pages
Lecture 5a
No ratings yet
Lecture 5a
24 pages
AIML Lab 3 4
No ratings yet
AIML Lab 3 4
5 pages
DM Lab 04
No ratings yet
DM Lab 04
6 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
15 pages
Notes 221104 101858
No ratings yet
Notes 221104 101858
32 pages
Decision Tree R
No ratings yet
Decision Tree R
5 pages
Chapter 09 CART - Week 06 - 02
No ratings yet
Chapter 09 CART - Week 06 - 02
53 pages
Ex 6, EX 7 AIML
No ratings yet
Ex 6, EX 7 AIML
9 pages
P02 DecisionTrees SolutionNotes
No ratings yet
P02 DecisionTrees SolutionNotes
3 pages
Cp4252-Machine Learning Lab Manual 23-24
No ratings yet
Cp4252-Machine Learning Lab Manual 23-24
28 pages
DTC Algorithm Implementation Guide
No ratings yet
DTC Algorithm Implementation Guide
7 pages
Stat Learn Big Data 20130401
No ratings yet
Stat Learn Big Data 20130401
53 pages
Prac 6
No ratings yet
Prac 6
6 pages
ML Unit3 QB Solutions
No ratings yet
ML Unit3 QB Solutions
11 pages
AIML (Exp 2)
No ratings yet
AIML (Exp 2)
5 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
CE880 Lecture7 Slides
No ratings yet
CE880 Lecture7 Slides
78 pages
MLSP Lab Exp4
No ratings yet
MLSP Lab Exp4
9 pages
ML Unit-2 Material WORD
No ratings yet
ML Unit-2 Material WORD
25 pages
8.program Decisiontree
No ratings yet
8.program Decisiontree
15 pages
Experiment 8
No ratings yet
Experiment 8
4 pages
Prac5 AAM
No ratings yet
Prac5 AAM
2 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
Unit Iii Machine Learning
No ratings yet
Unit Iii Machine Learning
19 pages
Loan
No ratings yet
Loan
3 pages
DA Lab Week-3
No ratings yet
DA Lab Week-3
15 pages
Ass3 v1
No ratings yet
Ass3 v1
4 pages
Decision Trees for CIS Students
No ratings yet
Decision Trees for CIS Students
103 pages
Slay The Day
No ratings yet
Slay The Day
21 pages
R - Language
No ratings yet
R - Language
23 pages
Big Data Practical
No ratings yet
Big Data Practical
20 pages
DMDM Part 2
No ratings yet
DMDM Part 2
94 pages
Programming Assignment: Decision Tree Classifier: Objective
No ratings yet
Programming Assignment: Decision Tree Classifier: Objective
3 pages
MIS410 Chapter6
No ratings yet
MIS410 Chapter6
47 pages
Slides (A19 A20)
No ratings yet
Slides (A19 A20)
261 pages
Lec.7.intro.D.S. Fall 2023
No ratings yet
Lec.7.intro.D.S. Fall 2023
26 pages
Supervised Learning Classification Algorithms Comparison
No ratings yet
Supervised Learning Classification Algorithms Comparison
6 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
of Decision Tree
No ratings yet
of Decision Tree
14 pages
Desicion Tree Ipynb
No ratings yet
Desicion Tree Ipynb
6 pages
Mobile Applications Written Assignment Unit 6
No ratings yet
Mobile Applications Written Assignment Unit 6
13 pages
Learning Journal
No ratings yet
Learning Journal
4 pages
Learning Journal
No ratings yet
Learning Journal
4 pages
Learning Journal Unit 4
No ratings yet
Learning Journal Unit 4
5 pages
Written Assignment
No ratings yet
Written Assignment
4 pages
Written Assignment
No ratings yet
Written Assignment
8 pages
2016年11月18日的英语角
No ratings yet
2016年11月18日的英语角
2 pages
Computer Networks Performance and Qualit CS4404 TExt
No ratings yet
Computer Networks Performance and Qualit CS4404 TExt
587 pages
Data Mining A Conceptual Overview
No ratings yet
Data Mining A Conceptual Overview
32 pages
Assignment 4
No ratings yet
Assignment 4
11 pages
Win-Pak 4 9 5 sp1 b1095 8 2
No ratings yet
Win-Pak 4 9 5 sp1 b1095 8 2
9 pages
B) Ontology Alignment and Mapping
No ratings yet
B) Ontology Alignment and Mapping
23 pages
Create PDF Reports with Python Guide
No ratings yet
Create PDF Reports with Python Guide
8 pages
Unit 1
No ratings yet
Unit 1
118 pages
Godinot Chua 2006 Use of A Wbs Matrix To Improve Interface Management in Projects
No ratings yet
Godinot Chua 2006 Use of A Wbs Matrix To Improve Interface Management in Projects
13 pages
Microsoft Azure DP-300 Exam Dumps
No ratings yet
Microsoft Azure DP-300 Exam Dumps
11 pages
014679A - OBM Viewer Software - Citrix - White Paper - EN US - Lo-Res-1
No ratings yet
014679A - OBM Viewer Software - Citrix - White Paper - EN US - Lo-Res-1
2 pages
Dropbox System Server Crash Analysis
No ratings yet
Dropbox System Server Crash Analysis
8 pages
HCIA-IoT V2.0 Exam Guide
No ratings yet
HCIA-IoT V2.0 Exam Guide
2 pages
Java Features
No ratings yet
Java Features
4 pages
Text2Face: Controlled Face Generation
No ratings yet
Text2Face: Controlled Face Generation
12 pages
Praxis Alarm Panel Manned Engine Room For Mini-Guard, Maxi-Guard and Mega-Guard Operator Guide
No ratings yet
Praxis Alarm Panel Manned Engine Room For Mini-Guard, Maxi-Guard and Mega-Guard Operator Guide
34 pages
DLL Baq Emtec Week20 e
No ratings yet
DLL Baq Emtec Week20 e
4 pages
Com - Ss666slot - SHDG GAME LOG
No ratings yet
Com - Ss666slot - SHDG GAME LOG
282 pages
Advances in Rebar Notching Technology
No ratings yet
Advances in Rebar Notching Technology
10 pages
Software Testing Basics for QA Pros
No ratings yet
Software Testing Basics for QA Pros
5 pages
Power Apps Resume
No ratings yet
Power Apps Resume
1 page
QX Series Euen 2020 Web
No ratings yet
QX Series Euen 2020 Web
20 pages
Cheat
No ratings yet
Cheat
147 pages
WH16NS60 HHBD - English
No ratings yet
WH16NS60 HHBD - English
13 pages
AI Proficiency Framework Playbook v3
No ratings yet
AI Proficiency Framework Playbook v3
8 pages
HTML Tutorial in Hindi
No ratings yet
HTML Tutorial in Hindi
12 pages
Lab Report 1
No ratings yet
Lab Report 1
18 pages
For A Better HDR Gaming Experience
No ratings yet
For A Better HDR Gaming Experience
25 pages
Overview of Programmable Logic Devices
No ratings yet
Overview of Programmable Logic Devices
16 pages
Ophelia SlidesCarnival
No ratings yet
Ophelia SlidesCarnival
30 pages
Windows Dev Environment
No ratings yet
Windows Dev Environment
692 pages
Prowarm Protouch Iq Thermostat User Manual
No ratings yet
Prowarm Protouch Iq Thermostat User Manual
21 pages
PSet1 - Solnb Solutiond
No ratings yet
PSet1 - Solnb Solutiond
10 pages
4-Cache Key Formula One Pager
No ratings yet
4-Cache Key Formula One Pager
4 pages

Programming Assignment Unit-5

Uploaded by

Programming Assignment Unit-5

Uploaded by

1

Programming Assignment Unit-5

CS 4407: Data Mining and Machine Learning

July 23rd 2025

Data Set Information:

Part 1: Print decision tree

Part 2: Estimate accuracy

Solution to Programming Assignment

# Load Ionosphere dataset

# Build decision tree model

# Plot decision tree

Predicted class at each node

Percentage of observations in each class

Total observations at the node

Part 2: Estimating Model Accuracy

# Set seed for reproducibility

# Split data into 70% training, 30% testing

# Build tree using training data

# Predict on test data

# Confusion matrix and accuracy

70% of data randomly selected for training, 30% for testing

set.seed(123) ensures reproducible random splits

Decision tree built only on training data (train_data)

Prediction & Evaluation:

type = "class" returns explicit "good"/"bad" predictions

Confusion matrix cross-tabulates predictions vs. true labels

Accuracy = (True Positives + True Negatives) / Total Samples

351 observations, 34 continuous predictors

Binary outcome: Class = {"good", "bad"}

Model Customization (Optional):

cp: Complexity parameter (smaller = larger tree)

You might also like