0% found this document useful (0 votes)

5 views5 pages

Project2 Report

This report presents the results of experiments on email spam classification using k-NN and MLP classifiers, with a dataset of 4601 examples and 58 attributes. The k-NN classifier achieved an accuracy of approximately 90.66%, while the MLP classifier achieved around 93.34%, with both models demonstrating effective performance in distinguishing between spam and non-spam emails. Recommendations for improving model performance include feature engineering, hyperparameter tuning, ensemble methods, and addressing class imbalance.

Uploaded by

namnguyen.plus444

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views5 pages

Project2 Report

Uploaded by

namnguyen.plus444

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Faculty of Engineering and Technology

Department of Electrical and Computer Engineering

Artificial Intelligence
ENCS3340
Second Semester 2022/2023

Project # 2
Prepared by: Omar Masalmah ID: 1200060

Instructor: Dr. Ismail Khater

Date: 15-07-2023
1. Introduction

In this report, we discuss the results and experiments conducted on the task of email
spam classification using k-NN and MLP classifiers. The goal was to develop models
that could accurately distinguish between spam and non-spam emails based on a
given dataset. The dataset comprised 4601 examples, each represented by 58
attributes, with the last attribute indicating the class label (spam or not-spam).

2. Methodology

We followed a systematic approach to train and evaluate the k-NN and MLP
classifiers on the provided dataset. The dataset was split into training and testing
sets using a 70:30 ratio. The features of the dataset were preprocessed by
normalizing each feature based on the mean and standard deviation. The k-NN
classifier was implemented with k=3, using the Euclidean distance metric to find the
nearest neighbors. The MLP classifier was trained with two hidden layers (10
neurons in the first layer, 5 neurons in the second layer), and the logistic activation
function.

3. Results

The models were trained and evaluated on the test set, and the following
performance metrics were calculated:

 Accuracy: Measures the overall correctness of the classifier's predictions.

 Precision: Indicates the proportion of correctly predicted spam emails

among the predicted spam emails.

 Recall: Represents the proportion of correctly predicted spam emails

among the actual spam emails.

 F1-score: Harmonic mean of precision and recall, providing a balanced

measure of classifier performance.
The achieved results on the test set are summarized below:

k-NN Classifier:

Figure 1: K-NN Results

MLP Classifier:

Figure 2: MLP Results

4. Confusion Matrix

To gain more insights into the classifier's performance, we constructed the

confusion matrices for both the k-NN and MLP classifiers. The confusion
matrix provides a detailed breakdown of the classifier's predictions, including
true positive (TP), true negative (TN), false positive (FP), and false negative
(FN) results. The confusion matrices are presented below:

Figure 3: Confusion Martix

k-NN Confusion Matrix:

Figure 4: Confusion Martix of K-NN

MLP Confusion Matrix:

Figure 5: Confusion Martix of MLP

5. Discussion

Based on the achieved results, both the k-NN and MLP classifiers showed
reasonably good performance in distinguishing between spam and non-spam
emails. The k-NN classifier achieved an accuracy of 0.9065894279507604
with a precision of 0.911275415896488, recall of 0.8588850174216028, and
F1-score of 0.884304932735426. Similarly, the MLP classifier achieved an
accuracy of 0.9333816075307748 with a precision of 0.9479553903345725,
recall of 0.8885017421602788, and F1-score of 0.9172661870503597.

Analyzing the confusion matrices, we observe that the k-NN classifier had 759
true positive predictions and 493 true negative predictions. However, it had
81 false positive predictions and 48 false negative predictions. Similarly, the
MLP classifier had 779 true positive predictions and 510 true negative
predictions, but it produced 64 false positive predictions and 28 false
negative predictions.

6. Experimental Improvements

We can examine the following tactics to improve the performance of the

evaluated models even more:

1- Feature Engineering: Look at other aspects that could better capture

the characteristics of spam emails. For instance, examining the email's
subject line, sender's domain, or presence of particular keywords
might offer insightful information.
-2 Tuning the hyperparameters: Experiment with various classifier
hyperparameter setups. Improvements could be made by adjusting
variables like k in the k-NN, the number of hidden layers and neurons
in the MLP, or the activation functions. To determine the best
hyperparameters, use methods such as grid search or random search.
3- Ensemble Methods: Use ensemble techniques to aggregate
predictions from different models, such as Random Forest or Gradient
Boosting. The accuracy and resilience of ensemble algorithms are
often increased.
-4 Handling Class Imbalance: Use approaches like oversampling the
minority class (spam) or undersampling the majority class (non-spam)
to address any class imbalance in the dataset. The model may be able
to learn from both classes more effectively as a result.
5- Error analysis: Carefully examine cases of misclassifications to spot
any trends or recurring traits that might be to blame. The models and
preprocessing procedures can be improved using the analysis.

6. Conclusion

In this report, we discussed the experiments and findings related to the

classification of spam emails using k-NN and MLP classifiers. The classifiers
successfully distinguished between spam and non-spam emails by achieving
accuracy for k-NN and accuracy for MLP. The confusion matrices shed light on
the true positives, true negatives, false positives, and false negatives in the
predictions made by the models.

We recommend investigating feature engineering, hyperparameter tuning,

ensemble approaches, resolving class imbalance, error analysis, and cross-
validation to further enhance the models' performance. Putting these tactics
into practice can improve classification accuracy and yield better outcomes.

Given the dynamic nature of spam emails, ongoing study and

experimentation in email spam classification are essential. We can better
safeguard users from undesired and potentially hazardous email content by
developing and improving classification models.

B. Flowchart of The Model: Esult
No ratings yet
B. Flowchart of The Model: Esult
3 pages
Email Spam Classification with ML & NLP
No ratings yet
Email Spam Classification with ML & NLP
6 pages
Research Article On The Forensic
No ratings yet
Research Article On The Forensic
14 pages
Improving Spam Email Classification Accuracy Using Ensemble Techniques: A Stacking Approach
No ratings yet
Improving Spam Email Classification Accuracy Using Ensemble Techniques: A Stacking Approach
13 pages
Email Spam Detection PPT Github
No ratings yet
Email Spam Detection PPT Github
11 pages
Neural Network Spam Classifier
No ratings yet
Neural Network Spam Classifier
5 pages
Spam Email Classifier
No ratings yet
Spam Email Classifier
17 pages
Published Paper
No ratings yet
Published Paper
9 pages
Pruthviraj Micor Foml
No ratings yet
Pruthviraj Micor Foml
26 pages
1822 B Deleted
No ratings yet
1822 B Deleted
38 pages
Ijst 2023 2979
No ratings yet
Ijst 2023 2979
12 pages
Title: Abstract
No ratings yet
Title: Abstract
2 pages
Phase 1
No ratings yet
Phase 1
6 pages
Email Classification with Machine Learning
No ratings yet
Email Classification with Machine Learning
22 pages
Presentation 3
No ratings yet
Presentation 3
13 pages
ML Lab
No ratings yet
ML Lab
13 pages
Vishal FOML Micro Project Vishal & Milan
No ratings yet
Vishal FOML Micro Project Vishal & Milan
26 pages
Email
No ratings yet
Email
27 pages
Final PPT
No ratings yet
Final PPT
18 pages
Spam Filter Project Report Logistic Regression
No ratings yet
Spam Filter Project Report Logistic Regression
10 pages
Final Report Spam Classifier
100% (1)
Final Report Spam Classifier
24 pages
Project 2
No ratings yet
Project 2
10 pages
DSP Report Taashif 22347 Aman 22035 Vivek 22373 Emailspamdetection
No ratings yet
DSP Report Taashif 22347 Aman 22035 Vivek 22373 Emailspamdetection
3 pages
Abhishek Mini Proj . File
No ratings yet
Abhishek Mini Proj . File
19 pages
Spam Detection for CS Students
No ratings yet
Spam Detection for CS Students
29 pages
Zoom
No ratings yet
Zoom
20 pages
Report
No ratings yet
Report
6 pages
E-Mail Spam Detection
No ratings yet
E-Mail Spam Detection
8 pages
Spam Detection System Using ML Techniques
No ratings yet
Spam Detection System Using ML Techniques
16 pages
Report
No ratings yet
Report
11 pages
Lab 3 Write Up
No ratings yet
Lab 3 Write Up
2 pages
AI Phase4
No ratings yet
AI Phase4
11 pages
Document
No ratings yet
Document
11 pages
Spam Mail Classifier
No ratings yet
Spam Mail Classifier
8 pages
Spam Email Classifier - Ramsanjay
No ratings yet
Spam Email Classifier - Ramsanjay
2 pages
Machine Learning Based Classification For Spam Detection
No ratings yet
Machine Learning Based Classification For Spam Detection
14 pages
Email Spam Detection for Engineers
No ratings yet
Email Spam Detection for Engineers
4 pages
Spam Detection Using ML & NLP
No ratings yet
Spam Detection Using ML & NLP
2 pages
Ai Project
No ratings yet
Ai Project
8 pages
02 JCCE2202192 Online
No ratings yet
02 JCCE2202192 Online
5 pages
E-Mail Spam Classification Via Machine Learning and Natural Language Processing
No ratings yet
E-Mail Spam Classification Via Machine Learning and Natural Language Processing
7 pages
Data Science Report
No ratings yet
Data Science Report
33 pages
ML Project - Classifying Spam Emails
No ratings yet
ML Project - Classifying Spam Emails
3 pages
Machine Learning for Spam Detection
No ratings yet
Machine Learning for Spam Detection
14 pages
Evaluation and Comparison of Machine Learning Models For Ham and Spam Email Classification
No ratings yet
Evaluation and Comparison of Machine Learning Models For Ham and Spam Email Classification
13 pages
1822 B Deleted Merged Cropped
No ratings yet
1822 B Deleted Merged Cropped
40 pages
Final Report (Saie)
No ratings yet
Final Report (Saie)
38 pages
Copy of Abstract
No ratings yet
Copy of Abstract
1 page
International Journal of Research Publication and Reviews: Spam Call Detection Using Machine Learning
No ratings yet
International Journal of Research Publication and Reviews: Spam Call Detection Using Machine Learning
9 pages
Anti Spam
No ratings yet
Anti Spam
26 pages
Aryan Blackbook 1
No ratings yet
Aryan Blackbook 1
29 pages
AI-Enabled Email Classiciation Spam Detection (RP)
No ratings yet
AI-Enabled Email Classiciation Spam Detection (RP)
6 pages
ML Algorithms for Spam Detection
No ratings yet
ML Algorithms for Spam Detection
10 pages
AI Phase2
No ratings yet
AI Phase2
42 pages
Research Paper 6
No ratings yet
Research Paper 6
9 pages
Email Spam Detection Using Machine Learning
No ratings yet
Email Spam Detection Using Machine Learning
2 pages
AI Phase1
No ratings yet
AI Phase1
7 pages
Emaill Classification - RNN and BiLSTM - 1
No ratings yet
Emaill Classification - RNN and BiLSTM - 1
6 pages
Fraud Detection in Mobile Payment Systems Using An Xgboost Based Framework
No ratings yet
Fraud Detection in Mobile Payment Systems Using An Xgboost Based Framework
19 pages
MOvie Recommendation System Project Report
No ratings yet
MOvie Recommendation System Project Report
30 pages
AFRICDSA Certified Data Scientist Syllabus - V1.2
No ratings yet
AFRICDSA Certified Data Scientist Syllabus - V1.2
12 pages
7 Cyberssyll
No ratings yet
7 Cyberssyll
14 pages
Khatri 2020
No ratings yet
Khatri 2020
4 pages
Internet of Things IoT-Enabled Machine Learning Models For Efficient Monitoring of Smart Agriculture
No ratings yet
Internet of Things IoT-Enabled Machine Learning Models For Efficient Monitoring of Smart Agriculture
17 pages
PlantCare: AI for Crop Disease Management
No ratings yet
PlantCare: AI for Crop Disease Management
6 pages
AI-Based Non-Intrusive Load Monitoring
No ratings yet
AI-Based Non-Intrusive Load Monitoring
24 pages
Week 3 Report
No ratings yet
Week 3 Report
37 pages
Human vs Machine Learning Explained
No ratings yet
Human vs Machine Learning Explained
34 pages
Analysis of Student Performance Data - EDA, KNN Classification, and K-Means Clustering
No ratings yet
Analysis of Student Performance Data - EDA, KNN Classification, and K-Means Clustering
6 pages
PPT
0% (1)
PPT
15 pages
Machine Learning (Important QS) - Young Researchers
No ratings yet
Machine Learning (Important QS) - Young Researchers
81 pages
ML Cheatsheet 1
No ratings yet
ML Cheatsheet 1
3 pages
Chapter 2 Literature Review
No ratings yet
Chapter 2 Literature Review
23 pages
ML Mindbenders: Interview Questions That'll Make You Sweat (Smartly) !
No ratings yet
ML Mindbenders: Interview Questions That'll Make You Sweat (Smartly) !
21 pages
ML Lecture 8 - Clustering
No ratings yet
ML Lecture 8 - Clustering
41 pages
UAV-based Approach To Extract Topographic and As-Built Information by Utilising The OBIA Technique
No ratings yet
UAV-based Approach To Extract Topographic and As-Built Information by Utilising The OBIA Technique
21 pages
Guarding Barlow Twins Against Overfitting With Mixed Samples
No ratings yet
Guarding Barlow Twins Against Overfitting With Mixed Samples
17 pages
Supervised Machine Learning-Adi
No ratings yet
Supervised Machine Learning-Adi
51 pages
ML Assignment Questions SyllabusWise 2
No ratings yet
ML Assignment Questions SyllabusWise 2
3 pages
Youtube Engagement Analytics Via Deep Multimodal Fusion Model
No ratings yet
Youtube Engagement Analytics Via Deep Multimodal Fusion Model
25 pages
Data Warehouse
No ratings yet
Data Warehouse
3 pages
Phishing Detection Using Machine Learning
No ratings yet
Phishing Detection Using Machine Learning
9 pages
Detecting Parkinson's with XGBoost
No ratings yet
Detecting Parkinson's with XGBoost
6 pages
E-Commerce Customer Segmentation
No ratings yet
E-Commerce Customer Segmentation
7 pages
Session 2 Intro AI ML ITiE
No ratings yet
Session 2 Intro AI ML ITiE
23 pages
Deep Learning Models A Practical Approach For Hands-On Professionals (Jonah Gamba)
No ratings yet
Deep Learning Models A Practical Approach For Hands-On Professionals (Jonah Gamba)
211 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
98 pages

Project2 Report

Uploaded by

Project2 Report

Uploaded by

Faculty of Engineering and Technology

Department of Electrical and Computer Engineering

Instructor: Dr. Ismail Khater

 Accuracy: Measures the overall correctness of the classifier's predictions.

 Precision: Indicates the proportion of correctly predicted spam emails

 Recall: Represents the proportion of correctly predicted spam emails

 F1-score: Harmonic mean of precision and recall, providing a balanced

Figure 1: K-NN Results

Figure 2: MLP Results

To gain more insights into the classifier's performance, we constructed the

Figure 3: Confusion Martix

Figure 4: Confusion Martix of K-NN

MLP Confusion Matrix:

Figure 5: Confusion Martix of MLP

We can examine the following tactics to improve the performance of the

1- Feature Engineering: Look at other aspects that could better capture

In this report, we discussed the experiments and findings related to the

We recommend investigating feature engineering, hyperparameter tuning,

Given the dynamic nature of spam emails, ongoing study and

You might also like