Phishing Paper 2

The document presents a machine learning-based approach for detecting and preventing phishing attacks, focusing on the use of the XGBoost algorithm due to its high predictive accuracy. It discusses the importance of automated systems to identify phishing attempts effectively and compares various machine learning techniques, highlighting the superiority of XGBoost over traditional methods. The research emphasizes the need for advanced solutions to combat the increasing sophistication of phishing attacks, utilizing URL-based features and a comprehensive dataset for model training.

Uploaded by

sushmithahs060

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views6 pages

Phishing Paper 2

Uploaded by

sushmithahs060

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Machine Learning Based Approach for

Phishing Attacks Detection and Prevention

Akshay M J, Pratheeksha N Hampole, Sowparmika K H, Sushmitha H S, T P Keerthi, Pavan M
Department of Information Science and Engineering
Jawaharlal Nehru New College of Engineering, Shivamogga

Abstract—Phishing attacks remain a significant threat in the identify and prevent phishing attempts more effectively than
cybersecurity landscape, targeting individuals and rule- based systems.
organizations to steal sensitive information such as usernames,
passwords, financial data, and confidential business records. II. RELATED WORKS
These attacks are typically carried out through deceptive
emails, websites, and messages that appear legitimate but aim This section discusses the papers and methods related to the
to trick users into disclosing personal information. Given the detection and prevention of phishing attacks.
sophistication and increasing frequency of phishing schemes,
there is an urgent need for automated detection and prevention The research focuses on using machine learning (ML) models to
systems that can quickly identify and mitigate such threats. identify and prevent phishing attacks, which are a significant
This work focuses on detecting and preventing phishing attacks cybersecurity threat [1]. It presents a comparative analysis of
using machine learning, specifically leveraging the XGBoost various ML techniques, including Support Vector Machines
(Extreme Gradient Boosting) algorithm. XGBoost is a state-of-
(SVM) and XGBoost, for detecting phishing websites. The study
the-art supervised learning model known for its high predictive
accuracy and ability to handle imbalanced datasets, which is evaluates these models using performance metrics such as
often the case in phishing detection tasks. The proposed work accuracy and precision. The XGBoost algorithm is highlighted
explores the use of URL-based features, such as domain for its superior accuracy, precision, and computational efficiency
registration details, URL length, the presence of special across multiple datasets. Through rigorous experimental
characters, and other textual elements, as input to the XGBoost analysis, the findings demonstrate that XGBoost outperforms the
classifier. Support Vector Machines algorithm.
Keywords— Phishing, Legitimate, Classification, Feature The research presents a novel approach that combines feature
extracton, XGBoost. selection and hyperparameter tuning for the XGBoost
algorithm using a modified genetic algorithm (GA) [2]. It
I. INTRODUCTION addresses challenges in spam detection by reducing the
Phishing attacks represent a significant cybersecurity dimensionality of large datasets while maintaining or
challenge, leveraging social engineering tactics to improving classification performance. The proposed method
manipulate individuals into divulging sensitive information. achieves significant spam classification accuracy and a high
As these attacks become increasingly sophisticated, geometric mean using less than 10% of the total feature space.
traditional detection methods often fall short, highlighting The approach was validated against datasets that include web
the need for advanced solutions. Machine learning (ML) has spam, which exploits social engineering to lure privileged
emerged as a powerful tool in the fight against phishing, users into logging into deceptive services.
offering innovative techniques for identifying and The research explores the increasing prevalence and
mitigating these threats. Machine learning (ML) offers a sophistication of spear-phishing attacks, which are highly
promising solution to enhance phishing detection and targeted and context-specific compared to regular phishing [3].
prevention efforts. By leveraging algorithms that can learn It provides an extensive review of the literature on phishing
from data, ML models can identify patterns and anomalies and spear-phishing, differentiating between the two types of
that signify phishing attempts with greater accuracy and attacks and analyzing their processes. The study highlights the
speed than conventional methods. Techniques such as growing reliance on social engineering (SE) as an attack
supervised learning, unsupervised learning, and deep vector, focusing on targeting individuals rather than systems.
learning allow for the analysis of vast datasets, including It examines a real-world, advanced spear-phishing campaign
email content, URLs, and user behavior. Thus targeting white-collar workers in 32 countries, illustrating the
incorporating ML into phishing attack prevention sophisticated tactics used by attackers, such as creating fake
strategies not only improves the speed and accuracy of companies and job postings to lure victims into providing
threat identification but also enables adaptive responses to sensitive information.
evolving tactics used by attackers. This approach allows
organizations to stay one step ahead of cybercriminals. The research presents a novel approach for detecting phishing
XGBoost (eXtreme Gradient Boosting) is a powerful and URLs, particularly those generated by AI systems like
efficient machine learning library designed for gradient DeepPhish [4]. The system is designed to detect both AI-
boosting, a technique that combines the predictions of weak generated and human-crafted phishing URLs. PhishHaven
learners (typically decision trees) to create a strong introduces several innovative techniques, including URL
predictive model. XGBoost implements an optimized HTML Encoding for on-the-fly classification and a URL Hit
gradient boosting algorithm based on decision trees, making approach to handle shortened URLs. The ensemble-based
it suitable for both classification and regression tasks. It machine learning models are executed in parallel using a
includes advanced regularization, which helps prevent multi-threading approach, enabling real-time detection.
overfitting and enhances model generalization. Leveraging The paper introduces a novel solution for detecting phishing
ML techniques allows for the analysis of large datasets, attacks by analyzing the visual similarity between web pages
including email content, URLs, and website behavior, to [5]. The authors argue that traditional phishing detection
methods, which rely on URL analysis or blacklists, are This research presents a multi-agent intelligent system for
insufficient, as attackers can easily modify these features detecting and preventing phishing attacks and malicious
to evade detection. scripts using machine learning [12]. It incorporates four
This work focuses on leveraging machine learning agents: a monitoring agent for URL extraction, two decision-
techniques to detect phishing websites, which aim to steal making agents utilizing classifiers (SVM and ANN), and an
sensitive user information through deceptive means [6]. The action-performing agent to block malicious pages or scripts.
proposed approach employs supervised machine learning The system employs the Extensible Messaging and Presence
algorithms—Random Forest and Decision Tree—to classify Protocol (XMPP) for agent communication and integrates
URLs as either phishing or legitimate. The study utilizes a features like URL length, IP addresses, and DNS records for
dataset of 10,000 URLs (5,000 phishing and 5,000 analysis.
legitimate), with features extracted from three categories: This research explores the persistent cybersecurity threat of
URL-based attributes (e.g., length, depth), domain-based phishing attacks, detailing their various forms, including spear
attributes (e.g., DNS records, domain age), and phishing, vishing, and smishing [13]. It explains how attackers
HTML/JavaScript-based attributes (e.g., iframe redirection, exploit social engineering techniques to manipulate users into
right-click disablement). divulging personal information or installing malware. Specific
The research explores innovative approaches to phishing attention is given to emerging threats like ransomware,
website detection using multimodal learning techniques [7]. banking trojans, and cryptojacking. The article also outlines
It introduces an Adversarial Website Generation (AWG) the stages of phishing attacks, from preparation to execution.
framework, leveraging Generative Adversarial Networks This work investigates a novel adversarial hiding approach
(GANs) and transfer-based black-box attacks to simulate aimed at evading phishing detection mechanisms within the
real-world phishing attacks. The study assesses 15 learning- Ethereum blockchain ecosystem [14]. It explores how
based models, including machine learning (ML), deep malicious actors exploit Ethereum’s decentralized
learning (DL), ensemble learning (EL), and multimodal infrastructure and smart contract features to conceal phishing
models (MM), focusing on their resistance to adversarial activities from detection systems. The proposed approach
examples (AEs). It also proposes defense strategies such as leverages adversarial techniques, such as obfuscation and
adversarial training to enhance model robustness against manipulation of transaction patterns, to bypass traditional and
phishing and adversarial websites. machine learning-based phishing detection algorithms.
The work presents a novel dataset specifically designed to
address the increasing sophistication and prevalence of This study provides a comprehensive benchmarking and
phishing attacks [8]. The authors emphasize that phishing evaluation of phishing detection techniques, addressing their
kits, which are prepackaged software used to create phishing effectiveness in meeting modern security requirements [15]. It
websites, pose a significant threat due to their ease of use and surveys state-of-the-art methodologies, highlighting their
distribution. These kits allow attackers with minimal strengths, limitations, and practical applicability in real-world
technical knowledge to launch effective phishing campaigns. scenarios. By analyzing a wide range of phishing detection
The PhiKitA dataset aims to enhance the identification and algorithms and systems, the study identifies critical gaps in
understanding of phishing websites by providing a existing research and offers insights into improving detection
comprehensive collection of phishing kits and associated accuracy, scalability, and robustness.
metadata.
The work addresses the challenge of detecting social This study explores the vulnerabilities of machine learning-
semantic attacks, a subset of social engineering attacks [9]. based web phishing classifiers to evasion attacks, where
These attacks exploit human behavioral and psychological adversaries manipulate inputs to bypass detection systems
vulnerabilities by creating deceptive elements such as URLs [16]. It provides an in-depth analysis of various evasion attack
or webpages that mimic legitimate ones. The study focuses strategies, such as feature manipulation, adversarial examples,
on detecting malicious URLs associated with four types of and mimicry techniques, which undermine the effectiveness of
attacks: phishing, spamming, defacement, and malware. these classifiers. The study also reviews state-of-the-art
This research investigates the issue of phishing attacks and defense mechanisms, including adversarial training, feature
evaluates the effectiveness of various machine learning hardening, and robust model architectures, aimed at enhancing
algorithms in detecting them [10]. It systematically analyzes the resilience of phishing detection systems.
five algorithms— Logistic Regression, Support Vector
This work addresses the challenge of detecting new forms of
Machine (SVM), K-Nearest Neighbors (K-NN), Naive
phishing attacks that evade traditional filters [17]. It introduces
Bayes, and Extreme Gradient Boosting (XGBoost)—using a
large dataset of URLs. The paper likely explores and SAFE-PC, a machine learning system for phishing email
evaluates various machine learning techniques for detecting detection. SAFE-PC extracts and processes email features,
phishing attacks. Phishing, a social engineering technique, leveraging techniques like Natural Language Processing and
tricks users into divulging sensitive information by Named Entity Recognition. It uses an ensemble classifier for
mimicking trusted entities. high detection accuracy.
This work presents a novel hybrid two-level framework The study investigates the vulnerabilities of machine learning
designed to enhance the detection of phishing websites by (ML) models in detecting malicious advertisement URLs [18].
optimizing feature selection and XGBoost model The study develops a framework leveraging lexical and
performance [11]. The framework integrates advanced webscrapped features for ML-based classification and
feature selection techniques to identify the most relevant clustering. It evaluates four ML models (Random Forest,
attributes, reducing noise and computational overhead.
Gradient Boost, XGBoost, and AdaBoost) for their detection
Subsequently, it applies an iterative hyperparameter tuning
accuracy and robustness against adversarial attacks,
process to fine-tune the XGBoost classifier for improved
accuracy and robustness. specifically the Zeroth Order Optimization (ZOO) attack.
III. PROPOSED APPROACH • Depth
The Proposed Approach mainly deals with two modules • Redirection “//”
which are Feature Extraction of the URLs and Training
Machine Learning Algorithms. • “HTTP/HTTPS” in Domain name
• Using URL Shortening Services
“Tiny URL”
• Prefix or Suffix “-” in Domain
There are total of 30 features taken and the data set
is shuffled and this data set is used to train the model.
Fig. 2 represents a correlation heat map that shows how
the features of the data set are related to each other.

Fig. 1. Architecture of Proposed System

Figure 1 represents the architecture diagram. In this, the

first algorithm is trained with base data set which is used as
training data and the data which is taken from the web traffic
acts as input for the feature extraction which is done mainly
on three types of features URL based, domain-based,
Html/JS-based features and this feature extracted data acts as
testing data and this machine learning model is exposed to
API and the prediction will be done and output is generated
as phishing or legitimate. If it is phishing then we should
block the website and if the output is legitimate then we
should allow it.
A. Data Set Description Fig. 2. Correlation Heat Map

The data set consists of 88,647 URLs which is 30,647 are

phishing and 58,000 are legitimate.Phishing URLs are C. Methodology
collected from the Kaggle web source.
The proposed methodology begins with the collection of data
-All the phishing URLs are labeled as "-1".
which contains the information like ‘s’ at the end of https, long
-All the legitimate URLs are labeled as "1".
URL, short URL and other related details that are required for
B. Feature Extraction detecting phishing in a website feature selection is then
The Features of the data set has been extracted by using performed to identify the most relevant attributes in predicting
python programming language. This model is trained based the disease effectively, later the pre-processed data is splitted
on three kinds of features those are: into training and testing of the data. 80% of the data is
considered for training and 20%of data is considered for
1. Domain-Based Features testing. The proposed methodology begins with the collection
• DNS Record of data which contains the information like ‘s’ at the end of
https, long URL, short URL and other related details that are
• Website Traffic required for detecting phishing in a website, followed by data
preprocessing preparing the raw data for analysis using various
• Age of Domain techniques like data completion, data noise reduction, data
• End Period of Domain transformation and validation, feature selection is then
performed to identify the most relevant attributes in predicting
2. HTML and JavaScript Based Features the disease effectively, later the pre- processed data is splitted
• IFrame Redirection into training and testing of the data. 80% of the data is
considered for training and 20%of data is considered for
• Status Bar Customization testing. The proposed methodology begins with the collection
of data which contains the information like ‘s’ at the end of
• Disabling Right Click https, long URL, short URL and other related details that are
• Website Forwarding required for detecting phishing in a website, followed by data
preprocessing preparing the raw data for analysis using various
3. Address Bar Based Features techniques like data completion, data noise reduction, data
• Domain transformation and validation, feature selection is then
performed to identify the most relevant attributes in predicting
• IP Address the disease effectively, later the pre-processed data is splitted
into training and testing of the data.
• “@” Symbol
• Length
80% of the data is considered for training and 20%of data is
considered for testing. The proposed methodology begins
with the collection of data which contains the information
like ‘s’ at the end of https, long URL, short URL and other
related details that are required for detecting phishing in a
website, followed by data preprocessing preparing the raw
data for analysis using various techniques like data
completion, data noise reduction, data transformation and
validation, feature selection is then performed to identify the
most relevant attributes in predicting the disease effectively,
later the pre- processed data is splitted into training and
testing of the data. 80% of the data is considered for training
and 20%of data is considered for testing.

The objective function in XGBoost is like a scorecard that Fig. 3. Implimentation of antiphishing model
helps the algorithm decide how good a model is. This To develop a robust classification model, we first fine-tune the
scorecard has two main parts: model’s settings, known as classifier parameters, to achieve
1. Loss Function: Measures how well the model’s predictions the best possible performance.
match the actual data. This is a crucial part of the objective Implementing a phishing attack detection system using the
function, as it helps the algorithm understand how to improve XGBoost (Extreme Gradient Boosting) algorithm involves
its predictions. several steps, from data collection and preprocessing to model
2.Regularization Term: This part prevents the model from evaluation and deployment. Data Collection First, you
becoming too complex and overfitting (like memorizing the need to collect a dataset that contains information about
training data rather than learning to generalize). It's like phishing and legitimate websites. You can use publicly
adding a rule that you can't use too many darts to hit the available datasets or create your own dataset.
bullseye; you need to hit it in fewer, well-placed throws.
The data needs to be cleaned and transformed into a format
•Loss Function = Measures prediction error. suitable for training the model. Identify and handle missing
•Regularization Term = Keeps the model simple to avoid values by either removing or imputing them with appropriate
overfitting. strategies. Convert raw website features (URLs, HTML
•Objective Function = The combination of these two to guide content, etc.) into meaningful numerical features. Divide your
the model building process. dataset into two sets: training and testing (commonly 80% for
training and 20% for testing). Now , train the XGBoost model.
IV. IMPLEMENTATION AND RESULTS Install the xgboost library if you haven't already and then
Python is a versatile and powerful programming language that implement the xgboost classifier. XGBoost has several
offers a wide range of tools and libraries for various purposes, hyperparameters that can be tuned for better performance. You
making it a popular choice among developers, data scientists, can use grid search or random search to find the optimal
and researchers. Following are the libraries employed in our hyperparameters. After training the model, evaluate its
project. Pandas, NumPy, SciKit-Learn, Flask, re(Regular performance on the test set using various metrics such as
Expression), urllib.parse, socket, ipaddress, TfidVectorizer, Accuracy: The ratio of correctly predicted instances to the total
Train_test_split are the most important tools that played instances. Precision is important in finding imbalanced
majority role in implementation of the XGBOOST model. datasets to evaluate the model's effectiveness in detecting
phishing URLs. Once the model is trained and evaluated, you
With the help of these tools,it has become possible to classify can deploy it as part of a larger phishing detection system.
features like longURL, shortURL, using IP or not, presense or
absence of symbol @, number of redirecting //, Subdomains Confusion Matrix:
occurences, policies of Domain Registration length, presence
of anchor URL, Website forwardings and many more into Confusion matrix(CM) is a graphical summary of the correct
phishing or legitimate for the model development. predictions and incorrect predictions that is made by a
classifier that can be used to determine the performance. In
abstract terms, the CM is as shown in the following figure.
The feature Extracted data set is taken and it is divided into
training and testing data with a ratio of 80-20 this training data
is used to train the XGBOOST model and testing data is used
to test and find the accuracy of the algorithm.

The diagram represents a workflow for a system involving

user login, data extraction, and prediction phases. The process
begins with the user registering or logging in, providing their
login ID and password. The system verifies the credentials by
looking them up in a database, allowing successful users to
proceed. Verified users are redirected to a prediction page,
where they can enter a URL for analysis. The system extracts
feature from the URL and performs predictions, providing the
results. The final output of the prediction is displayed to the
user as the last step.

Fig 4. Shows confusion matrix

V. MODEL DEPLOYMENT
From the above classification report and the confusion
matrix, it is clearly shown that XGBOOST is having a high
accuracy than other ML algorithms. so, this algorithm is
stored by using pickle for deployment. The model is deployed
as a webpage with the help of Flask API. This webpage
contains a textbox and a submit button which is developed
using Hyper Text Markup Language (HTML).
Whenever we enter a URL and click on a submit button
then this URL will be processed by the model and returns a
value as 1 or -1. If the returned value is ‘1’ then the output
is displayed as “Legitimate” and if the returned value is
‘-1’ the output is displayed as “Phishing”. Fig. 9. Shows detection of legitimate website

Admin Login page where it allows the user to enter the

username and username. The Home page where it consists of
description about website scanner. The website predictor page
which allows users to enter relevant data.The users input there
data and press “check here” to determine whether the entered
website is phishing or legitimate. website predictor page
mentioned as "Web Scanner" which allows users to enter
relevant data. The users input there data and press “check here”
Fig. 5. Shows Admin Login page to determine whether the entered website is phishing or
legitimate. The detection of legitimate website, after the
user have entered the URL in predictor page it will predict
legitimate website based on the features. The detection of
phishing website, after the user have entered the URL in
predictor page it will predict phishing website based on the
features and alerts the user that the website is unsafe and asks
the user whether they want to continue although it is unsafe.
The contact us page, where the admin details is stored like
admin name, email and contact number through which the user
can contact the admin.
Fig. 6. Shows About Us page
VI. CONCLUSION
This paper helps to develop a model by using Machine
Learning which is used to detect the phishing URL’s and warn
the user in advance. The features of the URL is extracted
which is entered by the user in the respective field and this acts
as input data for the Machine learning model. The model
process this and gives the output as to whether it is phishing or
legitimate. The algorithms that are used to build this model
is XGBOOST. After training the accuracy of the algorithm
is 97.0%.
Fig.7. Shows Web Scanner Page VII. REFERENCES
[1]. N. Ghatasheh, I. Altaharwa and K. Aldebei, "Modified
Genetic Algorithm for Feature Selection and Hyper Parameter
Optimization: Case of XGBoost in Spam Prediction," in IEEE
Access, vol. 10, pp. 84365-84383, 2022.
[2]. M. Patil, N. Shivsharan, Y. Naik, H. Yeram and A. Gawade,
"Enhancing Cybersecurity: A Comprehensive Analysis of
Machine Learning Techniques in Detecting and Preventing
Phishing Attacks with a Focus on Xgboost Algorithm," 2024
International Conference on Intelligent Systems for
Cybersecurity (ISCS), Gurugram, India, 2024.

Fig. 8. Shows detection of phishing website

[3]. L. Allodi, T. Chotza,E.Panina and N. Zannone, "The Need [18]. E.Nowroozi, Abhishek, M. Mohammadi and M. Conti,
for New Antiphishing Measures Against Spear-Phishing "An Adversarial Attack Analysis on Malicious Advertisement
Attacks," in IEEE Security & Privacy, vol. 18, no. 2, pp. 23- URL Detection Framework," in IEEE Transactions on
34, March-April 2020 Network and Service Management, vol. 20, no. 2, pp.
[4]. M. Sameen, K. Han and S. O. Hwang, "PhishHaven-An
Efficient Real-Time AI Phishing URLs Detection System," in 1332-1344, June 2023.
IEEE Access, vol. 8, pp. 83425-83443, 2020
[5]. J. Mao, W. Tian, P. Li, T. Wei and Z. Liang, "Phishing-
Alarm: Robust and Efficient Phishing Detection via Page
Component Similarity," in IEEE Access, vol. 5, pp.
17020-17030, 2017
[6]. A. Mandadi, S. Boppana, V. Ravella and R. Kavitha,
"Phishing Website Detection Using Machine Learning," 2022
IEEE 7th International conference for Convergence in
Technology (I2CT), Mumbai, India.
[7]. P. T. Duy, V. Q. Minh, B. T. H. Dang, N. D. H. Son, N. H
Quyen and V. -H. Pham, "A Study on Adversarial Sample
Resistance and Defense Mechanism for Multimodal Learning-
Based Phishing Website Detection," in IEEE Access, vol. 12,
pp. 137805-137824, 2024
[8]. F. Castaño, E. F. Fernañdez, R. Alaiz-Rodríguez and E.
Alegre, "PhiKitA: Phishing Kit Attacks Dataset for Phishing
Websites Identification," in IEEE Access, vol. 11, pp.
40779-40789, 2023
[9]. M. Almousa and M. Anwar, "A URL-Based Social
Semantic Attacks Detection With Character- Aware Language
Model," in IEEE Access, vol. 11, pp. 10654-10663, 2023,
[10]. H. Shirazi, S. R.Muramudalige, I. Ray, A. P. Jayasumana
and H. Wang, "Adversarial Autoencoder Data Synthesis for
Enhancing Machine Learning-Based Phishing Detection
Algorithms," in IEEE Transactions on Services Computing, vol.
16, no. 4, pp. 2411-2422, 1 July-Aug. 2023
[11]. L. Jovanovic et al., "Improving PhishingWebsite Detection
using a Hybrid Two-level Framework for Feature Selection and
XGBoost Tuning," in Journal of Web Engineering, vol. 22, no.
3, pp. 543- 574, May 2023.
[12]. N. Megha, K. R. Remesh Babu and E. Sherly, "An
Intelligent System for Phishing Attack Detection and
Prevention," 2019 International Conference on Communication
and Electronics Systems (ICCES), Coimbatore, India, 2019, pp.
1577-1582.
[13]. M. A. Ivanov, B. V. Kliuchnikova, I. V. Chugunkov and
A.M. Plaksina, "Phishing Attacks and Protection Against
Them," 2021 IEEE Conference of Russian Young Researchers
in Electrical and Electronic Engineering (ElConRus), St.
Petersburg, Moscow, Russia.
[14]. H. Wen, J. Fang, J. Wu and Z. Zheng, "Hide and Seek: An
Adversarial Hiding Approach Against Phishing Detection on
Ethereum," in IEEE Transactions on Computational Social
Systems, vol. 10, no. 6, pp. 3512-3523, Dec. 2023.
[15]. A. El Aassal, S. Baki, A. Das and R. M. Verma, "An In-
Depth Benchmarking and Evaluation of Phishing Detection
Research for Security Needs," in IEEE Access, vol. 8, pp.
22170-22192, 2020.
[16]. M. J. Pillai, S. Remya, V. Devika, S. Ramasubbareddy and
Y. Cho, "Evasion Attacks and Defense Mechanisms for
Machine Learning-Based Web Phishing Classifiers," in
IEEE Access, vol. 12, pp. 19375-19387, 2024, doi:
10.1109/ACCESS.2023.
[17]. C.N.Gutierrez et al.,"Learning from the Ones that Got
Away: Detecting New Forms of Phishing Attacks," in IEEE
Transactions on Dependable and Secure Computing, vol. 15,
no. 6, pp. 988-1001, 1 Nov.-Dec. 2018.

Machine Learning For Detecting The Phishing Threats
No ratings yet
Machine Learning For Detecting The Phishing Threats
6 pages
20mis0106 VL2023240102875 Pe003
No ratings yet
20mis0106 VL2023240102875 Pe003
42 pages
Paper 1
No ratings yet
Paper 1
5 pages
Phishing Url Detection Research PDF
No ratings yet
Phishing Url Detection Research PDF
9 pages
Sufficiency of Ensemble Machine Learning Methods For Phishing Websites Detection
No ratings yet
Sufficiency of Ensemble Machine Learning Methods For Phishing Websites Detection
11 pages
Tittle of The Project
No ratings yet
Tittle of The Project
1 page
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
No ratings yet
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
6 pages
Ins Research Paper New
No ratings yet
Ins Research Paper New
6 pages
Phishing Detection via Machine Learning
No ratings yet
Phishing Detection via Machine Learning
15 pages
Vol +31+no +5+ (2025) +-+23
No ratings yet
Vol +31+no +5+ (2025) +-+23
9 pages
Enhancing Phishing URL Detection Through Comprehen
No ratings yet
Enhancing Phishing URL Detection Through Comprehen
7 pages
Real Time Phishing Website Detectionusing ML
No ratings yet
Real Time Phishing Website Detectionusing ML
4 pages
Ensemble ML for Phishing Detection
No ratings yet
Ensemble ML for Phishing Detection
11 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
7 pages
Phishing Detection with XGBoost
No ratings yet
Phishing Detection with XGBoost
20 pages
Phishing Detection in Email Using Deep Learning
No ratings yet
Phishing Detection in Email Using Deep Learning
8 pages
AI in Phishing Detection and Prevention
No ratings yet
AI in Phishing Detection and Prevention
10 pages
Paper 5665
No ratings yet
Paper 5665
117 pages
JETIR2504A41
No ratings yet
JETIR2504A41
7 pages
Detection of Phishing WebsitesUsing Random Forest and XGBOOST
No ratings yet
Detection of Phishing WebsitesUsing Random Forest and XGBOOST
14 pages
Employing Machine Learning Algorithms To Detect Phishing URL Websites
No ratings yet
Employing Machine Learning Algorithms To Detect Phishing URL Websites
6 pages
Applsci 13 04649
No ratings yet
Applsci 13 04649
16 pages
Major Project Final Report
No ratings yet
Major Project Final Report
53 pages
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
No ratings yet
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
4 pages
Edited Phishing Domains Detection Using Deep Learning
No ratings yet
Edited Phishing Domains Detection Using Deep Learning
11 pages
Final Paper On Phishing Domains Detection Using Deep Learning
No ratings yet
Final Paper On Phishing Domains Detection Using Deep Learning
11 pages
Paper 2
No ratings yet
Paper 2
10 pages
Fake Url
No ratings yet
Fake Url
64 pages
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
No ratings yet
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
8 pages
20mis0106 VL2023240103172 Pe003
No ratings yet
20mis0106 VL2023240103172 Pe003
5 pages
PHISHNET Multi Algorithmic Safety Net For Advanced Phishing URL Detection
No ratings yet
PHISHNET Multi Algorithmic Safety Net For Advanced Phishing URL Detection
8 pages
PhishNotCloud-Based ML
No ratings yet
PhishNotCloud-Based ML
11 pages
Paper Major1
No ratings yet
Paper Major1
6 pages
Project
No ratings yet
Project
12 pages
TSP CMC 51778
No ratings yet
TSP CMC 51778
21 pages
Paper 6
No ratings yet
Paper 6
6 pages
SafeSurf Enhancing Web Security
No ratings yet
SafeSurf Enhancing Web Security
16 pages
Automated Phishing Detection Through URL Analysis and Machine Learning
No ratings yet
Automated Phishing Detection Through URL Analysis and Machine Learning
9 pages
CT43B0513 Ieee
No ratings yet
CT43B0513 Ieee
6 pages
Phishing Detection: Literature Review
No ratings yet
Phishing Detection: Literature Review
5 pages
A Comparative Analysis of Different Feature Set On The Performance of Different Algorithms in Phishing Website Detection
No ratings yet
A Comparative Analysis of Different Feature Set On The Performance of Different Algorithms in Phishing Website Detection
7 pages
Phishdect: An Optimised Deep Neural Network Algorithm For Detecting Phishing Attacks in Online Platform
No ratings yet
Phishdect: An Optimised Deep Neural Network Algorithm For Detecting Phishing Attacks in Online Platform
7 pages
1 PB
No ratings yet
1 PB
11 pages
Phishing Detection with Decision Trees
No ratings yet
Phishing Detection with Decision Trees
7 pages
A Machine Learning Based Approach For Phishing Detection Using
No ratings yet
A Machine Learning Based Approach For Phishing Detection Using
14 pages
Batch-5 Journal-6 ECE-D New
No ratings yet
Batch-5 Journal-6 ECE-D New
6 pages
Abedin 2020
No ratings yet
Abedin 2020
6 pages
A Sophisticated Framework For The Accurate Detection of Phishing Websites
No ratings yet
A Sophisticated Framework For The Accurate Detection of Phishing Websites
23 pages
Enhanced Phishing Website Detection: Leveraging Random Forest and XGBoost Algorithms With Hybrid Features
No ratings yet
Enhanced Phishing Website Detection: Leveraging Random Forest and XGBoost Algorithms With Hybrid Features
4 pages
Applsci 13 08756 v2
No ratings yet
Applsci 13 08756 v2
19 pages
Avanti Kumari - A Report
No ratings yet
Avanti Kumari - A Report
39 pages
1229-Article Text-12170-1-10-20250203-2
No ratings yet
1229-Article Text-12170-1-10-20250203-2
13 pages
ML for Phishing Detection
No ratings yet
ML for Phishing Detection
8 pages
Phishing Attacks Detection A Machine Learning-Based Approach
No ratings yet
Phishing Attacks Detection A Machine Learning-Based Approach
6 pages
Final PPT - Phishing Website
100% (2)
Final PPT - Phishing Website
23 pages
Final Research Paper
No ratings yet
Final Research Paper
6 pages
Phishing Detection via ML Stacking
No ratings yet
Phishing Detection via ML Stacking
16 pages
Mandadi 2022
No ratings yet
Mandadi 2022
4 pages
Batch-5 ECE-D
No ratings yet
Batch-5 ECE-D
4 pages
RRSV Torts
100% (1)
RRSV Torts
14 pages
Week One Acct 444 Multiple
No ratings yet
Week One Acct 444 Multiple
12 pages
Corporate Branding Solutions
No ratings yet
Corporate Branding Solutions
32 pages
Manage Meetings Assessment Guide
No ratings yet
Manage Meetings Assessment Guide
11 pages
RMI Water Resources and Quality Monitoring
No ratings yet
RMI Water Resources and Quality Monitoring
31 pages
Mumbai
No ratings yet
Mumbai
7 pages
Additive Manufacturing Essentials - WEB
No ratings yet
Additive Manufacturing Essentials - WEB
179 pages
Employees' Compensation Act, 1973
No ratings yet
Employees' Compensation Act, 1973
15 pages
Constitution-Trade, Commerce and Intercourse
No ratings yet
Constitution-Trade, Commerce and Intercourse
12 pages
Audacity
No ratings yet
Audacity
6 pages
2025 Exams Time Table Updated First Draft - 035601
No ratings yet
2025 Exams Time Table Updated First Draft - 035601
3 pages
SR-2000 Manuale Completo
No ratings yet
SR-2000 Manuale Completo
118 pages
BDP Content
No ratings yet
BDP Content
4 pages
11) Building Code of Pakistan
No ratings yet
11) Building Code of Pakistan
267 pages
Liquidity Management and Profitability: A Case Study of Listed Manufacturing Companies in Sri Lanka
100% (1)
Liquidity Management and Profitability: A Case Study of Listed Manufacturing Companies in Sri Lanka
5 pages
Emaar South: Luxury Living in Dubai
No ratings yet
Emaar South: Luxury Living in Dubai
1 page
ER Model and Database Design
No ratings yet
ER Model and Database Design
40 pages
Report 3 PDF
No ratings yet
Report 3 PDF
55 pages
Project Report Of: Leo Multiple District 306, Sri Lanka
No ratings yet
Project Report Of: Leo Multiple District 306, Sri Lanka
16 pages
SVAN 956 Short Manual
No ratings yet
SVAN 956 Short Manual
3 pages
Cathode Materials
No ratings yet
Cathode Materials
8 pages
UCSD UPS Program Orientation Guide
No ratings yet
UCSD UPS Program Orientation Guide
4 pages
Financial Literacy Impact on ABM Students
No ratings yet
Financial Literacy Impact on ABM Students
62 pages
Dissertation School Uniform
100% (2)
Dissertation School Uniform
5 pages
Quanti
No ratings yet
Quanti
6 pages
India Tractor Demand Forecasting
No ratings yet
India Tractor Demand Forecasting
23 pages
Job Satisfaction
No ratings yet
Job Satisfaction
3 pages
EC6201D-Digital Integrated Circuit Design Dhanaraj K. J. Assistant Professor ECED, NIT Calicut
No ratings yet
EC6201D-Digital Integrated Circuit Design Dhanaraj K. J. Assistant Professor ECED, NIT Calicut
13 pages
Corporate Crime
No ratings yet
Corporate Crime
4 pages
School Client Feedback Survey Form
No ratings yet
School Client Feedback Survey Form
6 pages

Phishing Paper 2

Uploaded by

Phishing Paper 2

Uploaded by

Machine Learning Based Approach for

Phishing Attacks Detection and Prevention

Fig. 1. Architecture of Proposed System

Figure 1 represents the architecture diagram. In this, the

The data set consists of 88,647 URLs which is 30,647 are

The diagram represents a workflow for a system involving

Fig 4. Shows confusion matrix

Admin Login page where it allows the user to enter the

Fig. 8. Shows detection of phishing website

You might also like