0% found this document useful (0 votes)

38 views6 pages

Anomaly-Based Network Intrusion Detection System

This paper presents an anomaly-based network intrusion detection system (NIDS) that combines feature selection, K-Means clustering, and XGBoost classification to detect attacks using the NSL-KDD dataset. The proposed model achieves an accuracy of 84.41%, a detection rate of 86.36%, and a false alarm rate of 18.20%, outperforming other machine learning models. The system effectively reduces the number of features from 122 to 75 through a feature selection method, enhancing performance while maintaining computational efficiency.

Uploaded by

marufhossain01927

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views6 pages

Anomaly-Based Network Intrusion Detection System

Uploaded by

marufhossain01927

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2018 Sixteenth International Conference on ICT and Knowledge Engineering

Anomaly-Based Network Intrusion Detection System

through Feature Selection and Hybrid Machine
Learning Technique
Apichit Pattawaro Chantri Polprasert
Master of Science in Information Technology, Master of Science in Information Technology,
Department of Computer Science, Faculty of Science Department of Computer Science, Faculty of Science
Srinakharinwirot University, Bangkok, Thailand Srinakharinwirot University, Bangkok, Thailand
[email protected] [email protected]

Abstract—In this paper, we propose an anomaly-based unusual traffic statistics or pattern. This system could be used
network intrusion detection system based on a combination of to detect zero-day attack and results from the discovery of the
feature selection, K-Means clustering and XGBoost classification attack could be added into the database for future detection
model. We test the performance of our proposed system over through misused-based. However, an anomaly-based system
NSL-KDD dataset using KDDTest+ dataset. A feature selection suffers from high false alarm rate since much normal traffic
method based on attribute ratio (AR) [14] is applied to construct exhibiting unusual behavior could trigger the alarm of the
a reduced feature subset of NSL-KDD dataset. After applying K- system. In practical system, a hybrid of the misused-based and
Means clustering, hyperparameter tuning of each classification anomaly-based is usually employed to mitigate the impact of
model corresponding to each cluster is implemented. Using only 2
both zero-day attack and high false-alarm rate.
clusters, our proposed model obtains accuracy equal to 84.41%
with detection rate equal to 86.36% and false alarm rate equal to Recently, anomaly-based NIDS employing machine
18.20% for KDDTest+ dataset. The performance of our proposed learning techniques has gained widespread attentions. A
model outperforms those obtained using the recurrent neural number of classification models e.g. KNN [3], genetic
network (RNN)-based deep neural network and other tree-based algorithms [4], Random Forest (RF) [5,6], Support Vector
classifiers. In addition, due to feature selection, our proposed Machine (SVM) [7] or Extreme Gradient Boosting (XGBoost)
model employs only 75 out of 122 features (61.47%) to achieve [8] have been used to differentiate between normal or
this level of performance comparable to those using full number suspicious traffic. However, feasibility of this approach is still
of features to train the model.
limited due to poor detection performance. This could be due
Keywords—Hybrid Clustering and Classification, NSL-KDD,
many reasons such as diverse nature of traffic, imbalance
network security traffic classes and ineffective feature selection processes. To
overcome this limitation, a number of papers utilize deep
I. INTRODUCTION neural network (DNN) methods such as recurrent neural
network (RNN) [9], long short-term memory (LSTM) [10] and
Network Intrusion Detection Systems (NIDS) [1, 2] play a convolutional neural network (CNN) [11] for anomaly-based
crucial role in the protection of computer systems from NIDS. Even though the DNN approaches exhibit enhanced
network-based malicious attacks that could disrupt the services detection performance, it requires significant amount of
of the system. Providing powerful and robust NIDS is a very training time and large volume of effective train dataset.
challenging task due to several factors. For instance, with the
growth of the Internet traffic which consisting of a variety of A hybrid approach for anomaly-based NID [12] is another
data type traversing the network, NIDS must be able to analyze promising alternative. By combining traditional ML model
these huge volume of traffic and differentiate between normal together, significant performance improvement in terms of
and malicious behavior with acceptable accuracy. Typically, accuracy, precision, false alarm rate (FAR) is exhibited with
NIDS systems are broadly categorized into 2 types consisting acceptable additional computational complexity.
of 1) misuse-based system (sometimes called signature-based)
In this paper, we focus on NID’s binary classification
2) anomaly-based system. problem where the system differentiates between normal or
A misuse-based NIDS system relies on an extensive attack activities. We propose an anomaly-based network
database of attack signatures. Each signature is a set of rules intrusion detection system based on a combination of feature
corresponding to intrusion attacks occurred in the past. selection, K-Means clustering and XGBoost classification
Therefore, with up-to-date NIDS signature database, a model. The reason behind selection of XGBoost classifier
misused-based system is very powerful in detecting past model is due to its strong performance, a variety of
attacks. However, this system is vulnerable to zero-day attack hyperparameter selection, fast implementation and popularity
and long processing time. An anomaly-based IDS system among machine learning communities. We test the
detects attacks in the computer system through observing performance of our proposed system over NSL-KDD dataset
[13] using KDDTest+ dataset. A feature selection method based
978-1-5386-7159-7/18/$31.00 ©2018 IEEE
on Attribute Ratio (AR) [14] is applied to construct a reduced 1) Denial-of-Service(DoS):This type of attack overwhelms
feature subset of NSL-KDD dataset. After applying K-Means the targets’ resources (Network, CPU or Memory) so that
clustering, hyperparameter tuning of each classification model typical operations cannot be performed as expected. Examples
corresponding to each cluster is implemented. Using only 2 of this attack include sending huge number of packets to the
clusters, our proposed model obtains the best accuracy equal to targeted server so that normal users cannot access.
84.41%, TPR equal to 86.36%, FPR = 18.20 and AUC=0.922
for KDDTest+ dataset. In addition, the performance of our 2) Probe:This type of attack involves port scanning to
proposed model outperforms those obtained from the RNN- identify vulnerabilities in computer systems for further attacks.
based deep neural network and other tree-based classifiers.
3) Root to Local (R2L): The attackers try to access the
Moreover, due to feature selection, our proposed model
employs only 75 out of 122 features (61.47%) to achieve this unauthorized computer resources in order to destroy or modify
level of performance comparable to those using full number of operations of the targeted computer systems
features to train the model. 4) Unauthorized to Root (U2R): The attackers try to gain
The remainder of this paper is organized as follows. Section accesses to unauthorized resources using root privileges.
2 discusses NSL-KDD dataset. The proposed system is Table 1. Type of Attacks
explained in Section 3. Results are evaluated and discussed in
Section 4 and our findings in this paper is summarized in Category Attacks
Section 5
Neptune, pod, smurf, teardrop,
2. NSL-KDD DATASET DoS process
(Denial of Service) table,warezmaster,apache2, mail
Previously, KDD-Cup 99 dataset [15] has been widely bomb, back
used to test the performance of an anomaly-based intrusion multihop, http tunnel,
detection system. However, researchers [15]pointed out 2 Probe ftp_write, root kit, ps
critical issues based on statistical analysis of the dataset buffer overflow, xterm
leading to over-simplistic prediction results. To circumvent
R2L named, snmpgetattack,xlock,
this problem, they proposed NSL-KDD dataset which has the (Root to Local ) send mail, guess_passwd
following advantages compared to KDD-Cup 99 as follows.
1) In NSL-KDD, many redundant and duplicate data
ipsweep, nmap,
encountered in KDD-Cup 99 are removed from the datasets. U2R
port sweep, satan,
(Unauthorized to Root)
2) To achieve more accurate evaluation of different mscan, saint
learning techniques, the number of selected records from each
difficulty-level group is inversely proportional to the
percentage of records in the original KDD dataset.
Table2.Ratio of Normal/Attack in each type of NSL-KDD dataset
NSL-KDD dataset is categorized into 4 separate dataset
including: Dataset Classify Total Normal Anomaly
1) KDDTrain+: This is the overall train dataset consisting
Number 125973 67343 58630
of 125, 973 recordsKDDTrain+ 20Percent: This is the train KDDTrain+
dataset consisting of only 20% of the total train dataset and has Percent 100% 53.46% 46.54%
25,192 records. Number 22544 9711 12833
2) KDDTrain+20Percent: This is the train dataset KDDTest+
consisting of only 20% of the total train dataset and has 25,192 Percent 100% 43.08% 56.92%
records. Number 11850 4342 7508
KDDTest-21
3) KDDTest+: This is the test dataset consisting of 22,544
Percent 100% 36.64% 63.36%
records.
4) KDDTest-21: This dataset consists of 11,850 records. NSL-KDD dataset contains 41 features categorizing into 3
This dataset is obtained by applying 21 machine learning types consisting of 3 nominal features, 6 binary features and
model on KDDTest+ dataset to predict the label of the dataset 32 numeric features [13].
(Normal/Attack) and those that are accurately predicted by all
21 models are discarded from the dataset.
3. METHODS
Table 2 lists the ratio of normal/attack for each type of Figure 1 illustrates the block diagram of the proposed NIDS
dataset in NSL-KDD. The NSL-KDD categorizes attacks into model. We employ KDDTrain+ dataset to train the proposed
4 types consisting of Denial-of-Service, Probe, Root to Local ML model and evaluate its performance in terms of accuracy,
and Unauthorized to Root as presented in Table 1. Each type AUC, precision and recall using KDDTest+ dataset.
of attack can be explained as follows:
,
()= (2)
,

whereAVGi,j=Ci,j/Ni,j. Ci,j is the sum of the jth feature

corresponding to ith label and Ni,j is the number of records of
the jth features corresponding to the ith label.

∑ ,
, = (3)

th th
is the sum of j feature divided by the total number of j feature
th
(Nj). For j binary feature, CRi(j) can be written as

(1) ,
( )= (4)
(0) ,

where Freq(1)i,j is the number of ithrecords whose jth feature

is equal to one and Freq(0)i,j is the number of ith records
whose jth feature is equal to zero. Figure 2 displays top ten
highest-important features obtained from Eq. (1). Features
whose AR values less than 0.01 are removed from the
analysis. The threshold 0.01 is judiciously selected to obtain
the best performance with acceptable computational
Figure1.Block diagram of the proposed NIDS model. complexity. After applying feature selection method with
threshold equal to 0.01, only 75 out of 122 features (61.47%)
3.1) Data Preprocessing are left to be used to train the model.
There are three sub-processes within the Data
Preprocessing consisting of One-Hot Encoding, Scaling and
Feature Selection

3.1.1) One Hot Encoding and Scaling

We exercise One-Hot Encoding to transform 3 nominal
features listed as protocal_type, service and flag into 84 binary
features (protocol has 3 features, service has 70 features and
flag has 11 features). In summary, after One-Hot Encoding
process, there are 122 features entering Normalization process.
During Normalization process, we scale the dataset so that the
mean of every feature is equal to zero and standard deviation
is equal to one.

3.1.2) Feature Selection

To enhance model efficiency, reduce computational Figure2.A list of top ten highest-important features.
complexity and remove irrelevant features, we implement
feature selection based on calculating the average AR. This
value will be used to determine the feature importance of 3.1.3) Clustering
every feature and its calculation can be explained as follows. The main objective of applying K-Means clustering to
In AR approach, we employ attribute average and frequency NSL-KDD dataset is to group a set of normal and attack traffic
for numeric and binary features, respectively. The AR of the jth that exhibit similar pattern into the same partitions. Then, ML
feature AR (j) can be calculated as model corresponding to each partition is trained to
differentiate normal or attack data within that group. To
determine the number of clusters (K), we implement K-Means
AR(j) = ∈[ , ]
CR (j) (1)
clustering on NSL-KDD dataset using different number of
clusters and evaluate performance using sum square error
where CRi(j) is Class Ratio of the jth feature of the ith label( i∈
(SSE). Figure 3 shows SSE of K-Means clustering over a
[0,4] where i = 0 for Normal, i=1 for DoS, i=2 for Probe, i=3
range of K where random_state is equal to 3425. From the
for R2L and i=4 for U2R class). For jth numeric feature,
figure, SSE yields highest drop (23.89%) when K is increased
CRi(j) can be expressed as
from 1 to 2. SSE gradually decreases for higher values of K
(SSE drop is reduced to 9.30% when K is increased from 2 to KDDTrain+ dataset to train our model and evaluate the
3). In addition, no significant numbers of records are presented in performance over KDDTest+ dataset. Our hybrid model
a new group when K is greater than 10. employs feature selection, K-Means clustering followed by
XGBoost prediction model to differentiate between normal or
attack traffic. We perform feature selection based on AR
values where features with AR value less than 0.01 are
discarded from analysis. This threshold is judiciously selected
to obtain maximum performance. For K-Means clustering, as
presented in Fig. 3, we use two clusters (K=2) for K-Means
clustering due to its steepest drop in SSE. For hyperparameter
tuning in each XGBoost model corresponding to each cluster,
we employ RandomizedSearchCV algorithm to select
hyperparameters that yield the best model’s performance in
term of accuracy. A set of hyperparameters we are interested
in is presented in Figure 4. The followings are some
hyperparameters we use in our model:
 n_estimators: The total number of trees used in the
model.
 max_depth: The highest number of tree hierarchies.
Figure3. SSE of K-Means clustering vs. a number of clusters (K) The higher the value, the more complex the model
becomes.
3.1.4) XGBoostClassifier  learning_rate: This parameter mitigates over-fitting
XGBoost was designed for speed and performance based on problem. It controls step-size shrinkage and weighting
gradient-boosted decision trees algorithms. It provides the factors for corrections when adding new trees to the
benefit of algorithm enhancement, model tuning, and can also model.
be deployed in different computing environments. In addition,  subsample: The fraction of samples to be randomly
it allows the addition or tuning of regularization parameters to selected for each tree.
mitigate the impact of over-fitting.  colsample_bytree: The fraction of columns to be
randomly sampled for each tree.
3.2) EVALUATION METRICS  colsample_bylevel : The subsample ratio of column for
each feature split, in each layer.
We evaluate the performance of our proposed model using
 min_child_weight: The minimum sum of weights of all
Accuracy, True Positive Rate (TPR) or Recall, False Positive
observations required in a child. It is used to control
Rate (FPR). In addition, we usedArea Under the Curve
over-fitting.
(AUC)for overall measure of performance across all possible
 gamma: The minimum loss reduction required to make
classification thresholds. Accuracy metric can be written as
a split.
+  reg_lambda: L2 regularization term on weights It is
Accuracy = (5) used to manage the regularization part of XGBoost loss
(TP + FN + TN + FP)
function
WhereTP, TN, FP and FN represent True Positive, True
Negative, False Positive and False Negative, respectively. For each set of hyperparameter, we implement 5-fold cross-
Recall or TPR is the ratio of items correctly classified as attack validation to validate the performance of our model in each
to all items classified as attack and can be written as cluster. Table 3 lists a set of hyperparameter of XGBoost
model for each cluster. From Table 3, after hyperparameter
TP tuning, both clusters yield identical hyperparameters.
TPR = (6)
(TP + FN ) With 2 clusters employing hyperparameters as listed in
Table 3, the performance of our proposed model is exhibited
FPR is the ratio of items incorrectly classified as attack to all in Table 4. From the table, the proposed model with K=2
items that belong to normal and can be written as yields accuracy equal to 84.41%, TPR = 86.36%, FPR =
18.20% and AUC = 0.84. We compare the performance of our
FP model over a range of cluster groups and found that those with
FPR = (7)
(FP + TN ) K=2 yields highest accuracy and AUC. TPR and FPR are on
4. RESULTS the same order over a range of a number of clusters as
presented in Table 4. This justifies our selection with K = 2.
We compare the performance of our proposed hybrid ML
system in terms of accuracy, TPR, FPR and AUC with the
RNN approach and other tree-based classifiers. We use
Table 5 lists top ten highest-important features of both
param_grid = { clusters. From the table, both clusters exhibit similar pattern
'n_estimators': [100, 200], where src_bytes feature yields highest importance for both
'max_depth': [5, 10, 15, 20], clusters.
'learning_rate': [0.001, 0.01, 0.1, 0.2, 0,3],
'subsample': [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], Table 5. A list of top ten highest-important feature of cluster 0 and 1 on
'colsample_bytree': [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], KDDTrain+ dataset
'colsample_bylevel': [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
'min_child_weight': [0.4, 0.5, 1.0, 3.0, 5.0, 7.0, 10.0], Feature Name Cluster 0 Cluster 1
'gamma': [0, 0.25, 0.5, 1.0], src_bytes 0.140192 0.132889
'reg_lambda': [0.1, 1.0, 5.0, 10.0, 50.0, 100.0]
} dst_bytes 0.074614 0.062317
dst_host_srv_count 0.063247 0.070306
Figure 4.List of XGBoostHyperparameters.
dst_host_diff_srv_rate 0.056252 0.060719
dst_host_count 0.052171 0.057257
Table 3.A list of XGBoosthyperparameter for each cluster
dst_host_same_src_port_rate 0.050131 0.045806
duration 0.042845 0.047137
dst_host_rerror_rate 0.041387 0.040213
dst_host_same_srv_rate 0.041096 0.045806
Table4. Performance of the proposed model in terms of accuracy, TPR and dst_host_srv_diff_host_rate 0.037015 0.033289
FPR over a range of K-Means clusters

The performance of our proposed model in term of accuracy is

compared with those from RNN and other tree-based classifiers
(Random Forest and Adaboost) in Table 6. From the table, our
proposed model obtains highest accuracy compared to others
for both KDDTrain+ and KDDTest+ datasets. use to clustering,
XGBoost classifier is trained using data exhibiting similar
pattern in each cluster and this improves the detection
performance compared to those obtained by training the
classifier without clustering [16] (Accuracy = 77%, TPR =
62% and FPR = 3%). This could be one of the main reasons
that contribute to its superiority over that of the RNN model.
For comparison with RF and Adaboost models, with more
customizable hyperparameter selection, the proposed model
yields superior performance compared to others. In addition, by
implementing feature selection using AR gain, our proposed
model uses only 75 out of 122 features (61.47%) to achieve
strong performance in comparable to those using full 122
features to train the model.
Table6. Comparison of the accuracy metric

Model KDDTrain+ KDDTest+

Baseline 99.81% 83.28%

K-Means+XGBoost (Our model) 99.85% 84.41%

K-Means+Random Forest 99.67% 75.67%
K-Means+Adaboost 99.61% 72.64%

5. CONCLUSIONS
We proposed a hybrid machine learning technique for
network intrusion detection based on a combination of feature
selection, K-Means clustering and XGBoost classification
models. We test the performance of our proposed system over
NSL-KDD (KDDTrain+, KDDTest+) dataset. A feature
selection method based on AR is applied to construct a reduced
feature subset of NSL-KDD dataset. After applying K-Means [8] Chen, Tianqi, and Carlos Guestrin. “XGBoost.” Proceedings of the 22nd
clustering, hyperparameter tuning of each classification model ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining - KDD 16, 2016, pp. 785–794.
corresponding to each cluster is implemented. Our proposed
model obtains the best accuracy equal to 84.41% with detection [9] Yin, Chuanlong, et al. “A Deep Learning Approach for Intrusion Detection
rate equal to 86.36%, false alarm rate equal to 18.20% and Using Recurrent Neural Networks.” IEEE Access, vol. 5, 2017, pp. 21954–
21961.
AUC equal to 0.84 for KDDTest+ dataset. In addition, the
performance of our proposed model in term of accuracy [10] Staudemeyer, Ralf C. “Applying Long Short-Term Memory Recurrent
outperforms those obtained from the RNN-based deep neural Neural Networks to Intrusion Detection.” South African Computer Journal,
network. Due to feature selection, our proposed model employs vol. 56, no. 1, Nov. 2015, pp. 136–154.
only 75 out of 122 features (61.47%) to achieve this level of
[11] Li, Zhipeng, et al. “Intrusion Detection Using Convolutional Neural
performance comparable to those using full number of features Networks for Representation Learning.” Neural Information Processing
to train the model. Lecture Notes in Computer Science, 2017, pp. 858–866.

[12] Kuang, Fangjun, et al. “A Novel Hybrid KPCA and SVM with GA
REFERENCES Model for Intrusion Detection.” Applied Soft Computing, vol. 18, 2014, pp.
178–184.
[1] Buczak, Anna L., and ErhanGuven. “Using Data Mining and Machine
Learning Methods for Cyber Security Intrusion Detection.” International [13] “Search UNB.” University of New Brunswick Est.1785,
Journal of Recent Trends in Engineering and Research, vol. 3, no. 4, 2017, pp. www.unb.ca/cic/datasets/nsl.html.
109–111.
[14] Sang-Hyun, Choi, and ChaeHee-Su. “Feature Selection Using Attribute
[2] Vemuri, V Rao. “Cyber-Security and Cyber-Trust.”Enhancing Computer Ratio in NSL-KDD Data.” International Conference Data Mining, Civil and
Security with Smart Technology, 2005, pp. 1–8. Mechanical Engineering (ICDMCME’2014), Feb 4-5, 2014 Bali (Indonesia),
4 Feb. 2014, pp. 90–92.
[3] Rao, B. Basaveswara, and K. Swathi. “Fast KNN Classifiers for Network
Intrusion Detection System.” Indian Journal of Science and Technology, vol. [15] Tavallaee, Mahbod, et al. “A Detailed Analysis of the KDD CUP 99 Data
10, no. 14, Jan. 2017, pp. 1–10. Set.” 2009 IEEE Symposium on Computational Intelligence for Security and
Defense Applications, 2009
[4] Li, Fan. “Hybrid Neural Network Intrusion Detection System Using
Genetic Algorithm.” 2010 International Conference on Multimedia [16] “Network Intrusion Detection.” NYC Data Science Academy Blog,
Technology, 2010, pp. 597–602. nycdatascience.com/blog/student-works/network-intrusion-detection-2.

[5]Farnaaz, Nabila, and M.a.Jabbar. “Random Forest Modeling for Network

Intrusion Detection System.” Procedia Computer Science, vol. 89, 2016, pp.
213–217.

[6] Zhang, Jiong, et al. “Random-Forests-Based Network Intrusion Detection

Systems.” IEEE Transactions on Systems, Man, and Cybernetics, Part C
(Applications and Reviews), vol. 38, no. 5, 2008, pp. 649–659.

[7] Pervez, Muhammad Shakil, and Dewan Md. Farid. “Feature Selection and
Intrusion Classification in NSL-KDD Cup 99 Dataset Employing SVMs.” The
8th International Conference on Software, Knowledge, Information
Management and Applications (SKIMA 2014), 2014.

Ensemble Learning for Anomaly Detection
No ratings yet
Ensemble Learning for Anomaly Detection
11 pages
10.1007@978 981 13 9710 344
No ratings yet
10.1007@978 981 13 9710 344
13 pages
Paper 1-1
No ratings yet
Paper 1-1
26 pages
Detecting Intrusions in Computer Network Traffic With Machine Learning Approaches
No ratings yet
Detecting Intrusions in Computer Network Traffic With Machine Learning Approaches
13 pages
Journal INISTA
No ratings yet
Journal INISTA
10 pages
Intrusion Detection System Using Machine Learning Techniques A Review
No ratings yet
Intrusion Detection System Using Machine Learning Techniques A Review
8 pages
Multi-Dimensional Feature Fusion and Stacking Ensemble Mechanism
No ratings yet
Multi-Dimensional Feature Fusion and Stacking Ensemble Mechanism
14 pages
Improving Intrusion Detection System Using The Combination of Neural Network and Genetic Algorithm
No ratings yet
Improving Intrusion Detection System Using The Combination of Neural Network and Genetic Algorithm
14 pages
Machine Learning-Based Intrusion Detection System For Detecting Web Attacks
No ratings yet
Machine Learning-Based Intrusion Detection System For Detecting Web Attacks
11 pages
Tree-Based ML for Intrusion Detection
No ratings yet
Tree-Based ML for Intrusion Detection
17 pages
Data-Driven Network Intrusion Detection: A Taxonomy of Challenges and Methods
No ratings yet
Data-Driven Network Intrusion Detection: A Taxonomy of Challenges and Methods
38 pages
Project Paper Publication
No ratings yet
Project Paper Publication
10 pages
Performance Analysis of Machine Learning
No ratings yet
Performance Analysis of Machine Learning
22 pages
Comparison of Single and Ensemble Intrusion Detection Techniques Using Multiple Datasets
No ratings yet
Comparison of Single and Ensemble Intrusion Detection Techniques Using Multiple Datasets
10 pages
HDLNIDS Hybrid Deep-Learning
No ratings yet
HDLNIDS Hybrid Deep-Learning
17 pages
Ieee - Intrusion Detection System Using Neural
No ratings yet
Ieee - Intrusion Detection System Using Neural
8 pages
Intrusion Detection Using Dynamic Feature Selection and Fuzzy Temporal Decision Tree Classification For Wireless Sensor Networks
No ratings yet
Intrusion Detection Using Dynamic Feature Selection and Fuzzy Temporal Decision Tree Classification For Wireless Sensor Networks
8 pages
Effective Network Intrusion de
No ratings yet
Effective Network Intrusion de
19 pages
1 s2.0 S0167739X1932730X Main
No ratings yet
1 s2.0 S0167739X1932730X Main
10 pages
AI Hybrid Ensemble for IDS
No ratings yet
AI Hybrid Ensemble for IDS
17 pages
Network Intrusion Detection Using Machine Learning: Project Guide DR K Suresh
No ratings yet
Network Intrusion Detection Using Machine Learning: Project Guide DR K Suresh
40 pages
Seguridad
No ratings yet
Seguridad
29 pages
ZR - Network Intrusion Detection System Based On Machine
No ratings yet
ZR - Network Intrusion Detection System Based On Machine
6 pages
s42400 021 00103 8
No ratings yet
s42400 021 00103 8
22 pages
Feature Level Fusion of Multi-Source Data For Network Intrusion Detection
No ratings yet
Feature Level Fusion of Multi-Source Data For Network Intrusion Detection
7 pages
An Efficient Intrusion Detection System With Custom Features Using FPA-Gradient Boost Machine Learning Algorithm
No ratings yet
An Efficient Intrusion Detection System With Custom Features Using FPA-Gradient Boost Machine Learning Algorithm
17 pages
A Stacked Ensemble Learning Model For Intrusion Detection in Wireless Network
No ratings yet
A Stacked Ensemble Learning Model For Intrusion Detection in Wireless Network
9 pages
Symmetry 15 01251
No ratings yet
Symmetry 15 01251
31 pages
Anomaly Detection in Network Traffic Using Machine
No ratings yet
Anomaly Detection in Network Traffic Using Machine
16 pages
Network Intrusion Detection Clustering & Gradient
No ratings yet
Network Intrusion Detection Clustering & Gradient
7 pages
Advanced NIDS for Anomaly Detection
No ratings yet
Advanced NIDS for Anomaly Detection
7 pages
A Hybrid IDS Using GA-Based Feature Selection Method and Random Forest
No ratings yet
A Hybrid IDS Using GA-Based Feature Selection Method and Random Forest
8 pages
Capstone Project Review-1
No ratings yet
Capstone Project Review-1
12 pages
Network Attack Detection Using ML & DL
No ratings yet
Network Attack Detection Using ML & DL
5 pages
Multi Level Deep Learning Model For Network Anomal
No ratings yet
Multi Level Deep Learning Model For Network Anomal
12 pages
Batch 1 - 4 CSE C
No ratings yet
Batch 1 - 4 CSE C
9 pages
Using Random Forests For Network-Based Anomaly Detection at Active Routers
No ratings yet
Using Random Forests For Network-Based Anomaly Detection at Active Routers
4 pages
Intrusion Detection in Software Defined Network Using Machine Learning
No ratings yet
Intrusion Detection in Software Defined Network Using Machine Learning
11 pages
Using Combination of Fuzzy Set and Gravitational Algorithm For Improving Intrusion Detection
No ratings yet
Using Combination of Fuzzy Set and Gravitational Algorithm For Improving Intrusion Detection
14 pages
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
No ratings yet
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
11 pages
IEEE Conference Templa
No ratings yet
IEEE Conference Templa
4 pages
Literature Survey
No ratings yet
Literature Survey
1 page
Network Anomaly Detection for Experts
No ratings yet
Network Anomaly Detection for Experts
4 pages
A Survey On Effective Machine Learning Algorithm For Intrusion Detection System
No ratings yet
A Survey On Effective Machine Learning Algorithm For Intrusion Detection System
4 pages
Network Intrusion Detection Using Supervised Machine Learning Technique With Feature Selection
No ratings yet
Network Intrusion Detection Using Supervised Machine Learning Technique With Feature Selection
4 pages
(Chou e JIANG) A Survey On Data-Driven Network Intrusion Detection.
No ratings yet
(Chou e JIANG) A Survey On Data-Driven Network Intrusion Detection.
36 pages
Paper Review of IIS Course
No ratings yet
Paper Review of IIS Course
10 pages
Intrusion Detection Model Using Machine Learning Algorithms On NSL-KDD Dataset
No ratings yet
Intrusion Detection Model Using Machine Learning Algorithms On NSL-KDD Dataset
14 pages
Final Progress
No ratings yet
Final Progress
22 pages
Review of Intrusion Detection Systems
100% (1)
Review of Intrusion Detection Systems
3 pages
Intrusion Detection Systems For Wireless Sensor Networks Using Computational Intelligence Techniques
No ratings yet
Intrusion Detection Systems For Wireless Sensor Networks Using Computational Intelligence Techniques
15 pages
1ahmed Maggie2023 PDF
No ratings yet
1ahmed Maggie2023 PDF
34 pages
Reducing Network Intrusion Detection Using Association Rule and Classification Algorithms
No ratings yet
Reducing Network Intrusion Detection Using Association Rule and Classification Algorithms
5 pages
Optimal Hybrid Classifiers for IDS
No ratings yet
Optimal Hybrid Classifiers for IDS
12 pages
Paper 4
No ratings yet
Paper 4
13 pages
Anomaly-Based Intrusion Detection System
No ratings yet
Anomaly-Based Intrusion Detection System
16 pages
Network Intrusion Detection System
No ratings yet
Network Intrusion Detection System
46 pages
Bayesian Optimization With Machine Learning Algorithms Towards Anomaly Detection
No ratings yet
Bayesian Optimization With Machine Learning Algorithms Towards Anomaly Detection
6 pages
Understanding Assessment in Education
67% (3)
Understanding Assessment in Education
3 pages
Lesson 2
No ratings yet
Lesson 2
8 pages
Life Sciences Grade 11 Teachers Guide
100% (3)
Life Sciences Grade 11 Teachers Guide
239 pages
AP Biology Course Overview and Tips
No ratings yet
AP Biology Course Overview and Tips
4 pages
Per Dev Week 6
No ratings yet
Per Dev Week 6
5 pages
Reference - Learning Outcomes (Lancaster University)
No ratings yet
Reference - Learning Outcomes (Lancaster University)
3 pages
Welhouse CV
No ratings yet
Welhouse CV
3 pages
2357 13 Handbook v1.3 - October - 2010
No ratings yet
2357 13 Handbook v1.3 - October - 2010
170 pages
Human Resource Selection and Development Across Cultures
No ratings yet
Human Resource Selection and Development Across Cultures
35 pages
German A2 in Three Weeks
No ratings yet
German A2 in Three Weeks
13 pages
Gen AI Databricks Simplified Notes
No ratings yet
Gen AI Databricks Simplified Notes
3 pages
English Language Panel of SMK Bandar Tenggara 2 Form 4 English Language Daily Lesson Plan 2020
No ratings yet
English Language Panel of SMK Bandar Tenggara 2 Form 4 English Language Daily Lesson Plan 2020
6 pages
BUS101 Course Generic Outline - Summer21 (Approved)
No ratings yet
BUS101 Course Generic Outline - Summer21 (Approved)
9 pages
Preamble of the Teacher's Code of Ethics
100% (2)
Preamble of the Teacher's Code of Ethics
34 pages
Retrieval-Augmented Dynamic Prompt Tuning For Incomplete Multimodal Learning
No ratings yet
Retrieval-Augmented Dynamic Prompt Tuning For Incomplete Multimodal Learning
9 pages
ISYE 7406 Fall 2023 Syllabus
No ratings yet
ISYE 7406 Fall 2023 Syllabus
10 pages
Listening Skills for 2nd-Year Students
No ratings yet
Listening Skills for 2nd-Year Students
6 pages
Usc Strategic Management Course Syllabus
No ratings yet
Usc Strategic Management Course Syllabus
13 pages
Action-Plan-in-EsP Club 2023-2024
No ratings yet
Action-Plan-in-EsP Club 2023-2024
12 pages
Question No. 1 (A) Explain The Forms of Social Control
No ratings yet
Question No. 1 (A) Explain The Forms of Social Control
20 pages
An Investigation Into Writing Strategies of Iranian EFL Undergraduate Learners
No ratings yet
An Investigation Into Writing Strategies of Iranian EFL Undergraduate Learners
10 pages
Member of Group - 20251001 - 093541 - 0000
No ratings yet
Member of Group - 20251001 - 093541 - 0000
12 pages
Effective Item Analysis Techniques
No ratings yet
Effective Item Analysis Techniques
19 pages
Eapp 2
No ratings yet
Eapp 2
3 pages
Cot Teachers I-III 2022
No ratings yet
Cot Teachers I-III 2022
2 pages
Tuga National High School Plans 2017-2018
No ratings yet
Tuga National High School Plans 2017-2018
5 pages
Beyond IQ Exploring Howard Gardners Theory of Multiple Intelligences
No ratings yet
Beyond IQ Exploring Howard Gardners Theory of Multiple Intelligences
10 pages
Investigating The Problems Faced by The
No ratings yet
Investigating The Problems Faced by The
7 pages
1-Intro To Educational Research
No ratings yet
1-Intro To Educational Research
30 pages

Anomaly-Based Network Intrusion Detection System

Uploaded by

Anomaly-Based Network Intrusion Detection System

Uploaded by

2018 Sixteenth International Conference on ICT and Knowledge Engineering

Anomaly-Based Network Intrusion Detection System

whereAVGi,j=Ci,j/Ni,j. Ci,j is the sum of the jth feature

where Freq(1)i,j is the number of ithrecords whose jth feature

3.1.1) One Hot Encoding and Scaling

3.1.2) Feature Selection

The performance of our proposed model in term of accuracy is

Model KDDTrain+ KDDTest+

Baseline 99.81% 83.28%

K-Means+XGBoost (Our model) 99.85% 84.41%

[5]Farnaaz, Nabila, and M.a.Jabbar. “Random Forest Modeling for Network

[6] Zhang, Jiong, et al. “Random-Forests-Based Network Intrusion Detection

You might also like