Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2019, International Journal of Recent Technology and Engineering (IJRTE)
The new development in the architecture of Internet has increased internet traffic. The introduction of Peer to Peer (P2P) applications are affecting the performance of traditional internet applications. Network optimization is used to monitor and manage the internet traffic and improve the performance of internet applications. The existing optimizations methods are not able to provide better management for networks. Machine learning (ML) is one of the familiar techniques to handle the internet traffic. It is used to identify and reduce the traffic. The lack of relevant datasets have reduced the performance of ML techniques in classification of internet traffic. The aim of the research is to develop a hybrid classifier to classify the internet traffic data and mitigate the traffic. The proposed method is deployed in the classification of traffic traces of University Technology Malaysia. The method has produced an accuracy of 98.3% with less computation time.
With the advancement of technology and communication system, use of internet is giving at a tremendous role. This causes an exponential growth of data and traffic over the internet. So to correctly classify this traffic is a hot research area. Internet traffic classification is a very popular tool against the information detection system. Although so many methods had been develop to efficiently classify internet traffic but among them machine learning techniques are most popular. A brief survey on various supervised and unsupervised machine learning techniques applied by various researchers to solve internet traffic classification has been discussed. This paper also present various issues related to machine learning techniques that may help interested researchers to work future in this direction.
International Scientific Journal of Engineering and Management
Internet traffic classification is a fundamental task for network services and management. There are good machine learning models to identify the class of traffic. However, finding the most discriminating features to have efficient models remains essential. In this paper, we use interpretable machine learning algorithms such as random forest and gradient boosting to find the most discriminating features for internet traffic classification. This paper aims to overcome these challenges by proposing machine learning classification mechanism.
International Journal of Information and Communication Technology Research, 2022
Almost every industry has revolutionized with Artificial Intelligence. The telecommunication industry is one of them to improve customers' Quality of Services and Quality of Experience by enhancing networking infrastructure capabilities which could lead to much higher rates even in 5G Networks. To this end, network traffic classification methods for identifying and classifying user behavior have been used. Traditional analysis with Statistical-Based, Port-Based, Payload-Based, and Flow-Based methods was the key for these systems before the 4th industrial revolution. AI combination with such methods leads to higher accuracy and better performance. In the last few decades, numerous studies have been conducted on Machine Learning and Deep Learning, but there are still some doubts about using DL over ML or vice versa. This paper endeavors to investigate challenges in ML/DL use-cases by exploring more than 140 identical researches. We then analyze the results and visualize a practical way of classifying internet traffic for popular applications.
2007 Second International Conference on Communications and Networking in China, 2007
Internet traffic classification is one of the popular research interest area because of its benefits for many applications like intrusion detection system, congestion avoidance, traffic prediction etc. Internet traffic is classified on the basis of statistical features because port and payload based techniques have their limitations. For statistics based techniques machine learning is used. The statistical feature set is large. Hence, it is a challenge to reduce the large feature set to an optimal feature set. This will reduce the time complexity of the machine learning algorithm. This paper tries to obtain an optimal feature set by using a hybrid approach-An unsupervised clustering algorithm (K-Means) with a supervised feature selection algorithm (Best Feature Selection).
As the world has become a global village, internet is considered as preliminary component of each institution, corporation, business as well as an individual. As internet base applications varies on the basis of functionality they provide to the end users, their bandwidth requirements also varies on the basis of Quality of Service (QoS) they provide. Due to the new trend introduced for classification in internet applications like use of dynamic port numbers i.e. P2P applications, classical classification techniques i.e. port based and pay-load based classification are considered inefficient. So to minimize the risk of inefficient IP data classification researchers have adopted Machine Learning (ML) based IP data classification methods. In this paper an effort is made by introducing different classifiers to increase the efficiency of existing Machine Learning techniques for IP data classification.
European Chemical Bulletin, 2023
Network traffic classification is crucial for internet service providers (ISPs) to optimize network performance by identifying various types of applications. Traditional techniques such as Port-Based and Payload-Based are available, but Machine Learning (ML) techniques are the most effective. This research presents a real-time internet data set and utilizes feature extraction tools to extract features from captured traffic, then applies four machine learning classifiers: Support Vector Machine, C4.5 decision tree, Naive Bayes, and Bayes Net classifiers. Results show that the C4.5 classifier achieves the highest accuracy among the other classifiers.
IEEE Communications Surveys and Tutorials, 2008
The research community has begun looking for IP traffic classification techniques that do not rely on 'well known' TCP or UDP port numbers, or interpreting the contents of packet payloads. New work is emerging on the use of statistical traffic characteristics to assist in the identification and classification process. This survey paper looks at emerging research into the application of Machine Learning (ML) techniques to IP traffic classification -an inter-disciplinary blend of IP networking and data mining techniques. We provide context and motivation for the application of ML techniques to IP traffic classification, and review 18 significant works that cover the dominant period from 2004 to early 2007. These works are categorized and reviewed according to their choice of ML strategies and primary contributions to the literature. We also discuss a number of key requirements for the employment of ML-based traffic classifiers in operational IP networks, and qualitatively critique the extent to which the reviewed works meet these requirements. Open issues and challenges in the field are also discussed.
— In this paper, Automated system is built which contains processing of captured packets from the network. Machine learning algorithms are used to build a traffic classifier which will classify the packets as malicious or non-malicious. Previously, many traditional ways were used to classify the network packets using tools, but this approach contains machine learning approach, which is an open field to explore and has provided outstanding results till now. The main aim is to perform traffic monitoring, analyze it and govern the intruders. The CTU-13 is a dataset of botnet traffic which is used to develop traffic classification system based on the features of the captured packets on the network. This type of classification will assist the IT administrators to determine the unknown attacks which are broadening in the IT industry.
International Journal of Health Sciences (IJHS), 2022
With in community of internet, it may be essential to recognize whatever programs are flowing via the networks in order to execute particular activities. Network traffic categorization is primarily used by Internet service providers (ISPs) to determine the qualities needed to construct a connection, which in turn influences the cable network current effectiveness. Stream, bandwidth, and machinelearning methods were all used to categorise internet protocol, and each has its own benefits and drawbacks. Because of its widespread use across disciplines as well as the increasing awareness between many investigators of its [5] methodology when especially in comparison to everyone else, the Machine Learning method is popular these times. Nave Bayes as well as K-nearest algorithm results are then compared in this research whenever applied to a networking 5389 given dataset taken through live stream feeds using Ethernet program. Python's sklearn module and the pandas and numpy arrays modules are utilised as assist modules to create a machine learning algorithm. Our findings show that K closest approach is more efficient than Nave Bayes, Decision Tree, and Support Vector Machine algorithms. Keywords-K-Nearest Neighbors (KNN), Naive Bayes (NB), decision trees (DT) and support vector machines (SVM).
2015 Tenth International Conference on Digital Information Management (ICDIM), 2015
The number of alleged crimes in computer networks had not increased until a few years ago. Real-time analysis has become essential to detect any suspicious activities. Network classification is the first step of network traffic analysis, and it is the core element of network intrusion detection systems (IDS). Although the techniques of classification have improved and their accuracy has been enhanced, the growing trend of encryption and the insistence of application developers to create new ways to avoid applications being filtered and detected are among the reasons that this field remains open for further research. This paper discusses how researchers apply Machine Learning (ML) algorithms in several classification techniques, utilising the statistical properties of the network traffic flow. It also outlines the next stage of our research, which involves investigating different classification techniques (supervised, semi-supervised, and unsupervised) that use ML algorithms to cope with real-world network traffic.
Information Systems Frontiers, 2008
Accurate and timely traffic classification is critical in network security monitoring and traffic engineering. Traditional methods based on port numbers and protocols have proven to be ineffective in terms of dynamic port allocation and packet encapsulation. The signature matching methods, on the other hand, require a known signature set and processing of packet payload, can only handle the signatures of a limited number of IP packets in real-time. A machine learning method based on SVM (supporting vector machine) is proposed in this paper for accurate Internet traffic classification. The method classifies the Internet traffic into broad application categories according to the network flow parameters obtained from the packet headers. An optimized feature set is obtained via multiple classifier selection methods. Experimental results using traffic from campus backbone show that an accuracy of 99.42% is achieved with the regular biased training and testing samples. An accuracy of 97.17% is achieved when un-biased training and testing samples are used with the same feature set. Furthermore, as all the feature parameters are computable from the packet headers, the proposed method is also applicable to encrypted network traffic.
2015 7th Conference on Information and Knowledge Technology (IKT), 2015
Heart disease is amongst the most widely recognized diseases in the world. This research aims to consolidate the precision of heart disease classification/diagnosis by developing a system depending on multiple classifiers. The proposed system contains two phases, which are the preprocessing phase and the classification phase. The preprocessing phase includes data cleaning, normalization and accounting for missing values. In the classification phase, multiple classifiers are used as an ensemble technique based on the Multilayer Perceptron (MLP), K-Nearest Neighbor (K-NN) and C4.5. A heart disease dataset, which contains four databases and gathered from the UCI machine learning repository, was used for experiments. The proposed classification system gives 99.4% classification precision according to 10-fold cross-validation technique. The outcome obtained from the proposed system shows that its performance is better than that of already reported classification systems.
Supervised statistical approaches for the classification of network traffic are quickly moving from research laboratories to advanced prototypes, which in turn will become actual products in the next few years. While the research on the classification algorithms themselves has made quite significant progress in the recent past, few papers have examined the problem of determining the optimum working parameters for statistical classifiers in a straightforward and foolproof way. Without such optimization, it becomes very difficult to put into practice any classification algorithm for network traffic, no matter how advanced it may be. In this paper we present a simple but effective procedure for the optimization of the working parameters of a statistical network traffic classifier. We put the optimization procedure into practice, and examine its effects when the classifier is run in very different scenarios, ranging from medium and large local area networks to Internet backbone links. Experimental results show not only that an automatic optimization procedure like the one presented in this paper is necessary for the classifier to work at its best, but they also shed some light on some of the properties of the classification algorithm that deserve further study.
International Journal of Engineering Research and, 2015
Traffic Classification is a method of categorizing the computer network traffic based on various features observed passively in the traffic into a number of traffic classes. Due to the rapid increase of different Internet application behaviors', raised the need to disguise the applications for filtering, accounting, advertising, network designing etc. Many traditional methods like port based, packets based and some alternate methods based on machine learning approaches have been used for the classification process. Proposed a new traffic classification scheme to utilize the information among the correlated traffic flows generated by an application. Discretized statistical features are extracted and are used to represent the traffic flows. The removal of irrelevant and redundant features from the feature set is done by Correlation based feature selection with high class-specific correlation and low inter correlation. For the classification process Naïve Bayes with Discretization is used. The proposed scheme is compared with three other Bayesian models. The experimental evaluation show that NBD outperforms the other methods even in the case of a small supervised training samples.
To identify the traffic based on the application in a large network, the major part is traffic classification and it is useful to provide quality of service, lawful interception and intrusion detection. A number of limitations have been exhibited by older methods such as port-based and payload based classification. Hence Machine learning techniques are used by the research community to analyse the flow statistics for detecting network applications. The statistical based approach is used here for traffic identification and classification process. The statistical features are flow size, flow duration, TCP port, packet inter-arrival times statistics, total number of packets, mean packet length, protocol, number of bytes transferred. There are two types of machine learning algorithms such as supervised and unsupervised algorithms. These two machine learning algorithms are analysed with datasets respectively based on the set of algorithms. By evaluating the results of supervised and unsupervised algorithms respectively, best algorithm from each technique is combined together to yield better results.
Journal of Computer Networks and Communications, 2016
Traffic classification utilizing flow measurement enables operators to perform essential network management. Flow accounting methods such as NetFlow are, however, considered inadequate for classification requiring additional packet-level information, host behaviour analysis, and specialized hardware limiting their practical adoption. This paper aims to overcome these challenges by proposing two-phased machine learning classification mechanism with NetFlow as input. The individual flow classes are derived per application throughk-means and are further used to train a C5.0 decision tree classifier. As part of validation, the initial unsupervised phase used flow records of fifteen popular Internet applications that were collected and independently subjected tok-means clustering to determine unique flow classes generated per application. The derived flow classes were afterwards used to train and test a supervised C5.0 based decision tree. The resulting classifier reported an average acc...
Advances in Intelligent Systems, Computer Science and Digital Economics II, 2021
The rapid development of telecommunications creates a lot of new types of the traffic. This cause a great problem for real time traffic classification using Machine Learning Methods due to their inability to add new classes. Supervised Learning methods are unable to classify samples that are not represented in train sequences. Unsupervised Learning methods require a significant amount of time to build a model, give relatively low results and the resulting clusters are difficult to interpret.
IEEE Access
Traffic classification is considered an important research area due to the increasing demand in network users. It not only effectively improve the network service identifications and security issues of the traffic network, but also provide robust accuracy and efficiency in different Internet application behaviors and patterns. Several traffic classification techniques have been proposed and applied successfully in recent years. However, the existing literature lack of comprehensive survey which could provide an overview and analysis towards the recent developments in network traffic classification. To this end, this survey presents a comprehensive investigation on traffic classification techniques by carefully reviewing existing methods from a new perspective. We comprehensively discuss the procedures and datasets for traffic classification. Additionally, traffic criteria are proposed, which could be beneficial to assess the effectiveness of the developed classification algorithm. Then, the traffic classification techniques are discussed in detail. Then, we thoroughly discussed the machine learning (ML) methods for traffic classification. For researcher's convenience, we present the traffic obfuscation techniques, which could be helpful for designing a better classifier. Finally, key findings and open research challenges for network traffic classification are identified along with recommendations for future research directions. In sum, this survey fills the gap of existing surveys and summarizes the latest research developments in traffic classification. INDEX TERMS Classification criteria, machine learning method, obfuscation, security, traffic classification. MUHAMMAD SAMEER SHEIKH received the Ph.D. degree in communication and information systems from the Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2017. He was a Postdoctoral Research Fellow with Jiangsu University, from 2018 to 2020. In 2020, he joined as an Associate Professor at Guangdong Ocean University, China. He has authored more than 20 articles published in international journals and international conference proceeding. His research interests include intelligent transportation systems (ITSs), traffic information engineering, traffic classification, traffic control and operation, traffic networks, connected and automated vehicles, and the IoT technologies. He is a member of the Technical Committee of International Conferences and a reviewer of various reputed journals.
International Journal of Computer Applications
The classification of Internet traffic has come to the forefront in recent times as organization of network traffic is necessitated by the increasing use of the internet and limited bandwidth. Also, network traffic classification finds its application in ne twork security and for Qos (Quality of service). In this report, a certain number of flow features have been used as a basis for classifying the network traffic into various applications that run on the network as the classes. These flow fe atures (13 in all) were extracted using a Perl script after capturing traffic using Wire shark. Seven network applications were chosen as the classes, visually, ftp, www, p2p, NetBIOS, dns, mail and telnet for classification. The machine a lgorithms that have been used for classification are Artificial Neural Network (ANN) and Support Vector Machine (SVM). These algorithms were used while designing a classification simulation model in WEKA in which Multilayer Perceptron (MLP) and sequential Mi...
2017
Internet traffic defines as the density of data or information presented on the Internet or in another language we can say it’s a flow of data on the internet. Internet traffic classification has power to solve many network difficulties and manage different type of network problems. There are some basic functions provided to government, Internet service providers (ISPs) and network administrator through Internet traffic classification. Machine learning approaches overcome many problems of traditional approaches of internet traffic classification. In supervised approaches, we discuss five well known supervised machine learning approaches these are Naïve Bayes, Feed Forward Neural Network, Bayes Net, RBF and C4.5 decision tree approach. Keywords— ―ISP, ML, DAG, RBF, FCBF, BOF‖
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.