Papers by Mohammad M. Masud

Classification and Adaptive Novel Class Detection of Feature-Evolving Data Streams
ABSTRACT Data stream classification poses many challenges to the data mining community. In this p... more ABSTRACT Data stream classification poses many challenges to the data mining community. In this paper, we address four such major challenges, namely, infinite length, concept-drift, concept-evolution, and feature-evolution. Since a data stream is theoretically infinite in length, it is impractical to store and use all the historical data for training. Concept-drift is a common phenomenon in data streams, which occurs as a result of changes in the underlying concepts. Concept-evolution occurs as a result of new classes evolving in the stream. Feature-evolution is a frequently occurring process in many streams, such as text streams, in which new features (i.e., words or phrases) appear as the stream progresses. Most existing data stream classification techniques address only the first two challenges, and ignore the latter two. In this paper, we propose an ensemble classification framework, where each classifier is equipped with a novel class detector, to address concept-drift and concept-evolution. To address feature-evolution, we propose a feature set homogenization technique. We also enhance the novel class detection module by making it more adaptive to the evolving stream, and enabling it to detect more than one novel class at a time. Comparison with state-of-the-art data stream classification techniques establishes the effectiveness of the proposed approach.
Network Packet Filtering and Deep Packet Inspection Hybrid Mechanism for IDS Early Packet Matching
2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA), 2016

Examining The Effect of Feature Selection on Improving Patient Deterioration Prediction
International Journal of Data Mining & Knowledge Management Process, 2015
Large amount of heterogeneous medical data is generated every day in various healthcare organizat... more Large amount of heterogeneous medical data is generated every day in various healthcare organizations. Those data could derive insights for improving monitoring and care delivery in the Intensive Care Unit. Conversely, these data presents a challenge in reducing this amount of data without information loss. Dimension reduction is considered the most popular approach for reducing data size and also to reduce noise and redundancies in data. In this paper, we are investigate the effect of the average laboratory test value and number of total laboratory in predicting patient deterioration in the Intensive Care Unit, where we consider laboratory tests as features. Choosing a subset of features would mean choosing the most important lab tests to perform. Thus, our approach uses state-of-the-art feature selection to identify the most discriminative attributes, where we would have a better understanding of patient deterioration problem. If the number of tests can be reduced by identifying the most important tests, then we could also identify the redundant tests. By omitting the redundant tests, observation time could be reduced and early treatment could be provided to avoid the risk. Additionally, unnecessary monetary cost would be avoided. We apply our technique on the publicly available MIMIC-II database and show the effectiveness of the feature selection. We also provide a detailed analysis of the best features identified by our approach.

ICU Patient Deterioration Prediction : A Data-Mining Approach
Computer Science & Information Technology ( CS & IT ), 2015
A huge amount of medical data is generated every day, which presents a challenge in analysing the... more A huge amount of medical data is generated every day, which presents a challenge in analysing these data. The obvious solution to this challenge is to reduce the amount of data without information loss. Dimension reduction is considered the most popular approach for reducing data size and also to reduce noise and redundancies in data. In this paper, we investigate the effect of feature selection in improving the prediction of patient deterioration in ICUs. We consider lab tests as features. Thus, choosing a subset of features would mean choosing the most important lab tests to perform. If the number of tests can be reduced by identifying the most important tests, then we could also identify the redundant tests. By omitting the redundant tests, observation time could be reduced and early treatment could be provided to avoid the risk. Additionally, unnecessary monetary cost would be avoided. Our approach uses state-of-the-art feature selection for predicting ICU patient deterioration using the medical lab results. We apply our technique on the publicly available MIMIC-II database and show the effectiveness of the feature selection. We also provide a detailed analysis of the best features identified by our approach.
A Hybrid Model to Detect Malicious Executables
2007 Ieee International Conference on Communications, 2007
We present a hybrid data mining approach to detect malicious executables. In this approach we ide... more We present a hybrid data mining approach to detect malicious executables. In this approach we identify important features of the malicious and benign executables. These features are used by a classifier to learn a classification model that can distinguish between malicious and benign executables. We construct a novel combination of three different kinds of features: binary n-grams, assembly n-grams, and

We propose a novel stream data classification technique to detect Peer to Peer botnet. Botnet tra... more We propose a novel stream data classification technique to detect Peer to Peer botnet. Botnet traffic can be considered as stream data having two important properties: infinite length and drifting concept. Thus, stream data classification technique is more appealing to botnet detection than simple classification technique. However, no other botnet detection approaches so far have applied stream data classification technique. We propose a multi-chunk, multi-level ensemble classifier based data mining technique to classify concept-drifting stream data. Previous ensemble techniques in classifying concept-drifting stream data use a single data chunk to train a classifier. In our approach, we train an ensemble of v classifiers from r consecutive data chunks. K of these v-classifier ensembles are used to build another level of ensemble. By introducing this multi-chunk, multi-level ensemble, we significantly reduce error compared to the singlechunk, single level ensemble. We have established the justification of using our algorithm theoretically. We have also tested our technique on both botnet traffic and simulated data, and obtained better detection accuracies compared to other published works.
Systems and Methods for Detecting a Novel Data Class
Classifying Evolving Data Streams for Intrusion Detection
Stream data classification is a challenging problem because of two important properties: its infi... more Stream data classification is a challenging problem because of two important properties: its infinite length and evolving nature. Traditional learning algorithms that require several passes on the training data are not directly applicable to stream classification problem because of the infinite length of the data stream. Data streams may evolve in several ways: the prior probability distribution p(c) of a
We design and implement DExtor, a Data Mining based Exploit code de- tector, to protect network s... more We design and implement DExtor, a Data Mining based Exploit code de- tector, to protect network services. The main assumption of our work is that normal traffic into the network services contain only data, whereas exploit code contains code. Thus, the "exploit code detection" problem reduces to "code detection" problem. DExtor is an application-layer attack blocker, which is deployed between

Network Firewalls are considered to be one of the most important security components in today's I... more Network Firewalls are considered to be one of the most important security components in today's IP network architectures. Performance of firewalls has significant impact on the overall network performance. Firewalls should be able to sustain a very high throughput and ensure network services availability. In this paper, we propose an analytical dynamic multilevel early packet filtering mechanism to enhance firewall performance. The proposed mechanism uses statistical splay tree filters that utilize traffic characteristics to minimize packet filtering time. The statistical splay tree filters are reordered according to the network traffic divergence upon certain threshold qualification (Chi-Square Test). That is, the proposed mechanism is able to decide whether or not there is a need to update the dynamic splay tree filters’ order for filtering the next net-work traffic window and predict the best order pattern. Furthermore, the im-portance of optimizing packet rejection and acceptance is done through the mul-tilevel packet filtering process; where in each level, unwanted packets are re-jected as early as possible. The proposed mechanism can also be considered as a device protection mechanism against denial of service (DoS) attacks targeting the default filtering rule. Early packet acceptance is done using the splay tree data structure which adapts dynamically according to network traffic flows. Consequently, repeated packets will have less memory accesses and therefore reduce the overall packets filtering time as demonstrated in the evaluation section.
Feature Based Techniques for Auto-Detection of Novel Email Worms
Lecture Notes in Computer Science, 2007
Z.-H. Zhou, H. Li, and Q. Yang (Eds.): PAKDD 2007, LNAI 4426, pp. 205216, 2007. © Springer-Verla... more Z.-H. Zhou, H. Li, and Q. Yang (Eds.): PAKDD 2007, LNAI 4426, pp. 205216, 2007. © Springer-Verlag Berlin Heidelberg 2007 ... Feature Based Techniques for Auto-Detection of Novel ... Mohammad M Masud, Latifur Khan, and Bhavani Thuraisingham
Journal of Intelligent Systems, 2014
Biometrics readers are deployed in many public sites and are used for user identification and ver... more Biometrics readers are deployed in many public sites and are used for user identification and verification. Nowadays, most biometrics readers can be connected to local area networks, and consequently, they are potential targets for network attacks. This article investigates the robustness of several fingerprint and iris readers against common denial of service (DoS) attacks. This investigation has been conducted using a set of laboratory experiments and DoS attack generator tools. The experiments show clearly that the tested biometric readers are very vulnerable to common DoS attacks, and their recognition performances deteriorate significantly once they are under DoS attacks. Finally, the article lists some security consideration that should be taken into consideration when designing secure biometrics readers.
Resilience of fingerprint and iris readers against common denial of service attacks
2014 World Congress on Computer Applications and Information Systems (WCCAIS), 2014
Email worm detection using naïve bayes and support vector machine
Email worm, as the name implies, spreads through infected email messages. The worm may be carried... more Email worm, as the name implies, spreads through infected email messages. The worm may be carried by attachment, or the email may contain links to an infected website. When the user opens the attachment, or clicks the link, the host is immediately infected. Email worms use the vulnerability of the email software of the host machine and sends infected emails to the addresses stored in the address book. In this way, new machines get infected. Examples of email worms are “W32.mydoom.M@mm”, “W32.Zafi.d”, “W32.LoveGate.w”, “W32.Mytob.c”, and so on. Worms do a lot of harm to computers and people. They can clog the network traffic, cause damage to the system and make the system unstable or even unusable.

We present a hybrid data mining approach to detect malicious executables. In this approach we ide... more We present a hybrid data mining approach to detect malicious executables. In this approach we identify important features of the malicious and benign executables. These features are used by a classifier to learn a classification model that can distinguish between malicious and benign executables. We construct a novel combination of three different kinds of features: binary n-grams, assembly n-grams, and library function calls. Binary features are extracted from the binary executables, whereas assembly features are extracted from the disassembled executables. The function call features are extracted from the program headers. We also propose an efficient and scalable feature extraction technique. We apply our model on a large corpus of real benign and malicious executables. We extract the abovementioned features from the data and train a classifier using Support Vector Machine. This classifier achieves a very high accuracy and low false positive rate in detecting malicious executables. Our model is compared with other feature-based approaches, and found to be more efficient in terms of detection accuracy and false alarm rate.
This paper describes the design and implementation of DExtor, a datamining-based exploit code det... more This paper describes the design and implementation of DExtor, a datamining-based exploit code detector that protects network services. DExtor operates under the assumption that normal traffic to network services contains only data whereas exploits contain code. The system is first trained with real data containing exploit code and normal traffic. Once it is trained, DExtor is deployed between a web service and its gateway or firewall, where it operates at the application layer to detect and block exploit code in real time. Tests using large volumes of normal and attack traffic demonstrate that DExtor can detect almost all the exploit code with negligible false alarm rates.

Recent approaches in classifying evolving data streams are based on supervised learning algorithm... more Recent approaches in classifying evolving data streams are based on supervised learning algorithms, which can be trained with labeled data only. Manual labeling of data is both costly and time consuming. Therefore, in a real streaming environment, where huge volumes of data appear at a high speed, labeled data may be very scarce. Thus, only a limited amount of training data may be available for building the classification models, leading to poorly trained classifiers. We apply a novel technique to overcome this problem by building a classification model from a training set having both unlabeled and a small amount of labeled instances. This model is built as micro-clusters using semisupervised clustering technique and classification is performed with κ-nearest neighbor algorithm. An ensemble of these models is used to classify the unlabeled data. Empirical evaluation on both synthetic data and real botnet traffic reveals that our approach, using only a small amount of labeled data for training, outperforms state-of-the-art stream classification algorithms that use twenty times more labeled data than our approach.

This paper outlines a data stream classification technique that addresses the problem of insuffic... more This paper outlines a data stream classification technique that addresses the problem of insufficient and biased labeled data. It is practical to assume that only a small fraction of instances in the stream are labeled. A more practical assumption would be that the labeled data may not be independently distributed among all training documents. How can we ensure that a good classification model would be built in these scenarios, considering that the data stream also has evolving nature? In our previous work we applied semi-supervised clustering to build classification models using limited amount of labeled training data. However, it assumed that the data to be labeled should be chosen randomly. In our current work, we relax this assumption, and propose a label propagation framework for data streams that can build good classification models even if the data are not labeled randomly. Comparison with stateof-the-art stream classification techniques on synthetic and benchmark real data proves the effectiveness of our approach.

In a typical data stream classification task, it is assumed that the total number of classes are ... more In a typical data stream classification task, it is assumed that the total number of classes are fixed. This assumption may not be valid in a real streaming environment, where new classes may evolve at any time. Traditional data stream classification techniques are not capable of recognizing novel class instances until the appearance of the novel class is manually identified, and labeled instances of that class are presented to the learning algorithm for training. The problem becomes more challenging in the presence of concept-drift, when the underlying data distributions evolve in streams. We propose a novel and efficient technique that can automatically detect the emergence of a novel class in the presence of concept-drift by quantifying cohesion among unlabeled test instances, and separation of the test instances from training instances. Our approach is non-parametric, meaning, it does not assume any underlying distributions of data. Comparison with the state-of-the-art stream classification techniques prove the superiority of our approach.
Uploads
Papers by Mohammad M. Masud