Papers by Amine Boukhtouta
The critical node game
Journal of combinatorial optimization, May 18, 2024
Forensic Data Analytics for Anomaly Detection in Evolving Networks
World Scientific series in digital forensics and cybersecurity, Jul 1, 2023

arXiv (Cornell University), Mar 10, 2023
Cloud networks are the backbone of the modern distributed internet infrastructure as they provisi... more Cloud networks are the backbone of the modern distributed internet infrastructure as they provision most of the on-demand resources organizations and individuals use daily. However, any abrupt cyber-attack could disrupt the provisioning of some of the cloud resources fulfilling the needs of customers, industries, and governments. In this work, we introduce a game-theoretic model that assesses the cyber-security risk of cloud networks and informs security experts on the optimal security strategies. Our approach combines game theory, combinatorial optimization, and cyber-security and aims at minimizing the unexpected network disruptions caused by malicious cyberattacks under uncertainty. Methodologically, our approach consists of a simultaneous and non-cooperative attacker-defender game where each player solves a combinatorial optimization problem parametrized in the variables of the other player. Practically, our approach enables security experts to (i.) assess the security posture of the cloud network, and (ii.) dynamically adapt the level of cyber-protection deployed on the network. We provide a detailed analysis of a real-world cloud network and demonstrate the efficacy of our approach through extensive computational tests.

2022 IEEE Future Networks World Forum (FNWF)
The convergence of Telecommunication and industry operational networks towards cloud native appli... more The convergence of Telecommunication and industry operational networks towards cloud native applications has enabled the idea to integrate protection layers to harden security posture and management of cloud native based deployments. In this paper, we propose a data-driven approach to support detection of anomalies in cloud native application based on a graph neural network. The essence of the profiling relies on capturing interactions between different perspectives in cloud native applications through a network dependency graph and transforming it to a computational graph neural network. The latter is used to profile different deployed assets like micro-service types, workloads' namespaces, worker machines, management and orchestration machines as well as clusters. As a first phase of the profiling, we consider a fine-grained profiling on microservice types with an emphasis on network traffic indicators. These indicators are collected on distributed Kubernetes (K8S) deployment premises. Experimental results shows good trade-off in terms of accuracy and recall with respect to micro-service types profiling (around 96%). In addition, we used predictions entropy scores to infer anomalies in testing data. These scores allow to segregate between benign and anomalous graphs, where we identified 19 out of 23 anomalies. Moreover, by using entropy scores, we can conduct a root cause analysis to infer problematic micro-services. Index Terms-anomaly detection, profiling, cloud native applications, Kubernetes (K8S), Graph Neural Networks (GNNs).

IEEE Access
The 6LoWPAN (IPv6 over low-power wireless personal area networks) standard enables resource-const... more The 6LoWPAN (IPv6 over low-power wireless personal area networks) standard enables resource-constrained devices to connect to the IPv6 network, blending an IPv6 header compression protocol. For this network technology, a new routing protocol called Routing Protocol for Low Power Lossy network (RPL) has been designed. The latter is a lightweight protocol that determines the route across the nodes based on rank values. This protocol is known to be non-resilient against Rank attacks, which aim at creating non-optimized routes for packet forwarding, hence overwhelming the constrained 6LoWPAN. With 5G, Software-Defined Networks (SDNs) have been developed to facilitate simple programmable control plane, Quality of Service (QoS) provisioning, and route configuration services for 6LoWPAN. However, there is still a lack of optimization mechanisms to protect 6LoWPAN against Rank attacks in SDN-based deployment. To this end, in this paper, a Reinforcement-Learning (RL) agent is leveraged to assist and complement an SDN controller in achieving cost-efficient route optimization, and QoS provisioning packet forwarding to prevent rank attacks. Experimental results confirm that our approach effectively prevents Rank attacks while providing an adequate delay and radio duty cycle. Meanwhile, it maximizes the packet delivery ratio, facilitating practical implementations in software-defined Low Power Internet of Things (IoT) networks. INDEX TERMS Reinforcement learning, SDN networks, 6LoWPAN networks, RPL protocol, rank attacks.

Data and Applications Security and Privacy XXXII, 2018
Crowd events or flash crowds are meant to be a voluminous access to media or web assets due to a ... more Crowd events or flash crowds are meant to be a voluminous access to media or web assets due to a popular event. Even though the crowd event accesses are benign, the problem of distinguishing them from Distributed Denial of Service (DDoS) attacks is difficult by nature as both events look alike. In contrast to the rich literature about how to profile and detect DDoS attack, the problem of distinguishing the benign crowd events from DDoS attacks has not received much interest. In this work, we propose a new approach for profiling crowd events and segregating them from normal accesses. We use a first selection based on semi-supervised approach to segregate between normal events and crowd events using the number of requests. We use a density based clustering, namely, DBSCAN, to label patterns obtained from a time series. We then use a second more refined selection using the resulted clusters to classify the crowd events. To this end, we build a XGBoost classifier to detect crowd events with a high detection rate on the training dataset (99%). We present our initial results of crowd events fingerprinting using 8 days log data collected from a major Content Delivery Network (CDN) as a driving test. We further prove the validity of our approach by applying our models on unseen data, where abrupt changes in the number of accesses are detected. We show how our models can detect the crowd event with high accuracy. We believe that this approach can further be used in similar CDN to detect crowd events.

Inferring, Characterizing, and Investigating Internet-Scale Malicious IoT Device Activities: A Network Telescope Perspective
2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2018
Recent attacks have highlighted the insecurity of the Internet of Things (IoT) paradigm by demons... more Recent attacks have highlighted the insecurity of the Internet of Things (IoT) paradigm by demonstrating the impacts of leveraging Internet-scale compromised IoT devices. In this paper, we address the lack of IoT-specific empirical data by drawing upon more than 5TB of passive measurements. We devise data-driven methodologies to infer compromised IoT devices and those targeted by denial of service attacks. We perform large-scale characterization analysis of their traffic, as well as explore a public threat repository and an in-house malware database, to underlie their malicious activities. The results expose a significant 26 thousand compromised IoT devices "in the wild," with 40% being active in critical infrastructure. More importantly, we uncover new, previously unreported malware variants that specifically target IoT devices. Our empirical results render a first attempt to highlight the large-scale insecurity of the IoT paradigm, while alarming about the rise of new generations of IoT-centric malware-orchestrated botnets.

Th eDataflo wPointcu t - AForma lan dPractica lFramework *
Some security concerns are sensitive to ∞ow of information in a program execution. The data∞ow po... more Some security concerns are sensitive to ∞ow of information in a program execution. The data∞ow pointcut has been proposed by Masuhara and Kawauchi in order to easily implement such security concerns in aspect-oriented programming (AOP) languages. The pointcut identifles join points based on the origins of values. This paper presents a formal framework for this pointcut based on the ‚ calculus. Data∞ow tags are propagated statically to track data dependencies between expressions. We introduce a static semantics for tag propagation and prove that it is consistent with respect to the dynamic semantics of the propagation. We instrument the static efiect-based type system to propagate tags, match and inject advices. This static approach can be used to minimize the cost of data∞ow pointcuts by reducing the runtime overhead since much of the data∞ow information would be available statically and at the same time it can be used for veriflcation. The proposed semantics for advice weaving is i...

IEEE Transactions on Dependable and Secure Computing, 2020
We present forward error correction systems based on soft-decision low-density parity check (LDPC... more We present forward error correction systems based on soft-decision low-density parity check (LDPC) codes for applications in 100-400-Gbps optical transport networks. These systems are based on the low-complexity "adaptive degeneration" decoding algorithm, which we introduce in this paper, along with randomly-structured LDPC codes with block lengths from 30 000 to 60 000 bits and overhead (OH) from 6.7% to 33%. We also construct a 3600-bit prototype LDPC code with 20% overhead, and experimentally show that it has no error floor above a bit error rate (BER) of 10 −15 using a field-programmable gate array (FPGA)-based hardware emulator. The projected net coding gain at a BER of 10 −15 ranges from 9.6 dB at 6.7% OH to 11.2 dB at 33% OH. We also present application-specific integrated circuit synthesis results for these decoders in 28 nm fully depleted silicon on insulator technology, which show that they are capable of 400-Gbps operation with energy consumption of under 3 pJ per information bit. Index Terms-Application-specific integrated circuit (ASIC) synthesis, forward error correction (FEC), low-density parity-check (LDPC) codes, low power.

IEEE Communications Surveys & Tutorials, 2018
Despite the ubiquitous role of domain name system (DNS) in sustaining the operations of various I... more Despite the ubiquitous role of domain name system (DNS) in sustaining the operations of various Internet services (domain name to IP address resolution, email, Web), DNS was abused/misused to perform large-scale attacks that affected millions of Internet users. To detect and prevent threats associated to DNS, researchers introduced passive DNS replication and analysis as an effective alternative approach for analyzing live DNS traffic. In this paper, we survey state of the art systems that utilized passive DNS traffic for the purpose of detecting malicious behaviors on the Internet. We highlight the main strengths and weaknesses of the implemented systems through an in-depth analysis of the detection approach, collected data, and detection outcomes. We highlight an incremental implementation pattern in the studied systems with similarities in terms of the used datasets and detection approach. Furthermore, we show that almost all studied systems implemented supervised machine learning (SML), which has its own limitations. In addition, while all surveyed systems required several hours or even days before detecting threats, we illustrate the ability to enhance performance by implementing a system prototype that utilize big data analytics frameworks to detect threats in near real-time. We demonstrate the feasibility of our threat detection prototype through real-life examples, and provide further insights for future work toward analyzing DNS traffic in near real-time.
Insights From The Analysis Of The Mariposa Botnet CRISIS'2010 October Montreal Canada

Journal of Computer Virology and Hacking Techniques, 2015
In order to counter cyber-attacks and digital threats, security experts must generate, share, and... more In order to counter cyber-attacks and digital threats, security experts must generate, share, and exploit cyber-threat intelligence generated from malware. In this research, we address the problem of fingerprinting maliciousness of traffic for the purpose of detection and classification. We aim first at fingerprinting maliciousness by using two approaches: Deep Packet Inspection (DPI) and IP packet headers classification. To this end, we consider malicious traffic generated from dynamic malware analysis as traffic maliciousness ground truth. In light of this assumption, we present how these two approaches are used to detect and attribute maliciousness to different threats. In this work, we study the positive and negative aspects for Deep Packet Inspection and IP packet headers classification. We evaluate each approach based on its detection and attribution accu-B Amine Boukhtouta

a b s t r a c t In this paper, we investigate cyber-threats and the underlying infrastructures. M... more a b s t r a c t In this paper, we investigate cyber-threats and the underlying infrastructures. More pre-cisely, we detect and analyze cyber-threat infrastructures for the purpose of unveiling key players (owners, domains, IPs, organizations, malware families, etc.) and the relationships between these players. To this end, we propose metrics to measure the badness of different infrastructure elements using graph theoretic concepts such as centrality concepts and Google PageRank. In addition, we quantify the sharing of infrastructure elements among different malware samples and families to unveil potential groups that are behind specific attacks. Moreover, we study the evolution of cyber-threat infrastructures over time to infer patterns of cyber-criminal activities. The proposed study provides the capability to derive insights and intelligence about cyber-threat infrastructures. Using one year dataset, we generate notable results regarding emerging threats and campaigns, important p...

ArXiv, 2021
Content delivery networks (CDNs) provide efficient content distribution over the Internet. CDNs i... more Content delivery networks (CDNs) provide efficient content distribution over the Internet. CDNs improve the connectivity and efficiency of global communications, but their caching mechanisms may be breached by cyber-attackers. Among the security mechanisms, effective anomaly detection forms an important part of CDN security enhancement. In this work, we propose a multi-perspective unsupervised learning framework for anomaly detection in CDNs. In the proposed framework, a multi-perspective feature engineering approach, an optimized unsupervised anomaly detection model that utilizes an isolation forest and a Gaussian mixture model, and a multi-perspective validation method, are developed to detect abnormal behaviors in CDNs mainly from the client Internet Protocol (IP) and node perspectives, therefore to identify the denial of service (DoS) and cache pollution attack (CPA) patterns. Experimental results are presented based on the analytics of eight days of real-world CDN log data prov...

In recent years, malware authors drastically changed their course on the subject of threat design... more In recent years, malware authors drastically changed their course on the subject of threat design and implementation. Malware authors, namely, hackers or cyber-terrorists perpetrate new forms of cyber-crimes involving more innovative hacking techniques. Being motivated by financial or political reasons, attackers target computer systems ranging from personal computers to organizations’ networks to collect and steal sensitive data as well as blackmail, scam people, or scupper IT infrastructures. Accordingly, IT security experts face new challenges, as they need to counter cyber-threats proactively. The challenge takes a continuous allure of a fight, where cyber-criminals are obsessed by the idea of outsmarting security defenses. As such, security experts have to elaborate an effective strategy to counter cyber-criminals. The generation of cyber-threat intelligence is of a paramount importance as stated in the following quote: “the field is owned by who owns the intelligence”. In this...
NFV security survey in 5G networks: A three-dimensional threat taxonomy
Computer Networks
AutoGuard: A Dual Intelligence Proactive Anomaly Detection at Application-Layer in 5G Networks
Computer Security – ESORICS 2021

Graph-theoretic characterization of cyber-threat infrastructures
Digital Investigation, 2015
ABSTRACT In this paper, we investigate cyber-threats and the underlying infrastructures. More pre... more ABSTRACT In this paper, we investigate cyber-threats and the underlying infrastructures. More precisely, we detect and analyze cyber-threat infrastructures for the purpose of unveiling key players (owners, domains, IPs, organizations, malware families, etc.) and the relationships between these players. To this end, we propose metrics to measure the badness of different infrastructure elements using graph theoretic concepts such as centrality concepts and Google PageRank. In addition, we quantify the sharing of infrastructure elements among different malware samples and families to unveil potential groups that are behind specific attacks. Moreover, we study the evolution of cyber-threat infrastructures over time to infer patterns of cyber-criminal activities. The proposed study provides the capability to derive insights and intelligence about cyber-threat infrastructures. Using one year dataset, we generate notable results regarding emerging threats and campaigns, important players behind threats, linkages between cyber-threat infrastructure elements, patterns of cybercrimes, etc.

2012 7th International Conference on Risks and Security of Internet and Systems (CRiSIS), 2012
An effective approach to gather cyber threat intelligence is to collect and analyze traffic desti... more An effective approach to gather cyber threat intelligence is to collect and analyze traffic destined to unused Internet addresses known as darknets. In this paper, we elaborate on such capability by profiling darknet data. Such information could generate indicators of cyber threat activity as well as providing in-depth understanding of the nature of its traffic. Particularly, we analyze darknet packets distribution, its used transport, network and application layer protocols and pinpoint its resolved domain names. Furthermore, we identify its IP classes and destination ports as well as geo-locate its source countries. We further investigate darknet-triggered threats. The aim is to explore darknet embedded threats and categorize their severities. Finally, we contribute by exploring the inter-correlation of such threats, by applying association rule mining techniques, to build threat association rules. Specifically, we generate clusters of threats that co-occur targeting a specific victim. Such work proves that specific darknet threats are correlated. Moreover, it provides insights about threat patterns and allows the interpretation of threat scenarios.

Proceedings of the 8th ACM international conference on Aspect-oriented software development - AOSD '09, 2009
Some security concerns are sensitive to flow of information in a program execution. The dataflow ... more Some security concerns are sensitive to flow of information in a program execution. The dataflow pointcut has been proposed by Masuhara and Kawauchi in order to easily implement such security concerns in aspect-oriented programming (AOP) languages. The pointcut identifies join points based on the origins of values. This paper presents a formal framework for this pointcut based on the λ calculus. Dataflow tags are propagated statically to track data dependencies between expressions. We introduce a static semantics for tag propagation and prove that it is consistent with respect to the dynamic semantics of the propagation. We instrument the static effect-based type system to propagate tags, match and inject advices. This static approach can be used to minimize the cost of dataflow pointcuts by reducing the runtime overhead since much of the dataflow information would be available statically and at the same time it can be used for verification. The proposed semantics for advice weaving is in the spirit of AspectJ where advices are injected before, after, or around the join points that are matched by their respective pointcuts. Inspired from the formal framework, the AspectJ compiler ajc is extended with the dataflow pointcut that tracks data dependencies inside methods.
Uploads
Papers by Amine Boukhtouta