Project Title
GPT Data Tagger and Analyser
Project Description
This project is concerned with the use of Chat GPT API to analyse and tag cyber threat data
[Table 1]. The aim is to produce robust prompts to provide factual responses to cyber-
related queries to identify attack vectors, industry relevant TTPs [Table 1] and other attack
indicators from open source sources. The end goal is to have a cloud-based solution that
can respond factually to queries such as:
a. What is the most common timeline in a ransomware attack?
b. What is the most recent cyber threat identified?
c. What is the attack vector for Volt Typhoon?
The GPT Analyser will be hosted on the AWS cloud. It will be triggered on a pre-defined
schedule with prompts identified beforehand. It should also respond to real-time prompts.
The architecture and description can be found here.
Project Skills
X AI/Machine Learning
X Cyber Security
X Programming (Software, Mobile and Web
Development)
Environment
X Python
C/C#/C++
Oracle (Java & Database)
Frameworks (Angular/React/Django/Flask)
Research Component
Chat GPT API
Prompt engineering
Cyber threat research and data
Table 1: Cyber threat data.
Cyber threat data refers to any information related to malicious activities that target an organization's
information systems or individuals. This data helps identify, prevent, or mitigate cyberattacks. Here’s a
breakdown of key components of cyber threat data:
1. Attack Vectors
Definition: The methods or pathways through which cyberattacks are carried out.
Examples: Phishing, malware, ransomware, denial-of-service (DoS), supply chain attacks, zero-day
exploits, SQL injection, etc.
2. TTPs (Tactics, Techniques, and Procedures)
Tactics: High-level objectives that attackers aim to achieve (e.g., initial access, exfiltration of data).
Techniques: Specific methods used to achieve a tactic (e.g., spear phishing, credential dumping).
Procedures: Detailed descriptions of how specific techniques are implemented by attackers.
Example Frameworks: MITRE ATT&CK Framework, which categorizes TTPs across various
stages of an attack lifecycle.
3. Indicators of Compromise (IoCs)
Definition: Pieces of forensic data used to identify potential malicious activity on a system or network.
Examples: Malicious IP addresses, domain names, file hashes (MD5, SHA-256), registry changes,
unusual outbound traffic, etc.
Purpose: Help in detecting and responding to cyberattacks early.
4. Common Vulnerabilities and Exposures (CVEs)
Definition: A list of publicly disclosed information about security vulnerabilities and software
weaknesses.
Purpose: Helps organizations identify and patch vulnerabilities that attackers could exploit.
Example: CVE-2021-34527 (PrintNightmare vulnerability).
5. Attack Timelines
Definition: The sequence and timing of events in an attack, from initial compromise to execution.
Examples:
o Reconnaissance: Scanning networks for vulnerabilities.
o Initial Compromise: Gaining unauthorized access.
o Lateral Movement: Moving within the network.
o Data Exfiltration: Stealing sensitive data.
o Persistence: Maintaining long-term access.
10. Incident Reports and Case Studies
Definition: Detailed reports and analyses of past cyberattacks or breaches.
Examples: Incident reports from vendors like FireEye, CrowdStrike, or Kaspersky, which offer
insights into attack vectors, TTPs, and the behavior of threat actors.
11. Threat Intelligence Feeds
Definition: Sources of continuous updates about new cyber threats.
Examples: Automated feeds from cybersecurity vendors (e.g., AlienVault, Recorded Future) that
provide up-to-date IoCs, CVEs, and attack patterns.