0% found this document useful (0 votes)

5 views5 pages

Solution

Proposed Agentic Hybrid CTI Analysis System

Uploaded by

Muhammad Muneeb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views5 pages

Solution

Proposed Agentic Hybrid CTI Analysis System

Uploaded by

Muhammad Muneeb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Proposed Agentic Hybrid CTI Analysis

System
The proposed system is a hybrid multi-agent architecture combining Large Language Models
(LLMs) with graph-based reasoning and retrieval components. Raw CTI reports (blogs, PDFs)
are ingested and pre-processed (text extraction, tokenization, normalization) into a common
format. A central orchestrator agent – implemented as an LLM – coordinates specialized
sub-agents in a pipeline:

● Extraction Agent: A fine-tuned or prompt-engineered LLM scans the text to extract

entities and concepts for all STIX 2.1 object types (Indicators, Threat Actors, Campaigns,
etc.) and TTPs (techniques/sub-techniques). It generates initial STIX triples (e.g.
Actor–uses–Attack Pattern) and short annotations. These objects are created with STIX
schema compliance in mind. In practice, we run separate LLM prompts for STIX Domain
Objects (SDOs), Cyber-observable Objects (SCOs), and Relationship Objects (SROs),
then merge them into a STIX bundle

● Graph Construction & Reasoning Agent: Extracted entities are loaded into a
Knowledge Graph where nodes are STIX objects and edges are their relationships. A
graph neural network (GNN, e.g. GraphSAGE ) propagates context and infers hidden
links. This helps link related TTPs or actors even if only implicitly mentioned.

● Validation / RAG Agent: To reduce hallucinations and ensure factual grounding, a

retrieval-augmented generation (RAG) step checks and enriches the LLM output. The
agent performs semantic search over a vector database of indexed CTI reports and
threat feeds, retrieving relevant text snippets that support each extraction. It may also
call external APIs (e.g. MITRE ATT&CK) to verify IOC details or technique names. By
“grounding” queries in authoritative data, the system avoids fabrications. Each candidate
fact is iteratively cross-checked by a chain-of-thought prompt; if inconsistencies arise,
the agent refines or rejects the fact.

● Output Generation Agent: The orchestrator then compiles the validated findings into
final outputs: fully-formed STIX 2.1 JSON bundles (complete with all relevant object
properties) and corresponding visualizations. All reasoning steps and retrieved evidence
are logged to ensure explainability.Agent orchestration frameworks like LangGraph can
manage this ensuring modularity and traceability.

Overall, data flows from raw CTI → LLM-based extraction → KG enrichment → RAG validation
→ final STIX output. The use of RAG (retrieval from a CTI corpus) and graph queries is built into
the toolset, while multi-agent orchestration organizes the steps
Addressing Existing Limitations
● Complete STIX 2.1 Coverage: Our extraction agent is explicitly trained and prompted to
recognize all 18 STIX Domain Objects (SDOs) and 2 Relationship Objects, not just the
commonly-used few.our system’s prompts reference the full STIX ontology. We can
fine-tune on annotated CTI that label each STIX type, or use rules to post-process LLM
output into any missing object types. This ensures niche objects (e.g. Course-of-Action,
Infrastructure, Grouping) are captured, yielding full STIX compliance. The STIX bundle
assembly step then validates each object using the STIX 2.1 schema libraries.

● Enhanced TTP Recognition and Linking: We employ a multi-label classification

strategy for TTPs that mirrors the MITRE ATT&CK hierarchy. For instance, a
DistilBERT-based system covering 560 ATT&CK classes (techniques and
sub-techniques) achieved 0.933 F1 on fine-grained TTP extraction. Similarly, our LLM is
guided (via prompts or fine-tuning) to tag TTPs at the sub-technique level. The
constructed graph then encodes TTP–tactic relationships, and GNN link prediction
identifies chains of procedures. This multi-hop reasoning uncovers, for example, how
reconnaissance techniques connect to later exploitation steps, strengthening our TTP
linkage beyond what a single-pass LLM could do.

● Reduced Hallucinations: Hallucination risk is mitigated through our RAG and

multi-agent checks. In practice, the RAG agent fetches relevant report snippets before
generation, so the model cites real sentences. Furthermore, multiple agents
cross-validate. For example, one agent’s extraction is verified by another agent or by
querying provenance logs, akin to the dual-evaluation paradigm where an auxiliary LLM
cross-checks outputs. ProvSEEK, a similar agentic system, enforces that “every
LLM-generated claim is tied to verifiable ground truth. In our design, any fact not
confirmed by retrieval or databases triggers a refinement loop, keeping hallucinations
below a low threshold. Empirically, such RAG+validation workflows can cut hallucination
rates dramatically, as each output is effectively “vetted” by multiple sources.

● Novel Entity & Pattern Detection: To flag new IoCs or TTP variants, we leverage
semantic embeddings. Extracted entities are compared via cosine similarity to an
embedding index of known threats. If the similarity score is below a threshold (e.g. <0.8),
the entity is marked as novel. The system then automatically queries live sources (e.g.
VirusTotal for IPs/domains, malware sandboxes for hashes, or the ATT&CK Taxii API) to
seek context. This lets the system adapt to emerging threats. New threat patterns can
also emerge via link prediction in the KG or by agentic exploratory prompts. The
architecture thus supports zero-shot recognition: even unseen threats trigger either
evidence-gathering or human-in-the-loop alerts, rather than silent failure.

● Reduced Dependency on Static KBs: Instead of a fixed internal knowledge base, the
system dynamically retrieves up-to-date CTI and threat data. Agents can call online APIs
(MITRE ATT&CK, AlienVault OTX, etc.) and use real-time threat feeds as additional
knowledge sources. This retrieval-augmented design means the system stays current:
when a new vulnerability or campaign appears online, the next RAG query will
incorporate it automatically.

Agentic AI Multi-Agent Approach

Our framework treats each component as an autonomous agent empowered by an LLM. In AI
terms, an agent is “a system that can use an LLM to reason through a problem, create a plan to
solve it, and execute the plan with tools. We adopt a multi-agent setup where each agent has a
role (e.g. “Extraction Specialist”, “Graph Reasoner”, “Validation Analyst”). They communicate via
the orchestrator.

● Role Specialization: Each agent has a distinct persona: for example, one prompt might
be “You are a Threat Analyst focusing on extracting Indicators of Compromise,” while
another is “You are a Forensics Expert mapping attack chains.” This follows ProvSEEK’s
strategy of role-based prompts to align contextarxiv.org. Role specialization minimizes
drift: each agent applies a professional “lens” to its subtask, improving consistency and
interpretability.

● Tool Integration: Agents access external tools to perform actions. For example, the
Graph Agent can run Cypher queries against a Neo4j KG, the RAG Agent can call a
FAISS vector store for nearest-neighbor search, and any agent can invoke REST APIs.
A question about a technique might trigger an ATT&CK API query; an IOC might trigger
a VirusTotal lookup. These tools extend the LLM’s capabilities. Frameworks like
LangChain (which LangGraph builds upon) make it easy to integrate such tools into
prompts. For instance, the agent uses a “Plan Generator” tool to decompose tasks and a
“Data Retriever” to fetch DB results

● Orchestration Frameworks: We will leverage open-source agentic frameworks for

coordination. LangGraph (built on LangChain) models workflows as directed graphs of
components, ideal for visualizing dependencies. Other options include HuggingFace’s
SmolAgents for lightweight tasks. We can also use tools like LangSmith/LangFuse to
monitor agent interactions and logs.

● Knowledge Graph Interaction: The Graph Agent interacts with the KG using queries
and embedding lookups. For example, it might ask “find all attack patterns linked to this
threat actor” and use the graph query result in subsequent reasoning. The agent can
also perform GNN inference in batches (using libraries like PyTorch Geometric) to
update node embeddings and predict edges. Each graph traversal or GNN output is
logged as an explanation: e.g. “Threat Actor A → uses → Technique T1059 (Scripting)
[confidence 0.92]” provides a traceable link.
● Iterative Reasoning Loop: Agents operate in an LLM-based reasoning loop (often
called ReAct). The orchestrator issues a high-level query (e.g. “Extract TTPs from
Report X”), the Extraction Agent responds, the Validation Agent critiques or refines, and
so on. Chain-of-Thought (CoT) prompting within each agent decomposes complex tasks.
If one agent’s result is uncertain, the orchestrator may route it to another agent (or the
same agent with a refinement instruction) for additional evidence. This dynamic planning
mirrors human analysis.

Alternative Approaches and Comparison

We considered several architectures:

● LLM + Static Graph: A baseline is to use an LLM to extract entities and a static
knowledge graph for relations. This can yield a structured KG, but it lacks dynamic
validation. Without RAG or multi-agent checks, the LLM may hallucinate and the graph
cannot update with new data. For example, GraphRAG systems combine KG with LLMs,
but our agentic design adds autonomy and refreshability.

● Transformer + Knowledge Graph: Here a fine-tuned transformer (e.g. a CTI-focused

BERT) identifies entities/TTPs and updates a graph. This achieves high precision on
known terms, but struggles with novel phrasing and cannot reason beyond its training
data. It also usually assumes a fixed schema. By contrast, our agents can query outside
data and re-run analysis on demand, giving higher recall on unseen variants.

● Dual-LLM: One could use two LLMs: one for extraction, another as a “critic” to validate
(a mini RAG approach). This reduces simple errors, but still operates purely on model
knowledge. Our multi-agent method generalizes this by allowing many specialized
“critics” and tool calls, not just a single validator.

● Fully Agentic : Our agentic architecture adds tool use and real-time feedback. It is more
complex to orchestrate but yields the best coverage and adaptability. In practice,
systems like ProvSEEK that use an agentic approach “outperform retrieval-based
methods in precision/recall and achieve accuracy on threat detection. We expect similar
gains: by actively querying evidence at each step, our solution maintains higher factuality
and broader coverage than any single-model pipeline.

Model and Tool Recommendations

We propose using a mix of open-source and state-of-the-art models:
● LLMs: For initial extraction and orchestration, capable LLMs include OpenAI’s GPT-4 or
newer (e.g. GPT-4o/5 if available) for their strong reasoning. However, cost and privacy
may favor open models: LLaMA-3 (via Meta) or Falcon can be fine-tuned on CTI texts.
Anthropic’s Claude (if accessible) also offers high inference ability. Domain-adapted
models like WizardLM or specialized security LLMs can boost performance. For initial
tests, even Llama 2 (7B to 70B) or Mistral 7B – which have shown good CTI extraction
ability– would suffice.

● Graph Models: For graph embeddings and link prediction, we recommend Graph
Neural Network libraries (e.g. PyTorch Geometric or DGL) with models like GraphSAGE
or GCN. These handle sparse CTI graphs effectively.

● Transformers for NLP: In addition to the LLMs, transformer encoders fine-tuned on

cybersecurity corpora can help. For example, SecBERT or CyberBERT
(RoBERTa/BERT models trained on security data) can improve the entity/TTP tagging
step. SciBERT might also help with technical jargon in CTI reports. We can also
leverage models like DeBERTa or RoBERTa for NER tasks on IOCs and TTP names.

● Agent Frameworks: For orchestration, LangGraph (graph-based workflows) which align

with our needs. HuggingFace’s SmolAgents can serve lightweight tasks. We will
evaluate these frameworks for ease of integration and monitoring (e.g. using LangSmith
for logs).

● Retrieval and Vector Stores: We will use vector databases such as FAISS or Milvus to
store embeddings of CTI documents. This enables fast RAG retrieval when validating
IoCs. We will also use STIX/STIX2 Python libraries to construct/validate bundles, and
common HTTP libraries (requests, httpx) to call CTI APIs.

Semester Project
No ratings yet
Semester Project
6 pages
Blue Team Task
No ratings yet
Blue Team Task
14 pages
Hase 2024 - The Path To Choosing A SIEM System - A Systematic Literature Review
No ratings yet
Hase 2024 - The Path To Choosing A SIEM System - A Systematic Literature Review
12 pages
FYP Meeting Minutes Form
No ratings yet
FYP Meeting Minutes Form
1 page
Transforming Cyber Defense: Harnessing Agentic and Frontier AI For Proactive, Ethical Threat Intelligence
No ratings yet
Transforming Cyber Defense: Harnessing Agentic and Frontier AI For Proactive, Ethical Threat Intelligence
23 pages
Assignment 3
No ratings yet
Assignment 3
11 pages
Agentic AIfor Autonomous Cyber Threat Huntingand Adaptive Defensein Dynamic Security Environments
No ratings yet
Agentic AIfor Autonomous Cyber Threat Huntingand Adaptive Defensein Dynamic Security Environments
7 pages
Se Slides
No ratings yet
Se Slides
14 pages
Diagnostic Test - English 5
No ratings yet
Diagnostic Test - English 5
7 pages
Class 10th SA-1 Material
No ratings yet
Class 10th SA-1 Material
22 pages
Placement Test
No ratings yet
Placement Test
5 pages
1 Linda Hutcheon "The Postmodern Problematizing of History PDF
100% (2)
1 Linda Hutcheon "The Postmodern Problematizing of History PDF
18 pages
Visual Programming for Developers
100% (1)
Visual Programming for Developers
4 pages
Tetris
No ratings yet
Tetris
3 pages
Java Array and Keyword Fundamentals
No ratings yet
Java Array and Keyword Fundamentals
78 pages
B System Setup CG ncs5000 77x
No ratings yet
B System Setup CG ncs5000 77x
86 pages
3-The Ashes and The Cursed King by The Stars - Carissa Broadbent-3
No ratings yet
3-The Ashes and The Cursed King by The Stars - Carissa Broadbent-3
12 pages
Past Perfect and Second Conditional Exercises
No ratings yet
Past Perfect and Second Conditional Exercises
8 pages
Group 6 Literary Elements Structures and Traditions of Fictions and Poetry
No ratings yet
Group 6 Literary Elements Structures and Traditions of Fictions and Poetry
16 pages
A Cricket Team Management Mini Project
No ratings yet
A Cricket Team Management Mini Project
8 pages
Ug1118 Vivado Creating Packaging Custom Ip
100% (1)
Ug1118 Vivado Creating Packaging Custom Ip
58 pages
Relativity Syntax Overview
100% (2)
Relativity Syntax Overview
6 pages
Toefl Lesson 27 Parallel Structures
No ratings yet
Toefl Lesson 27 Parallel Structures
26 pages
There Were Two Trees in The Garden
No ratings yet
There Were Two Trees in The Garden
11 pages
Mindless Reading
No ratings yet
Mindless Reading
3 pages
Oxford Phonics 2
No ratings yet
Oxford Phonics 2
13 pages
Module 2 FEA
No ratings yet
Module 2 FEA
19 pages
HSK 3 Vocabulary Flashcards & List
100% (4)
HSK 3 Vocabulary Flashcards & List
13 pages
Understanding Prepositions and Objects
No ratings yet
Understanding Prepositions and Objects
2 pages
Education Philosophy Exploration
No ratings yet
Education Philosophy Exploration
3 pages
Python Built-in Functions Overview
No ratings yet
Python Built-in Functions Overview
142 pages
Wrapper Class Icse Class 10
No ratings yet
Wrapper Class Icse Class 10
8 pages
GL300M Series Manage Tool User Guide - V1.01
No ratings yet
GL300M Series Manage Tool User Guide - V1.01
24 pages
Mr. Know All. Text 4
No ratings yet
Mr. Know All. Text 4
3 pages
Psychological Aspects of Personality
No ratings yet
Psychological Aspects of Personality
11 pages
Anh 6- Nội Dung Ôn Tập Kiểm Tra Cuối Hk 1
No ratings yet
Anh 6- Nội Dung Ôn Tập Kiểm Tra Cuối Hk 1
5 pages
National Senior Certificate: Kereite Ya 12
No ratings yet
National Senior Certificate: Kereite Ya 12
26 pages

Solution

Uploaded by

Solution

Uploaded by

Proposed Agentic Hybrid CTI Analysis

●​ Extraction Agent: A fine-tuned or prompt-engineered LLM scans the text to extract

●​ Validation / RAG Agent: To reduce hallucinations and ensure factual grounding, a

●​ Enhanced TTP Recognition and Linking: We employ a multi-label classification

●​ Reduced Hallucinations: Hallucination risk is mitigated through our RAG and

Agentic AI Multi-Agent Approach

●​ Orchestration Frameworks: We will leverage open-source agentic frameworks for

Alternative Approaches and Comparison

●​ Transformer + Knowledge Graph: Here a fine-tuned transformer (e.g. a CTI-focused

Model and Tool Recommendations

●​ Transformers for NLP: In addition to the LLMs, transformer encoders fine-tuned on

●​ Agent Frameworks: For orchestration, LangGraph (graph-based workflows) which align

You might also like

● Extraction Agent: A fine-tuned or prompt-engineered LLM scans the text to extract

● Validation / RAG Agent: To reduce hallucinations and ensure factual grounding, a

● Enhanced TTP Recognition and Linking: We employ a multi-label classification

● Reduced Hallucinations: Hallucination risk is mitigated through our RAG and

● Orchestration Frameworks: We will leverage open-source agentic frameworks for

● Transformer + Knowledge Graph: Here a fine-tuned transformer (e.g. a CTI-focused

● Transformers for NLP: In addition to the LLMs, transformer encoders fine-tuned on

● Agent Frameworks: For orchestration, LangGraph (graph-based workflows) which align