0% found this document useful (0 votes)
14 views3 pages

Examplee 2

The document outlines a chatbot pipeline that processes user queries and optional PDF attachments, utilizing a Graph RAG & Structuring Agent to gather relevant financial data. It emphasizes a single flow design for simplicity, modular extensibility for future enhancements, and optimization strategies for model selection and resource efficiency. Hardware requirements include a powerful CPU, ample memory, high-performance GPU, and fast storage for optimal operation.

Uploaded by

Skander Dinari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views3 pages

Examplee 2

The document outlines a chatbot pipeline that processes user queries and optional PDF attachments, utilizing a Graph RAG & Structuring Agent to gather relevant financial data. It emphasizes a single flow design for simplicity, modular extensibility for future enhancements, and optimization strategies for model selection and resource efficiency. Hardware requirements include a powerful CPU, ample memory, high-performance GPU, and fast storage for optimal operation.

Uploaded by

Skander Dinari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Figure 1: Chatbot Pipeline

1. User Input
The user can supply only a query or a query plus a PDF.

2. PDF Attached? (Decision Node)


Yes: • The document is classified to see if it’s scanned or digital.
• The appropriate text extraction (OCR or digital) is performed.
• The text is then normalized and passed to the Graph RAG & Structuring Agent.
No (Query Only) → Use Graph RAG: • If no PDF is attached, the query goes directly to
Graph RAG & Structuring Agent.
• Relevant financial data is pulled from its knowledge base.

3. Graph RAG & Structuring Agent


Gathers and structures relevant data (tables, market info, regulatory filings, etc.) based on:
• The user’s query alone, or
• The combination of the query + extracted document text.

1
4. Merge Processed Doc & Query
Combines all text and context into a single prompt for the next step.
5. Finance LLM Agent
Generates the response using the merged prompt, leveraging domain-specific financial training.
6. Feedback Loop
Collects user input on the quality or accuracy of the response, used for iterative improvements.
7. Enhanced Output
Returns the final answer to the user, incorporating relevant context and data.

Design Considerations
Single, Consistent Flow
• Maintaining one path through the pipeline (Graph RAG → Merge → LLM ) avoids branching logic
that can complicate maintenance.
• If there’s no PDF, the “Merge” step simply merges the user’s query with the Graph RAG–retrieved
data, acting as a pass-through.

Modular Extensibility
• Later, you may add other optional data sources (e.g., user profile, previously uploaded documents, or
real-time market data). The “Merge” block is a natural place to combine them.
• Having a single merge node means no separate path is needed for the “no document” case.

Simplified Code and Orchestration


• Splitting the pipeline into separate routes (one bypassing “Merge” and one that doesn’t) introduces
extra branching or code paths.
• By treating “no PDF data” as an empty or null input, the “Merge” step still processes the user query
plus whatever Graph RAG context is available.

Overview of the Model Selection & Optimization Approach


1. Multimodal OCR
• Two Pre-Trained Models: Select two leading OCR solutions (e.g., from research papers or industry
benchmarks).
• Evaluation: Use a financially annotated dataset (tables, financial terms) to measure key metrics such
as Character/Word Error Rate and table-structure accuracy.
• Model Selection: Choose the model with the best overall performance (lowest errors, highest quality
output) based on standardized evaluation methods.

2. Finance LLM
• Two Pre-Trained Finance Models: Identify two specialized large language models tuned for fi-
nancial text.
• Testing Methods: Compare their domain-specific accuracy (factual consistency, clarity) using rec-
ognized finance NLP benchmarks.
• Best Model Choice: Select the LLM with superior performance on a set of financial queries or tasks.

2
3. Pipeline Optimization
• Efficiency Focus: Use architectures and techniques that minimize GPU/CPU consumption (e.g.,
quantization, pruning).
• Goal: Maintain strong performance for both OCR and LLM while reducing inference costs and resource
usage.

Hardware
– CPU: 8–12 cores for efficient preprocessing and orchestration.

– Memory: 32 GB system RAM; to manage resources effectively.


– GPU: High-memory GPU recommended with at least 16–24 GB VRAM ( NVIDIA RTX 3090, RTX
A5000, or A6000...) for fast inference and handling large model parameters.
– Storage: Fast NVMe SSD (500 GB or larger) for quick model loading and data caching.

You might also like