0% found this document useful (0 votes)

26 views4 pages

Steps Involved in RAG

Retrieval Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge bases to improve response accuracy without retraining. The RAG process involves data preparation, embedding, and a workflow that includes querying, vector searching, and generating responses based on retrieved context. This approach ensures that the AI provides grounded and relevant answers, reducing instances of misinformation or 'hallucinations'.

Uploaded by

Charvi Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views4 pages

Steps Involved in RAG

Uploaded by

Charvi Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

RAG:

RAG stands for Retrieval Augmented Generation, it is the process of optimizing the
output of a large language model, so it references some personal knowledge base
outside of its training data sources before generating a response. Large Language
Models (LLMs) are trained on vast volumes of data and use billions of parameters to
generate original output for tasks like answering questions, translating languages,
and completing sentences. RAG extends the already powerful capabilities of LLMs to
specific domains or an organization's internal knowledge base, all without the need
to retrain the model. It is a cost effective approach to improving LLM output so it
remains relevant, accurate, and useful in various contexts.

RAG (Retrieval Augmented Generation) is an AI framework that combines the

strengths of traditional information retrieval systems (such as search and databases)
with the capabilities of generative large language models (LLMs).

Steps involved in RAG system:

1. Data preparation:

a. Raw data sources:

This is the initial stage where raw, unstructured data is collected from
diverse sources such as PDF documents, web pages, internal
databases, etc. These sources contain valuable domain specific
knowledge that is not necessarily part of a pre trained language
model’s internal parameters. This raw information is essential because
a RAG system depends on external knowledge to provide accurate and
grounded responses.
However, in its current state, this raw data is not directly usable by AI
models due to inconsistencies in formatting, structure, and content
types.
Note: In generative AI, grounding refers to the process of connecting a
large language model's (LLM) output to verifiable sources of
information, ensuring that the AI's responses are accurate, reliable, and
grounded in reality, rather than relying solely on its internal knowledge.
This is crucial for reducing "hallucinations" i.e. instances where the AI
makes up information.

b. Information Extraction:

Once the raw data is gathered, the next task is to extract useful
information from it. This involves using tools like OCR (Optical
Character Recognition) to digitize scanned documents, PDF parsers to
read and convert PDFs, web crawlers to scrape HTML content, and
other extraction tools for CSVs, images, or audio. The goal is to
standardize the content into a structured or semi structured plain text
format. For instance, metadata, headings, paragraphs, tables, and
images from a PDF report might be cleaned and rearranged into plain
text. This step ensures the information is clean, readable, and ready for
processing. Without this, the content might contain noise, repeated
patterns, or irrelevant characters that could reduce the quality of
embeddings and retrievals.

c. Chunking:
Chunking refers to separating text into manageable units.
After the data has been extracted and cleaned, it is split into smaller,
manageable units called chunks. This process is important because
modern AI models, including those used for embeddings and
generation (like BERT, OpenAI Embeddings, etc.), have token limits.
Chunking ensures that the data is broken down into semantically
meaningful segments such as paragraphs, sentences, or even
sections, depending on the context and the desired
granularity(granularity refers to the level of detail at which data is
stored and analyzed.). These chunks act as the atomic units for
storage and retrieval later in the pipeline. Importantly, good chunking
also considers the context window to maintain continuity across
segments. For example, blindly cutting every 100 words might break
sentences, so smarter algorithms use semantic or sentence aware
chunking strategies.

d. Embedding:
Each chunk is then passed through an embedding model, which
transforms it into a high dimensional vector representation. These
embeddings capture the semantic meaning of the text rather than just
its surface form. That means two chunks with different words but
similar meanings will have embeddings close to each other in the
vector space. For example, “How to reset my password?” and “Steps to
change my login credentials” would produce similar vectors.
These vectors are then stored in a vector database (e.g., FAISS,
Pinecone, Chroma), which allows efficient similarity search. This step is
crucial because it enables fast and accurate retrieval based on
meaning rather than keyword matching.

2. RAG Workflow:

a. Query:
This begins the retrieval phase. A user sends a natural language
question or request called the “query.” For example, someone might
ask, “What is the return policy on electronics?” This query, that looks
simple, can vary significantly in wording and requires contextual
understanding to be answered properly.
The raw query text is not useful on its own. It must be transformed to
match the vector space of the preprocessed knowledge base(the
vector database we created during data preparation). That leads us to
the next step.

b. Embedding the query:

Just like the data chunks we read previously, the user’s query is
passed through the same embedding model used earlier. This
generates a query vector that lives in the same high dimensional
semantic space as the stored data. This transformation ensures that
instead of relying on exact keywords, the system can retrieve results
that are semantically aligned with the user’s intent. This vector is now
ready to be used in a similarity search within the vector database to
find the most relevant data.

c. Vector Search:

The query vector is compared against the stored vectors in the vector
database using similarity metrics like cosine similarity or inner product.
The database returns the top n most relevant chunks that are closest to
the query vector. These retrieved pieces of information (e.g., top 5 or
top 10 chunks) are considered the most contextually relevant and are
assumed to contain the answer or background knowledge needed to
fulfill the user’s request. This is the “retrieval” part of Retrieval
Augmented Generation. These results are then sent forward for
synthesis.

d. Augmentation:
Now, the retrieved relevant chunks are combined with the original user
query to form an augmented prompt. This prompt typically includes
both the user’s original question and the retrieved context (in a format
like: “Context: [retrieved text] \n\n Question: [user query]”). This
augmented input is what is fed into the Large Language Model (LLM).
Because the LLM now has access to real world, specific information
(that may not have been part of its training set), it can generate more
accurate, up-to-date, and grounded answers. This step is critical in
preventing hallucinations, as the model no longer needs to guess or
make up answers because it has the context provided.

e. Generate Response:

Finally, the LLM processes the combined context and query and
generates a natural language response. Since it’s equipped with real
time, relevant knowledge from the retrieval step, the output tends to be
far more accurate and specific than responses generated by a
traditional, standalone LLMs. For instance, the system can respond
with: “According to the company’s electronics return policy, items must
be returned within 15 days of purchase with the original receipt.” This
response is tailored, context-aware, and reliable because it is grounded
in actual documentation retrieved earlier.

Semantic Search and Beyond handout-Tim-Clarke
No ratings yet
Semantic Search and Beyond handout-Tim-Clarke
16 pages
External Information On Large Linguistic Models Utilizing Retrieval Enhanced Generation (RAG)
100% (10)
External Information On Large Linguistic Models Utilizing Retrieval Enhanced Generation (RAG)
6 pages
Minor Proj
No ratings yet
Minor Proj
15 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
23mca1047
No ratings yet
23mca1047
57 pages
Knowledge Retrieval Based On Generative AI: 1 Te-Lun Yang
No ratings yet
Knowledge Retrieval Based On Generative AI: 1 Te-Lun Yang
8 pages
Transcript For Explaining Retrieval-Augmented Generation (RAG) To Colleagues
No ratings yet
Transcript For Explaining Retrieval-Augmented Generation (RAG) To Colleagues
6 pages
Rag System Notes
No ratings yet
Rag System Notes
26 pages
Llmrag
No ratings yet
Llmrag
6 pages
Pec Gen Ai Notes
No ratings yet
Pec Gen Ai Notes
11 pages
Hybrid RAG For Unstructured Data
No ratings yet
Hybrid RAG For Unstructured Data
25 pages
RAG 570 Hasnad Ahmed2
No ratings yet
RAG 570 Hasnad Ahmed2
9 pages
RAG Framework: Document Retrieval & Generation
No ratings yet
RAG Framework: Document Retrieval & Generation
13 pages
Synthetic Data Generation with LLMs
No ratings yet
Synthetic Data Generation with LLMs
7 pages
Paper 2
No ratings yet
Paper 2
12 pages
Chapter 3 Methods
No ratings yet
Chapter 3 Methods
20 pages
Gen AI Guide
No ratings yet
Gen AI Guide
6 pages
Privacy First RAG Closed-Loop LLMs For Industrial Data Security
No ratings yet
Privacy First RAG Closed-Loop LLMs For Industrial Data Security
12 pages
Challenge
No ratings yet
Challenge
8 pages
How To Train LLM
No ratings yet
How To Train LLM
6 pages
RAG Enhancements for Financial QA
No ratings yet
RAG Enhancements for Financial QA
7 pages
Untitled 2
No ratings yet
Untitled 2
40 pages
DOM Graph RAG: Advanced AI Architecture
No ratings yet
DOM Graph RAG: Advanced AI Architecture
30 pages
Retrieval Augmented Generation (RAG) For Everyone
No ratings yet
Retrieval Augmented Generation (RAG) For Everyone
57 pages
Internship Report Hamas Khan
No ratings yet
Internship Report Hamas Khan
24 pages
Medium
No ratings yet
Medium
22 pages
LangChain & RAG - U1
No ratings yet
LangChain & RAG - U1
32 pages
Reading:: Sources
No ratings yet
Reading:: Sources
15 pages
RAG Notes
No ratings yet
RAG Notes
19 pages
Learning: Gen Ai
No ratings yet
Learning: Gen Ai
6 pages
NVIDIA RAG Whitepaper
No ratings yet
NVIDIA RAG Whitepaper
7 pages
Hybrid Retrieval-Augmented Generation Approach For LLMs Query Response Enhancement
No ratings yet
Hybrid Retrieval-Augmented Generation Approach For LLMs Query Response Enhancement
5 pages
AI For Education RAG
No ratings yet
AI For Education RAG
18 pages
RAG and Prompt Engineering Overview
No ratings yet
RAG and Prompt Engineering Overview
56 pages
Author Name Title Paper/Submission ID Submitted by Submission Date Total Pages Document Type
No ratings yet
Author Name Title Paper/Submission ID Submitted by Submission Date Total Pages Document Type
8 pages
RAG vs GPT: A Comprehensive Guide
No ratings yet
RAG vs GPT: A Comprehensive Guide
8 pages
2.5 Retrieval Augmented Generation RAG
No ratings yet
2.5 Retrieval Augmented Generation RAG
2 pages
Rag
No ratings yet
Rag
10 pages
GenAI PDF
No ratings yet
GenAI PDF
34 pages
R AG: Incorporating Retrieval Information Into Retrieval Augmented Generation
No ratings yet
R AG: Incorporating Retrieval Information Into Retrieval Augmented Generation
13 pages
Major Projectpp
No ratings yet
Major Projectpp
5 pages
Understanding Retrieval-Augmented Generation (RAG)
No ratings yet
Understanding Retrieval-Augmented Generation (RAG)
12 pages
Crag Pa Peer
No ratings yet
Crag Pa Peer
16 pages
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
7 pages
M Rag Survey
No ratings yet
M Rag Survey
80 pages
Research Methodology Project
No ratings yet
Research Methodology Project
7 pages
DataGemma FullPaper
No ratings yet
DataGemma FullPaper
39 pages
Guide to Retrieval Augmented Generation
No ratings yet
Guide to Retrieval Augmented Generation
9 pages
Omrani Et Al. - 2024 - Hybrid Retrieval-Augmented Generation Approach For LLMs Query Response Enhancement
No ratings yet
Omrani Et Al. - 2024 - Hybrid Retrieval-Augmented Generation Approach For LLMs Query Response Enhancement
5 pages
(IJETA-V11I3P40) :kanishk Pratap Singh, Pradeep Kumar
No ratings yet
(IJETA-V11I3P40) :kanishk Pratap Singh, Pradeep Kumar
8 pages
AI UNIT-5 Notes
No ratings yet
AI UNIT-5 Notes
27 pages
Retrieval-Augmented Generation Techniques
No ratings yet
Retrieval-Augmented Generation Techniques
34 pages
Advanced Search Techniques Guide
No ratings yet
Advanced Search Techniques Guide
16 pages
Transcript - What Is Retrieval Augmented Generation (RAG) - 1
No ratings yet
Transcript - What Is Retrieval Augmented Generation (RAG) - 1
3 pages
A Survey On Rag Meeting LLMS: Towards Retrieval-Augmented Large Language Models
No ratings yet
A Survey On Rag Meeting LLMS: Towards Retrieval-Augmented Large Language Models
18 pages
RAG Slide ENG
No ratings yet
RAG Slide ENG
41 pages
Retrieval Augmented Generation Explained
No ratings yet
Retrieval Augmented Generation Explained
8 pages
RAG Deep-Dive Research Report
No ratings yet
RAG Deep-Dive Research Report
46 pages
Retrieval-Augmented Reasoning With Lean Language Models: (Rchan, Fnanni, Jgeddes) @turing - Ac.uk A.duncan@imperial - Ac.uk
No ratings yet
Retrieval-Augmented Reasoning With Lean Language Models: (Rchan, Fnanni, Jgeddes) @turing - Ac.uk A.duncan@imperial - Ac.uk
45 pages
Creating A RAG System Using Langchain
No ratings yet
Creating A RAG System Using Langchain
1 page
How Friends and Relatives Can Help Nancy G Andreasen
No ratings yet
How Friends and Relatives Can Help Nancy G Andreasen
19 pages
What Causes Psychiatric Disorders
No ratings yet
What Causes Psychiatric Disorders
18 pages
Llama Implementation Using Ollama
No ratings yet
Llama Implementation Using Ollama
1 page
Gemini Implementation
No ratings yet
Gemini Implementation
4 pages
VXLAN Lab Guide
No ratings yet
VXLAN Lab Guide
8 pages
Full Download (Ebook PDF) Operations and Supply Chain Management The Core 3th PDF
100% (10)
Full Download (Ebook PDF) Operations and Supply Chain Management The Core 3th PDF
50 pages
ETC3250 2024 Sem 1-Sol
No ratings yet
ETC3250 2024 Sem 1-Sol
49 pages
16-Point Checklist For Building Production-Ready Kubernetes Clusters
No ratings yet
16-Point Checklist For Building Production-Ready Kubernetes Clusters
13 pages
Vocal Remover - MP3 Vocal Remover - Vocal Eliminator - Backing Trac
No ratings yet
Vocal Remover - MP3 Vocal Remover - Vocal Eliminator - Backing Trac
7 pages
Material Science Multiple Choice
100% (3)
Material Science Multiple Choice
946 pages
Basic IP2 Win Tutorial
No ratings yet
Basic IP2 Win Tutorial
32 pages
SEO Tips for Literature Reviews on Temperature Controllers
100% (2)
SEO Tips for Literature Reviews on Temperature Controllers
6 pages
Marlboro Brand Overview and Strategy
No ratings yet
Marlboro Brand Overview and Strategy
4 pages
Dawei Medical Ultrasound Innovations
No ratings yet
Dawei Medical Ultrasound Innovations
26 pages
Modern Operating Systems, 5th Global Edition Andrew Tanenbaum - Ebook PDFPDF Download
100% (3)
Modern Operating Systems, 5th Global Edition Andrew Tanenbaum - Ebook PDFPDF Download
62 pages
Arrays, Vectors C++
No ratings yet
Arrays, Vectors C++
40 pages
WOODWARD SPEED SW. ESSE-2 Only
No ratings yet
WOODWARD SPEED SW. ESSE-2 Only
24 pages
Huawei Core Network Product Overview
100% (1)
Huawei Core Network Product Overview
26 pages
GameRanger Launch Log
No ratings yet
GameRanger Launch Log
90 pages
DS All Unit by Dsalgo
No ratings yet
DS All Unit by Dsalgo
31 pages
Android Food Waste Management System
No ratings yet
Android Food Waste Management System
7 pages
Final Results Record
No ratings yet
Final Results Record
41 pages
Tech Trivia: Key IT Terms & Figures
No ratings yet
Tech Trivia: Key IT Terms & Figures
1 page
NLP Unit4 Mat
No ratings yet
NLP Unit4 Mat
13 pages
11C90
No ratings yet
11C90
11 pages
Business Logic (Group 11) PDF
No ratings yet
Business Logic (Group 11) PDF
18 pages
BP232 A Concise Introduction To MS DOS
No ratings yet
BP232 A Concise Introduction To MS DOS
82 pages
A Step by Step Guide For Invoicing Extraction (FI-... - SAP Community
No ratings yet
A Step by Step Guide For Invoicing Extraction (FI-... - SAP Community
22 pages
Worldspan Reservation Manual DEC10
100% (2)
Worldspan Reservation Manual DEC10
132 pages
Computation 1
No ratings yet
Computation 1
13 pages
Digital Microscope
No ratings yet
Digital Microscope
1 page
Failure Analysis of Ball Valves Worcester
No ratings yet
Failure Analysis of Ball Valves Worcester
12 pages
Greenhouse Wireless Monitoring..
0% (1)
Greenhouse Wireless Monitoring..
88 pages
Lesson Plan 1: Clothes We Wear: Class: Year 2 Time: HR To 1 HR Methodology
No ratings yet
Lesson Plan 1: Clothes We Wear: Class: Year 2 Time: HR To 1 HR Methodology
5 pages

Steps Involved in RAG

Uploaded by

Steps Involved in RAG

Uploaded by

RAG:

RAG (Retrieval Augmented Generation) is an AI framework that combines the

Steps involved in RAG system:

1.​ Data preparation:

a.​ Raw data sources:

b.​ Information Extraction:

2.​ RAG Workflow:

b.​ Embedding the query:

c.​ Vector Search:

e.​ Generate Response:

You might also like

1. Data preparation:

a. Raw data sources:

b. Information Extraction:

2. RAG Workflow:

b. Embedding the query:

c. Vector Search:

e. Generate Response: