RAG System Design Notes

Uploaded by

Jan Berde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views6 pages

RAG System Design Notes

Uploaded by

Jan Berde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

RAG System

From data ingestion to evaluation

Design Notes

© 2025 Concise class notes for engineers building retrieval augmented generation.
1. End to end pipeline
Overview
Ingest Chunk Embed Index Retrieve Rerank Generate Post process Eva
Design for observability: store query, retrieved IDs, scores, and final output with
versioned embeddings.

Ingest Chunk Embed Index Retrieve

Rerank Generate Evaluate

Keep artifacts versioned: docs, chunks, embedding model, index params, prompts.
2. Chunking & embeddings
2.1 Chunking
Semantic + structural chunking (headings, code blocks).
Overlap only as needed (e.g., 10 15%); store metadata: source, section, timestamp.
2.2 Embeddings
Use domain fit models where possible; normalize vectors; monitor drift after model upgra
Store text, vector, and hash of pre processing pipeline for reproducibility.

Index hygiene beats parameter tweaking. Garbage in garbage out.

3. Retrieval & reranking
3.1 Retrieval recipes
Hybrid (BM25 + vector) improves recall; add semantic filters on metadata (time, author).
Multi query expansion: rewrite user query into N paraphrases and merge top k results.
// Pseudocode: hybrid search
dense = [Link](query, k=20)
sparse = [Link](query, k=20)
results = rerank(dense sparse)[:k]

3.2 Reranking
Cross encoder rerankers can boost precision@k; cache aggressively.

Aim for high recall first, then increase precision with rerankers and filters.
4. Generation & guardrails
Prompt shape
Instructions + citations requirement + JSON schema for answers.
Insert retrieved chunks with clear separators; limit max tokens.
SYSTEM: Answer using ONLY supplied context. If missing, say you don t know.
CONTEXT:
<<<chunk 1>>>
<<<chunk 2>>>
OUTPUT: JSON {"answer": string, "citations": [doc_id], "confidence": 0..1}

Guardrails
Grounding check: ask the model to quote exact spans before answering.
Toxicity/redaction passes on output; domain allow lists for sources.

Schema bound outputs reduce hallucinations and simplify UI rendering.

5. Evaluation & observability
Offline eval
IR metrics: recall@k, nDCG; QA metrics: answerable/unanswerable, exact match, citation
accuracy.
Use a labeled set of question gold spans; update monthly.
Online eval
Collect user feedback; detect answer changes vs. baseline; monitor latency and cost per
query.
Shadow deploy new indexes/models; A/B test prompt variants.

Ship dashboards: retrieval quality, latency, costs, and safety incidents.

AutoRAG Automated Framework For Optimization of Retrieval-Augmented Generation
No ratings yet
AutoRAG Automated Framework For Optimization of Retrieval-Augmented Generation
22 pages
Implementing Retrieval-Augmented Generation
No ratings yet
Implementing Retrieval-Augmented Generation
3 pages
Beyond Explaining The Basics of Retrieval (Augmented Generation)
No ratings yet
Beyond Explaining The Basics of Retrieval (Augmented Generation)
22 pages
RAG Syllabus R&D
No ratings yet
RAG Syllabus R&D
6 pages
Reading:: Sources
No ratings yet
Reading:: Sources
15 pages
Types of RAG: @bhavishya Pandit
100% (1)
Types of RAG: @bhavishya Pandit
15 pages
Advanced Data Query Techniques
100% (1)
Advanced Data Query Techniques
5 pages
Hilti Ai Assisted Data Search or Ai Powered Data Management - Shikhar Ashutosh Moondra
No ratings yet
Hilti Ai Assisted Data Search or Ai Powered Data Management - Shikhar Ashutosh Moondra
12 pages
Certified Generative Ai Engineer Associate
No ratings yet
Certified Generative Ai Engineer Associate
25 pages
RAG Comprehensive Documentation
No ratings yet
RAG Comprehensive Documentation
20 pages
Source Code Analysis Using Generative AI
No ratings yet
Source Code Analysis Using Generative AI
3 pages
Thesis RAG Retrieval Augmented Generation For The IR-Anthology
No ratings yet
Thesis RAG Retrieval Augmented Generation For The IR-Anthology
83 pages
Latex Conversion
No ratings yet
Latex Conversion
42 pages
RAG Understanding PDF
No ratings yet
RAG Understanding PDF
12 pages
Survey Template With Expansion Guides
No ratings yet
Survey Template With Expansion Guides
5 pages
Rag Expected Interview Question
No ratings yet
Rag Expected Interview Question
6 pages
RAG Slide ENG
No ratings yet
RAG Slide ENG
41 pages
Design Elastic Search 2
No ratings yet
Design Elastic Search 2
11 pages
Practical RAG
No ratings yet
Practical RAG
127 pages
RAG Model Optimization Techniques
No ratings yet
RAG Model Optimization Techniques
2 pages
RAG Cheat Sheet-2
No ratings yet
RAG Cheat Sheet-2
29 pages
Improving Retrieval Augmented Generation
No ratings yet
Improving Retrieval Augmented Generation
33 pages
Understanding Retrieval-Augmented Generation (RAG)
No ratings yet
Understanding Retrieval-Augmented Generation (RAG)
12 pages
RAG Training NEW
No ratings yet
RAG Training NEW
47 pages
RAG in 10 Minutes
No ratings yet
RAG in 10 Minutes
1 page
Beyond Naive RAG - Practical Advanced Methods
No ratings yet
Beyond Naive RAG - Practical Advanced Methods
144 pages
Advance RAG Technique
No ratings yet
Advance RAG Technique
23 pages
Transcript For Explaining Retrieval-Augmented Generation (RAG) To Colleagues
No ratings yet
Transcript For Explaining Retrieval-Augmented Generation (RAG) To Colleagues
6 pages
Interactive Dense Retrieval and Query Refinement Systems - A Synergistic Approach To Information Retrieval
No ratings yet
Interactive Dense Retrieval and Query Refinement Systems - A Synergistic Approach To Information Retrieval
22 pages
Ir pt2
No ratings yet
Ir pt2
19 pages
Build Application Using Advanced RAG Methods and Validate Using Different Evaluation Mechanism
No ratings yet
Build Application Using Advanced RAG Methods and Validate Using Different Evaluation Mechanism
29 pages
Different RAG Techniques
No ratings yet
Different RAG Techniques
9 pages
Mastering RAG: A Comprehensive Guide
100% (1)
Mastering RAG: A Comprehensive Guide
15 pages
A Deep Dive Into Retrieval Augmented Generation: Team Members
No ratings yet
A Deep Dive Into Retrieval Augmented Generation: Team Members
14 pages
Hierarchical Knowledge Base With Vector Indexing PDF
No ratings yet
Hierarchical Knowledge Base With Vector Indexing PDF
6 pages
Ue21cs421ac1 20240924233834
No ratings yet
Ue21cs421ac1 20240924233834
54 pages
Semantic Search and Beyond handout-Tim-Clarke
No ratings yet
Semantic Search and Beyond handout-Tim-Clarke
16 pages
Modular RAG: Transforming RAG Systems Into LEGO-like Reconfigurable Frameworks
No ratings yet
Modular RAG: Transforming RAG Systems Into LEGO-like Reconfigurable Frameworks
17 pages
Rag Assignment
No ratings yet
Rag Assignment
4 pages
A Practical Approach To Retrieval Augmented Generation Systems - 4 From Simple To Advanced RAG
No ratings yet
A Practical Approach To Retrieval Augmented Generation Systems - 4 From Simple To Advanced RAG
45 pages
AI Model Enhancement Techniques
No ratings yet
AI Model Enhancement Techniques
9 pages
Retrieval-Augmented Generation Techniques
No ratings yet
Retrieval-Augmented Generation Techniques
34 pages
RAG and Prompt Engineering Overview
No ratings yet
RAG and Prompt Engineering Overview
56 pages
Autocomplete From Scratch
No ratings yet
Autocomplete From Scratch
24 pages
Generative Certification Notes-1
No ratings yet
Generative Certification Notes-1
22 pages
Irt Q&A
No ratings yet
Irt Q&A
14 pages
RAG-Fusion: The Future of Search
No ratings yet
RAG-Fusion: The Future of Search
20 pages
RAG Notes
No ratings yet
RAG Notes
4 pages
Major Projectpp
No ratings yet
Major Projectpp
6 pages
External Information On Large Linguistic Models Utilizing Retrieval Enhanced Generation (RAG)
100% (10)
External Information On Large Linguistic Models Utilizing Retrieval Enhanced Generation (RAG)
6 pages
Information Retrieval System MODULE 2 Mumbai University
No ratings yet
Information Retrieval System MODULE 2 Mumbai University
23 pages
Ap May 23 QP Ans
No ratings yet
Ap May 23 QP Ans
9 pages
Master RAG Course
No ratings yet
Master RAG Course
50 pages
RAG Framework: Document Retrieval & Generation
No ratings yet
RAG Framework: Document Retrieval & Generation
13 pages
A Simple Guide To Retrieval Augmented Generation
No ratings yet
A Simple Guide To Retrieval Augmented Generation
32 pages
Knowledge Retrieval Based On Generative AI: 1 Te-Lun Yang
No ratings yet
Knowledge Retrieval Based On Generative AI: 1 Te-Lun Yang
8 pages
Rationale Distillation in RAG Systems
No ratings yet
Rationale Distillation in RAG Systems
12 pages
Advanced RAG Techniques Guide
No ratings yet
Advanced RAG Techniques Guide
16 pages
Sample INTERNSHIP Report
No ratings yet
Sample INTERNSHIP Report
32 pages
Dynamic Datasets and Market Environments For Financial Reinforcement Learning
No ratings yet
Dynamic Datasets and Market Environments For Financial Reinforcement Learning
49 pages
Eso Self Service Guide
No ratings yet
Eso Self Service Guide
14 pages
Microsoft Authenticator MFA Setup Guide
No ratings yet
Microsoft Authenticator MFA Setup Guide
5 pages
E-Commerce Notes
No ratings yet
E-Commerce Notes
8 pages
Ais160 Final Examination Feb 2023
100% (3)
Ais160 Final Examination Feb 2023
17 pages
Text Book Answer Key
No ratings yet
Text Book Answer Key
5 pages
PowerShell Commands & Features Guide
No ratings yet
PowerShell Commands & Features Guide
4 pages
Advanced Labs on FT2 Supercomputer
No ratings yet
Advanced Labs on FT2 Supercomputer
2 pages
Android Based Sidama To Amharic Dictionary
100% (1)
Android Based Sidama To Amharic Dictionary
63 pages
Business Unit Education: Job Description
No ratings yet
Business Unit Education: Job Description
15 pages
Digital Student ID System (DSID)
No ratings yet
Digital Student ID System (DSID)
2 pages
Leticia Cabral (@leticiacabral195) - Instagram Photos and Videos
No ratings yet
Leticia Cabral (@leticiacabral195) - Instagram Photos and Videos
1 page
VMware Installation & Virtualization Guide
No ratings yet
VMware Installation & Virtualization Guide
16 pages
CS609 Final Term Notes by Saeed
No ratings yet
CS609 Final Term Notes by Saeed
33 pages
Adding 'Lead Days' or 'Transit Days' To Transfer Orders Generated From MRP Message
No ratings yet
Adding 'Lead Days' or 'Transit Days' To Transfer Orders Generated From MRP Message
5 pages
SocketException Host Lookup Failures
No ratings yet
SocketException Host Lookup Failures
6 pages
SEI User Guide PDF
No ratings yet
SEI User Guide PDF
120 pages
Jquery 1 3 With PHP 1st Edition Kae Verens Latest PDF 2025
100% (1)
Jquery 1 3 With PHP 1st Edition Kae Verens Latest PDF 2025
156 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
53 pages
Agent As Cerebrum, Controller As Cerebellum
No ratings yet
Agent As Cerebrum, Controller As Cerebellum
16 pages
12th Python Interface With SQL
No ratings yet
12th Python Interface With SQL
9 pages
Os Study Material
No ratings yet
Os Study Material
5 pages
JavaFX for Beginners
No ratings yet
JavaFX for Beginners
41 pages
Session 2 TIB - SSODL
No ratings yet
Session 2 TIB - SSODL
9 pages
of Chapter 2
No ratings yet
of Chapter 2
49 pages
UNIT IV Virtual Environment BCME 802
No ratings yet
UNIT IV Virtual Environment BCME 802
12 pages
Advertisment For Engaging Project Staff in HCP522401, HCP522401 & MLP0017 Projects
No ratings yet
Advertisment For Engaging Project Staff in HCP522401, HCP522401 & MLP0017 Projects
7 pages
CS101 Solved MCQs Mega File For Mid Term Papers With 115 Pages
80% (5)
CS101 Solved MCQs Mega File For Mid Term Papers With 115 Pages
111 pages
CV Ok
No ratings yet
CV Ok
2 pages

RAG System Design Notes

Uploaded by

RAG System Design Notes

Uploaded by

RAG System

From data ingestion to evaluation

Ingest Chunk Embed Index Retrieve

Rerank Generate Evaluate

Index hygiene beats parameter tweaking. Garbage in garbage out.

Schema bound outputs reduce hallucinations and simplify UI rendering.

Ship dashboards: retrieval quality, latency, costs, and safety incidents.

You might also like