Understanding Retrieval Augmented Generation

Retrieval-Augmented Generation (RAG) is a technique that enhances large language models (LLMs) by integrating external authoritative knowledge bases to provide up-to-date and accurate responses. The process involves data collection, chunking, embedding, handling user queries, and generating responses, with key components including a retriever and generator. RAG applications span various fields, such as customer support, content creation, and education, while also facing challenges like integration complexity, scalability, and data quality.

Uploaded by

toumi.mohameddhia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views16 pages

Understanding Retrieval Augmented Generation

Uploaded by

toumi.mohameddhia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Chapter 8: Retrieval Augmented

Generation RAG
Unit: Advanced DL

1
What is it?
• LLMs don't have any up-to-date information past
their training cut-off, and they don’t know private
and proprietary information. Retrieval-augmented
generation (RAG) is the technique that helps
address these limitations.
• Retrieval-Augmented Generation (RAG) is the
process of optimizing the output of a large
language model (LLM), so it references an
authoritative knowledge base outside of its
training data sources before generating a response.
• RAG involves several steps: data collection. data
chunking, document embeddings, handling user
queries, and generating responses using an LLM.
RAG
• Retrieval Augmented Generation (RAG) is an
advanced artificial intelligence (AI) technique.

• It combines the capabilities of a pre-trained large

language model with an external data source.

• This approach merges the generative power of

LLMs with the precision of specialized data search
mechanisms.

• The result is a system that can provide more

nuanced and accurate responses.
Architecture & Key Components
• At a high level, there are two components: the retriever and the generator. As the name suggests, the
retriever is responsible for retrieving the information, and the generator is the LLM, used to generate the
text.
• Retrieval and pre-processing:
• RAGs leverage powerful search algorithms to query external data, such as web pages, knowledge bases, and databases.
• Once retrieved, the relevant information undergoes pre- processing, including fokenization, stemming, and removal of stop
words.
• Grounded generation:
• The pre-processed retrieved information is then searnlessly incorporated into the pre-trained LLM.
• This integration enhances the LLM's context, providing it with a more comprehensive understanding of the topic.
• This augmented context enables the LLM to generate more precise, informative, and engaging responses.
• Key Components:
• The knowledge base
• The retriever
• The integration layer
• The generator
• The ranker
• The output handler
Vector Database
• A vector database stores, manages and
indexes high-dimensional vector data.
• Data points are stored as arrays of numbers
called "vectors," which are clustered based on
similarity.
• The best way of chunking a long text will
depend on the types of texts and queries your
system anticipates:
• Each sentence is a chunk
• Each paragraph is a chunk
• Overlapping window of paragraphs
• Vector databases examples:
• ChromaDB, Pinecone, & Faiss
RAG Applications with Examples
Example Scenario RAG in Action
Advanced Question- Imagine a customer support chatbot for an The chatbot retrieves the store's return policy document
Answering System online store. A customer asks, "What is the from its knowledge base. RAG then uses this information
return policy for a damaged item?" to generate a clear and concise answer like, "If your
item is damaged upon arrival, you can return it free of
charge within 30 days of purchase. Please visit our
returns page for detailed instructions."
Content creation and You're building a travel website and Creation RAG can access and process vast amounts of
summarization and want to create a summary of the information about the Great Barrier Reef from various
Summarization Great Barrier Reef. sources. It can then provide a concise summary
highlighting key points like its location, size, biodiversity,
and conservation efforts.
Educational An online learning platform for science The platform uses RAG to access relevant information
Tools and courses. A student is studying about the about the heart's anatomy and function from the
Resources human body and has a question about the course materials. It then presents the student with an
function of the heart. explanation, diagrams, and perhaps even links to video
resources, all tailored to their specific learning needs.
How Does RAG Work? (1/5)
Step 1: Data collection
• You must first gather all the
data that is needed for your
application.
• Example: in the case of a
customer support chatbot
for an electronics company,
this can include user
manuals, a product
database, and a list of FAQs.
How Does RAG Work? (2/5)
• Step 2: Data chunking
• involves breaking down large
datasets into smaller, more
focused pieces.
• This makes it easier to retrieve
relevant information and
improves efficiency by avoiding
unnecessary processing.
• By organizing data into specific
topics, you can ensure that
search results are directly
applicable to the user's query.
How Does RAG Work? (3/5)
Step 3: Document embeddings
• Now that the source data has
been broken down into smaller
parts, it needs to be converted
into a vector representation.
• This involves transforming text
data into embeddings, which are
numeric representations that
capture the semantic meaning
behind text.
How Does RAG Work? (4/5)
Step 4: Handling user queries
• User queries are converted into
embedding or vector representations.
• The same model is used for both
document and query embeddings to
maintain consistency.
• The system compares the query
embedding with document embeddings.
• It retrieves data chunks whose
embeddings are most similar to the
query, using cosine similarity or
Euclidean distance.
• The retrieved chunks are considered the
most relevant to the user’s query.
How Does RAG Work? (5/5)
Step 5: Generating responses with
an LLM

• The retrieved text chunks, along

with the initial user query, are fed
into a language model.

• The algorithm will use this

information to generate a
coherent response to the user’s
questions through a chat
interface.
Indexing stage
To seamlessly accomplish the steps required to generate responses with LLMs, you can use a data
framework like:
• LlamaIndex: enables LLMs to access and interact with data from various sources by indexing it for
efficient querying, without retraining the model.
• LangChain: Designed to help developers create applications that combine LLMs with external
tools and data sources, like databases, APIs, and search systems.
• FAISS (Facebook AI Similarity Search):A library for efficient similarity search and clustering of
dense vectors, used for fast nearest neighbor searches in vector embeddings.
• Pinecone: A managed vector database that enables efficient retrieval of relevant data chunks for
RAG by storing and searching through large sets of embeddings.
• Elasticsearch: Though primarily a text search engine, it supports vector-based search and can be
integrated with LLMs for RAG, combining traditional keyword search with dense retrieval.
Benefits
• Cost-efficient Al implementation and Alscaling
• Access to current domain-specific data
• Lower risk of Al hallucinations
• Increased user trust
• Expanded use cases
• Enhanced developer control and model maintenance
• Greater data security
RAG: challenges (1/3)

• Integration complexity:

It can be difficult to integrate a retrieval system with an LLM. This

complexity increases when there are multiple sources of external data
in varying formats.

• To overcome this challenge: separate modules can be designed to

handle different data sources independently
RAG: challenges (2/3)
• Scalability
As the amount of data increases, it gets more challenging to maintain
the efficiency of the RAG system. Many complex operations need to be
performed. These tasks are computationally intensive and can slow
down the system as the size of the source data increases.

• To address this challenge, you can distribute computational load

across different servers and invest in robust hardware infrastructure.
RAG: challenges (3/3)
• Data quality
The effectiveness of an RAG system depends heavily on the quality of data being
fed into it. If the source content accessed by the application is poor, the responses
generated will be inaccurate.
• To address this challenge : Organizations must invest in a diligent content
curation and fine-tuning process. It is necessary to refine data sources to
enhance their quality.
For commercial applications, it can be beneficial to involve a subject matter expert
to review and fill in any information gaps before using the dataset in an RAG
system.

RAG Comprehensive Guide
No ratings yet
RAG Comprehensive Guide
65 pages
Understanding Retrieval-Augmented Generation
No ratings yet
Understanding Retrieval-Augmented Generation
14 pages
Rag Explanation Document
No ratings yet
Rag Explanation Document
26 pages
Retrieval-Augmented Generation (RAG) : 2. The Limitations of Traditional Large Language Models
No ratings yet
Retrieval-Augmented Generation (RAG) : 2. The Limitations of Traditional Large Language Models
15 pages
Tutorial About NLP Basic Conspets
No ratings yet
Tutorial About NLP Basic Conspets
20 pages
Understanding RAG Systems in NLP
No ratings yet
Understanding RAG Systems in NLP
46 pages
Understanding RAG for LLMs Explained
No ratings yet
Understanding RAG for LLMs Explained
4 pages
RAG Techniques and Applications Overview
No ratings yet
RAG Techniques and Applications Overview
14 pages
Beyond Static Knowledge A Survey On The Evolution and Future of Retrieval Augmented Generation For LLMs
No ratings yet
Beyond Static Knowledge A Survey On The Evolution and Future of Retrieval Augmented Generation For LLMs
8 pages
RAG: Definition and Limitations
100% (2)
RAG: Definition and Limitations
11 pages
Comprehensive Guide to Retrieval Augmented Generation
No ratings yet
Comprehensive Guide to Retrieval Augmented Generation
57 pages
Building a Chatbot with RAG Technology
No ratings yet
Building a Chatbot with RAG Technology
6 pages
RAG Architecture Overview and Guide
No ratings yet
RAG Architecture Overview and Guide
12 pages
Unlocking LLM Potential Critical Breakthroughs and Frontier Explorations in Retrieval-Augmented Generation
No ratings yet
Unlocking LLM Potential Critical Breakthroughs and Frontier Explorations in Retrieval-Augmented Generation
8 pages
Understanding Retrieval Augmented Generation
No ratings yet
Understanding Retrieval Augmented Generation
26 pages
RAG Retrieval Augmented Generation (RAG) Pipeline
No ratings yet
RAG Retrieval Augmented Generation (RAG) Pipeline
28 pages
RAG Variants: Graph, Light, Agentic
No ratings yet
RAG Variants: Graph, Light, Agentic
16 pages
Understanding RAG with Gemini Pro
No ratings yet
Understanding RAG with Gemini Pro
42 pages
Understanding Retrieval-Augmented Generation
No ratings yet
Understanding Retrieval-Augmented Generation
9 pages
RAG Interview Questions
No ratings yet
RAG Interview Questions
21 pages
Types of Retrieval-Augmented Generation
No ratings yet
Types of Retrieval-Augmented Generation
29 pages
Advanced RAG Techniques Overview
No ratings yet
Advanced RAG Techniques Overview
54 pages
RAG Architecture Cheat Sheet
No ratings yet
RAG Architecture Cheat Sheet
29 pages
Understanding Retrieval-Augmented Generation
No ratings yet
Understanding Retrieval-Augmented Generation
2 pages
Introduction to Retrieval-Augmented Generation
No ratings yet
Introduction to Retrieval-Augmented Generation
29 pages
Enhancing LLMs with Retrieval-Augmented Generation
No ratings yet
Enhancing LLMs with Retrieval-Augmented Generation
8 pages
Understanding Retrieval Augmented Generation
No ratings yet
Understanding Retrieval Augmented Generation
14 pages
Customizing LLMs: RAG & Fine Tuning
No ratings yet
Customizing LLMs: RAG & Fine Tuning
51 pages
RAG Paradigms and Technologies Overview
No ratings yet
RAG Paradigms and Technologies Overview
41 pages
Understanding Retrieval Augmented Generation
No ratings yet
Understanding Retrieval Augmented Generation
6 pages
RAG System for Language Model Accuracy
No ratings yet
RAG System for Language Model Accuracy
2 pages
Unit 5
No ratings yet
Unit 5
12 pages
RAG Guide: Build Faster AI Apps
No ratings yet
RAG Guide: Build Faster AI Apps
10 pages
Understanding Retrieval-Augmented Generation
No ratings yet
Understanding Retrieval-Augmented Generation
6 pages
RAG: Enhancing AI with Data Integration
No ratings yet
RAG: Enhancing AI with Data Integration
8 pages
RAG: Enhancing LLMs for AI Applications
No ratings yet
RAG: Enhancing LLMs for AI Applications
7 pages
How Does RAG Work
No ratings yet
How Does RAG Work
7 pages
RAG Model Challenges and Memory Needs
No ratings yet
RAG Model Challenges and Memory Needs
2 pages
Understanding Retrieval-Augmented Generation
No ratings yet
Understanding Retrieval-Augmented Generation
23 pages
RAG Flow Explanation
No ratings yet
RAG Flow Explanation
1 page
Understanding Retrieval-Augmented Generation
No ratings yet
Understanding Retrieval-Augmented Generation
5 pages
Closed-Loop RAG for Industrial Data Security
No ratings yet
Closed-Loop RAG for Industrial Data Security
12 pages
Retrieval Augmented Generation (RAG) Guide
No ratings yet
Retrieval Augmented Generation (RAG) Guide
11 pages
Rag
No ratings yet
Rag
18 pages
Chapter - 08 RAG
No ratings yet
Chapter - 08 RAG
20 pages
LangChain and RAG Overview
No ratings yet
LangChain and RAG Overview
32 pages
What Is Retrieval-Augmented Generation (RAG)
No ratings yet
What Is Retrieval-Augmented Generation (RAG)
5 pages
Business Intelligence: Databases Overview
No ratings yet
Business Intelligence: Databases Overview
48 pages
Data Analytics in Cybersecurity Insights
No ratings yet
Data Analytics in Cybersecurity Insights
25 pages
Prashant Mundhe's Computer Science CV
No ratings yet
Prashant Mundhe's Computer Science CV
2 pages
Social Media Disaster Event Classification
No ratings yet
Social Media Disaster Event Classification
4 pages
Compiler Construction Overview and Tools
No ratings yet
Compiler Construction Overview and Tools
21 pages
FactLLaMA: AI for Automated Fact-Checking
No ratings yet
FactLLaMA: AI for Automated Fact-Checking
8 pages
In-Depth Guide to .NET Platform
No ratings yet
In-Depth Guide to .NET Platform
6 pages
Testbank For Big Java Early Objects 7th Edition by Cay S Horstmann
No ratings yet
Testbank For Big Java Early Objects 7th Edition by Cay S Horstmann
18 pages
Data-Driven Construction Management Challenges
No ratings yet
Data-Driven Construction Management Challenges
19 pages
Hierarchical Clustering Metrics Explained
No ratings yet
Hierarchical Clustering Metrics Explained
15 pages
Implementation of A Chat Bot System Using AI and NLP: Research
No ratings yet
Implementation of A Chat Bot System Using AI and NLP: Research
6 pages
Software Engineer Exam Prep Guide
No ratings yet
Software Engineer Exam Prep Guide
5 pages
Web-Based Study Mood Tracker
No ratings yet
Web-Based Study Mood Tracker
3 pages
SIGINT and GSM Key Management Overview
No ratings yet
SIGINT and GSM Key Management Overview
7 pages
Understanding Convolutional Neural Networks
100% (1)
Understanding Convolutional Neural Networks
12 pages
Data Fusion: Trends and Challenges
No ratings yet
Data Fusion: Trends and Challenges
26 pages
Hrishav Agarwal: Data Scientist Profile
No ratings yet
Hrishav Agarwal: Data Scientist Profile
2 pages
Blockchain in Industry 4.0: Performance Review
No ratings yet
Blockchain in Industry 4.0: Performance Review
16 pages
Overview of Programming Languages
No ratings yet
Overview of Programming Languages
7 pages
Introduction to Database Systems Overview
No ratings yet
Introduction to Database Systems Overview
38 pages
Relational Model Anomalies Explained
No ratings yet
Relational Model Anomalies Explained
24 pages
AMIE Computing & Informatics Test Paper
No ratings yet
AMIE Computing & Informatics Test Paper
5 pages
SAP Database Maintenance Guide
No ratings yet
SAP Database Maintenance Guide
2 pages
Information Retrieval Exam Questions
No ratings yet
Information Retrieval Exam Questions
1 page
Advanced SQL Interview Strategies
No ratings yet
Advanced SQL Interview Strategies
5 pages
Prajwal Murthy's Tech Resume 2023
No ratings yet
Prajwal Murthy's Tech Resume 2023
1 page
Introduction to Data Science Overview
No ratings yet
Introduction to Data Science Overview
37 pages
EEG Channel Attention with Swin Transformer
No ratings yet
EEG Channel Attention with Swin Transformer
10 pages
Accuracy of Imbalanced Image Classification
No ratings yet
Accuracy of Imbalanced Image Classification
36 pages
Class X Consumer Awareness Project
No ratings yet
Class X Consumer Awareness Project
4 pages

Understanding Retrieval Augmented Generation

Uploaded by

Understanding Retrieval Augmented Generation

Uploaded by

Chapter 8: Retrieval Augmented

• It combines the capabilities of a pre-trained large

• This approach merges the generative power of

• The result is a system that can provide more

• The retrieved text chunks, along

• The algorithm will use this

It can be difficult to integrate a retrieval system with an LLM. This

• To overcome this challenge: separate modules can be designed to

• To address this challenge, you can distribute computational load

You might also like