0% found this document useful (0 votes)
6 views69 pages

Mod 4

The document outlines a course on Generative AI, detailing the syllabus and various types of language models, including statistical and neural models. It discusses the importance, challenges, and applications of language models in areas like machine translation, speech recognition, and chatbots. Additionally, it contrasts Large Language Models (LLMs) with Generative AI and foundational models, highlighting their capabilities and use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views69 pages

Mod 4

The document outlines a course on Generative AI, detailing the syllabus and various types of language models, including statistical and neural models. It discusses the importance, challenges, and applications of language models in areas like machine translation, speech recognition, and chatbots. Additionally, it contrasts Large Language Models (LLMs) with Generative AI and foundational models, highlighting their capabilities and use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Generative AI

Course Code - 21CS4807


Course Instructors - Dr. Natarajan Venkateswaran,
Prof. Sasikala Nagarajan,
Prof. Arjun Krishnamurthy.

Academic Year - 2024–2025


Year -Semester - IV -VIII
Syllabus
Language Model

• What is a Language Model?


• A language model (LM) is a statistical or neural network-based model that
predicts the probability of a sequence of words.

• Example - Given a sentence


• "I love to drink __."

• A language model should predict "coffee", "tea", or "juice" based on previous words.
Language Model

• Why is Language Modeling Important?


• Machine Translation - Predicts the best translation for a given sentence.

• Speech Recognition - Helps convert speech into text accurately.

• Chatbots - Generates human-like responses.

• Search Engines - Suggests relevant queries.


Language Model

• Challenges in Language Modeling


• Data sparsity - Handling rare words.

• Context understanding - Capturing long-range dependencies.


• "The boy who was playing with his friends near the park, despite being tired, decided
to finish his homework before dinner."

• Computational efficiency - Training large models is expensive.


Language Model

Types of
Language
Models

Statistical Transformer-
Neural Language
Language Based Language
Models (NLMs)
Models (SLMs) Models
Language Model

• Statistical Language Models (SLMs)

• These models use probability distributions to predict the next word.


Language Model

• Problems with N-Gram Models


• Data sparsity - Rare words have low probabilities.

• Fixed-length context - Cannot capture long-range dependencies.

• Smoothing Techniques in N-Gram Models


• Laplace Smoothing - Adds a small value to all probabilities.

• Good-Turing Smoothing - Adjusts probability based on unseen words.

• Kneser-Ney Smoothing - Improves probability distribution for rare words.


Language Model
• Neural Language Models (NLMs)

• Deep learning models learn word relationships using embeddings and

neural networks.

• Feedforward Neural Networks for LM


• Uses a fixed window of words as input.

• Predicts the next word using dense layers.

• Still limited by a fixed context size.


Language Model
• Neural Language Models (NLMs)
• Recurrent Neural Networks (RNNs)
• Captures sequential dependencies in text.

• Can process varying-length input.

• Struggles with vanishing gradients in long texts.

• Long Short-Term Memory (LSTM) & Gated Recurrent Units (GRU)


• LSTMs and GRUs solve long-range dependency issues by using gates.

• LSTMs maintain a memory cell, deciding what to keep or forget.


Language Model
• Advanced Language Models (Transformers & Pretrained Models)
• The Transformer architecture (Vaswani et al., 2017) introduced the self-attention

mechanism, which helps capture long-range dependencies more effectively than

RNNs.

• Self-Attention Mechanism

• Assigns different importance (weights) to words in a sequence.

• Example - In "The cat sat on the mat", "cat" should be closely related to "sat", not "mat".
Language Model
• Advanced Language Models (Transformers & Pretrained Models)
• BERT (Bidirectional Encoder Representations from Transformers)
• Uses bidirectional context, unlike traditional left-to-right models.

• Pretrained on Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks.

• Fine-tuned for specific NLP tasks (e.g., sentiment analysis, Q&A).

• GPT (Generative Pretrained Transformer)


• Autoregressive model - Generates text sequentially.

• Example - GPT-4 is trained on massive datasets for text generation.

BERT: https://www.youtube.com/watch?v=t45S_MwAcOw&list=WL&index=24
Language Model
• Applications of Language Models
• Machine Translation (Google Translate)
• Uses Transformer models like BERT and T5.

• Speech Recognition (Siri, Alexa, Google Assistant)


• Converts speech into text using deep learning-based acoustic models and language models.

• Chatbots (ChatGPT, Bard, LLaMA)


• Uses Transformer-based generative models for realistic conversations.

• Text Summarization (GPT, BART, Pegasus)


• Summarizes large texts while preserving meaning.
Language Model
• Current Challenges
• Bias in training data (e.g., gender, racial biases).
• Ethical concerns (e.g., misinformation, deepfakes).
• Computational cost (large-scale training needs massive GPUs).

• Future Directions
• More efficient models (quantization, pruning, distillation).
• Better multilingual support (handling low-resource languages).
• Ethical AI (reducing bias in NLP models).
Language Model
• Why Are Language Models Important in AI?
• Improve Human-Computer Interaction - AI assistants (e.g., Siri, Alexa, ChatGPT).
• Automate Text-Based Tasks - AI-driven writing, coding, and summarization.
• Enhance Search & Information Retrieval - Google Search uses BERT for better query understanding.
• Multimodal AI - Some models (GPT-4, Gemini) handle text, images, and audio together.

• Real-World Example
• Google uses BERT (Bidirectional Encoder Representations from Transformers) to understand search
queries better.
• OpenAI’s GPT models power ChatGPT, which generates human-like conversations.
LLM

• A Large Language Model (LLM) is an advanced AI system trained on


massive text datasets to understand, process, and generate
human-like text.

• It is built using deep learning techniques, especially Transformers,


to handle a wide range of Natural Language Processing (NLP) tasks.
LLM

• Trained on billions of words from books, websites, research papers,


etc.

• Uses Transformer architecture (e.g., GPT, BERT, T5).

• Capable of text generation, translation, summarization, and


reasoning.

• Can perform zero-shot, few-shot, and fine-tuned learning.


https://youtu.be/N1fbskTpwZ0?si=J_Z3cRIFr7xbmzEX (from 45.2 mins to end)
LLM
• How Do LLMs Work?

• LLMs learn to predict and generate text by modeling probabilities of


word sequences. They use
• Tokenization – Converts words into numerical tokens.
• Embeddings – Transforms tokens into vector representations.
• Transformer Model – Processes text using self-attention and feedforward layers.
• Text Generation – Uses probability-based word prediction (e.g., autoregressive
decoding).
LLM

• LLMs predict words dynamically to generate meaningful responses.


LLM
• Capabilities of LLMs
• Text Generation - Writes articles, summaries, poems, and stories.

• Conversational AI - Powers chatbots like ChatGPT, Google Bard, and Claude.

• Code Generation - Helps developers with programming (e.g., GitHub Copilot).

• Language Translation - Converts text between languages.

• Question Answering - Answers factual and reasoning-based questions.

• Sentiment Analysis - Determines the tone of text (positive, neutral, negative).


LLM Vs GenAI
Feature LLM (Large Language Model) Generative AI (GenAI)
AI models specialized in text processing AI models that generate new data
Definition
and generation. (text, images, videos, music, etc.).
Focused only on natural language Covers multiple modalities (text,
Scope
processing (NLP). images, audio, video, 3D models, etc.).
ChatGPT, DALL·E, Stable Diffusion,
Examples GPT-4, BERT, PaLM, LLaMA, T5
Runway ML
Transformer-based models trained on Can use Transformers, GANs, VAEs,
Underlying Technology
massive text data. and Diffusion Models.
Input: Text Input: Text, images, video, audio
Inputs & Outputs
Output: Text Output: Text, images, video, audio
Text, images, audio, video, and
Training Data Books, articles, web data (text-based).
structured data.
LLM Vs GenAI

Category Examples of LLMs Examples of GenAI Models


Text Generation GPT-4, LLaMA, Claude ChatGPT, Jasper AI
N/A (LLMs don’t generate DALL·E, Stable Diffusion,
Image Generation
images) MidJourney
Music & Audio Generation N/A Jukebox AI, Google MusicLM
Video Generation N/A Runway ML, Pika Labs
3D Model Generation N/A Nvidia Get3D, DreamFusion
LLM Vs Foundational Models

• A Foundational Model is a general-purpose AI model trained on


massive, diverse datasets that can be adapted for multiple tasks
across different domains (text, images, audio, video, and
multimodal learning).
LLM Vs Foundational Models
Feature LLM (Large Language Model) Foundational Model

Broad AI models capable of handling


Definition AI models specialized in text-based tasks. multiple modalities (text, images, audio,
video, etc.).
Covers multiple AI applications beyond text
Scope Focuses on Natural Language Processing (NLP).
(AI art, speech, 3D models, etc.).
Examples GPT-4, BERT, LLaMA, PaLM GPT-4, PaLM-2, DALL·E, CLIP, Stable Diffusion

Training Data Text-only (books, articles, web data). Text, images, audio, videos, 3D models.
AI image generation, video generation,
Use Cases Chatbots, summarization, translation, code generation.
multimodal AI applications.
Uses Transformers, GANs, VAEs, Diffusion
Architectural Foundation Transformer-based (decoder-only or encoder-decoder).
Models, and hybrid architectures.
Multimodal Capabilities? No (text-only). Yes (text, image, speech, video).
LLM Vs Foundational Models

Feature LLMs Foundational Models


Primary Focus Text processing & generation. General-purpose AI for multiple domains.
Multimodal (text, vision, audio, robotics,
Scope Limited to NLP tasks.
etc.).
GPT-4 (as an LLM), BERT, T5, GPT-4 (as a Foundational Model), Gemini,
Examples
Claude PaLM-2, Stable Diffusion
Can Generate Images or Videos? No Yes
Can Process Audio and Speech? No Yes
Examples of LLM

LLM Model Developer Key Features Use Cases


Most advanced, multimodal
Chatbots, creative writing,
GPT-4 OpenAI (text + images), human-like
coding, Q&A
responses
Predecessor to GPT-4,
GPT-3.5 OpenAI ChatGPT, AI content creation
powerful text generation
BERT (Bidirectional Encoder
Bidirectional understanding,
Representations from Google AI Google Search, NLP tasks
great for search engines
Transformers)
PaLM-2 (Pathways Language Multimodal, multilingual,
Google AI Google Bard, AI research
Model-2) advanced reasoning
LLaMA (Large Language Open-source LLM for
Meta (Facebook) AI research, text analysis
Model Meta AI) research and development
Examples of LLM

LLM Model Developer Key Features Use Cases


Focuses on safety, ethical AI AI chatbots, safe language
Claude (Anthropic AI) Anthropic
responses modeling
T5 (Text-to-Text Transfer Converts all NLP tasks into a Summarization, translation,
Google AI
Transformer) text-to-text format Q&A
Multilingual, supports 46
BLOOM (BigScience) Open-source Research, open-source NLP
languages
Trained for factual accuracy
Gopher DeepMind AI research, academic writing
and reasoning
Efficient, optimized open- AI applications, code
Mistral-7B Mistral AI
source model generation
GPT

• GPT (Generative Pre-trained Transformer) is a powerful Large


Language Model (LLM) developed by OpenAI. It is based on the
Transformer architecture and is designed to generate, complete,
and understand human-like text.
GPT

GPT Version Release Year Key Features


GPT-1 2018 First Transformer-based LLM, trained on 117M parameters.
GPT-2 2019 1.5B parameters, coherent text generation, no fine-tuning needed.
GPT-3 2020 175B parameters, advanced reasoning, few-shot learning.
GPT-3.5 2022 Faster, more optimized, better accuracy.
Multimodal (text + images), improved reasoning, better factual
GPT-4 2023
accuracy.
GPT 3

• GPT-3 (Generative Pre-trained Transformer 3) is a large language


model (LLM) developed by OpenAI in 2020. It is the third
generation of the GPT series and is designed to understand and
generate human-like text using deep learning.
GPT 3

• Key Features
• 175 billion parameters (compared to 1.5B in GPT-2).

• Pretrained on 570GB of text data (books, articles, Wikipedia, web content).

• Performs multiple NLP tasks (text generation, summarization, translation,


question answering).

• Zero-shot, Few-shot, and Fine-tuned learning capabilities.


BERT
• BERT (Bidirectional Encoder Representations from Transformers) is a Transformer-based language model developed by Google
AI in 2018. It is designed to understand the context of words by considering both left and right context (bidirectional training).

• Key Features
• Bidirectional understanding – Reads text both forward and backward.

• Pretrained on massive datasets (Wikipedia + BooksCorpus).

• Excels in tasks like Q&A, search engines, and text classification.

• Fine-tunable for multiple NLP applications.

• Example
• Sentence: "The bank approved my loan because I had a good credit score."

• BERT understands: "bank" refers to a financial institution, not a riverbank.

• BERT improves AI’s ability to understand language contextually!


BERT

• It is designed to pre-train deep bidirectional


representations from unlabeled text by jointly
conditioning on both left and right context.
• As a result, the pre-trained BERT model can
be fine-tuned with just one additional output
layer to create state-of-the-art models for a
wide range of NLP tasks.

https://doi.org/10.48550/arXiv.1810.04805
Transformers

• The transformer architecture efficiently parallelizes


machine learning training.

• When we do massive parallelization, it makes the model


feasible to train BERT on extensive data quickly.

• Transformers work by leveraging attention.

The quick brown fox jumps over the lazy dog.

https://arxiv.org/abs/1706.03762v7
BERT

Component Function
Token Embeddings Converts words into numerical vectors.
Positional Encoding Adds word order information.
Bidirectional Self-Attention Understands relationships between words.
Feedforward Neural Network Refines token representations.
Output Layer (Softmax) Predicts missing words or next sentence.
Large amounts of training data
• BERT is specially designed to work on larger word counts.

• The large informational datasets have contributed to BERT’s deep


knowledge of English and many other languages.

• When we want to train BERT on a larger dataset it takes more time.

• Training BERT is possible because of the transformer architecture


and speeding it up using Tensor Processing Units.
https://doi.org/10.48550/arXiv.1810.04805
BERT – Input Embeddings

https://doi.org/10.48550/arXiv.1810.04805
Masked Language Model

• Masked Language Model (MLM) enables bidirectional learning from


text.

• We can do it by hiding a word in a sentence and forcing BERT to


bidirectionally use the words on both sides.

• That is, we can try to understand the previous and next words of
the hidden word for predicting it.
Next Sentence Prediction

• Next Sentence Prediction (NSP) helps BERT learn about


relationships between sentences by predicting if a given sentence
follows the previous one.

• In training, 50% of correct predictions are fixed with 50% random


sentences to help BERT increase its accuracy.
Transformers
• The transformer architecture efficiently parallelizes machine learning training.

• When we do massive parallelization, it makes the model feasible to train BERT on extensive data quickly.

• Transformers work by leveraging attention.

• Human brains have limited memory, and machine learning models must learn to pay attention to what
matters most.

• We can avoid wasting computational resources and use those for processing irrelevant information when
the machine learning model does that.

• Transformers create differential weights by sending signals to the words in a sentence which are critical
for further processing.
BERT

• Types of BERT Models

Model Parameters Use Case


General NLP tasks (search, Q&A,
BERT-Base 110M
sentiment analysis).
BERT-Large 340M More powerful, better understanding.
DistilBERT 66M Lighter and faster version of BERT.
Improved training methods for better
RoBERTa (Robustly Optimized BERT) 355M
performance.
Compressed BERT with fewer
ALBERT (A Lite BERT) 12M-223M
parameters.
BERT
• Real-World Applications
• Google Search – Helps understand search queries contextually.
• Chatbots & Virtual Assistants – Improves understanding of user intent.
• Question Answering – Powers models like SQuAD (Stanford Question Answering Dataset).
• Sentiment Analysis – Determines whether a text is positive, neutral, or negative.
• Machine Translation – Assists in multilingual NLP tasks.
• Example (Google Search Before & After BERT)
• Query: "2019 brazil traveler to usa need visa"
• Before BERT: Google ignored "to" and returned wrong visa info.
• After BERT: Google understands travel direction and shows correct visa info.

• BERT significantly improved search engine accuracy


BERT

• BERT vs. GPT

GPT (Generative Pretrained


Feature BERT
Transformer)
Training Direction Bidirectional Autoregressive (Left-to-Right)
Main Task Understanding text Generating text
Example Use Case Search engines, Q&A Chatbots, text generation
Masked Language Modeling Predicting next word
Pretraining Task
(MLM) (Autoregressive Learning)
T5
• T5 (Text-to-Text Transfer Transformer) is an advanced NLP model developed
by Google Research. It treats all NLP tasks as text-to-text problems, meaning
both input and output are represented as text.

• Key Features
• All NLP tasks are converted into text generation problems.
• Pretrained on a massive text corpus (Colossal Clean Crawled Corpus - C4).
• Works on multiple NLP tasks: summarization, translation, Q&A, and classification.
• Encoder-Decoder Transformer architecture (like seq2seq models).
T5
• Tasks
• Translation
• Summarization
• Sentiment Analysis
• QA

• Unlike BERT (which understands text), T5 can both understand and


generate text
T5
• T5 is an encoder-decoder Transformer model

• Components
• Token Embeddings - Converts words into numerical vectors.
• Positional Encoding - Adds word order information.
• Encoder (Processes Input Text) - Converts input into hidden representations.
• Decoder (Generates Output Text) - Takes encoder output and generates text
token by token.
• Attention Mechanism - Helps model focus on important parts of input text.
T5

• T5 vs. BERT vs. GPT

• BERT → Only understands text (encoder-only).

• GPT → Only generates text (decoder-only).

• T5 → Both understands & generates text (encoder-decoder).

• T5 is like a "universal NLP model" that handles multiple tasks in a


unified way
T5

• T5 Variants and Model Sizes

Model Parameters Use Case


T5-Small 60M Lightweight NLP tasks.
T5-Base 220M Standard NLP tasks.
T5-Large 770M Advanced text generation.
T5-3B 3 Billion High-quality summarization, Q&A.
T5-11B 11 Billion State-of-the-art text generation.
T5

Feature T5 BERT GPT BART

Architecture Encoder-Decoder Encoder-only Decoder-only Encoder-Decoder

Text generation &


Main Task Text understanding Text generation Text-to-text
understanding

Masked span Masked word Denoising


Training Task Predict next word
prediction prediction autoencoder

Summarization, Search engines, Chatbots, creative Summarization,


Use Case
Q&A, translation chatbots writing paraphrasing
T5
• https://arxiv.org/abs/1910.10683

• Summary
• The paper introduces a unified "text-to-text" framework for natural language processing tasks, explores various techniques for transfer
learning in this framework, and then combines these insights to achieve state-of-the-art results on a wide range of NLP benchmarks.

• Main Findings
• The key finding is the development of a unified text-to-text framework that can be applied to a diverse set of NLP tasks.

• The encoder-decoder Transformer architecture performed best in their text-to-text framework, despite using more parameters than other
architectural variants.

• The C4 dataset they introduced, which is a cleaned version of the Common Crawl web data, can provide benefits over more curated
unlabeled datasets for some downstream tasks.
RoBERTa
• RoBERTa (Robustly Optimized BERT Pretraining Approach) is an improved version of BERT
developed by Facebook AI (Meta AI) in 2019. It enhances BERT’s pretraining process to
improve performance on NLP tasks.

• Key Features
• Same architecture as BERT (Transformer-based encoder).

• Optimized pretraining for better accuracy (longer training, more data).

• No Next Sentence Prediction (NSP) – unlike BERT.

• Uses dynamic masking for better generalization.

• RoBERTa is like "BERT but stronger" with better fine-tuning results on NLP benchmarks!
RoBERTa
• How is RoBERTa Different from BERT?
Feature BERT RoBERTa

Architecture Transformer Encoder Transformer Encoder


Masked Language Model (MLM) + Masked Language Model (MLM) only
Pretraining Task
Next Sentence Prediction (NSP) (NSP removed)
Training Data 16GB of text 160GB of text (10x more data)
Dynamic masking (better
Masking Strategy Static masking
generalization)
8,000 (Larger batches improve
Batch Size 256
stability)
Fine-Tuning Performance Good Better than BERT on most NLP tasks

• RoBERTa removes Next Sentence Prediction (NSP) and dynamically


changes masked words during training, leading to better performance!
RoBERTa
• Key Improvements in RoBERTa Over BERT

• Removed Next Sentence Prediction (NSP) Task


• BERT was trained to predict if two sentences were related, but
researchers found that NSP does not significantly help NLP tasks.
• RoBERTa removes NSP, allowing the model to focus only on Masked
Language Modeling (MLM).

• This leads to better language understanding and generalization


RoBERTa
• Key Improvements in RoBERTa Over BERT

• Dynamic Masking Instead of Static Masking


• BERT: Uses static masking, meaning the same words are masked in every training epoch.

• RoBERTa: Uses dynamic masking, meaning different words are masked each time the model sees the same sentence.

• Example
• Sentence: "The [MASK] jumped over the fence."

• BERT: Always masks the same word ("dog").

• RoBERTa: Randomly masks different words in different epochs ("dog", "jumped", "fence"), forcing the model to learn
better representations.

• This improves the model’s ability to understand various contexts.


RoBERTa
• Key Improvements in RoBERTa Over BERT

• More Training Data and Longer Training Time


• RoBERTa is trained on 160GB of text (10x more than BERT).

• Uses bigger batch sizes (8,000 vs. 256 in BERT) for better learning stability.

• Trains on more diverse text sources (Common Crawl, BooksCorpus, Wikipedia,


etc.).

• Result: RoBERTa achieves higher accuracy than BERT on NLP benchmarks


RoBERTa

• https://arxiv.org/pdf/1907.11692
Leading Language Models and their Real Life Applications
• Leading language models like GPT-3, BERT, and LaMDA are
revolutionizing various fields, with applications ranging from
customer service and content creation to data analysis and
language translation, demonstrating their ability to understand and
generate human-like text.
Leading Language Models and their Real Life Applications

Model Developer Key Features Real-Life Applications


Multimodal (text + images), ChatGPT, AI assistants,
GPT-4 OpenAI
advanced reasoning content creation
Chatbots, code generation,
GPT-3.5 OpenAI High-quality text generation
research tools
BERT (Bidirectional
Google Search, NLP
Encoder Representations Google AI Bidirectional text understanding
research, chatbots
from Transformers)
Sentiment analysis,
RoBERTa (Robust BERT) Meta AI Faster, better NLP accuracy
customer service AI
PaLM-2 (Pathways Multilingual, better reasoning & Google Bard, AI-powered
Google AI
Language Model-2) logic search
Leading Language Models and their Real Life Applications

Model Developer Key Features Real-Life Applications

AI assistants, safe text


Claude (Anthropic AI) Anthropic AI safety, ethical AI responses
generation
LLaMA (Large Language Model AI research, enterprise
Meta AI Open-source, efficient model
Meta AI) applications
T5 (Text-to-Text Transfer
Google AI Text-to-text format for NLP tasks Summarization, translation, Q&A
Transformer)
Scientific research, global NLP
BLOOM BigScience Open-source multilingual LLM
tasks

Mistral-7B Mistral AI Optimized, efficient LLM AI-powered search, chatbots


Retrieval Augmented Generation (RAG)
• An AI framework that augments the potential of language models by integrating
them with external data sources.

• Traditional language models produce responses based only on data that they have
been trained with, which may be inaccurate, especially when dealing with recent
events or specialised knowledge domains.

• RAG addresses this limitation by allowing the model to obtain the relevant
information in real time before generating the reply.

• Key developments of the components of a RAG system includes Retrieval,


Generation, and Augmentation.
Retrieval Augmented Generation (RAG)
Encoder
Role of the Encoder Input Conversion
The encoder processes input The primary function of the encoder is to convert raw
queries and transforms them for input data into a structured format for better
effective retrieval. It is crucial for processing.
understanding user intent.
Effective Retrieval
Semantic Meaning Capture By structuring data, the encoder enables more
The encoder captures the semantic effective retrieval and access to relevant information.
meaning of the input, ensuring
accurate and relevant responses Transformers Usage
from the retriever. Techniques like Transformers are utilized to generate
high-quality embeddings for improved data
representation.
Retrieval Augmented Generation (RAG)
Retriever
Information Retrieval Linking Queries to Knowledge
The retriever is essential for fetching The retriever connects user queries to relevant
relevant information based on input knowledge resources, facilitating effective information
from a knowledge source. access.

Role in Content Generation Algorithms for Information Retrieval


By providing well-informed data, the Various algorithms are employed by the retriever to
retriever enhances the quality and fetch the most pertinent information for the generation
relevance of generated content. process.

Encoded Input Processing Efficient Information Fetching


The retriever processes encoded The retriever ensures efficient information fetching by
input to accurately fetch information prioritizing relevance and accuracy in the returned data.
that meets user needs.
Retrieval Augmented Generation (RAG)
Utilization of Vector Database

Quick Access to Information


Vector databases provide rapid access to vast amounts
of data, optimizing the information retrieval process.

Document Embeddings
Storing document embeddings allows for efficient
organization and searching of complex information.

Nearest Neighbor Searches


The vector database facilitates quick nearest neighbor
searches, enhancing the accuracy and speed of data
retrieval.
Retrieval Augmented Generation (RAG)
Generator

Synthesis of Output
The generator combines original input and retrieved information to create final outputs
effectively.

Advanced Generative Techniques


Utilizing advanced generative techniques, the mechanism ensures the text produced is
coherent and relevant.

Contextual Relevance
The generator focuses on maintaining contextual relevance to enhance the quality of the
generated text.
RAG – Simplified Sketch
Retrieval
Phase II : Generation

Document 1
Query + Context
Context
LLM
Document 2 Prompt
Model
Document 3
Generates Answers

Knowledge Base

Query
Question

Output Answer
68
April 24, 2025
Advantages and Limitations of RAG

Enhanced Accuracy
RAG improves the accuracy of information retrieval, ensuring relevant results that meet user needs
effectively.

Contextual Relevance
The contextual relevance of RAG enhances the user experience by providing tailored information
based on context.

Knowledge Base Quality


The effectiveness of RAG is dependent on the quality of the underlying knowledge base, which can
affect outcomes.

Retrieval Noise
Retrieval noise may arise in RAG systems, leading to irrelevant or misleading information being
presented.

You might also like