What Is Retrieval Augmented Generation Rag Final v2 Cs

Uploaded by

Nikhil Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

What Is Retrieval Augmented Generation Rag Final v2 Cs

Uploaded by

Nikhil Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

McKinsey Explainers

What is retrieval-
augmented generation
(RAG)?
Retrieval-augmented generation, or RAG, is a process applied to
large language models to make their outputs more relevant for the
end user.

October 2024
In recent years, large language models (LLMs) Because RAG deployments have access to vast
have made tremendous progress in their ability to amounts of information that is more up to date and
generate content. But some leaders who hoped enterprise-specific, they can provide much more
these models would increase business efficiency accurate, relevant, and coherent outputs. This is
and productivity have been disappointed. Off-the- particularly helpful in applications and use cases
shelf generative AI (gen AI) tools have yet to live up that require highly accurate outputs, such as
to the considerable hype surrounding them. Why is enterprise knowledge management and copilots
that? For one thing, LLMs are trained on only the that are specific to a given domain (for example, a
information that’s available to the providers that workflow or process, journey, or function within the
build them. This can limit their utility in company).
environments where a wider range of more
nuanced, enterprise-specific knowledge is needed. Learn more about QuantumBlack, AI by McKinsey.

Retrieval-augmented generation, or RAG, is a

process applied to LLMs to make their outputs How does RAG work?
more relevant in specific contexts. RAG allows RAG involves two phases: ingestion and retrieval.
LLMs to access and reference information outside To understand these concepts, it helps to imagine a
the LLMs own training data, such as an large library with millions of books.
organization’s specific knowledge base, before
generating a response—and, crucially, with citations The initial “ingestion” phase is akin to stocking the
included. This capability enables LLMs to produce shelves and creating an index of their contents,
highly specific outputs without extensive fine- which allows a librarian to quickly locate any book in
tuning or training, delivering some of the benefits of the library’s collection. As part of this process, a set
a custom LLM at considerably less expense. of dense vector representations—numerical
representations of data, also known as
Consider a typical gen AI chatbot that’s deployed in “embeddings” (for more, see sidebar, “What are
a customer service context. While it may offer some embeddings?”)—is generated for each book,
general guidance, because the chatbot is working chapter, or even selected paragraphs.
from an LLM that was trained on only a specific
amount of information, it’s therefore not accessing Once the library is stocked and indexed, the
the enterprise’s unique policies, procedures, data, “retrieval” phase begins. Whenever a user asks a
or knowledge base. As a result, its answers will lack question on a specific topic, the librarian uses the
specificity and relevance to a user’s inquiry. For index to locate the most relevant books. The
example, when a customer asks about the status of selected books are then scanned for relevant
their account or payment options, the chatbot might content, which is carefully extracted and
respond with only generic information; because the synthesized into a concise output. The original
chatbot isn’t accessing the company’s specific data, question informs the initial research and selection
the response it gives doesn’t consider that process, guiding the librarian to present only the
customer’s specific situation. most pertinent and accurate information in
response. This process might involve summarizing
key points from multiple sources, quoting

What is retrieval-augmented generation (RAG)? 2

What are embeddings?

Embeddings are numerical representations system in a library allows a librarian to — The word “woman,” on the other hand,
of words or phrases that are unique points quickly locate related text, embeddings is represented as a vector that is
in a multidimensional digital space, where help users organize and retrieve relevant different from both “king” and “man.”
similar ideas and concepts are clustered information. Here’s an example of how
— When we subtract “man” from “king”
together. Each embedding is defined by a they work:
and add “woman” to the space, the
vector—that is, a set of numbers that
— The word “king” is represented as a vectors are manipulated accordingly.
describes a particular characteristic or trait
vector in the multidimensional space. This results in a new vector that
of the word or phrase, such as color,
represents the concept of “queen.”
shape, or meaning. A vector is a coordinate — The word “man” is also represented as
on a map: it pinpoints the exact location of a vector in that space. Because “king”
something in relation to its other features. and “man” share a semantic meaning,
Embeddings allow LLMs to retrieve only their vectors are similar as well.
the most relevant data. Just as a catalog

authoritative texts, or even generating new content — Database queries. RAG can retrieve relevant
based on the insights that can be gleaned from the data that are stored in structured formats, such
library’s resources. as databases or tables, making it easy to search
and analyze this information.
Through these ingestion and retrieval phases, RAG
can generate highly specific outputs that would be — Application programming interface (API) calls.
impossible for traditional LLMs to produce on their RAG can use APIs to access specific
own. The stocked library and index provide a information from other services or platforms.
foundation for the librarian to select and synthesize
information in response to a query, leading to a — Web search/scraping. In some cases, RAG
more relevant and thus more helpful answer. implementations can scrape web pages for
relevant information, although this method is
In addition to accessing a company’s internal more prone to errors than others, due to the
“library,” many RAG implementations can query underlying data quality.
external systems and sources in real time. Examples
of such searches include the following:

What is retrieval-augmented generation (RAG)? 3

Which areas of the business stand to Learn more about QuantumBlack, AI by McKinsey.
benefit from RAG systems?
RAG has far-reaching applications in various
What are some challenges associated
domains, including customer service, marketing,
with RAG?
finance, and knowledge management. By
integrating RAG into existing systems, businesses While RAG is a powerful tool for enhancing an
can generate outputs that are more accurate than LLM’s capabilities, it is not without its limitations.
they would be using an off-the-shelf LLM, which Like LLMs, RAG is only as good as the data it can
can improve customer satisfaction, reduce costs, access. Here are some of its specific challenges:
and enhance overall performance. Here are some
specific examples of where and how RAG can be — Data quality issues. If the knowledge that RAG
applied: is sourcing is not accurate or up to date, the
resulting output may be incorrect.
— Enterprise-knowledge-management chatbot.
When an employee searches for information — Multimodal data. RAG may not be able to read
within their organization’s intranet or other certain graphs, images, or complex slides, which
internal knowledge sources, the RAG system can lead to issues in the generated output. New
can retrieve relevant information from across multimodal LLMs, which can parse complex
the organization, synthesize it, and provide the data formats, can help mitigate this.
employee with actionable insights.
— Bias. If the underlying data contains biases, the
— Customer service chatbots. When a customer generated output is likely to be biased as well.
interacts with a company’s website or mobile
app to inquire about a product or service, the — Data access and licensing concerns. Intellectual
RAG system can retrieve relevant information property, licensing, and privacy and security
based on corporate policies, customer account issues related to data access need to be
data, and other sources, then provide the considered throughout the design of a RAG
customer with more accurate and helpful system.
responses.
To help address these challenges, enterprises can
— Drafting assistants. When an employee starts establish data governance frameworks—or, if they
drafting a report or document that requires already have them, ramp up those frameworks to
company-specific data or information, the RAG help ensure the quality, accessibility, and timeliness
system retrieves the relevant information from of the underlying data used in RAG. Organizations
enterprise data sources, such as databases, that are implementing RAG systems should also
spreadsheets, and other systems, then provides carefully consider any copyright issues with respect
the employee with prepopulated sections of the to RAG-derived content, biases in the overall data
document. This output can help the employee set, and the level of interoperability between data
develop the document more efficiently and sets that were not previously centrally accessible.
more accurately.

What is retrieval-augmented generation (RAG)? 4

How is RAG evolving? LLMs enhanced with retrieval-augmented
generation can harness the strengths of both
As RAG’s capabilities and potential applications
humans and machines, enabling users to tap into
continue to evolve, we expect several emerging
vast knowledge sources and generate more
trends to shape its future:
accurate and relevant responses. As this
technology continues to evolve, we expect
— Standardization. The increasing standardization
significant improvements in its scalability,
of underlying software patterns means that
adaptability, and impact on enterprise applications,
there will be more off-the-shelf solutions and
with the potential to spur innovation and create
libraries available for RAG implementations,
value.
making them progressively easier to build and
deploy.
Learn more about QuantumBlack, AI by McKinsey.
And check out AI-related job opportunities if you’re
— Agent-based RAG. Agents are systems that can
interested in working with McKinsey.
reason and interact with each other and require
less human intervention than earlier AI systems.
Articles referenced:
These tools can enable RAG systems to flexibly
and efficiently adapt to changing contexts and
— “Why agents are the next frontier of generative
user needs so they can better respond to more
AI,” McKinsey Quarterly, July 24, 2024, Lareina
complex and more nuanced prompts.
Yee, Michael Chui, and Roger Roberts, with
Stephen Xu
— LLMs that are optimized for RAG. Some LLMs
are now being trained specifically for use with
— “A data leader’s technical guide to scaling gen
RAG. These models are tailored to meet the
AI,” July 8, 2024, Asin Tavakoli, Carlo Giovine,
unique needs of RAG tasks, such as quickly
Joe Caserta, Jorge Machado, and Kayvaun
retrieving data from a vast corpus of
Rowshankish, with Jon Boorstein and Nathan
information, rather than relying solely on the
Westby
LLM’s own parametric knowledge. One example
of these optimized LLMs is the AI-powered
— “Choose the right transformation ‘bite size’,”
answer engine Perplexity AI, which has been
March 27, 2024, Eric Lamarre, Kate Smaje, and
fine-tuned to perform in various RAG
Rodney Zemmel
applications (for example, answering complex
questions and summarizing text).

Get to know and directly engage with senior McKinsey experts on RAG.
Lareina Yee is a senior partner in McKinsey’s Bay Area office, where Michael Chui is a senior fellow
and Roger Roberts is a partner; Mara Pometti is a consultant in the London office; Patrick Wollner is
a consultant in the Vienna office; and Stephen Xu is a senior director of product management in
the Toronto office.

What is retrieval-augmented generation (RAG)? 5

RAG - A Simple Introduction
100% (5)
RAG - A Simple Introduction
75 pages
RAG Architecture
100% (7)
RAG Architecture
52 pages
BT Connect IP Connect Global Product Definition Dec15
100% (1)
BT Connect IP Connect Global Product Definition Dec15
169 pages
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
7 pages
NVIDIA RAG Whitepaper
No ratings yet
NVIDIA RAG Whitepaper
7 pages
Minor_proj
No ratings yet
Minor_proj
15 pages
The Ultimate Guide to GenAI RAG: Enhancing AI with Real-Time Data Retrieval
No ratings yet
The Ultimate Guide to GenAI RAG: Enhancing AI with Real-Time Data Retrieval
12 pages
Retrieval Augmented Generation - A Simple Introduction
No ratings yet
Retrieval Augmented Generation - A Simple Introduction
82 pages
Document 2
No ratings yet
Document 2
12 pages
RAG - The Future of LLMs - LinkedIn
No ratings yet
RAG - The Future of LLMs - LinkedIn
7 pages
A Taxonomy of Retrieval Augmented Generation
100% (1)
A Taxonomy of Retrieval Augmented Generation
56 pages
building-blocks-of-rag-ebook-final
No ratings yet
building-blocks-of-rag-ebook-final
15 pages
Building Blocks of Rag Ebook Final
100% (1)
Building Blocks of Rag Ebook Final
9 pages
A Practical Blueprint For Implementing Generative AI Retrieval-Augmented Generation
No ratings yet
A Practical Blueprint For Implementing Generative AI Retrieval-Augmented Generation
19 pages
tyjt
No ratings yet
tyjt
2 pages
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
No ratings yet
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
7 pages
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
5 pages
Llmrag
No ratings yet
Llmrag
6 pages
The complete guide to RAG
No ratings yet
The complete guide to RAG
27 pages
RAG Workflowllllll
No ratings yet
RAG Workflowllllll
3 pages
The DOM GraphRAG Project
No ratings yet
The DOM GraphRAG Project
30 pages
Rag
No ratings yet
Rag
10 pages
WWW - K2view - Com - What Is Retrieval Augmented Generation
No ratings yet
WWW - K2view - Com - What Is Retrieval Augmented Generation
29 pages
gautam2024evaluating
No ratings yet
gautam2024evaluating
7 pages
retrieval-augmented-generation-options-Good-5-38
No ratings yet
retrieval-augmented-generation-options-Good-5-38
34 pages
WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag
12 pages
Retrieval-Augmented Generation (RAG)_ a Comprehens
No ratings yet
Retrieval-Augmented Generation (RAG)_ a Comprehens
8 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
What is Retrieval-Augmented Generation (RAG)
No ratings yet
What is Retrieval-Augmented Generation (RAG)
12 pages
AgenticRAGRedefiningRetrieval-AugmentedGenerationforAdaptiveIntelligence
No ratings yet
AgenticRAGRedefiningRetrieval-AugmentedGenerationforAdaptiveIntelligence
10 pages
1732974151910
No ratings yet
1732974151910
12 pages
Data For GenAI
No ratings yet
Data For GenAI
17 pages
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
No ratings yet
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
12 pages
RAG Notes
No ratings yet
RAG Notes
19 pages
Generative AI
No ratings yet
Generative AI
25 pages
Blogs Nvidia Com Blog What-Is-Retrieval-Augmented-Generation
No ratings yet
Blogs Nvidia Com Blog What-Is-Retrieval-Augmented-Generation
12 pages
Langchain Retrieval Augmented Generation White Paper
100% (1)
Langchain Retrieval Augmented Generation White Paper
23 pages
New eBook the Ultimate Guide to RAG for AI 1730277314
No ratings yet
New eBook the Ultimate Guide to RAG for AI 1730277314
15 pages
Generative AI PPT Final
No ratings yet
Generative AI PPT Final
34 pages
Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium
No ratings yet
Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium
18 pages
2.5 Retrieval Augmented Generation RAG
No ratings yet
2.5 Retrieval Augmented Generation RAG
2 pages
Research Ibm Com Blog retrieval-augmented-generation-RAG
No ratings yet
Research Ibm Com Blog retrieval-augmented-generation-RAG
11 pages
LLM and RAG
No ratings yet
LLM and RAG
12 pages
Understanding RAG
No ratings yet
Understanding RAG
16 pages
2024.Naacl Industry.23
No ratings yet
2024.Naacl Industry.23
16 pages
2024-05-EB-A Compact GuideTo RAG
No ratings yet
2024-05-EB-A Compact GuideTo RAG
38 pages
NEW_25.02.03_AGENTIC-AI-RESEARCH_2501.09136v2
No ratings yet
NEW_25.02.03_AGENTIC-AI-RESEARCH_2501.09136v2
39 pages
Agent i Crag
No ratings yet
Agent i Crag
9 pages
download4
No ratings yet
download4
2 pages
5th and 6th Topic
No ratings yet
5th and 6th Topic
8 pages
A Survey On Retrieval-Augmented Text Generation For Large Language Models
No ratings yet
A Survey On Retrieval-Augmented Text Generation For Large Language Models
18 pages
Github - Blog - Ai and ML - Generative Ai - What Is Retrieval Augmented Generation and What Does It Do For Generative Ai
No ratings yet
Github - Blog - Ai and ML - Generative Ai - What Is Retrieval Augmented Generation and What Does It Do For Generative Ai
14 pages
Developers_Guide_to_RAG_with_Data_Streaming
100% (1)
Developers_Guide_to_RAG_with_Data_Streaming
22 pages
RAG
No ratings yet
RAG
6 pages
v1_covered_dd8bccc1-d5a3-4e08-8468-11e29c92981b
No ratings yet
v1_covered_dd8bccc1-d5a3-4e08-8468-11e29c92981b
16 pages
Retrieval-Augmented Generation For Large Language Models A Survey
No ratings yet
Retrieval-Augmented Generation For Large Language Models A Survey
26 pages
Exploring RAG and Gen AI for knowledge base management
No ratings yet
Exploring RAG and Gen AI for knowledge base management
12 pages
How Build A RAG Agent With LlamaIndex
No ratings yet
How Build A RAG Agent With LlamaIndex
4 pages
Agent Rag
No ratings yet
Agent Rag
35 pages
RAG Slide ENG
No ratings yet
RAG Slide ENG
41 pages
Learning Cascading
From Everand
Learning Cascading
Michael Covert
No ratings yet
Endura UDI5000-CAM Configuration Software: Q U I C K S T A R T
No ratings yet
Endura UDI5000-CAM Configuration Software: Q U I C K S T A R T
8 pages
IN 003 ISBT 128 For Blood
No ratings yet
IN 003 ISBT 128 For Blood
18 pages
The New New Product Owner
No ratings yet
The New New Product Owner
9 pages
Appe Goal Sheet
No ratings yet
Appe Goal Sheet
3 pages
Analysis Modeling: Scenario Based Elements, Class-Based Elements Behavioral Elements, Flow Oriented Elements
No ratings yet
Analysis Modeling: Scenario Based Elements, Class-Based Elements Behavioral Elements, Flow Oriented Elements
28 pages
Connect Hardware
No ratings yet
Connect Hardware
4 pages
Firewall: Seminar On
No ratings yet
Firewall: Seminar On
19 pages
Food Recommendation System Based On Bmi PDF
100% (1)
Food Recommendation System Based On Bmi PDF
72 pages
DataStage Best Practices
100% (1)
DataStage Best Practices
63 pages
Történelem Témazáró 8.osztály PDF
No ratings yet
Történelem Témazáró 8.osztály PDF
1 page
Pro Log Manager
No ratings yet
Pro Log Manager
4 pages
Cajas Acusticas Planos
100% (1)
Cajas Acusticas Planos
71 pages
Isilon Ethernet Backend Network Overview PDF
No ratings yet
Isilon Ethernet Backend Network Overview PDF
16 pages
Anyconnect Og PDF
No ratings yet
Anyconnect Og PDF
14 pages
Revanth's Resume
No ratings yet
Revanth's Resume
3 pages
Setting Up Multi Org Structure in R12 PDF
No ratings yet
Setting Up Multi Org Structure in R12 PDF
70 pages
Week7 Quiz
No ratings yet
Week7 Quiz
4 pages
mehnat
No ratings yet
mehnat
62 pages
Whittle D. - Vassiliev P. - apcOM97 - Construction Economic Ore Body Models
No ratings yet
Whittle D. - Vassiliev P. - apcOM97 - Construction Economic Ore Body Models
6 pages
Cubic Spline Data Interpolation - MATLAB
No ratings yet
Cubic Spline Data Interpolation - MATLAB
4 pages
Unit-2 Jquery Selectors
No ratings yet
Unit-2 Jquery Selectors
9 pages
VVM Exam App
No ratings yet
VVM Exam App
11 pages
Hadamard Code Sec B Wireless Assignment
No ratings yet
Hadamard Code Sec B Wireless Assignment
5 pages
Automatic Sprinkler System Using Arduino.: G M Barbade, Mahajan Vasudha, Pasarge Sanika, Shinde Sandhya
No ratings yet
Automatic Sprinkler System Using Arduino.: G M Barbade, Mahajan Vasudha, Pasarge Sanika, Shinde Sandhya
5 pages
InTouch Sylabus Fundamental PDF
No ratings yet
InTouch Sylabus Fundamental PDF
2 pages
Text
No ratings yet
Text
2 pages
Statement of Purpose Northumbria University
No ratings yet
Statement of Purpose Northumbria University
2 pages
Nokia 216 Dual SIM English
No ratings yet
Nokia 216 Dual SIM English
22 pages
IJSHR20
No ratings yet
IJSHR20
13 pages