What Is Retrieval Augmented Generation Rag Final v2 Cs
What Is Retrieval Augmented Generation Rag Final v2 Cs
What is retrieval-
augmented generation
(RAG)?
Retrieval-augmented generation, or RAG, is a process applied to
large language models to make their outputs more relevant for the
end user.
October 2024
In recent years, large language models (LLMs) Because RAG deployments have access to vast
have made tremendous progress in their ability to amounts of information that is more up to date and
generate content. But some leaders who hoped enterprise-specific, they can provide much more
these models would increase business efficiency accurate, relevant, and coherent outputs. This is
and productivity have been disappointed. Off-the- particularly helpful in applications and use cases
shelf generative AI (gen AI) tools have yet to live up that require highly accurate outputs, such as
to the considerable hype surrounding them. Why is enterprise knowledge management and copilots
that? For one thing, LLMs are trained on only the that are specific to a given domain (for example, a
information that’s available to the providers that workflow or process, journey, or function within the
build them. This can limit their utility in company).
environments where a wider range of more
nuanced, enterprise-specific knowledge is needed. Learn more about QuantumBlack, AI by McKinsey.
Embeddings are numerical representations system in a library allows a librarian to — The word “woman,” on the other hand,
of words or phrases that are unique points quickly locate related text, embeddings is represented as a vector that is
in a multidimensional digital space, where help users organize and retrieve relevant different from both “king” and “man.”
similar ideas and concepts are clustered information. Here’s an example of how
— When we subtract “man” from “king”
together. Each embedding is defined by a they work:
and add “woman” to the space, the
vector—that is, a set of numbers that
— The word “king” is represented as a vectors are manipulated accordingly.
describes a particular characteristic or trait
vector in the multidimensional space. This results in a new vector that
of the word or phrase, such as color,
represents the concept of “queen.”
shape, or meaning. A vector is a coordinate — The word “man” is also represented as
on a map: it pinpoints the exact location of a vector in that space. Because “king”
something in relation to its other features. and “man” share a semantic meaning,
Embeddings allow LLMs to retrieve only their vectors are similar as well.
the most relevant data. Just as a catalog
authoritative texts, or even generating new content — Database queries. RAG can retrieve relevant
based on the insights that can be gleaned from the data that are stored in structured formats, such
library’s resources. as databases or tables, making it easy to search
and analyze this information.
Through these ingestion and retrieval phases, RAG
can generate highly specific outputs that would be — Application programming interface (API) calls.
impossible for traditional LLMs to produce on their RAG can use APIs to access specific
own. The stocked library and index provide a information from other services or platforms.
foundation for the librarian to select and synthesize
information in response to a query, leading to a — Web search/scraping. In some cases, RAG
more relevant and thus more helpful answer. implementations can scrape web pages for
relevant information, although this method is
In addition to accessing a company’s internal more prone to errors than others, due to the
“library,” many RAG implementations can query underlying data quality.
external systems and sources in real time. Examples
of such searches include the following:
Get to know and directly engage with senior McKinsey experts on RAG.
Lareina Yee is a senior partner in McKinsey’s Bay Area office, where Michael Chui is a senior fellow
and Roger Roberts is a partner; Mara Pometti is a consultant in the London office; Patrick Wollner is
a consultant in the Vienna office; and Stephen Xu is a senior director of product management in
the Toronto office.