0% found this document useful (0 votes)

50 views

WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag

Uploaded by

uma5b3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views

WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag

Uploaded by

uma5b3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Login

Retrieval Augmented Generation

All / Retrieval Augmented Generation

What Is Retrieval Augmented Generation, or RAG?

Retrieval augmented generation, or RAG, is an architectural approach that can improve the efficacy of large language model (LLM)
applications by leveraging custom data. This is done by retrieving data/documents relevant to a question or task and providing them
as context for the LLM. RAG has shown success in support chatbots and Q&A systems that need to maintain up-to-date information
or access domain-specific knowledge.

Here’s more to explore

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
The Big Book of MLOps Augment your LLM's using RAG
A must-read for ML engineers and data scientists seeking a How to get more from generative AI with RAG.
better way to do MLOps.

Get the eBook  Download now 

Databricks Named a Leader in New Report

Databricks is a Leader in the 2024 Gartner®️Magic
Quadrant™️for Data Science and Machine Learning
Platforms.

Read now 

What challenges does the retrieval augmented generation approach solve?

Problem 1: LLM models do not know your data
PDFmyURL converts web pages and even full websites to PDF easily and quickly.
LLMs use deep learning models and train on massive datasets to understand, summarize and generate novel content. Most LLMs are
trained on a wide range of public data so one model can respond to many types of tasks or questions. Once trained, many LLMs do
not have the ability to access data beyond their training data cutoff point. This makes LLMs static and may cause them to respond
incorrectly, give out-of-date answers or hallucinate when asked questions about data they have not been trained on.

Problem 2: AI applications must leverage custom data to be effective

For LLMs to give relevant and specific responses, organizations need the model to understand their domain and provide answers from
their data vs. giving broad and generalized responses. For example, organizations build customer support bots with LLMs, and those
solutions must give company-specific answers to customer questions. Others are building internal Q&A bots that should answer
employees' questions on internal HR data. How do companies build such solutions without retraining those models?

Solution: Retrieval augmentation is now an industry standard

An easy and popular way to use your own data is to provide it as part of the prompt with which you query the LLM model. This is
called retrieval augmented generation (RAG), as you would retrieve the relevant data and use it as augmented context for the LLM.
Instead of relying solely on knowledge derived from the training data, a RAG workflow pulls relevant information and connects static
LLMs with real-time data retrieval.

With RAG architecture, organizations can deploy any LLM model and augment it to return relevant results for their organization by
giving it a small amount of their data without the costs and time of fine-tuning or pretraining the model.

What are the use cases for RAG?

There are many different use cases for RAG. The most common ones are:

1. Question and answer chatbots: Incorporating LLMs with chatbots allows them to automatically derive more accurate answers
from company documents and knowledge bases. Chatbots are used to automate customer support and website lead follow-up to
answer questions and resolve issues quickly.

2. Search augmentation: Incorporating LLMs with search engines that augment search results with LLM-generated answers can
better answer informational queries and make it easier for users to find the information they need to do their jobs.

3. Knowledge engine — ask questions on your data (e.g., HR, compliance documents): Company data can be used as context for
LLMs and allow employees to get answers to their questions easily, including HR questions related to benefits and policies and
security and compliance questions.

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
What are the benefits of RAG?
The RAG approach has a number of key benefits, including:

1. Providing up-to-date and accurate responses: RAG ensures that the response of an LLM is not based solely on static, stale
training data. Rather, the model uses up-to-date external data sources to provide responses.

2. Reducing inaccurate responses, or hallucinations: By grounding the LLM model's output on relevant, external knowledge, RAG
attempts to mitigate the risk of responding with incorrect or fabricated information (also known as hallucinations). Outputs can
include citations of original sources, allowing human verification.

3. Providing domain-specific, relevant responses: Using RAG, the LLM will be able to provide contextually relevant responses
tailored to an organization's proprietary or domain-specific data.

4. Being efficient and cost-effective: Compared to other approaches to customizing LLMs with domain-specific data, RAG is simple
and cost-effective. Organizations can deploy RAG without needing to customize the model. This is especially beneficial when
models need to be updated frequently with new data.

When should I use RAG and when should I fine-tune the model?
RAG is the right place to start, being easy and possibly entirely sufficient for some use cases. Fine-tuning is most appropriate in a
different situation, when one wants the LLM's behavior to change, or to learn a different "language." These are not mutually exclusive. As
a future step, it's possible to consider fine-tuning a model to better understand domain language and the desired output form — and
also use RAG to improve the quality and relevance of the response.

When I want to customize my LLM with data, what are all the options and which method is the best
(prompt engineering vs. RAG vs. fine-tune vs. pretrain)?
There are four architectural patterns to consider when customizing an LLM application with your organization's data. These techniques
are outlined below and are not mutually exclusive. Rather, they can (and should) be combined to take advantage of the strengths of
each.

Method Definition Primary use case Data requirements Advantages Considerations

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Crafting specialized prompts Quick, on-the-fly None Fast, cost-effective, no Less control than fine-tuning
to guide LLM behavior model guidance training required

Prompt engineering

Combining an LLM with Dynamic datasets External knowledge Dynamically updated Increases prompt length and
external knowledge retrieval and external base or database context, enhanced inference computation
knowledge (e.g., vector accuracy
database)

Retrieval augmented
generation (RAG)

Adapting a pretrained LLM to Domain or task Thousands of Granular control, high Requires labeled data,
specific datasets or domains specialization domain-specific or specialization computational cost
instruction
examples

Fine-tuning

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Training an LLM from scratch Unique tasks or Large datasets Maximum control, tailored Extremely resource-intensive
domain-specific (billions to trillions for specific needs
corpora of tokens)

Pretraining

Regardless of the technique selected, building a solution in a well-structured, modularized manner ensures organizations will be
prepared to iterate and adapt. Learn more about this approach and more in The Big Book of MLOps.

What is a reference architecture for RAG applications?

There are many ways to implement a retrieval augmented generation system, depending on specific needs and data nuances. Below is
one commonly adopted workflow to provide a foundational understanding of the process.

1. Prepare data: Document data is gathered alongside metadata and subjected to initial preprocessing — for example, PII handling
(detection, filtering, redaction, substitution). To be used in RAG applications, documents need to be chunked into appropriate
lengths based on the choice of embedding model and the downstream LLM application that uses these documents as context.

2. Index relevant data: Produce document embeddings and hydrate a Vector Search index with this data.

3. Retrieve relevant data: Retrieving parts of your data that are relevant to a user's query. That text data is then provided as part of
the prompt that is used for the LLM.

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
4. Build LLM applications: Wrap the components of prompt augmentation and query the LLM into an endpoint. This endpoint can
then be exposed to applications such as Q&A chatbots via a simple REST API.

Databricks also recommends some key architectural elements of a RAG architecture:

Vector database: Some (but not all) LLM applications use vector databases for fast similarity searches, most often to provide
context or domain knowledge in LLM queries. To ensure that the deployed language model has access to up-to-date information,
regular vector database updates can be scheduled as a job. Note that the logic to retrieve from the vector database and inject
information into the LLM context can be packaged in the model artifact logged to MLflow using MLflow LangChain or PyFunc
model flavors.

MLflow LLM Deployments or Model Serving: In LLM-based applications where a third-party LLM API is used, the MLflow LLM
Deployments or Model Serving support for external models can be used as a standardized interface to route request from vendors
such as OpenAI and Anthropic. In addition to providing an enterprise-grade API gateway, the MLflow LLM Deployments or Model
Serving centralizes API key management and provides the ability to enforce cost controls.

Model Serving: In the case of RAG using a third-party API, one key architectural change is that the LLM pipeline will make external
API calls, from the Model Serving endpoint to internal or third-party LLM APIs. It should be noted that this adds complexity,
potential latency and another layer of credential management. By contrast, in the fine-tuned model example, the model and its
model environment will be deployed.

Resources
Databricks blog posts
Using MLflow AI Gateway and Llama 2 to Build Generative AI Apps

Best Practices for LLM Evaluation of RAG Applications

Databricks Demo

Databricks eBook — The Big Book of MLOps

Databricks customers using RAG

JetBlue

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
JetBlue has deployed "BlueBot," a chatbot that uses open source generative AI models complemented by corporate data, powered by
Databricks. This chatbot can be used by all teams at JetBlue to get access to data that is governed by role. For example, the finance
team can see data from SAP and regulatory filings, but the operations team will only see maintenance information.

Also read this article.

Chevron Phillips

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Chevron Phillips Chemical uses Databricks to support their generative AI initiatives, including document process automation.

Thrivent Financial
Thrivent Financial is looking at generative AI to make search better, produce better summarized and more accessible insights, and
improve the productivity of engineering.

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Where can I find more information about retrieval augmented generation?
There are many resources available to find more information on RAG, including:

Blogs
Creating High-Quality RAG Applications With Databricks

Databricks Vector Search Public Preview

Improve RAG Application Response Quality With Real-Time Structured Data

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Build Gen AI Apps Faster With New Foundation Model Capabilities

Best Practices for LLM Evaluation of RAG Applications

Using MLflow AI Gateway and Llama 2 to Build Generative AI Apps (Achieve greater accuracy using retrieval augmented generation
(RAG) with your own data)

E-books
The Big Book of GenAI

The Compact Guide to RAG

The Big Book of MLOps

Demos
Deploy Your LLM Chatbot With Retrieval Augmented Generation (RAG), llama2-70B (MosaicML Inferences) and Vector Search

Contact Databricks to schedule a demo and talk to someone about your LLM and retrieval augmented generation (RAG) projects

Back to Glossary

Why Databricks

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Product

Solutions

Resources

About

Databricks Inc.
160 Spear Street, 15th Floor
San Francisco, CA 94105 See Careers
1-866-330-0121 at Databricks

Privacy Notice | Terms of Use | Modern Slavery Statement | California Privacy | Your Privacy Choices

PDFmyURL converts web pages and even full websites to PDF easily and quickly.

RAG - A Simple Introduction
100% (5)
RAG - A Simple Introduction
75 pages
RAG Architecture
100% (7)
RAG Architecture
52 pages
Arduino Programming in 24 Hours Richard Blum Softarchive Net PDF
92% (26)
Arduino Programming in 24 Hours Richard Blum Softarchive Net PDF
605 pages
Paper Airplane Activity
No ratings yet
Paper Airplane Activity
4 pages
Ms IntuneTutorial
100% (1)
Ms IntuneTutorial
23 pages
Retrieval Augmented Generation - A Simple Introduction
No ratings yet
Retrieval Augmented Generation - A Simple Introduction
82 pages
Rag
No ratings yet
Rag
10 pages
RAG - The Future of LLMs - LinkedIn
No ratings yet
RAG - The Future of LLMs - LinkedIn
7 pages
Research Ibm Com Blog retrieval-augmented-generation-RAG
No ratings yet
Research Ibm Com Blog retrieval-augmented-generation-RAG
11 pages
rag
No ratings yet
rag
20 pages
2.5 Retrieval Augmented Generation RAG
No ratings yet
2.5 Retrieval Augmented Generation RAG
2 pages
A Taxonomy of Retrieval Augmented Generation
100% (1)
A Taxonomy of Retrieval Augmented Generation
56 pages
Medical_rag_report
No ratings yet
Medical_rag_report
6 pages
Rag in 80 Questions Rag Basics 1732967574
No ratings yet
Rag in 80 Questions Rag Basics 1732967574
28 pages
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
No ratings yet
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
7 pages
Building Blocks of Rag Ebook Final
100% (1)
Building Blocks of Rag Ebook Final
9 pages
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
7 pages
Grounding LLM Models For Increased Accuracy
No ratings yet
Grounding LLM Models For Increased Accuracy
9 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
Minor_proj
No ratings yet
Minor_proj
15 pages
Gen AI
No ratings yet
Gen AI
9 pages
Agent Based Models Are Here and Disrupting GPT RAG 1717410571
No ratings yet
Agent Based Models Are Here and Disrupting GPT RAG 1717410571
12 pages
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
5 pages
Llmrag
No ratings yet
Llmrag
6 pages
Untitled 2
No ratings yet
Untitled 2
40 pages
A Practical Blueprint For Implementing Generative AI Retrieval-Augmented Generation
No ratings yet
A Practical Blueprint For Implementing Generative AI Retrieval-Augmented Generation
19 pages
LLM and RAG
No ratings yet
LLM and RAG
12 pages
A Survey On Retrieval-Augmented Text Generation For Large Language Models
No ratings yet
A Survey On Retrieval-Augmented Text Generation For Large Language Models
18 pages
WWW - K2view - Com - What Is Retrieval Augmented Generation
No ratings yet
WWW - K2view - Com - What Is Retrieval Augmented Generation
29 pages
Ibm
No ratings yet
Ibm
12 pages
17 (Advanced) RAG Techniques To Turn Your LLM App Prototype Into A Production-Ready Solution - by Dominik Polzer - Jun, 2024 - Towards Data Science
No ratings yet
17 (Advanced) RAG Techniques To Turn Your LLM App Prototype Into A Production-Ready Solution - by Dominik Polzer - Jun, 2024 - Towards Data Science
54 pages
How Build A RAG Agent With LlamaIndex
No ratings yet
How Build A RAG Agent With LlamaIndex
4 pages
Session 7 LLMs Fine Tuning and RAG
No ratings yet
Session 7 LLMs Fine Tuning and RAG
21 pages
Semantic Search and Beyond handout-Tim-Clarke
No ratings yet
Semantic Search and Beyond handout-Tim-Clarke
16 pages
DSPT 114 - Hands-On With LlamaIndex - First Steps For Retrieval-Augmented Generation (RAG)
No ratings yet
DSPT 114 - Hands-On With LlamaIndex - First Steps For Retrieval-Augmented Generation (RAG)
87 pages
NVIDIA RAG Whitepaper
No ratings yet
NVIDIA RAG Whitepaper
7 pages
RAG from scratch_ Overview
No ratings yet
RAG from scratch_ Overview
1 page
Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium
No ratings yet
Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium
18 pages
What is Retrieval-Augmented Generation (RAG)
No ratings yet
What is Retrieval-Augmented Generation (RAG)
12 pages
The Ultimate Guide to GenAI RAG: Enhancing AI with Real-Time Data Retrieval
No ratings yet
The Ultimate Guide to GenAI RAG: Enhancing AI with Real-Time Data Retrieval
12 pages
LlamaIndex Talk (W&B Fully Connected 2024)
No ratings yet
LlamaIndex Talk (W&B Fully Connected 2024)
38 pages
Developers_Guide_to_RAG_with_Data_Streaming
100% (1)
Developers_Guide_to_RAG_with_Data_Streaming
22 pages
Rag Foundry- Diff Framework
No ratings yet
Rag Foundry- Diff Framework
10 pages
What Is Retrieval Augmented Generation Rag Final v2 Cs
No ratings yet
What Is Retrieval Augmented Generation Rag Final v2 Cs
5 pages
RAG Technics
No ratings yet
RAG Technics
8 pages
Améliorer Les Outputs Des LLM
No ratings yet
Améliorer Les Outputs Des LLM
14 pages
Retrieval-Augmented Generation For Large Language Models A Survey
No ratings yet
Retrieval-Augmented Generation For Large Language Models A Survey
26 pages
2024 KDD RAG Meets LLM Tutorial Part1
No ratings yet
2024 KDD RAG Meets LLM Tutorial Part1
68 pages
39-04 RAG Retrieval Augmented Generation
No ratings yet
39-04 RAG Retrieval Augmented Generation
7 pages
Rag Semi Structured
No ratings yet
Rag Semi Structured
20 pages
RAG vs Finetuning
No ratings yet
RAG vs Finetuning
16 pages
gautam2024evaluating
No ratings yet
gautam2024evaluating
7 pages
Gen AI guide
No ratings yet
Gen AI guide
6 pages
AI For Education RAG
No ratings yet
AI For Education RAG
18 pages
RAG Understanding.pdf
No ratings yet
RAG Understanding.pdf
12 pages
Evolving LLOMPS For RAG
No ratings yet
Evolving LLOMPS For RAG
6 pages
Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak
No ratings yet
Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak
25 pages
1732974151910
No ratings yet
1732974151910
12 pages
IR-LLMs
No ratings yet
IR-LLMs
17 pages
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
tyjt
No ratings yet
tyjt
2 pages
RAG Developers Stack
No ratings yet
RAG Developers Stack
13 pages
2024.Naacl Industry.23
No ratings yet
2024.Naacl Industry.23
16 pages
DEBSWANA
No ratings yet
DEBSWANA
43 pages
CH03 COA9e
No ratings yet
CH03 COA9e
52 pages
MS116-6.3 Manual Motor Starter: Product-Details
No ratings yet
MS116-6.3 Manual Motor Starter: Product-Details
7 pages
Video Display Devices
No ratings yet
Video Display Devices
20 pages
New Web PSC Flow
No ratings yet
New Web PSC Flow
5 pages
Shortlist - NIT Hamirpur
No ratings yet
Shortlist - NIT Hamirpur
36 pages
U20CS404 - CN UNIT 1 NOTES
No ratings yet
U20CS404 - CN UNIT 1 NOTES
44 pages
PA-9312 9324A 9336 9348 Me
No ratings yet
PA-9312 9324A 9336 9348 Me
16 pages
Service: Manual
No ratings yet
Service: Manual
58 pages
Fresher Academy: Testng Testing Framework
No ratings yet
Fresher Academy: Testng Testing Framework
49 pages
Celebrity Format - PDF
No ratings yet
Celebrity Format - PDF
9 pages
Google App Engine
No ratings yet
Google App Engine
5 pages
Mahadevan Cover Letter Continental
No ratings yet
Mahadevan Cover Letter Continental
1 page
HA35-22 V100R001 Maintenance Manual v1.01
No ratings yet
HA35-22 V100R001 Maintenance Manual v1.01
11 pages
Embedded Linux - Ready For Real-Time - MontaVista
No ratings yet
Embedded Linux - Ready For Real-Time - MontaVista
13 pages
Umer Khitab CV
No ratings yet
Umer Khitab CV
3 pages
Aditya Kumar Resume
No ratings yet
Aditya Kumar Resume
3 pages
Networking 3
No ratings yet
Networking 3
12 pages
ASE Course Supplement 2.2.1 August 2022
No ratings yet
ASE Course Supplement 2.2.1 August 2022
29 pages
Gregor Hohpe Resume
No ratings yet
Gregor Hohpe Resume
4 pages
Unit 3 Hardware
No ratings yet
Unit 3 Hardware
13 pages
Type Series Booklet: Submersible Borehole Pump
No ratings yet
Type Series Booklet: Submersible Borehole Pump
104 pages
Curriculum Vitae: Rahmat Hidayat
No ratings yet
Curriculum Vitae: Rahmat Hidayat
2 pages
Project Report - Pfsense
No ratings yet
Project Report - Pfsense
2 pages
Jameco Part Number 49357FSC: Distributed by
No ratings yet
Jameco Part Number 49357FSC: Distributed by
6 pages
CSE321 - 1. Introduction & Operating-System
No ratings yet
CSE321 - 1. Introduction & Operating-System
45 pages
AES136IRIS
No ratings yet
AES136IRIS
8 pages

WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag

Uploaded by

WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag

Uploaded by

Login

Retrieval Augmented Generation

What Is Retrieval Augmented Generation, or RAG?

Here’s more to explore

Get the eBook  Download now 

Databricks Named a Leader in New Report

What challenges does the retrieval augmented generation approach solve?

Problem 2: AI applications must leverage custom data to be effective

Solution: Retrieval augmentation is now an industry standard

What are the use cases for RAG?

Method Definition Primary use case Data requirements Advantages Considerations

What is a reference architecture for RAG applications?

Databricks also recommends some key architectural elements of a RAG architecture:

Best Practices for LLM Evaluation of RAG Applications

Databricks eBook — The Big Book of MLOps

Databricks customers using RAG

Also read this article.

Databricks Vector Search Public Preview

Improve RAG Application Response Quality With Real-Time Structured Data

Best Practices for LLM Evaluation of RAG Applications

The Compact Guide to RAG

The Big Book of MLOps

You might also like