0% found this document useful (0 votes)

224 views38 pages

LlamaIndex Talk (W&B Fully Connected 2024)

Uploaded by

李哲

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Topics covered

Agent Runner,
Full Agent System,
Query Engine,
RAG,
Synchronous Communication,
Monte-Carlo Tree Search,
Error Correction,
ReAct,
Context Augmentation,
Query Planning

0% found this document useful (0 votes)

224 views38 pages

LlamaIndex Talk (W&B Fully Connected 2024)

Uploaded by

李哲

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Topics covered

Agent Runner,
Full Agent System,
Query Engine,
RAG,
Synchronous Communication,
Monte-Carlo Tree Search,
Error Correction,
ReAct,
Context Augmentation,
Query Planning

Introduction to RAG
RAG Process Overview

Beyond RAG: Building Advanced Context-

Augmented LLM
Applications
Jerry Liu, LlamaIndex co-founder/CEO
LlamaIndex: Context Augmentation for your LLM app
RAG
RAG

Data Parsing & Ingestion Data Querying

Data Parsing + LLM +

Data Index Retrieval Response
Ingestion Prompts
Naive RAG

Sentence
Dense Retrieval Simple QA
PyPDF Splitting
Top-k = 5
Prompt
Chunk Size 256

Data Parsing + LLM +

Data Index Retrieval Response
Ingestion Prompts
Naive RAG is Limited
RAG Prototypes are Limited
Naive RAG approaches tend to work well for simple questions over a simple,
small set of documents.
● “What are the main risk factors for Tesla?” (over Tesla 2021 10K)
● “What did the author do during his time at YC?” (Paul Graham essay)
Pain Points
There’s certain questions we want to ask where naive RAG will fail.

Examples:
● Summarization Questions: “Give me a summary of the entire <company>
10K annual report”
Pain Points
There’s certain questions we want to ask where naive RAG will fail.

Examples:
● Summarization Questions: “Give me a summary of the entire <company> 10K
annual report”
● Comparison Questions: “Compare the open-source contributions of candidate
A and candidate B”
● Structured Analytics + Semantic Search: “Tell me about the risk factors of the
highest-performing rideshare company in the US”
● General Multi-part Questions: “Tell me about the pro-X arguments in article A,
and tell me about the pro-Y arguments in article B, make a table based on our
internal style guide, then generate your own conclusion based on these facts.”
Can we do more?
In the naive setting, RAG is boring.

🚫 It’s just a glorified search system

🚫 There’s many questions/tasks that naive RAG can’t give an answer to.

💡 Can we go beyond simple search/QA to building a general

context-augmented research assistant?
Beyond RAG: Adding Layers of Agentic Reasoning
From RAG to Agents

Query RAG Response

From RAG to Agents

Query RAG Response

Single-shot
No query understanding/planning
No tool use
No reflection, error correction
No memory (stateless)
From RAG to Agents
✅ Multi-turn Tool
✅ Query / task planning layer
✅ Tool interface for external environment
✅ Reflection Tool
✅ Memory for personalization

Tool

Query Agent RAG Response

From Simple to Advanced Agents
Agent Ingredients Full Agents

Routing Tool Use

Dynamic
Planning +
One-Shot Query ReAct Execution
Planning

Conversation
Memory

Simple Advanced
Lower Cost Higher Cost
Lower Latency Higher Latency
Routing
Simplest form of agentic
reasoning.

Given user query and set of

choices, output subset of
choices to route query to.
Routing
Use Case: Joint QA and
Summarization

Guide
Conversation Memory
In addition to current query,
take into account
conversation history as
input to your RAG
pipeline.
Conversation Memory
How to account for
conversation history in a
RAG pipeline?
● Condense question
● Condense question +
context
Compare revenue growth of
Query Planning Uber and Lyft in 2021

Break down query into

Describe revenue Describe revenue growth
parallelizable sub-queries. growth of Lyft in 2021 of Uber in 2021

Each sub-query can be

executed against any set of top-2
RAG pipelines
Uber 10-K chunk 4

Uber 10-K
Uber 10-K chunk 8

top-2
Lyft 10-K
Lyft 10-K chunk 4

Lyft 10-K chunk 8

Compare revenue growth of
Query Planning Uber and Lyft in 2021

Example: Compare
Describe revenue Describe revenue growth
revenue of Uber and Lyft in growth of Lyft in 2021 of Uber in 2021

2021

Query Planning Guide top-2

Uber 10-K chunk 4

Uber 10-K
Uber 10-K chunk 8

top-2
Lyft 10-K
Lyft 10-K chunk 4

Lyft 10-K chunk 8

Tool Use
Use an LLM to call an API

Infer the parameters of that

API
Tool Use
In normal RAG you just
pass through the query.

But what if you used the

LLM to infer all the
parameters for the API
interface?

A key capability in many QA

use cases (auto-retrieval,
text-to-SQL, and more)
Let’s put them together
● All of these are agent ingredients
● Let’s put them together for a full agent system
○ Query planning
○ Memory
○ Tool Use
● Let’s add additional components
○ Reflection
○ Controllability
○ Observability
Core Components of a Full Agent

Minimum necessary
ingredients:
● Query planning
● Memory
● Tool Use
ReAct: Reasoning + Acting with LLMs

Source: https://react-lm.github.io/
ReAct: Reasoning + Acting with LLMs

Query Planning:
Generate next step
given previous steps
(chain-of-thought
prompt)

Tool Use:
Sequential tool
calling.

Memory: Maintain
simple buffer.
ReAct: Reasoning + Acting with LLMs

ReAct + RAG Guide

Can we make this even better?
● Stop being so short-sighted - plan ahead at each step
● Parallelize execution where we can
LLMCompiler (Kim et al. 2023)

Kim et al. 2023

An agent compiler
for parallel multi-
function planning +
execution.
LLMCompiler

Query Planning:
Generate a DAG of
steps. Replan if steps
don’t reach desired
state

Tool Use: Parallel

function calling.

Memory: Maintain
simple buffer.

LLMCompiler Agent
Tree-based Planning

Tree of Thoughts
(Yao et al. 2023)

Reasoning via
Planning (Hao et al.
2023)

Language Agent
Tree Search (Zhou
et al. 2023)
Tree-based Planning
Query Planning in the
face of uncertainty:
Instead of planning out
a fixed sequence of
steps, sample a few
different states.

Run Monte-Carlo Tree

Search (MCTS) to
balance exploration vs.
exploitation.
Self-Reflection

Use feedback to improve

agent execution and
reduce errors

Human feedback

🤖 LLM feedback

Use few-shot examples

instead of retraining the
model!
Reflexion: Language Agents with Verbal Reinforcement Learning, by Shinn et al. (2023)
Additional Requirements
● Observability: see the full trace of the agent
○ Observability Guide
● Control: Be able to guide the intermediate steps of an agent step-by-step
○ Lower-Level Agent API
● Customizability: Define your own agentic logic around any set of tools.
○ Custom Agent Guide
○ Custom Agent with Query Pipeline Guide
● Multi-agents: Define multi-agent interactions!
○ Synchronously: Define an explicit flow between agents
○ Asynchronously: Treat each agent as a microservice that can communicate with each other.
■ Upcoming in LlamaIndex!
○ Current Frameworks: Autogen, CrewAI
LlamaIndex + W&B
Tracing and
Observability are
essential developer
tools for RAG/agent
development.

We have first-class
integrations with
Weights and Biases.

Guide

LlamaIndex Prompt Engineering Tutorial (FlowGPT)
No ratings yet
LlamaIndex Prompt Engineering Tutorial (FlowGPT)
20 pages
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
100% (3)
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
48 pages
Advanced Data Query Techniques
100% (1)
Advanced Data Query Techniques
5 pages
Retrieval Augmented Generation (RAG) For Everyone
No ratings yet
Retrieval Augmented Generation (RAG) For Everyone
57 pages
List of Open Sourced Fine-Tuned Large Language Models (LLM) - by Sung Kim - Geek Culture - Mar, 2023 - Medium
No ratings yet
List of Open Sourced Fine-Tuned Large Language Models (LLM) - by Sung Kim - Geek Culture - Mar, 2023 - Medium
18 pages
Patterns For Building LLM-based Systems & Products
50% (2)
Patterns For Building LLM-based Systems & Products
31 pages
Building Affordable AI Knowledge Graphs
No ratings yet
Building Affordable AI Knowledge Graphs
31 pages
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
100% (2)
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
24 pages
Evaluate RAG - Phoenix
No ratings yet
Evaluate RAG - Phoenix
25 pages
RAG Notes
No ratings yet
RAG Notes
4 pages
An AI Engineer's Guide To Machine Learning and Generative AI - by Ai Geek (Wishesh) - Medium
100% (1)
An AI Engineer's Guide To Machine Learning and Generative AI - by Ai Geek (Wishesh) - Medium
67 pages
LLM Indexing and Chaining Overview
No ratings yet
LLM Indexing and Chaining Overview
19 pages
1 Introduction To Agents and Their World - AI Agents in Action
No ratings yet
1 Introduction To Agents and Their World - AI Agents in Action
21 pages
Interview Questions On RAG
100% (1)
Interview Questions On RAG
6 pages
LoRA Techniques for LLM Fine-Tuning
No ratings yet
LoRA Techniques for LLM Fine-Tuning
27 pages
MCP 9
No ratings yet
MCP 9
17 pages
Overview of Small Language Models
No ratings yet
Overview of Small Language Models
3 pages
Private Chatbot With Local LLM (Falcon 7B) and LangChain
No ratings yet
Private Chatbot With Local LLM (Falcon 7B) and LangChain
14 pages
Train DeepSeek R1 From Scratch
No ratings yet
Train DeepSeek R1 From Scratch
42 pages
Hugging Face Transformers
100% (1)
Hugging Face Transformers
8 pages
The Model Context Protocol
No ratings yet
The Model Context Protocol
46 pages
OpenAI Embeddings Guide
No ratings yet
OpenAI Embeddings Guide
13 pages
Devs: Build PDF Bots with Open-Source LLMs
No ratings yet
Devs: Build PDF Bots with Open-Source LLMs
13 pages
Enhancing AI Models with RAG Techniques
No ratings yet
Enhancing AI Models with RAG Techniques
9 pages
Practical Guide To Using LLMs by Andrej Karpathy Feb 29 2025
No ratings yet
Practical Guide To Using LLMs by Andrej Karpathy Feb 29 2025
8 pages
Langchain PDF Reader
100% (1)
Langchain PDF Reader
15 pages
A Survey of Techniques For Maximizing LLM Performance
100% (1)
A Survey of Techniques For Maximizing LLM Performance
40 pages
A Step-By-Step Guide To Building AI Agents With LangGraph - by Alannaelga - Coinmonks - Nov, 2024 - Medium
100% (1)
A Step-By-Step Guide To Building AI Agents With LangGraph - by Alannaelga - Coinmonks - Nov, 2024 - Medium
32 pages
Chunking Techniques in RAG Systems
No ratings yet
Chunking Techniques in RAG Systems
12 pages
PythonAI LLMs ForSharing
100% (2)
PythonAI LLMs ForSharing
47 pages
LangGraph Tutorials
100% (2)
LangGraph Tutorials
3 pages
LLM and Gen AI
No ratings yet
LLM and Gen AI
4 pages
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
No ratings yet
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
12 pages
LLMOps Toolkit - Prashant Sahu
No ratings yet
LLMOps Toolkit - Prashant Sahu
12 pages
Open-Source Models with LlamaIndex
100% (1)
Open-Source Models with LlamaIndex
34 pages
3 +Cursor+IDE+Cheatsheet
No ratings yet
3 +Cursor+IDE+Cheatsheet
5 pages
LangChain Academy - Introduction To LangGraph - Motivation
No ratings yet
LangChain Academy - Introduction To LangGraph - Motivation
17 pages
Streamlit Chatbot with LangChain LLMs
No ratings yet
Streamlit Chatbot with LangChain LLMs
15 pages
Birthday Gift Ideas for Data Scientists
No ratings yet
Birthday Gift Ideas for Data Scientists
1 page
Whitepaper Emebddings Vectorstores v2
100% (1)
Whitepaper Emebddings Vectorstores v2
64 pages
Vector Databases
100% (2)
Vector Databases
35 pages
Top Agentic RAG Architecture
No ratings yet
Top Agentic RAG Architecture
27 pages
RAG - Genai
No ratings yet
RAG - Genai
11 pages
GraphRAG + GPT-4o-Mini Is The RAG Heaven - by Vatsal Saglani - Jul, 2024 - Towards AI
No ratings yet
GraphRAG + GPT-4o-Mini Is The RAG Heaven - by Vatsal Saglani - Jul, 2024 - Towards AI
34 pages
LangGraph Slides
No ratings yet
LangGraph Slides
54 pages
AI Agent Types Part 4 Discover AI Agents, Their Design, and by
No ratings yet
AI Agent Types Part 4 Discover AI Agents, Their Design, and by
30 pages
Understanding Intelligent Agents in AI
No ratings yet
Understanding Intelligent Agents in AI
42 pages
AI Model Optimization Guide
100% (1)
AI Model Optimization Guide
1 page
Agentic RAG: Costs and ROI Analysis
No ratings yet
Agentic RAG: Costs and ROI Analysis
1 page
Advances in MultiModal Large Language Models
No ratings yet
Advances in MultiModal Large Language Models
22 pages
AI Lab Manual: Python Implementation
No ratings yet
AI Lab Manual: Python Implementation
118 pages
Agents White Paper
100% (2)
Agents White Paper
21 pages
Agents in Action
No ratings yet
Agents in Action
29 pages
Large Language Models
No ratings yet
Large Language Models
40 pages
RLHF - Reinforcement Learning From Human Feedback
No ratings yet
RLHF - Reinforcement Learning From Human Feedback
21 pages
PEFT Methods for Language Models
No ratings yet
PEFT Methods for Language Models
20 pages
Hybrid RAG For Unstructured Data
No ratings yet
Hybrid RAG For Unstructured Data
25 pages
AI Agent Engineering Syllabus
100% (1)
AI Agent Engineering Syllabus
9 pages
Agents in LangChain
100% (2)
Agents in LangChain
11 pages
LlamaIndex Talk (AI User Conference)
No ratings yet
LlamaIndex Talk (AI User Conference)
35 pages
Control Statements
No ratings yet
Control Statements
23 pages
RO
No ratings yet
RO
3 pages
Meeting Management for Leaders
No ratings yet
Meeting Management for Leaders
7 pages
Full Circle Inkscape Special Edition N4
No ratings yet
Full Circle Inkscape Special Edition N4
27 pages
Understanding Computational Thinking Techniques
100% (1)
Understanding Computational Thinking Techniques
28 pages
Ask Me (Ask4pc)
No ratings yet
Ask Me (Ask4pc)
2 pages
IGCSE EXCEL PAPER, ICT, Year 10, 11
71% (7)
IGCSE EXCEL PAPER, ICT, Year 10, 11
2 pages
Best Practices in Unified Process Design
No ratings yet
Best Practices in Unified Process Design
22 pages
Bikrant Sarmah: Software Engineer Profile
No ratings yet
Bikrant Sarmah: Software Engineer Profile
1 page
Sniper Game Hacks and Modifications
No ratings yet
Sniper Game Hacks and Modifications
5 pages
Elements of Spreadsheet
No ratings yet
Elements of Spreadsheet
1 page
C Programming Examples and Outputs
No ratings yet
C Programming Examples and Outputs
64 pages
Devops Lab - Run Regression Tests Using Maven Build Pipeline
No ratings yet
Devops Lab - Run Regression Tests Using Maven Build Pipeline
5 pages
Overview of NoSQL Database Types
No ratings yet
Overview of NoSQL Database Types
21 pages
Distributed Databases: Solutions To Practice Exercises
No ratings yet
Distributed Databases: Solutions To Practice Exercises
4 pages
Sample Questions
No ratings yet
Sample Questions
20 pages
Techolympics 2017 Guidelines
No ratings yet
Techolympics 2017 Guidelines
11 pages
ACADFEUI
No ratings yet
ACADFEUI
50 pages
MEDITECH 6.0 Training Workshops Agenda
No ratings yet
MEDITECH 6.0 Training Workshops Agenda
15 pages
Security Mechanisms:-: Encipherment
No ratings yet
Security Mechanisms:-: Encipherment
64 pages
Underwater ROV LED Lighting Solutions
No ratings yet
Underwater ROV LED Lighting Solutions
12 pages
Wind Turbine Tower Design Report
100% (1)
Wind Turbine Tower Design Report
18 pages
Distance Vector Routing Algorithm Report
0% (1)
Distance Vector Routing Algorithm Report
11 pages
Excel Power Query
No ratings yet
Excel Power Query
1 page
AboveAllAdmin RFP for Transcription Services
No ratings yet
AboveAllAdmin RFP for Transcription Services
3 pages
AACS1074 Assignment Report 202305
No ratings yet
AACS1074 Assignment Report 202305
30 pages
System Commands OPPE 16th Aug 25 Solutions (Q4 Not Solved)
No ratings yet
System Commands OPPE 16th Aug 25 Solutions (Q4 Not Solved)
3 pages
Fe197 P-47D Placards
No ratings yet
Fe197 P-47D Placards
3 pages
Apple Gift Card - Google Search
No ratings yet
Apple Gift Card - Google Search
1 page
PowerSchool Parent Guide 2007-2008
100% (1)
PowerSchool Parent Guide 2007-2008
9 pages

LlamaIndex Talk (W&B Fully Connected 2024)

Uploaded by

LlamaIndex Talk (W&B Fully Connected 2024)

Uploaded by

Beyond RAG: Building Advanced Context-

Data Parsing & Ingestion Data Querying

Data Parsing + LLM +

Data Parsing + LLM +

🚫 It’s just a glorified search system

💡 Can we go beyond simple search/QA to building a general

Query RAG Response

Query RAG Response

Query Agent RAG Response

Routing Tool Use

Given user query and set of

Break down query into

Each sub-query can be

Lyft 10-K chunk 8

Query Planning Guide top-2

Uber 10-K chunk 4

Lyft 10-K chunk 8

Infer the parameters of that

But what if you used the

A key capability in many QA

ReAct + RAG Guide

Kim et al. 2023

Tool Use: Parallel

Run Monte-Carlo Tree

Use feedback to improve

Use few-shot examples

You might also like