0% found this document useful (0 votes)

2K views168 pages

LangChainJS For Beginners - Nathan Sebhastian

Uploaded by

kodaneb161

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2K views168 pages

LangChainJS For Beginners - Nathan Sebhastian

Uploaded by

kodaneb161

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 168

Preface
Working Through This Book
Requirements
Source Code
Contact
Chapter 1: Introduction to Generative AI Applications
What is a Large Language Model?
What is LangChain?
The Architecture of a Generative AI Application
Development Environment Set Up
Summary
Chapter 2: Your First LangChain Application
Installing LangChain Packages
Creating the Question & Answer Application
Getting Google Gemini API Key
Running the Application
Resource Exhausted Error
Summary
Chapter 3: Using OpenAI LLM in LangChain
Getting Started With OpenAI API
Integrating OpenAI With LangChain
ChatGPT vs Gemini: Which One To Use?
Summary
Chapter 4: Using Open-Source LLMs in LangChain
Ollama Introduction
Using Ollama in LangChain
Again, Which One To Use?
Summary
Chapter 5: Enabling User Input With Prompts
Summary
Chapter 6: LangChain Prompt Templates
Creating a Prompt Template
Prompt Template With Multiple Inputs
Restricting LLM From Answering Unwanted Prompts
Summary
Chapter 7: The LangChain Expression Language (LCEL)
Sequential Chains
Simple Sequential Chain
Using Multiple LLMs in Sequential Chain
Debugging the Sequential Chains
Summary
Chapter 8: Regular Sequential Chains
Format the Output Variables
Summary
Chapter 9: Implementing Chat History in LangChain
Creating a Chat Prompt Template
Saving Messages in LangChain
Summary
Chapter 10: AI Agents and Tools
Creating an AI Agent With LangChain
Asking Different Questions to the Agent
List of Available AI Tools
Types of AI Agents
Summary
Chapter 11: Interacting With Documents in LangChain
Getting the Document
Building the Chat With Document Application
Adding Chat Memory for Context
About The Vector Database
Switching the LLM
Summary
Chapter 12: Uploading Different Document Types
Summary
Chapter 13: Chat With YouTube Videos
Adding The YouTube Loader
Handling Transcript Doesn’t Exist Error
Summary
Chapter 14: Interacting With Images Using Multimodal
Messages
Understanding Multimodal Messages
Sending Multimodal Messages in LangChain
Adding Chat History
Ollama Multimodal Message
Summary
Chapter 15: Developing AI-powered Next.js Application
Creating the Next.js Application
Installing Required Packages
Adding the Server Action
Adding Profile Pictures for User and Assistant
Developing React Chat Components
Adding Chat History
Summary
Chapter 16: Deploying Next.js AI Application to Production
Streaming the Response
Creating API Key Input
Adding the setApi() Function
Adding a Chat Sidebar
Running Build Locally
Pushing Code to GitHub
Vercel Deployment
Summary
Wrapping Up
About the author

OceanofPDF.com
LangChainJS For Beginners
A Step-By-Step Guide to AI Application Development With
LangChain,JavaScript/NextJS, OpenAI/ChatGPT and Other LLMs

By Nathan Sebhastian

OceanofPDF.com
PREFACE

The goal of this book is to provide a gentle step-by-step

instructions that help you learn LangChain.js gradually from
basic to advanced.

You will see why LangChain is a great tool for building AI

applications and how it simplifies the integration of language
models into your web applications.

We’ll see how essential LangChain features such as prompt

templates, chains, agents, document loaders, output parsers,
and model classes are used to create a generative AI application
that’s smart and flexible.

After that, we will integrate LangChain into Next.js so you know

how to create AI-powered web applications.

Working Through This Book

This book is broken down into 16 concise chapters, each
focusing on a specific topic in LangChain programming.

I encourage you to write the code you see in this book and run
them so that you have a sense of what LangChain development
looks like. You learn best when you code along with examples
in this book.

A tip to make the most of this book: Take at least a 10-minute

break after finishing a chapter, so that you can regain your
energy and focus.

Also, don’t despair if some concept is hard to understand.

Learning anything new is hard for the first time, especially
something technical like programming. The most important
thing is to keep going.

Requirements
To experience the full benefit of this book, you need to have
knowledge of basic JavaScript and NextJS.

If you need some help in learning JavaScript or NextJS, you can

get one of my books at https://codewithnathan.com

Source Code
You can download the source code from GitHub at the following
link:

https://github.com/nathansebhastian/langchain-js

Click on the 'Code' button, then click on the 'Download ZIP' link
as shown below:
Figure 1. Download the Source Code at GitHub

You need to extract the archive to access the code. Usually, you
can just double-click the archive to extract the content.

The number in the folder name indicates the chapter number

in this book.

Contact
If you need help, you can contact me at
[email protected].

You can also connect or follow me on LinkedIn at

https://linkedin.com/in/nathansebhastian

OceanofPDF.com
CHAPTER 1: INTRODUCTION TO
GENERATIVE AI APPLICATIONS

A Generative AI application is a computer application that can

generate contextually relevant output based on a given input
(or prompt).

Generative AI application came to the attention of the general

public in 2022, when OpenAI released ChatGPT and quickly
gained 1 million users in just 5 days:
Figure 2. ChatGPT Reached 1 Million Users in 5 Days

Another example of a generative AI application is chatpdf.com,

which enables users to upload a PDF and perform various tasks,
such as extracting insights from the PDF.

The answers provided by chatpdf.com contain references to

their source in the original PDF document, so no more flipping
pages to find the source.

Behind the scenes, these generative AI applications use the

power of Large Language Models to generate the answers.
What is a Large Language Model?
A Large Language Model (LLM for short) is a machine learning
model that can understand and generate an output that
humans can understand.

LLMs are usually trained on a vast amount of text data

available on the internet so that they can perform a wide range
of language-related tasks such as translation, summarization,
question answering, and creative writing.

Examples of LLMs include GPT-4 by OpenAI, Gemini by Google,

Llama by Meta, and Mistral by Mistral.

Some LLMs are closed-source, like GPT and Gemini, while some
are open-source such as Llama and Mistral.

What is LangChain?
LangChain is an open-source framework designed to simplify
the process of developing a LLM-powered application.

LangChain enables you to integrate and call LLM which powers

generative AI applications by simply calling the class that
represents the model.

Under the hood, LangChain will perform the steps required to

interact with the language model API and manage the
processing of input and output so that you can access different
LLMs with minimal code change.

What’s more, you can also use external data sources such as a
PDF, a Wikipedia article, or a search engine result with
LangChain to produce a contextually relevant response.
By using LangChain, you can develop specialized generative AI
applications that are optimized for certain use cases, such as
summarizing a YouTube video, extracting insights from a PDF,
or writing an essay.

LangChain supports both Python and JavaScript. This book

focuses on the JavaScript version of LangChain.

The Architecture of a Generative AI

Application
A traditional application commonly uses the client-server
architecture as follows:

Figure 3. Client-Server Architecture

The client and server communicate using HTTP requests. When

needed, a server might interact with the database to fulfill the
request sent by the client.

On the other hand, a Generative AI application utilizes the

power of LLM to understand human language prompts and
generate relevant outputs:
Figure 4. AI-Powered Application Architecture

While the architecture is similar to a traditional application,

there’s an added layer to connect to LLMs.

This added layer is where LangChain comes in, as it performs

and manages tasks related to LLM, such as processing our input
into a prompt that LLMs can understand. It also processes the
response from LLM into a format that’s accepted in traditional
applications.

You’ll understand more as you practice building generative AI

applications in the following chapters.

For now, just think of LangChain as a management layer

between your application server and the LLM.
Development Environment Set Up
To start developing AI applications with LangChain.js, you need
to have three things on your computer:

1. A web browser
2. A code editor
3. The Node.js program

Let’s install them in the next section.

Installing Chrome Browser

Any web browser can be used to browse the Internet, but for
development purposes, you need to have a browser with
sufficient development tools.

The Chrome browser developed by Google is a great browser

for web development, and if you don’t have the browser
installed, you can download it here:

https://www.google.com/chrome/

The browser is available for all major operating systems. Once

the download is complete, follow the installation steps
presented by the installer to have the browser on your
computer.

Next, we need to install a code editor. There are several free

code editors available on the Internet, such as Sublime Text,
Visual Studio Code, and Notepad++.
Out of these editors, my favorite is Visual Studio Code because
it’s fast and easy to use.

Installing Visual Studio Code

Visual Studio Code or VSCode for short is a code editor
application created for the purpose of writing code. Aside from
being free, VSCode is fast and available on all major operating
systems.

You can download Visual Studio Code here:

https://code.visualstudio.com/

When you open the link above, there should be a button

showing the version compatible with your operating system as
shown below:
Figure 5. Downloading VSCode

Click the button to download VSCode, and install it on your

computer.

Now that you have a code editor installed, the next step is to
install Node.js

Installing Node.js
Node.js is a JavaScript runtime application that enables you to
run JavaScript outside of the browser. We need this program to
install the TypeScript compiler.

You can download and install Node.js from https://nodejs.org.

Pick the recommended LTS version because it has long-term
support. The installation process is pretty straightforward.
To check if Node has been properly installed, type the command
below on your command line (Command Prompt on Windows
or Terminal on Mac):

node -v

The command line should respond with the version number of

the Node.js you have on your computer.

Node.js also includes a program called npm (Node Package

Manager) which you can use to install and manage Node
packages:

npm -v

Node packages are JavaScript libraries and frameworks that

you can use for free in your project. We’re going to use npm to
install some packages later.

Now you have all the software needed to start developing

LangChain applications. Let’s do that in the next chapter.

Summary
In this chapter, you’ve learned the architecture of a generative
AI application, and how LangChain takes the role of an
integration layer between the server and the LLM API endpoint.

You’ve also installed the tools required to write and run a

LangChain application on your computer.

If you encounter any issues, you can email me at

[email protected] and I will do my best to help you.
OceanofPDF.com
CHAPTER 2: YOUR FIRST
LANGCHAIN APPLICATION

It’s time to create our first LangChain application.

We will create a simple question and answer application where

we can ask any kind of question to a Large Language Model

First, create a folder on your computer that will be used to store

all files related to this project. You can name the folder
'beginning_langchain_js'.

Next, open the Visual Studio Code, and select File > Open
Folder… from the menu bar. Select the folder you’ve just
created earlier.

VSCode will load the folder and display the content in the
Explorer sidebar, it should be empty as we haven’t created any
files yet.

To create a file, right-click anywhere inside the VSCode window

and select New Text File or New File… from the menu.

Once the file is created, press Control + S or Command + S to save

the file. Name that file as app.js.
Installing LangChain Packages
Now you need to install the packages required to create a
LangChain application.

In VSCode, right-click on the folder you’ve just created, then

select Open in Integrated Terminal to show the command line
inside VSCode.

On the terminal, run the following command:

npm install langchain @langchain/google-genai dotenv

The above command will install three packages:

▪ langchain contains all the core modules of LangChain

▪ @langchain/google-genai is the Google Generative AI

integration module

▪ dotenv is used to load LLM API keys from environment

variables

Once the packages are installed, npm will generate a

package.json file containing the versions of these installed
packages.

Creating the Question & Answer Application

It’s time to write the code for the question and answer
application.
On the app.js file, import the Google Generative AI class and
load the environment variables:

import { ChatGoogleGenerativeAI } from '@langchain/google-genai';

// load environment variables

import 'dotenv/config';

To interact with LLMs in LangChain, you need to create an

object that represents the API for that LLM.

Because we want to interact with Google’s LLM, we need to

create an object from the ChatGoogleGenerativeAI class as
follows:

const llm = new ChatGoogleGenerativeAI({

model: 'gemini-1.5-pro-latest',
apiKey: process.env.GOOGLE_GEMINI_KEY,
});

The GOOGLE_GEMINI_KEY contains the API key which you’re going

to get in the next section.

For now, you just need to understand that the

ChatGoogleGenerativeAI object represents the Google LLM you
want to use.

When instantiating a new llm object, you need to pass an object

specifying the options for that instance.

The model option is required so that Google knows which model

you want to use, and the apiKey option is used to verify you
have permission to use that model.
Next, write the code for a simple question and answer
application as follows:

console.log('Q & A With AI');

console.log('=============');

const question = "What's the currency of Thailand?";

console.log(`Question: ${question}`);

const response = await llm.invoke(question);

console.log(`Answer: ${response.content}`);

In the code above, we simply print some text showing the

question we want to ask the model.

The llm.invoke() method will send the input question to the

LLM and return response object.

The answer is stored under the content property, so we print

response.content value.

Because we use the await syntax, we need to add the "type":

"module" option in the package.json file as follows:

{
"type": "module",
"dependencies": {
// ...
}
}

This type option is also needed when we use the import syntax
for the packages instead of require().

Now the application is ready, but we still need to get the Google
Gemini API key to access the LLM.
Getting Google Gemini API Key
To get the API key, you need to visit the Gemini API page at
https://ai.google.dev/gemini-api

On the page, you need to click the 'Get API Key in Google AI
Studio' button as shown below:

Figure 6. Get Gemini API Key

From there, you’ll be taken to Google AI Studio.

Note that, you might be shown the page below when clicking
the button:
Figure 7. Google AI Studio Available Regions Page

This page usually appears when you are located in a region

that’s not served by Google AI Studio.

One way to handle this is to use a VPN service, but I would

recommend you use another LLM instead, such as OpenAI or
Ollama which I will show in the next chapters.

If this is your first time accessing the studio, it will show you the
terms of service like this:
Figure 8. Google AI Studio Terms of Service

Just check on the 'I consent' option, then click 'Continue'.

Now click the 'Create API Key' button to create the key:
Figure 9. Gemini Create API Key

If you’re asked where to create the API Key, select create in new
project:
Figure 10. Create API Key in New Project

Google will create a Cloud project and generate the key for you.

After a while, you should see the key shown in a pop-up box as
follows:

Figure 11. Gemini API Key Generated

Copy the API key string, then create a .env file in your
application folder with the following content:

GOOGLE_GEMINI_KEY='Your Key Here'

Replace the Your Key Here string above with your actual API
key.

Running the Application

With the API key obtained, you are ready to run the LangChain
application.

From the terminal, run the app.js file using Node.js as follows:

node app.js

You should see the following output in your terminal:

Q & A With AI
=============
Question: What's the currency of Thailand?
Answer: Thai baht

This means you have successfully created your first LangChain

application and interacted with Google’s Gemini LLM using the
API key.

Each LLM model has its own characteristics. The 'gemini-1.5-

pro-latest' model usually answers a question directly with no
extra information.

You can try changing the model to 'gemini-1.5-flash-latest' as

shown below:
const llm = new ChatGoogleGenerativeAI({
model: 'gemini-1.5-flash-latest',
apiKey: process.env.GOOGLE_GEMINI_KEY,
});

Now run the app.js file again, and the answer is a bit different
this time:

Q & A With AI
=============
Question: What's the currency of Thailand?
Answer: The currency of Thailand is the **Thai baht**, which is abbreviated as
**THB**.

Here, the 'gemini-1.5-flash' repeats the question first, then gives

more information such as the currency abbreviation and
symbol.

The asterisk ** symbols around THB are meant to make the text
appear in bold, but it’s rendered as-is in the terminal.

Now try replacing the question variable with any question you
want to ask the LLM.

Resource Exhausted Error

When using Google Gemini, you might see an error like this
when running the application:

ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..

This error occurs because the free tier resource has been
exhausted. You need to try again at a later time.
Summary
The code for this chapter is available in the
02_Simple_Q&A_Gemini folder.

In this chapter, you’ve created and run your first LangChain

application. Congratulations!

The application can connect to Google’s Gemini LLM to ask

questions and get answers.

In the next chapter, we’re going to learn how to use OpenAI’s

GPT model in LangChain.

OceanofPDF.com
CHAPTER 3: USING OPENAI LLM
IN LANGCHAIN

In the previous chapter, you’ve seen how to communicate with

Google’s Gemini model using LangChain.

In this chapter, I will show you how to use OpenAI in

LangChain as an alternative.

But keep in mind that the OpenAI API has no free tier. It used to
give away $5 worth of API usage, but it seems to have been
quietly stopped.

So if you want to use OpenAI API, you need to buy the

minimum amount of credit, which is $5 USD.

Getting Started With OpenAI API

OpenAI is an AI research company that aims to develop and
promote capable AI software for the benefit of humanity. The
famous ChatGPT is one of the products developed by OpenAI.

Besides the ChatGPT application, OpenAI also offers GPT

models, the LLM that powers ChatGPT, in the form of HTTP API
endpoints.
To use OpenAI’s API, you need to register an account on their
website at https://platform.openai.com.

After you sign up, you can go to

https://platform.openai.com/api-keys to create a new secret key.

When you try to create a key for the first time, you’ll be asked to
verify by adding a phone number:

Figure 12. OpenAI Phone Verification

OpenAI only uses your phone number for verification purpose.

You will receive the verification code through SMS.

Once you’re verified, you’re going to be asked to add credit

balance for API usage. If not, go to
https://platform.openai.com/account/billing to add some credits.
Figure 13. OpenAI Adding Credits

OpenAI receives payment using credit cards, so you need to

have one. The lowest amount you can buy is $5 USD, and it will
be more than enough to run all the examples in this book using
OpenAI.

Alternatively, if you somehow get the $5 free trial credits, then

you don’t need to set up your billing information.

Next, input the name and select the project the key will belong
to:
Figure 14. OpenAI Create API Key

Click the 'Create secret key' button, and OpenAI will show you
the generated key:
Figure 15. OpenAI Copy API Key

You need to copy and paste this API key into the .env file of your
project:

OPENAI_KEY='Your Key Here'

Now that you have the OpenAI key ready, it’s time to use it in
the LangChain application.

Integrating OpenAI With LangChain

To use OpenAI in LangChain, you need to install the
@langchain/openai package using npm:

npm install @langchain/openai

Once the package is installed, create a new file named

app_gpt.js and import the ChatOpenAI class from the package.
When instantiating the llm object, specify the model and
openAIApiKey options as shown below:

import { ChatOpenAI } from '@langchain/openai';

import 'dotenv/config';

const llm = new ChatOpenAI({

model: 'gpt-4o',
openAIApiKey: process.env.OPENAI_KEY,
});

You can change the model parameter with the model you want
to use. As of this writing, GPT-4o is the latest LLM released by
OpenAI.

Let’s ask GPT the same question we asked to Gemini:

console.log('Q & A With AI');

console.log('=============');

const question = "What's the currency of Thailand?";

console.log(`Question: ${question}`);

const response = await llm.invoke(question);

console.log(`Answer: ${response.content}`);

Save the changes, then run the file using Node.js:

node app_gpt.js

You should get a response similar to this:

Q & A With AI
=============
Question: What's the currency of Thailand?
Answer: The currency of Thailand is the Thai Baht. Its ISO code is THB.
This means the LangChain application successfully
communicated with the GPT chat model from OpenAI.
Awesome!

You can try asking another question by changing the question

variable value.

ChatGPT vs Gemini: Which One To Use?

Both ChatGPT and Gemini are very capable of performing the
tasks we want them to do in this book, so it’s really up to you.

In the past, OpenAI used to give $5 credits for free. But it seems
no longer to be the case, as many people in the OpenAI forum
said they don’t get it after registering.

On the other hand, Google is offering a free tier for the Gemini
model in exchange for using our data to train the model, so it’s
okay to use it for learning and exploring LangChain.

Still, Google has the right to stop the free tier at any time, so let
me introduce you to one more way to use LLMs in LangChain.

This time, we’re going to use open-source models.

Summary
The code for this chapter is available in the folder
03_Using_OpenAI from the book source code.

In this chapter, you’ve learned how to create OpenAI API key

and use it in a LangChain application.
Here, we start to see one of the benefits of using LangChain,
which is easy integration with LLMs of any kind.

LangChain represents the LLMs as packages that you can install

and import into your project.

You only need to create an instance of the model class and run
the invoke() method to access the LLM.

Whenever you need to use another LLM, you only need to

change the class used and pass the right model and API key.

OceanofPDF.com
CHAPTER 4: USING OPEN-
SOURCE LLMS IN LANGCHAIN

The LangChain library enables you to communicate with LLMs

of any kind, from proprietary LLMs such as Google’s Gemini
and OpenAI’s GPT to open-source LLMs like Meta’s Llama and
Mistral.

This chapter will show you how to use open-source models in

LangChain. Let’s jump in.

Ollama Introduction
Ollama is a tool used to run LLMs locally. It handles
downloading, managing, and opening HTTP API endpoints for
the models you want to use on your computer.

To get started, head over to https://ollama.com and click the

'Download' button.

From there, you can select the version for your Operating
System:
Figure 16. Downloading Ollama

Once downloaded, open the package and follow the instructions

until you are asked to install the command line tool as follows:
Figure 17. Installing Ollama Terminal Command

Go ahead and click the 'Install' button.

Once the installation is finished, Ollama will show you how to

run a model:
Figure 18. Ollama Running Your First Model

But since Llama 3 is an 8 billion parameter model, the model is

quite large at 4.7 GB.
I recommend you run the Gemma model instead, which has 2
billion parameters:

ollama run gemma:2b

The Gemma model is a lightweight model from Google, so you

can think of it as an open-source version of Google Gemini.

The Gemma 2B model is only 1.7 GB in size, so it comes in

handy when you want to try out ollama.

Once the download is finished, you can immediately use the

model from the terminal. Ask it a question as shown below:

Figure 19. Example of Asking Gemma in Ollama

To exit the running model, type /bye and press Enter.

As long as Ollama is running on your computer, the Ollama API

endpoint is accessible at localhost:11434 as shown below:
Figure 20. Ollama Localhost API Endpoint

LangChain will use this API endpoint to communicate with

Ollama models, which we’re going to do next.

Using Ollama in LangChain

To use the models downloaded by Ollama, you need to import
the ChatOllama class which was developed by the LangChain
community.

From the terminal, install the community package using npm as

follows:

npm install @langchain/community

Next, create a new file named app_ollama.py and import the
Ollama chat model as shown below:

import { ChatOllama } from '@langchain/community/chat_models/ollama';

const llm = new ChatOllama({

model: 'gemma:2b',
});

console.log('Q & A With AI');

console.log('=============');

const question = "What's the currency of Thailand?";

console.log(`Question: ${question}`);

const response = await llm.invoke(question);

console.log(`Answer: ${response.content}`);

Because Ollama is open-source and local, you don’t need to

import the dotenv module and use API keys.

Now you can run the file using Node.js to communicate with
the LLM. You should have a response similar to this:

Q & A With AI
=============
Question: What's the currency of Thailand?
Answer: The currency of Thailand is the Thai baht (THB). It is subdivided into
100 sen. The baht is denoted by the symbol THB.

Note that because the LLM model is running on your computer,

the answer may take longer when compared to the Gemini and
GPT models.

And that’s how you use Ollama in LangChain. If you want to use
other open-source models, you need to download the model
with Ollama first:

ollama pull mistral

The pull command downloads the model without running it on
the command line.

After that, you can switch the model parameter when creating
the ChatOllama object:

// Switch the model

const llm = new ChatOllama({
model: 'mistral',
});

Remember that the larger the model, the longer it takes to run.

The common guideline is that you should have at least 8 GB of

RAM available to run the 7B models, 16 GB to run the 13B
models, and 32 GB to run the 33B models.

There are many open-source models that you can run using
Ollama, such as Google’s Gemma and Microsoft’s Phi-3.

You can explore https://ollama.com/library to see all available

models.

Again, Which One To Use?

So far, you have explored how to use Google Gemini, OpenAI
GPT, and Ollama open-source models. Which one to use in your
application?

I recommend you use OpenAI GPT if you can afford it because

the API isn’t rate-limited and the result is great.

If you can’t use OpenAI GPT for any reason, then you can use
Google Gemini free tier if it’s available in your country.
If not, you can use Ollama and download the Gemma 2B model
or Llama 3, based on your computer’s RAM capacity.

Unless specifically noted, I’m going to use OpenAI GPT for all
example codes shown in this book.

But don’t worry because replacing the LLM part in LangChain is

very easy. You only need to change the llm variable itself as
shown in this chapter.

You can get the code examples that use Gemini and Ollama on
the repository.

Summary
The code for this chapter is available in the folder
04_Using_Ollama from the book source code.

In this chapter, you’ve learned how to use open-source LLMs

using Ollama and LangChain.

Using Ollama, you can download and run any Large Language
Models that are open-source and free to use.

If you look at the Ollama website, you will find many models
that are very capable and can even match proprietary models
such as Gemini and ChatGPT in performance.

If you are worried about the privacy of your data and want to
make sure that no one uses it for training their LLMs, then
using open-source LLMs like Llama 3, Mistral, or Gemma can be
a great choice.

OceanofPDF.com
CHAPTER 5: ENABLING USER
INPUT WITH PROMPTS

So far, you have asked the LLM by writing the question in the
question constant.

Instead of having to type the question explicitly, let’s enable

user input so that you can type the question in the terminal.

This can be done by installing the prompts package from npm:

npm install prompts

The prompts package is a lightweight package used to add

interactivity to the terminal.

This means you can ask for user input when you run the
question and answer application.

Back to the application, import the prompts package and ask for
a question like this:

import prompts from 'prompts';

// ...
console.log('Q & A With AI');
console.log('=============');

const { question } = await prompts(

{
type: 'text',
name: 'question',
message: 'Your question: ',
validate: value => (value ? true : 'Question cannot be empty'),
});

const response = await llm.invoke(question);

console.log(response.content);

The prompts() function takes an object or an array of objects,

then use each object to form a question to the user.

You can specify the type of the response, the name of the variable
to store the user input, the message to show when asking for
input, then the validate function.

The prompts() function will keep asking the same question until
the validate function returns true.

If you run the application now, you can type the question in the
terminal as shown below:

Figure 21. User Input In The Terminal

With this, you can ask any kind of question without needing to
replace the question variable each time.

You can also allow the user to chat with the LLM until the user
type '/bye' in the terminal.
Wrap the user input prompt in a while loop as shown below:

console.log('Q & A With AI');

console.log('=============');
console.log('Type /bye to stop the program');

let exit = false;

while (!exit) {
const { question } = await prompts({
type: 'text',
name: 'question',
message: 'Your question: ',
validate: value => (value ? true : 'Question cannot be empty'),
});
if (question == '/bye') {
console.log('See you later!');
exit = true;
} else {
const response = await llm.invoke(question);
console.log(response.content);
}
}

This way, JavaScript will keep asking for input until the user
types '/bye':

We’ll ask for more input in the coming chapters, so having the
prompts module will come in handy.

Summary
The code for this chapter is available in the folder
05_Enabling_User_Input from the book source code.
In this chapter, you’ve added the prompts package to capture
user input and make the application more interactive.

The prompts package will only be used as long as we’re learning

LangChain. It won’t be used when we develop a web
application later.

In the next chapter, we’re going to learn about prompt

templates.

OceanofPDF.com
CHAPTER 6: LANGCHAIN
PROMPT TEMPLATES

The LangChain prompt template is a JavaScript class used to

create a specific prompt (or instruction) to send to a Large
Language Model.

By using prompt templates, we can reproduce the same

instruction while requiring minimal input from the user.

To show you an example, suppose you are creating a simple AI

application that only gives currency information of a specific
country.

Based on what we already know, the only way to do this is to

keep repeating the question while changing the country as
shown below:

Your question: … What is the currency of Malaysia?

...
Your question: … What is the currency of India?
...
Your question: … What is the currency of Cambodia?

Instead of repeating the question, you can create a template for

that question as follows:
console.log('Currency Info');
console.log('=============');
console.log('You can ask for the currency of any country in the world');

const { country } = await prompts({

type: 'text',
name: 'country',
message: 'Input Country: ',
validate: value => (value ? true : 'Country cannot be empty'),
});

const response = await llm.invoke(`What is the currency of ${country}`);

console.log(response.content);

Now you only need to give the country name to get the
currency information.

While you can use a template string as shown above,

LangChain recommends you use the prompt template class
instead for effective reuse. Let me show you how.

Creating a Prompt Template

To create a prompt template, you need to import the
PromptTemplate class from @langchain/core/prompts as shown
below:

import { PromptTemplate } from '@langchain/core/prompts';

The next step is to create the prompt itself.

You can create a variable named prompt, then pass the call to
PromptTemplate() constructor to that variable:

const prompt = new PromptTemplate({});

When calling the PromptTemplate() constructor, you need to pass
two arguments:

1. inputVariables - An array of the names of the variables

used in the template
2. template - The string for the prompt template itself

Here’s an example of the complete PromptTemplate() call:

const prompt = new PromptTemplate({

inputVariables: ['country'],
template: `What is the currency of {country}? Answer in one short paragraph`,
});

Now you can use the prompt object when calling the
llm.invoke() method.

You need to call the prompt.format() method and pass the

variable specified in the inputVariables parameter as shown
below:

const { country } = await prompts({

type: 'text',
name: 'country',
message: 'Input Country: ',
validate: value => (value ? true : 'Country cannot be empty'),
});

const response = await llm.invoke(await prompt.format({ country: country }));

console.log(response.content);

Now run the application and ask for the currency of a specific
country.

Here’s an example of asking for the currency of Spain:

Figure 22. LLM Response

The PromptTemplate class provides a structure from which you

can construct a specific prompt.

Prompt Template With Multiple Inputs

The prompt template can accept as many inputs as you need in
your template string.

For example, suppose you want to control the amount of

paragraph and the language of the answer. You can add two
more variables to the prompt template like this:

const prompt = new PromptTemplate({

inputVariables: ['country', 'paragraph', 'language'],
template: `
You are a currency expert.
You give information about a specific currency used in a specific country.

Answer the question: What is the currency of {country}?

Answer in {paragraph} short paragraph in {language}

`,
});

Because we have three input variables, we need to ask the

inputs from the user.

Just above the prompts() call, create an array of objects

containing the input details as shown below:
const questions = [
{
type: 'text',
name: 'country',
message: 'What country?',
validate: value => value ? true : 'Country cannot be empty',
},
{
type: 'number',
name: 'paragraph',
message: 'How many paragraphs (1 to 5)?',
validate: value =>
value >= 1 && value <= 5 ? true : 'Paragraphs must be between 1 and 5',
},
{
type: 'text',
name: 'language',
message: 'What Language?',
validate: value => value ? true : 'Language cannot be empty',
},
];

Notice that the paragraph input is limited between 1 and 5 to

avoid generating a long article.

Now pass the questions variable to the prompts() function, and

then extract the inputs using the destructuring assignment
syntax:

const { country, paragraph, language } = await prompts(questions);

Now you can pass the inputs to the prompt.format() method:

const response = await llm.invoke(

await prompt.format({ country, paragraph, language })
);

console.log(response.content);

And that’s it. Now you can try running the application as shown
below:
Figure 23. Multiple Inputs Result

Combining the prompt template with prompts, you can create a

more sophisticated currency information application that can
generate an answer exactly N paragraphs long and in your
preferred language.

Restricting LLM From Answering Unwanted

Prompts
Prompt templates can also prevent your model from giving
answers to weird questions.

For example, you can ask the LLM about the currency of
Narnia, which is a fictional country created by the British
author C.S. Lewis:

Figure 24. LLM Answering All Kinds of Questions

While the answer is appropriate, you might not want to give
information about fictional or non-existent countries in the first
place.

Modify the prompt template parameter as shown below:

const prompt = new PromptTemplate({

inputVariables: ['country', 'paragraph', 'language'],
template: `
You are a currency expert.
You give information about a specific currency used in a specific country.
Avoid giving information about fictional places.
If the country is fictional or non-existent, answer: I don't know.

Answer the question: What is the currency of {country}?

Answer in {paragraph} short paragraph in {language}

`,
});

The prompt above instructs the LLM to not answer when asked
about fictional places.

Now if you ask again, the LLM will respond as follows:

Figure 25. LLM Not Answering

As you can see, the LLM refused to answer when asked about
the currency of a fictional country.

With a prompt template, the code is more maintainable and

cleaner compared to using template strings repeatedly.
Summary
The code for this chapter is available in the folder
06_Prompt_Template from the book source code.

The use of prompt templates enables you to craft a

sophisticated instruction for LLMs while requiring only
minimal inputs from the user.

The more specific your instruction, the more accurate the

response will be.

You can even instruct the LLM to avoid answering unwanted

prompts, as shown in the last section.

OceanofPDF.com
CHAPTER 7: THE LANGCHAIN
EXPRESSION LANGUAGE (LCEL)

In the previous chapter, we called the prompt.format() method

inside the llm.invoke() method as shown below:

const response = await llm.invoke(

await prompt.format({ country, paragraph, language })
);

console.log(response.content);

While this technique works, LangChain actually provides a

declarative way to sequentially execute the prompt and llm
objects.

This declarative way is called the LangChain Expression

Language (LCEL for short)

Using LCEL, you can wrap the prompt and the llm object in a
chain as follows:

const chain = prompt.pipe(llm);

LCEL is marked by the pipe() method, which can be used to

wrap LangChain components together.
Components in LangChain include the prompt, the LLM, the
chain itself, and parsers. We’ll learn more about parsers in the
next section.

You can call the invoke() method from the chain object, and
pass the inputs required by the prompt as an object like this:

const response = await chain.invoke({ country, paragraph, language });

console.log(response.content);

The chain object will format the prompt and then pass it
automatically to the llm object.

The response object is the same as when you call the

llm.invoke() method: it’s a message object with the answer
stored under the content property.

Sequential Chains
By using LCEL, you can create many chains and run the next
prompt once the LLM responds to the previous prompt.

This method of running the next prompt after the previous

prompt has been answered is called the sequential chain.

Based on the input and output results, the sequential chain is

divided into 2 categories:

▪ Simple Sequential Chain

▪ Regular Sequential Chain

We will explore the regular sequential chain in the next
chapter. For now, let’s explore the simple sequential chain.

Simple Sequential Chain

The simple sequential chain is where each step in the chain has
a single input/ output. The output of one step will be the input
of the next prompt:

Figure 26. Simple Sequential Chain Illustration

For example, suppose you want to create an application that

can write a short essay.

You will provide the topic, and the LLM will first decide on the
title, and then continue by writing the essay for that topic.
To create the application, you need to create a prompt for the
title first:

const titlePrompt = new PromptTemplate({

inputVariables: ['topic'],
template: `
You are an expert journalist.

You need to come up with an interesting title for the following topic:
{topic}

Answer exactly with one title

`,
});

The titlePrompt above receives a single input variable: the

topic for the title it will generate.

Next, you need to create a prompt for the essay as follows:

const essayPrompt = new PromptTemplate({

inputVariables: ['title'],
template: `
You are an expert nonfiction writer.

You need to write a short essay of 350 words for the following title:

{title}

Make sure that the essay is engaging and makes the reader feel excited.
`,
});

This essayPrompt also takes a single input: the title generated

by the titlePrompt which we created before.

Now you need to create two chains, one for each prompt:

const firstChain = titlePrompt.pipe(llm).pipe(new StringOutputParser());

const secondChain = essayPrompt.pipe(llm);

The firstChain uses the StringOutputParser class to parse the
LLM response as a string, so you need to import the parser from
LangChain:

import { StringOutputParser } from '@langchain/core/output_parsers';

Using the string parser, the LLM response will be converted

from an object into a single string value, removing the
metadata included in the response.

Now you can combine the firstChain and secondChain to create

an overallChain as follows:

const overallChain = firstChain

.pipe(firstChainResponse => ({ title: firstChainResponse }))
.pipe(secondChain);

Note that an arrow function is passed in the first pipe() method.

This function formats the value returned by the firstChain into
an object that can be passed into the secondChain.

Now that you have an overallChain, let’s update the prompts

questions to ask just one question:

console.log('Essay Writer');

const { topic } = await prompts([

{
type: 'text',
name: 'topic',
message: 'What topic to write?',
validate: value => (value ? true : 'Topic cannot be empty'),
},
]);

const response = await overallChain.invoke({ topic });

console.log(response.content);
And you’re finished. If you run the application and ask for a
topic, you’ll get a response similar to this:

Figure 27. Simple Sequential Chain Result

There are a few paragraphs cut from the result above, but you
can already see that the firstChain prompt generates the title
variable used by the secondChain prompt.

Using simple sequential chains allows you to break down a

complex task into a sequence of smaller tasks, improving the
accuracy of the LLM results.

Using Multiple LLMs in Sequential Chain

You can also assign a different LLM for each chain you create
using LCEL.

The following sample code runs the first chain using Google
Gemini, while the second chain uses OpenAI GPT:

const llm = new ChatOpenAI({

model: 'gpt-4o',
apiKey: process.env.OPENAI_KEY,
});

const llm2 = new ChatGoogleGenerativeAI({

model: 'gemini-1.5-pro-latest',
apiKey: process.env.GOOGLE_GEMINI_KEY,
});
// Use a different LLM for each chain:
const firstChain = titlePrompt.pipe(llm).pipe(new StringOutputParser());
const secondChain = essayPrompt.pipe(llm2);

Because LCEL is declarative, you can swap the components in

the chain easily.

Debugging the Sequential Chains

If you want to see the process of sequential chains in more
detail, you can enable the debug mode when creating the llm
object:

const llm = new ChatOpenAI({

model: 'gpt-4o',
apiKey: process.env.OPENAI_KEY,
debug: true
});

When you rerun the application, you’ll see the debug output on
the terminal.

You can see the prompt sent by LangChain to LLM by searching

for the [llm/start] log as follows:

[llm/start] [1:llm:ChatOpenAI] Entering LLM run with input: {

// ...
}

To see the output, you need to look for the [llm/end] log.

If you search for the second chain input, you’ll see the prompt
defined as follows:

[llm/start] [1:llm:ChatOpenAI] Entering LLM run with input: {

"messages": [
[
{
"lc": 1,
"type": "constructor",
"id": [
"langchain_core",
"messages",
"HumanMessage"
],
"kwargs": {
"content": "\n You are an expert nonfiction writer.\n\n You need
to write a short essay of 350 words for the following title:\n\n \"Living
with Giants: Unraveling the Mysteries of Bears\"\n\n Make sure that the
essay is engaging and makes the reader feel excited.\n ",
"additional_kwargs": {},
"response_metadata": {}
}
}
]
]
}

The input for the second prompt is formatted as a string

because we use the StrOutputParser() for the first chain.

If you don’t parse the output of the first chain, then the second
chain prompt will look like this:

"kwargs": {
"content": "\n You are an expert nonfiction writer.\n\n You need to write
a short essay of 350 words for the following title:\n\n
[object Object]
\n\n Make sure that the essay is engaging and makes the reader feel
excited.\n ",
}

Notice that the response is embedded into the string as [object

Object], so the LLM might misunderstand the request.

In the case of GPT, it tells you in the response:

I'm sorry, it seems like there was an error in the title provided.
Let's assume a compelling topic to proceed with.

How about this title: "The Marvel of Quantum Computing"?

In my case, I asked GPT to write an essay about bears, and it

randomly suggested a title based on its training data.

To minimize this kind of undesired response, you need to parse

the output of the first chain using LangChain parsers.

We’ll use another parser in the next chapter.

Summary
The code for this chapter is available in the folder 07_LCEL
from the book source code.

In this chapter, you’ve learned about the LangChain Expression

Language, which can be used to compose LangChain
components in a declarative way.

A chain is simply a wrapper for these LangChain components:

1. The prompt template

2. The LLM to use
3. The parser to process the output from LLM

The components of a chain are interchangeable, meaning you

can use the GPT model for the first prompt, and then use the
Gemini model for the second prompt, as shown above.
By using LCEL, you can create advanced workflows and interact
with Large Language Models to solve a complex task.

In the next chapter, I will show you how to create a regular

sequential chain.

OceanofPDF.com
CHAPTER 8: REGULAR
SEQUENTIAL CHAINS

A regular sequential chain is a more general form of sequential

chains that allow multiple inputs and outputs.

The input for the next chain is usually a mix of the output from
the previous chain and another source like this:

Figure 28. Sequential Chain Illustration

This chain is a little more complicated than a simple sequential
chain because we need to track multiple inputs and outputs.

For example, suppose you change the essayPrompt from the

previous chapter to have two inputVariables as follows:

const essayPrompt = new PromptTemplate({

inputVariables: ['title', 'emotion'],
template: `
You are an expert nonfiction writer.

You need to write a short essay of 350 words for the following title:

{title}

Make sure that the essay is engaging and makes the reader feel {emotion}.
`,
});

The emotion input required by the essay prompt doesn’t come

from the first chain, which only returns the title variable.

You need to inject the emotion input when creating the

overallChain as follows:

const overallChain = firstChain

.pipe(result => ({
title: result,
emotion,
}))
.pipe(secondChain);

This way, the title input is obtained from the output of the
firstChain, while the emotion input is from another source.

Now you need to ask the user for the emotion input:

console.log('Essay Writer');
const questions = [
{
type: 'text',
name: 'topic',
message: 'What topic to write?',
validate: value => (value ? true : 'Topic cannot be empty'),
},
{
type: 'text',
name: 'emotion',
message: 'What emotion to convey?',
validate: value => (value ? true : 'Emotion cannot be empty'),
},
];

const { topic, emotion } = await prompts(questions);

And now you have two inputs for the second chain: title and
emotion.

Make sure that the overallChain is declared below the await

prompts() line.

Format the Output Variables

A sequential chain usually also tracks multiple output variables.

To keep track of multiple output variables, you can use the

StructuredOutputParser from LangChain to format the output as
a JSON object.

First, import the parser from LangChain:

import {
StringOutputParser,
StructuredOutputParser,
} from '@langchain/core/output_parsers';
Next, create a parser that contains the schema of the output as
an object.

Suppose you want to show the title, emotion, and essay values
in the response.

Call the StructuredOutputParser.fromNamesAndDescriptions()

method and pass the schema as follows:

const firstChain = titlePrompt.pipe(llm).pipe(new StringOutputParser());

const structuredParser = StructuredOutputParser.fromNamesAndDescriptions({

title: 'the essay title',
emotion: 'the emotion conveyed by the essay',
essay: 'the essay content',
});

After that, pass the parser into the secondChain as follows:

const secondChain = essayPrompt.pipe(llm).pipe(structuredParser);

Now the LLM is instructed to parse the output using the

structuredParser.

In the essayPrompt, update both inputVariables and template

parameters to include the format_instructions input:

const essayPrompt = new PromptTemplate({

inputVariables: ['title', 'emotion', 'format_instructions'],
template: `
You are an expert nonfiction writer.

You need to write a short essay of 350 words for the following title:

{title}

Make sure that the essay is engaging and makes the reader feel {emotion}.

{format_instructions}
`,
});

For the last step, pass the format_instructions input by calling

the structuredParser.getFormatInstructions() method when
creating the overallChain object:

const overallChain = firstChain

.pipe(result => ({
title: result,
emotion,
format_instructions: structuredParser.getFormatInstructions(),
}))
.pipe(secondChain);

The getFormatInstructions() method returns a string that

contains the instruction and the JSON schema. You can see the
string by calling console.log() if you want to:

console.log(structuredParser.getFormatInstructions());

Now you can run the application and give the title and emotion
inputs for the essay. The LLM will return a JavaScript object:
Figure 29. Receiving JSON Output

If you want to write each output variable, you can access the
properties directly as follows:

const response = await overallChain.invoke({

topic,
});
console.log(response.title);
console.log(response.emotion);
console.log(response.essay);

Now you have a sequential chain that tracks multiple inputs

and outputs. Very nice!
Summary
The code for this chapter is available in the folder
08_Sequential_Chain from the book source code.

When you create a sequential chain, you can add extra inputs
to the next chain that are not sourced from the previous chain.

You can also format the output as a JSON object to make the
response better organized and easier to process.

OceanofPDF.com
CHAPTER 9: IMPLEMENTING
CHAT HISTORY IN LANGCHAIN

So far, the LLM took the question we asked it and gave an

answer retrieved from the training data.

Going back to the simple question and answer application in

Chapter 5, you can try to ask the LLM a question such as:

1. When was the last FIFA World Cup held?

2. Multiply the year by 2

At the time of this writing, the last FIFA World Cup was held in
2022. Reading the second prompt above, we can understand
that the 'year' refers to '2022'.

However, because the LLM has no awareness of the previous

interaction, the answer won’t be related.

With GPT, the LLM refers to the current year instead of the last
FIFA World Cup year:
Figure 30. LLM Out of Context Example

The LLM can’t understand that we are making a follow-up

instruction to the previous question.

To address this issue, you need to save the previous messages

and use them when sending a new prompt.

To follow along with this chapter, you can copy the code from
Chapter 5 and use it as a starter.

Creating a Chat Prompt Template

First, you need to create a chat prompt template that has the
chat history injected into it.

A chat prompt template is different from the usual prompt

template. It accepts an array of messages, and each message can
be associated with a specific role.

Here’s an example of a chat prompt template:

import {
ChatPromptTemplate
} from '@langchain/core/prompts';

const chatTemplate = ChatPromptTemplate.fromMessages([

['system', 'You are a helpful AI bot. Your name is {name}'],
['human', 'Hello, how are you doing?'],
['ai', "I'm doing well, thanks!"],
['human', '{input}'],
]);
In the example above, the messages are associated with the
'system', 'human', and 'ai' roles. The 'system' message is used to
influence AI behavior.

You can use the ChatPromptTemplate and MessagesPlaceholder

classes to create a prompt that accepts a chat history as follows:

import {
ChatPromptTemplate,
MessagesPlaceholder,
} from '@langchain/core/prompts';

const prompt = ChatPromptTemplate.fromMessages([

[
'system',
`You are an AI chatbot having a conversation with a human.
Use the following context to understand the human question.
Do not include emojis in your answer`,
],
new MessagesPlaceholder('chatHistory'),
['human', '{input}'],
]);

The MessagesPlaceholder class acts as an opening from which

you can inject the chat history. You need to pass a string key
that stores the chat history when instantiating the class.

Saving Messages in LangChain

To save chat messages in LangChain, you can use the provided
ChatMessageHistory class:

import { ChatMessageHistory } from 'langchain/memory';

const history = new ChatMessageHistory();

The ChatMessageHistory class provides methods to get, add, and

clear messages.
But you’re not going to manipulate the history object directly.
Instead, you need to pass this object into the
RunnableWithMessageHistory class.

The RunnableWithMessageHistory class creates a chain that injects

the chat history for you. It will also add new messages
automatically when you invoke the chain:

const chain = prompt.pipe(llm);

const chainWithHistory = new RunnableWithMessageHistory({

runnable: chain,
getMessageHistory: sessionId => history,
inputMessagesKey: 'input',
historyMessagesKey: 'chatHistory',
});

When creating the RunnableWithMessageHistory object, you need

to pass the chain that you want to inject history into (runnable)
and a function that returns the chat history (getMessageHistory).

The inputMessagesKey is the input key that exists in the prompt,

while historyMessagesKey is the variable that accepts the history
(it should be the same as the string key passed to
MessagesPlaceholder in the prompt)

Now that the chainWithHistory object is created, you can test the
AI and see if it’s aware of the previous conversation. Use a while
loop here to show the input again:

console.log('Chat With AI');

console.log('Type /bye to stop the program');

let exit = false;

while (!exit) {
const { question } = await prompts([
{
type: 'text',
name: 'question',
message: 'Your question: ',
validate: value => (value ? true : 'Question cannot be empty'),
},
]);
if (question == '/bye') {
console.log('See you later!');
exit = true;
} else {
const response = await chainWithHistory.invoke(
{ input: question },
{
configurable: {
sessionId: 'test',
},
}
);
console.log(response.content);
}
}

When invoking the chainWithHistory object, you need to pass

the sessionId key into the config parameter as shown above.

The sessionId can be of any string value.

Now you can test the application by giving a follow-up question:

Q: When was the last FIFA World Cup held?

A: The last FIFA World Cup was held in 2022 in Qatar.
Q: Multiply the year by 2
A: Multiplying the year 2022 by 2 gives you 4044.

Because the chat history is injected into the prompt, the LLM
can put the second question in the context of the first.

If you pass the verbose option to the LLM:

const llm = new ChatOpenAI({

model: 'gpt-4o',
apiKey: process.env.OPENAI_KEY,
verbose: true
});

You will see the chat history is passed when you run the second
question inside the messages array:

[llm/start] [1:llm:ChatOpenAI] Entering LLM run with input: {

"messages": [
... objects containing previous chat messages
]
}

These previous messages are injected by the chainWithHistory

chain into the prompt.

Summary
The code for this chapter is available in the folder
09_Chat_History from the book source code.

In this chapter, you’ve learned how to inject chat history into

the prompt using LangChain’s RunnableWithMessageHistory class.

Adding the chat history enables AI to contextualize your

question based on the previous messages.

Combining chat history with prompts, you can chat with AI

continuously while retaining previous interactions.

OceanofPDF.com
CHAPTER 10: AI AGENTS AND
TOOLS

An AI agent is a thinking software that is capable of solving a

task through a sequence of actions. It uses the LLM as a
reasoning engine to plan and execute actions.

All you need to do is to give the agent a specific task. The agent
will process the task, determine the actions needed to solve it,
and then take those actions.

An agent can also use tools to take actions in the real-world,

such as searching for specific information on the internet.

Here’s an illustration to help you understand the concept of

agents:
Figure 31. LLM Agents Illustration

Not all LLMs are capable of creating an agent, so advanced

models like GPT-4, Gemini 1.5 Pro, or Mistral are required.

Let me show you how to create an agent using LangChain next.

Creating an AI Agent With LangChain

Create a new JavaScript file named react_agent.js and import
the following modules:

import { pull } from 'langchain/hub';

import { ChatOpenAI } from '@langchain/openai';
import { AgentExecutor, createReactAgent } from 'langchain/agents';
import { DuckDuckGoSearch } from
'@langchain/community/tools/duckduckgo_search';
import { WikipediaQueryRun } from
'@langchain/community/tools/wikipedia_query_run';
import { Calculator } from '@langchain/community/tools/calculator';
import 'dotenv/config';
import prompts from 'prompts';

The dotenv and ChatOpenAI have already been used before, but
the rest are new modules used to create an AI agent.

The pull function is used to retrieve a prompt from the

LangChain community hub. You can visit the hub at
https://smith.langchain.com/hub

The LangChain community hub is an open collection of

prompts that you can use for free in your projects.

The createReactAgent module creates an agent that uses ReAct

prompting, while AgentExecutor manages the execution of the
agent, such as processing inputs, generating responses, and
updating the agent’s state.

Next, initialize the llm and get the prompt from the hub as
follows:

const llm = new ChatOpenAI({

model: 'gpt-4o',
apiKey: process.env.OPENAI_KEY,
});

const prompt = await pull('hwchase17/react');

The pull() function retrieves the prompt from the repository

you specified as its argument. Here, we use the "react" prompt
created by the user "hwchase17".

If you want to see the prompt, you can visit

https://smith.langchain.com/hub/hwchase17/react
Next, instantiate the tools we want to provide to the LLM, then
create the agent executor object as follows:

const wikipedia = new WikipediaQueryRun();

const ddgSearch = new DuckDuckGoSearch({ maxResults: 3 });

const calculator = new Calculator();

const tools = [wikipedia, ddgSearch, calculator];

const agent = await createReactAgent({ llm, tools, prompt });

const agentExecutor = new AgentExecutor({

agent,
tools,
verbose: true // show the logs
});

There are three tools we provide to the agent:

1. wikipedia for accessing and summarizing Wikipedia

articles
2. ddgSearch for searching the internet using the DuckDuckGo
search engine
3. calculator for calculating mathematical equations

To run the DuckDuckGo search tool, you need to install the

duck-duck-scrape package using npm:

npm install duck-duck-scrape

The rest of the tools are already available from

@langchain/community module.
Once the installation is finished, complete the agent by adding a
question prompt and call the invoke() method:

const { question } = await prompts([

{
type: 'text',
name: 'question',
message: 'Your question: ',
validate: value => (value ? true : 'Question cannot be empty'),
},
]);

const response = await agentExecutor.invoke({ input: question });

console.log(response);

The AI agent is now completed. You can run the agent using
Node.js as follows:

node react_agent.js

Now give a task for the agent to finish, such as 'Who was the
first president of America?'

Because the verbose parameter is set to true in AgentExecutor,

you will see the reasoning and action taken by the LLM:

Figure 32. LLM Agent Reasoning and Taking Actions

The LLM will take action using the agent we have created to
seek the final answer.

The following log shows the thinking done by the LLM:

Figure 33. LLM Agent Finished

After the agent finished running, it will return an object with

two properties: input and output as shown below:

{
input: 'Who was the first president of America?',
output: 'The first president of the United States was George Washington.'
}

If the LLM you use already has the answer in its training data,
you might see the output immediately without any
[agent/action] logs.

For example, I asked 'When is America Independence Day?'

below:

Your question: … When is America Independence Day?

America's Independence Day is a well-known historical event that does not

require searching for current updates or detailed information from an
encyclopedia.

It is generally known information.\n\nFinal Answer: America's Independence Day

is on July 4th.

{
input: 'When is America Independence Day?',
output: "America's Independence Day is on July 4th."
}

Because the answer is already in its training data, the LLM

decides to answer directly.
Asking Different Questions to the Agent
You can now ask different kinds of questions to see if the LLM is
smart enough to use the available tools.

If you ask 'Who is the Prime Minister of Singapore today?', the

LLM should use DuckDuckGo search to seek the latest
information:

Figure 34. LLM Agent Doing Search

If you ask a math question such as 'Take 5 to the power of 2

then multiply that by the sum of six and three', the agent should
use the calculator tool:

Figure 35. LLM Agent Doing Math

The latest LLMs are smart enough to understand the intent of

the question and pick the right tool for the job.

List of Available AI Tools

An AI agent can only use the tools you added when you create
the agent.

The list of tools provided by LangChain can be found at

https://js.langchain.com/v0.2/docs/integrations/tools.
However, some tools like Calculator and BingSerpAPI are not
listed on the integration page above, so you need to dive into
the source code of the LangChain community package to find
all available tools.

Just open the node_modules/ folder, then go into

@langchain/community/tools and you’ll see all tools there:
Figure 36. LLM Agent Available Tools

You can see other tools like Google search and Bing search here,
but these tools require an API key to run.
Types of AI Agents
There are several types of AI agents identified today, and the
one we created is called a ReAct (Reason + Act) agent.

The ReAct agent is a general-purpose agent, and there are more

specialized agents such as the XML agent and JSON agent.

You can read more about the different agent types at

https://js.langchain.com/v0.1/docs/modules/agents/agent_types/

As LLM and LangChain improve, new types of agents might be

created, so the definitions above won’t always be relevant.

Summary
The code for this chapter is available in the folder
10_Agents_and_Tools from the book source code.

While we don’t have an autonomous robot helper (yet) in our

world today, we can already see how the development of AI
agents might one day be used as the brains of AI robots.

AI agents are a wonderful innovation that shows how a

machine can come up with a sequence of actions to take to
accomplish a goal.

When you create an agent, the LLM is used as a reasoning

engine that needs to come up with steps of logic to accomplish a
task.

The agent can also use various tools to act, such as browsing the
web or solving a math equation.
More complex tasks that use multiple tools can also be executed
by these agents.

Sometimes, a well-trained LLM can answer directly from the

training data, bypassing the need to use tools.

OceanofPDF.com
CHAPTER 11: INTERACTING
WITH DOCUMENTS IN
LANGCHAIN

One of the most interesting AI use cases is the Chat With

Document feature, which enables users to interact with a
document using conversational queries.

By simply asking questions, users can quickly find relevant

information, summarize content, and gain insights without
having to read and sort through pages of text.

Here’s an illustration of the process required to create a Chat

With Document application:
Figure 37. Chat With Document Application Process

You first need to split a single document into chunks so that the
document can be processed and indexed effectively. The chunks
are then converted into vector embeddings.

Vector embeddings are arrays of numbers that represent

information of various types, including text, images, audio, and
more, by capturing their features in a numerical format.

When you give a question, LangChain will convert the query

into vectors, and then search the document vectors for the most
relevant matches.

LangChain then sends the user query and relevant text chunks
to LLM so that the LLM can generate a response based on the
given input.

The process of finding relevant text information and sending it

to the LLM is also known as Retrieval Augmented Generation,
or RAG for short.

Using LangChain, you can create an application that processes a

document so you can ask the LLM questions relevant to the
content of that document.

Let’s jump in.

Getting the Document

You’re going to upload a document to the application and ask
questions about it. You can get the text file named ai-
discussion.txt from the source code folder.

The text file contains a fictional story that discusses the impact
of AI on humanity.

Because it’s fictional, you can be sure that the LLM obtains the
answer from the document and not from its training data.

Building the Chat With Document Application

Create a new JavaScript file named rag_app.js, then import the
packages required for the Chat With Document application as
follows:

import { ChatOpenAI, OpenAIEmbeddings } from '@langchain/openai';

import { ChatPromptTemplate } from '@langchain/core/prompts';
import prompts from 'prompts';

// New packages:
import { TextLoader } from 'langchain/document_loaders/fs/text';
import { RecursiveCharacterTextSplitter } from '@langchain/textsplitters';
import { MemoryVectorStore } from 'langchain/vectorstores/memory';
import { createRetrievalChain } from 'langchain/chains/retrieval';
import { createStuffDocumentsChain } from 'langchain/chains/combine_documents';
import 'dotenv/config';

The OpenAIEmbeddings class is used to access the OpenAI

embedding model.

The TextLoader class is used to load a text file, while

RecursiveCharacterTextSplitter is used to split a text into small
chunks so that it can be processed more efficiently by LLMs.

The MemoryVectorStore class is used to store vector data in

memory. The createRetrievalChain retrieves a document and
passes it to the createStuffDocumentsChain, which passes the
document to the LLM.

With the packages imported, you can define the llm next as
shown below:

const llm = new ChatOpenAI({

model: 'gpt-4o',
apiKey: process.env.OPENAI_KEY,
});

After the llm, load the text using TextLoader and pass the path to
the document in TextLoader as follows:

const loader = new TextLoader('./ai-discussion.txt');

const docs = await loader.load();

The docs will be an array of Document object. Now you need to

create the text splitter and split the document into chunks:

const splitter = new RecursiveCharacterTextSplitter({

chunkSize: 1000,
chunkOverlap: 200,
});
const chunks = await splitter.splitDocuments(docs);

To convert the document chunks into vectors, you need to use an

embedding model.

OpenAI provides an API endpoint to turn documents into

vectors, and LangChain provides the OpenAIEmbeddings class so
that you can use this embedding easily.

Add the following code below the chunks variable:

const embeddings = new OpenAIEmbeddings({ apiKey: process.env.OPENAI_KEY });

const vectorStore = await MemoryVectorStore.fromDocuments(chunks, embeddings);

const retriever = vectorStore.asRetriever();

The MemoryVectorStore.from_documents() method sends the

chunks and embeddings data to the vectore store.

Next, you need to call the asRetriever() method to create a

retriever object, which accepts the user input and returns
relevant chunks of the document.

After that, you need to create the prompt to send to the LLM:

const systemPrompt = `You are an assistant for question-answering tasks.

Use the following pieces of retrieved context to answer
the question. If you don't know the answer, say that you
don't know. Use three sentences maximum and keep the
answer concise.
\n\n
{context}`;

const prompt = ChatPromptTemplate.fromMessages([

['system', systemPrompt],
['human', '{input}'],
]);

With the prompt created, you need to create a chain by calling

the createStuffDocumentsChain() function and pass llm and
prompt like this:

const questionAnswerChain = await createStuffDocumentsChain({

llm: llm,
prompt: prompt,
});

This questionAnswerChain will handle filling the prompt

template and sending the prompt to the LLM.

Next, pass the questionAnswerChain to the

create_retrieval_chain() function:

const ragChain = await createRetrievalChain({

retriever: retriever,
combineDocsChain: questionAnswerChain,
});

The ragChain above will pass the input to the retriever, which
returns a relevant part of the document to the chain.

The relevant part of the chain and the input are then passed to
the questionAnswerChain to get the result.

Now you can ask for user input, then run the invoke() method
with that input:

const { question } = await prompts([

{
type: 'text',
name: 'question',
message: 'Your question: ',
validate: value => (value ? true : 'Question cannot be empty'),
},
]);

const response = await ragChain.invoke(

{ input: question },
{
configurable: {
sessionId: 'test',
},
}
);
console.log(response.answer);

When the LLM returns the answer, you print the answer to the
command line.

Now the application is completed. You can run it using Node.js

from the command line:

node rag_app.js

And then ask a question relevant to the document context.

Here’s an example:

node rag_app.js
Your question: … Where does Mr. Thompson work?
Mr. Thompson works at VegaTech Inc. as the Chief AI Scientist.

As we can see, the LLM can answer the question by analyzing

the prompt created by LangChain and our input.

Adding Chat Memory for Context

To add chat memory to a RAG chain, you need to upgrade the
retriever object by creating a history-aware retriever.
This history-aware retriever is then used to contextualize your
latest question by analyzing the chat history.

You need to import the createHistoryAwareRetriever chain, then

create a createHistoryAwareRetriever object as follows:

import { createHistoryAwareRetriever } from

'langchain/chains/history_aware_retriever';

// ... other code

const retriever = vectorStore.asRetriever();

const contextualizeSystemPrompt = `
Given a chat history and the latest user question
which might reference context in the chat history, formulate a standalone
question
which can be understood without the chat history. Do NOT answer the question,
just reformulate it if needed and otherwise return it as is.
`;

const contextualizePrompt = ChatPromptTemplate.fromMessages([

['system', contextualizeSystemPrompt],
new MessagesPlaceholder('chat_history'),
['human', '{input}'],
]);

const historyAwareRetriever = await createHistoryAwareRetriever({

llm,
retriever,
rephrasePrompt: contextualizePrompt,
});

The contextualizePrompt is used to make the LLM rephrase the

question in context of the chat history.

The prompt is passed to the createHistoryAwareRetriever object.

Next, you need to add MessagesPlaceholder to the prompt object

as well:
const prompt = ChatPromptTemplate.fromMessages([
['system', systemPrompt],
new MessagesPlaceholder('chat_history'),
['human', '{input}'],
]);

After that, update the ragChain to use the historyAwareRetriever

as follows:

const ragChain = await createRetrievalChain({

retriever: historyAwareRetriever,
combineDocsChain: questionAnswerChain,
});

Now that you have an updated RAG chain, pass the chain to the
RunnableWithMessageHistory() class like we did in Chapter 9:

// Add the imports

import { ChatMessageHistory } from 'langchain/memory';
import { RunnableWithMessageHistory } from '@langchain/core/runnables';

// ... other code

const ragChain = await createRetrievalChain({

retriever: historyAwareRetriever,
combineDocsChain: questionAnswerChain,
});

const history = new ChatMessageHistory();

const conversationalRagChain = new RunnableWithMessageHistory({

runnable: ragChain,
getMessageHistory: sessionId => history,
inputMessagesKey: 'input',
historyMessagesKey: 'chat_history',
outputMessagesKey: 'answer',
});

As the last step, change the chain invoked when the application
receives a question to conversationalRagChain, and add the
configurable parameter for the sessionId.
To repeat the prompt, add a while loop as before:

let exit = false;

while (!exit) {
const { question } = await prompts([
{
type: 'text',
name: 'question',
message: 'Your question: ',
validate: value => (value ? true : 'Question cannot be empty'),
},
]);
if (question == '/bye') {
console.log('See you later!');
exit = true;
} else {
const response = await conversationalRagChain.invoke(
{ input: question },
{
configurable: {
sessionId: 'test',
},
}
);
console.log(response.answer);
}
}

Now we can interact with the document, and the AI is aware of

the chat history. Good job!

About The Vector Database

We have used the MemoryVectorStore to store vector data in this
application, but this store is actually not recommended for
production.

When the application stops running, the vector data in memory

will be lost.
Most modern database applications such as PostgreSQL, Redis,
and MongoDB support storing vector data, so you might want to
use them in production.

You can see vector database integration details at

https://js.langchain.com/v0.2/docs/integrations/vectorstores/

The MemoryVectorStore database is recommended only for

prototyping and testing.

Switching the LLM

If you want to change the LLM used for this application, you
need to also change the embedding model used for the vector
generation.

For Google Gemini, you can import the

GoogleGenerativeAIEmbeddings model as follows:

import {
ChatGoogleGenerativeAI,
GoogleGenerativeAIEmbeddings,
} from '@langchain/google-genai';

// llm
const llm = new ChatGoogleGenerativeAI({
model: 'gemini-1.5-pro-latest',
apiKey: process.env.GOOGLE_GEMINI_KEY
});

// embeddings
const embeddings = new GoogleGenerativeAIEmbeddings({
apiKey: process.env.GOOGLE_GEMINI_KEY,
modelName: 'embedding-001',
});

For Ollama, you can use the community-developed

OllamaEmbeddings class like this:
import { ChatOllama } from '@langchain/community/chat_models/ollama';
import { OllamaEmbeddings } from '@langchain/community/embeddings/ollama';

// llm
const llm = new ChatOllama({ model: 'mistral' });

// embeddings
const embeddings = new OllamaEmbeddings({ model: 'mistral' });

Make sure that you use the same model when instantiating the
ChatOllama and OllamaEmbeddings classes.

Summary
The code for this chapter is available in the folder
11_Chat_With_Document from the book source code.

In this chapter, you’ve learned how to create a Chat With

Document application using LangChain.

Using the RAG technique, LangChain can be used to retrieve

information from a document, and then pass the information to
the LLM.

The first thing you need to do is to process the document and

turn it into chunks, which can then be converted into vectors
using an embedding model.

The vectors are stored in a vector database, and the retriever

created from the database is used when the user sends a query
or input.

Next, let’s see how we can load documents in different formats,

such as .docx and .pdf.
OceanofPDF.com
CHAPTER 12: UPLOADING
DIFFERENT DOCUMENT TYPES

Now that we can load a .txt document in LangChain, let’s

improve the application so we can also load other document
types or formats, such as .docx and .pdf.

To load different file formats, you need to import the loader for
each format as follows:

import { TextLoader } from 'langchain/document_loaders/fs/text';

import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
import { DocxLoader } from "@langchain/community/document_loaders/fs/docx";

Next, you need to check on the document extension and load

the document using the matching loader.

Right below the llm creation, specify the location of the file you
want to use as a filePath variable, then get the extension of the
document using the path.extname() method:

import path from 'path';

// ...

const filePath = './python.pdf';

const extension = path.extname(filePath);

let loader = null;

if (extension === '.txt'){

loader = new TextLoader(filePath);
} else if (extension === '.pdf') {
loader = new PDFLoader(filePath);
} else if (extension === '.docx') {
loader = new DocxLoader(filePath);
} else {
throw new Error('The document format is not supported');
}

const docs = await loader.load();

The code above uses the 'python.pdf' file which you can get
from the source code folder.

After you get the extension string, you need to check the value
to create the right loader.

When the document format is not supported, throw an error to

stop the application.

The LangChain loaders depend on Node packages. You need to

install mammoth to load .docx files and pdf-parse to load .pdf files:

npm install mammoth pdf-parse

Once the loader is created, you can load and split the document
into chunks, convert the chunks into vectors, and create the
chains like in the previous chapter:

const docs = await loader.load();

const splitter = new RecursiveCharacterTextSplitter({

chunkSize: 1000,
chunkOverlap: 200,
});
const chunks = await splitter.splitDocuments(docs);

// the rest is the same...

And that’s it. Now you can upload a .txt, .docx, or .pdf
document and ask questions relevant to the content of the
document.

To change the document you want to upload, you just need to

change the filePath variable to the location of the document.

If you upload a document that’s not supported, Node will show

'The document format is not supported' error.

Summary
The code for this chapter is available in the folder
12_Uploading_Different_Document_Types from the book source
code.

In this chapter, you have improved the Chat With Document

application further by creating an interface for uploading a file,
then load the document using different loaders, depending on
the type of the document.

In the next chapter, I’m going to show you how to chat with
YouTube videos. See you there!

OceanofPDF.com
CHAPTER 13: CHAT WITH
YOUTUBE VIDEOS

Now that you’ve learned how to make AI interact with

documents, let’s continue with creating a Chat with YouTube
application.

To create this application, you can copy the finished application

from Chapter 11 and make the changes shown in this chapter.

Let’s get started!

Adding The YouTube Loader

The Chat With YouTube application uses the RAG technique to
augment the LLM with the video’s transcript data.

To get a YouTube video’s transcript, you need to install the

youtube-transcript and youtubei.js packages using npm as
follows:

npm install youtube-transcript youtubei.js

This API is used by LangChain’s YouTube loader to fetch the

transcript of a video.
At the top of the file, import the YoutubeLoader class from
LangChain like this:

import { YoutubeLoader } from

'@langchain/community/document_loaders/web/youtube';

We’re going to ask the user for a YouTube URL, then process
that URL to create the embeddings and RAG chain

Just below the history initiation, create a function named

processUrl() with the following content:

const history = new ChatMessageHistory();

const processUrl = async ytUrl => {

const loader = YoutubeLoader.createFromUrl(ytUrl);
const docs = await loader.load();
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});

const chunks = await splitter.splitDocuments(docs);

const embeddings = new OpenAIEmbeddings({ apiKey: process.env.OPENAI_KEY });

const vectorStore = await MemoryVectorStore.fromDocuments(

chunks,
embeddings
);

const retriever = vectorStore.asRetriever();

const historyAwareRetriever = await createHistoryAwareRetriever({

llm,
retriever,
rephrasePrompt: contextualizePrompt,
});

const ragChain = await createRetrievalChain({

retriever: historyAwareRetriever,
combineDocsChain: questionAnswerChain,
});
const conversationalRagChain = new RunnableWithMessageHistory({
runnable: ragChain,
getMessageHistory: sessionId => history,
inputMessagesKey: 'input',
historyMessagesKey: 'chat_history',
outputMessagesKey: 'answer',
});

return conversationalRagChain;
};

The processUrl() function simply processes the given ytUrl

argument to create the embeddings and RAG chain.

You need to delete duplicate code outside of the function that

does the same process.

Next, create a while loop below the function to ask for the
YouTube URL:

let conversationalRagChain = null;

while (!conversationalRagChain) {
const { ytUrl } = await prompts([
{
type: 'text',
name: 'ytUrl',
message: 'YouTube URL: ',
validate: value => (value ? true : 'YouTube URL cannot be empty'),
},
]);

conversationalRagChain = await processUrl(ytUrl);

}

The loop will run long as the conversationalRagChain is null.

Once the URL is is processed, create another while loop to ask

for questions from the user:
console.log('Video processed. Ask your questions');
console.log('Type /bye to stop the program');

let exit = false;

while (!exit) {
const { question } = await prompts([
{
type: 'text',
name: 'question',
message: 'Your question: ',
validate: value => (value ? true : 'Question cannot be empty'),
},
]);
if (question == '/bye') {
console.log('See you later!');
exit = true;
} else {
const response = await conversationalRagChain.invoke(
{ input: question },
{
configurable: {
sessionId: 'test',
},
}
);
console.log(response.answer);
}
}

The application is now completed. You can run the application

and test it.

For example, I passed the URL to my video at

https://youtu.be/Sr4KeW078P4 which explains the JavaScript
Promise syntax:

Figure 38. Ask YouTube Application Result

You can pass a YouTube short link or full link. It will work as
shown above.

Handling Transcript Doesn’t Exist Error

Because YouTube music videos don’t have transcripts, you’ll get
an error when you give one to the application.

Below is the error message shown when I put a Taylor Swift

video at https://youtu.be/q3zqJs7JUCQ:

Error: Failed to get YouTube video transcription: [YoutubeTranscript]

🚨 Transcript is disabled on this video (q3zqJs7JUCQ)

This error occurs because the transcript is disabled for music

videos.

You also get an error when you pass a non-YouTube URL like
this:

YouTube URL: https://amazon.com

Error: Failed to get youtube video id from the url

To prevent both errors, you need to wrap the function body in a

try..catch block as follows:

const processUrl = async ytUrl => {

try {
const loader = YoutubeLoader.createFromUrl(ytUrl);

// ...

return conversationalRagChain;
} catch (error) {
console.log('Not a YouTube URL or video has no transcript. Please try
another URL');
return null;
}
};

This way, an error message is shown when the video has no

transcript, and the user can input a different URL:

YouTube URL: https://amazon.com

Not a YouTube URL or video has no transcript. Please try another URL

Without the transcript, we can’t generate vector data.

Summary
The code for this chapter is available in the folder
13_Chat_With_Youtube from the book source code.

In this chapter, you’ve learned how to create a Chat With

YouTube application that fetches a YouTube video transcript
using the URL, and then converts the transcript into vectors,
which can be used to augment the LLM knowledge.

By using AI, you can ask for the key points or summary of a
long video without having to watch the video yourself.

OceanofPDF.com
CHAPTER 14: INTERACTING
WITH IMAGES USING
MULTIMODAL MESSAGES

Now that we can interact with different types of documents and

chat with YouTube videos, let’s continue with making AI
understand images.

Understanding Multimodal Messages

A multimodal message is a message that uses more than one
mode of communication to communicate.

The communication modes known to humans are:

▪ Text

▪ Video

▪ Audio

▪ Image
Advanced models like GPT-4 and Gemini Pro already have
vision capabilities, meaning the models can "see" images and
answer questions about them.

Using the multimodal message, we can use AI to interact with

images, like asking how many persons are shown on an image
or what color is dominant in the image.

Let’s learn how to create such an application using LangChain

next.

Sending Multimodal Messages in LangChain

To send a multimodal message in LangChain, you only need to
adjust the prompt template and pass a list for the human
message.

First, create a new file named handle_image.js, import the

required packages, and define the LLM to use as follows:

import { ChatOpenAI } from '@langchain/openai';

import { ChatPromptTemplate } from '@langchain/core/prompts';

import { readFile } from 'node:fs/promises';

import prompts from 'prompts';

import 'dotenv/config';

const llm = new ChatOpenAI({

model: 'gpt-4o',
apiKey: process.env.OPENAI_KEY,
});

The readFile module from Node is used to read the image file
and encode it in base 64 string format.
Write the function to encode images below the llm variable:

const encodeImage = async imagePath => {

const imageData = await readFile(imagePath);
return imageData.toString('base64');
}

const image = await encodeImage('./image.jpg');

Change the './image.jpg' passed to the encodeImage() function

with an image you have in your computer.

You can pass any image you have on your computer to the
function, but make sure it doesn’t contain any sensitive data.

The encoded image can then be passed into a chat prompt

template as follows:

const prompt = ChatPromptTemplate.fromMessages([

['system', 'You are a helpful assistant that can describe images in
detail.'],
['human',
[
{ type: 'text', text: '{input}' },
{
type: 'image_url',
image_url: {
url: `data:image/jpeg;base64,${image}`,
detail: 'low',
},
},
],
],
]);

In the template above, you can see that there are two messages
passed as the human message: the "text" for the question and
the "image_url" for the image.
We can pass a public image url such as 'https://', but here we use
the data:image URI because we upload an image from our
computer.

Behind the scenes, LangChain automatically converts the

message into a multimodal message.

Create a chain from the prompt and the llm, and ask the user for
an input:

const chain = prompt.pipe(llm);

const response = await chain.invoke({"input": "What do you see on this
image?"})

console.log(response.content);

The LLM will process the image, and then give you the
appropriate answer.

Adding Chat History

Now that you can ask questions about an image using a
language model, let’s improve the application by adding a chat
history.

As always, import the packages needed to add the web interface

and history first:

import { ChatOpenAI } from '@langchain/openai';

import { ChatMessageHistory } from 'langchain/memory';

import { RunnableWithMessageHistory } from '@langchain/core/runnables';
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from '@langchain/core/prompts';

import { readFile } from 'node:fs/promises';

import prompts from 'prompts';

import 'dotenv/config';

Next, you need to instantiate the ChatMessageHistory class and

upgrade the chain to include chat history, similar to Chapter 9.

const history = new ChatMessageHistory();

const chain = prompt.pipe(llm);

const chainWithHistory = new RunnableWithMessageHistory({

runnable: chain,
getMessageHistory: sessionId => history,
inputMessagesKey: 'input',
historyMessagesKey: 'chat_history',
});

The next step is to ask for a question from the user. Use prompts
to do so:

console.log('Chat With Image');

console.log('Type /bye to stop the program');

let exit = false;

And that’s it. Now you can ask questions about the image you
passed to the encodeImage() function.

I’m using an image from https://g.codewithnathan.com/lc-image

for the example below:

Figure 39. Chat With Image Result

Try asking for a certain detail, such as how many people can
you see on the image, or what color is dominant on the image.

Ollama Multimodal Message

If you want to send a multimodal message to Ollama, you need
to download a model that supports the message format, such as
bakllava and llava.

You can run the command ollama pull bakllava to download

the model to your machine. Note that both models require 8GB
of RAM to run without any issues.

To send a multimodal message to Ollama, you need to make

sure that the image_url key holds a string value as follows:
['human',
[
{ type: 'text', text: '{input}' },
{
type: 'image_url',
image_url: `data:image/jpeg;base64,${image}`,
},
],
],

But then again, there’s a bug on LangChain side that converts

the image_url key to an object and adds a url property in that
object.

To resolve this issue, you need to open the ollama.js source

code located on
node_modules/@langchain/community/dist/chat_models and
change the code at line 468 as follows:

else if (contentPart.type === "image_url") {

console.log(contentPart.image_url);
const imageUrlComponents = contentPart.image_url.url.split(",");
// Support both data:image/jpeg;base64,<image> format as well
images.push(imageUrlComponents[1] ?? imageUrlComponents[0]);
}

The console.log() above will print the content of the image_url

key, which looks like this:

{
url: '...'
}

As you can see, it’s converted into an object by LangChain even

though we passed a string.

This makes the original ollama.js code that checks for a string
type always cause an error:
else if (
contentPart.type === "image_url" &&
typeof contentPart.image_url === "string"
) {
const imageUrlComponents = contentPart.image_url.split(",");
// Support both data:image/jpeg;base64,<image> format as well
images.push(imageUrlComponents[1] ?? imageUrlComponents[0]);
} else {
throw new Error(
`Unsupported message content type. Must either have type "text" or type
"image_url" with a string "image_url" field.`
);
}

I will update this section once the issue is resolved by

LangChain maintainers.

Summary
The code for this chapter is available in the folder
14_Handling_Images from the book source code.

In this chapter, you’ve learned how to send a multimodal

message to a language model using LangChain.

Note that not all models can understand a multimodal message.

If you’re using Ollama, you need to use a model like bakllava
and not mistral or gemma.

If the LLM doesn’t understand, it will usually tell you that it

can’t understand the message you’re sending.

OceanofPDF.com
CHAPTER 15: DEVELOPING AI-
POWERED NEXT.JS APPLICATION

Now that you’ve explored the core concepts of LangChain, let’s

learn how to integrate LangChain into a Next.js web application
next.

Next.js is a JavaScript and React framework that enables you to

develop a fully functional web application.

Don’t worry if you never used Next.js before. I will guide you to
create the application and explain the code written.

If you ever want to learn Next.js fundamentals in depth, you

can get my Beginning Next.js Development book at
https://codewithnathan.com/beginning-nextjs

Let’s get started!

Creating the Next.js Application

To create a Next.js application, you need to run the create-next-
app package using npx.
At the time of this writing, the latest Next.js version is 14.2.4,
and it has a build issue with LangChain that’s still unresolved.

To avoid the issue, let’s use Next.js version 14.1 instead. Run the
following command from the terminal:

npx [email protected] nextjs-langchain

The npx command allows you to execute a Javascript package

directly from the terminal.

You should see npm asking to install a new package as shown

below. Proceed by typing 'y' and pressing Enter:

Need to install the following packages:

[email protected]
Ok to proceed? (y) y

Once create-next-app is installed, the program will ask you for

the details of your Next.js project, such as whether you want to
use TypeScript and Tailwind CSS.

To change the option, press left or right to select Yes or No, then
hit Enter to confirm your choice. You can follow the default
setup from create-next-app as shown below:

npx [email protected] nextjs-langchain

Need to install the following packages:
[email protected]
Ok to proceed? (y) y
✔ Would you like to use TypeScript? … Yes
✔ Would you like to use ESLint? … Yes
✔ Would you like to use Tailwind CSS? … Yes
✔ Would you like to use `src/` directory? … No
✔ Would you like to use App Router? (recommended) … Yes
✔ Would you like to customize the default import alias (@/*)? … No
The package then generates a new project named 'nextjs-
langchain' as follows:

Success! Created nextjs-langchain at ...

The next step is to use the cd command to change the working

directory to the application we’ve just created, then run npm run
dev to start the application:

cd nextjs-langchain
npm run dev

> [email protected] dev

> next dev

▲ Next.js 14.1.0
- Local: http://localhost:3000

✓ Ready in 1156ms

Now you can view the running application from the browser, at
the designated localhost address:
If you see the screen above, it means you have successfully
created and run your first Next.js application. Good work!

Installing Required Packages

Stop the running application with CTRL + C, then install the
packages we’re going to use:

npm install langchain @langchain/openai

The langchain and @langchain/openai packages are the same

ones we used in previous chapters.

Next, install the packages required to create the front-end React

components:
npm install @radix-ui/react-icons react-textarea-autosize react-code-blocks
marked

With the packages installed, let’s create a server action that will
instruct LangChain to call the LLM next.

Adding the Server Action

Server Actions is a new feature added in Next.js 14 as the new
standard way to fetch and process data in Next.js applications.

A server action is basically an asynchronous function that’s

executed on the server. You can call the function from React
components.

In your project folder, create a new folder named actions/, then

create a file named openai.action.ts as follows:

.
├── actions
│ └── openai.action.ts
└── app

Inside this file, import the langchain modules and create the
chain that will be used for communicating with the LLM:

'use server';

import { ChatOpenAI } from '@langchain/openai';

import { ChatPromptTemplate } from '@langchain/core/prompts';

const llm = new ChatOpenAI({

model: 'gpt-4o',
apiKey: process.env.OPENAI_KEY,
});

const prompt = ChatPromptTemplate.fromMessages([

['system',
'You are an AI chatbot having a conversation with a human. Use the
following context to understand the human question. Do not include emojis in
your answer',
],
['human', '{input}'],
]);

const chain = prompt.pipe(llm);

The 'use server' directive is added on top of the file to let Next.js
know that this file can only run on the server.

You don’t need dotenv to load environment variables in a

Next.js application.

After that, create an asynchronous function named getReply()

that accepts a string parameter and calls the chain.invoke()
method as follows:

export const getReply = async (message: string) => {

const response = await chain.invoke({

input: message
});

return response.content;
};

This way, you can call the getReply() function whenever you
want to send a message to the LLM.

Next, we’re going to create the components for the chat

interface.
Adding Profile Pictures for User and
Assistant
Before creating the interface, let’s add two images that will be
used as the profile pictures.

You can get the 'user.png' and 'robot.png' files from the source
code, and place them in the public/ folder.

Also, delete the next.svg and vercel.svg files as we no longer

need them.

Developing React Chat Components

The chat interface will be created using React. First, create a
folder named components/, then create another folder named
chat/ inside it.

Inside the chat/ folder, create a file named index.tsx and

import the packages required by this file:

'use client';

import { useState } from 'react';

import ChatList from './chat-list';
import ChatBottombar from './chat-bottombar';
import { getReply } from '@/actions/openai.action';

When you use React functionalities such as useState, you need

to define the 'use client' directive at the top of the file to tell
Next.js that this component must be rendered on the browser.

The ChatList and ChatBottombar are parts of the chat interface

that we will create later.
Next, you need to define an interface for the Message object as
follows:

export interface Message {

role: string;
content: string;
}

This interface is used to type the Message object, which will store
the messages in the front-end.

Next, create the Chat() function component as follows:

export default function Chat() {

const [loadingSubmit, setLoadingSubmit] = useState(false);
const [messages, setMessages] = useState<Message[]>([]);

const sendMessage = async (newMessage: Message) => {

setLoadingSubmit(true);
setMessages(prevMessages => [...prevMessages, newMessage]);

const response = await getReply(newMessage.content);

const reply: Message = {

role: 'assistant',
content: response as string,
};
setLoadingSubmit(false);
setMessages(prevMessages => [...prevMessages, reply]);
};

return (
<div className='max-w-2xl flex flex-col justify-between w-full h-full'>
<ChatList messages={messages} loadingSubmit={loadingSubmit} />
<ChatBottombar sendMessage={sendMessage} />
</div>
);
}

The Chat() component contains the sendMessage() function,

which is used to send the user message to the server action.
There are two states defined in this component. The
loadingSubmit state will show a loading animation while the
LLM is generating a reply.

The messages state will hold the conversation messages, which

are displayed by the ChatList component.

Next, you need to create the ChatList component, so create a

new file named chat-list.tsx with the following content:

import { useRef, useEffect } from 'react';

import Image from 'next/image';
import { marked } from 'marked';
import { Message } from './';

interface ChatListProps {
messages: Message[];
loadingSubmit: boolean;
}

export default function ChatList({ messages, loadingSubmit }: ChatListProps) {

const bottomRef = useRef<HTMLDivElement>(null);

const scrollToBottom = () => {

bottomRef.current?.scrollIntoView({ behavior: 'smooth', block: 'end' });
};

useEffect(() => {
scrollToBottom();
}, [messages]);
}

The ChatList component will display the messages on the

browser.

The scrollToBottom() function is used to automatically scroll to

the bottom of the chat list. This function is called whenever the
messages variable value is updated because of the useEffect().
Below the useEffect() call, create an if statement to check for
the length of the messages array:

if (messages.length === 0) {
return (
<div className='w-full h-full flex justify-center items-center'>
<div className='flex flex-col gap-4 items-center'>
<Image
src='/robot.png'
alt='AI'
width={64}
height={64}
className='object-contain'
/>
<p className='text-center text-lg text-muted-foreground'>
How can I help you today?
</p>
</div>
</div>
);
}

When the length of the messages array is zero, an Image and a

text will be shown to the user like this:
Just below the if statement, write another return statement to
show existing messages:

return (
<div className='w-full overflow-x-hidden h-full justify-end'>
<div className='w-full flex flex-col overflow-x-hidden overflow-y-hidden
min-h-full justify-end'>
{messages.map((message, index) => (
<div
key={index}
className={`flex flex-col gap-2 p-4 ${
message.role === 'user' ? 'items-end' : 'items-start'
}`}
>
<div className='flex gap-3 items-center'>
{message.role === 'user' && (
<div className='flex items-end gap-3'>
<span
className='bg-blue-100 p-3 rounded-md max-w-xs sm:max-w-2xl
overflow-x-auto'
dangerouslySetInnerHTML={{
__html: marked.parse(message.content),
}}
/>
<span className='relative h-10 w-10 shrink-0 rounded-full flex
justify-start items-center overflow-hidden'>
<Image
className='aspect-square h-full w-full object-contain'
alt='user'
width='32'
height='32'
src='/user.png'
/>
</span>
</div>
)}
{message.role === 'assistant' && (
<div className='flex items-end gap-3'>
<span className='relative h-10 w-10 shrink-0 overflow-hidden
rounded-full flex justify-start items-center'>
<Image
className='aspect-square h-full w-full object-contain'
alt='AI'
width='32'
height='32'
src='/robot.png'
/>
</span>
<span className='bg-blue-100 p-3 rounded-md max-w-xs sm:max-w-
2xl overflow-x-auto'>
<span
key={index}
dangerouslySetInnerHTML={{
__html: marked.parse(message.content),
}}
/>
</span>
</div>
)}
</div>
</div>
))}
{loadingSubmit && (
<div className='flex pl-4 pb-4 gap-2 items-center'>
<span className='relative h-10 w-10 shrink-0 overflow-hidden
rounded-full flex justify-start items-center'>
<Image
className='aspect-square h-full w-full object-contain'
alt='AI'
width='32'
height='32'
src='/robot.png'
/>
</span>
<div className='bg-blue-100 p-3 rounded-md max-w-xs sm:max-w-2xl
overflow-x-auto'>
<div className='flex gap-1'>
<span className='size-1.5 rounded-full bg-slate-700 motion-
safe:animate-[bounce_1s_ease-in-out_infinite]'></span>
<span className='size-1.5 rounded-full bg-slate-700 motion-
safe:animate-[bounce_0.5s_ease-in-out_infinite]'></span>
<span className='size-1.5 rounded-full bg-slate-700 motion-
safe:animate-[bounce_1s_ease-in-out_infinite]'></span>
</div>
</div>
</div>
)}
</div>
<div id='anchor' ref={bottomRef}></div>
</div>
);

When the role property is 'user', the user chat bubble will be
shown on the right end. Otherwise, we show the AI assistant
bubble on the left end.

When the loadingSubmit state is true, we show a thinking

animation:
That will be all for the ChatList component.

Next, create a new file named chat-bottombar.tsx which

contains a single chat input bar:

import { useState } from 'react';

import TextareaAutosize from 'react-textarea-autosize';
import { PaperPlaneIcon } from '@radix-ui/react-icons';
import { Message } from './';

interface ChatBottombarProps {
sendMessage: (newMessage: Message) => void;
}
export default function ChatBottombar({ sendMessage }: ChatBottombarProps) {
const [input, setInput] = useState('');

const handleKeyDown = (e: React.KeyboardEvent<HTMLTextAreaElement>) => {

if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault();
sendMessage({ role: 'user', content: input });
setInput('');
}
};

return (
<div className='p-4 flex justify-between w-full items-center gap-2'>
<form className='w-full items-center flex relative gap-2'>
<TextareaAutosize
autoComplete='off'
value={input}
onChange={e => setInput(e.target.value)}
onKeyDown={handleKeyDown}
placeholder='Type your message...'
className='border-input max-h-20 px-5 py-4 text-sm shadow-sm
placeholder:text-muted-foreground focus-visible:outline-none focus-
visible:ring-1 focus-visible:ring-ring disabled:cursor-not-allowed
disabled:opacity-50 w-full border rounded-full flex items-center h-14 resize-
none overflow-hidden'
/>
<button
className='inline-flex items-center justify-center whitespace-nowrap
rounded-md text-sm font-medium transition-colors focus-visible:outline-none
focus-visible:ring-1 focus-visible:ring-ring disabled:pointer-events-none
disabled:opacity-50 hover:bg-accent hover:text-accent-foreground h-14 w-14'
type='submit'
>
<PaperPlaneIcon />
</button>
</form>
</div>
);
}

This component simply holds a <form> element with an <input>

and a <button>.
When you send a message, the sendMessage() function passed by
the Chat component will be executed, and LangChain will send
a request to the LLM.

Alright, now the chat interface components are finished.

You can import the component from app/page.tsx as follows:

import Chat from '@/components/chat';

export default function Home() {

return (
<main className='flex h-[calc(100dvh)] flex-col items-center '>
<Chat />
</main>
);
}

Also, remove the CSS style defined in globals.css except for the
Tailwind directives:

@tailwind base;
@tailwind components;
@tailwind utilities;

/* Remove the rest... */

Now run the application using npm run dev, and try chatting
with the LLM from the browser:
The next task is to add a chat history.

Adding Chat History

The chat history can be added to the server action that we’ve
created before.

Import ChatMessageHistory to save the history on the memory

and RunnableWithMessageHistory to create the chain with history:

// openai.action.ts
import { ChatMessageHistory } from 'langchain/memory';
import { RunnableWithMessageHistory } from '@langchain/core/runnables';
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from '@langchain/core/prompts';

// ... other code

const history = new ChatMessageHistory();

const prompt = ChatPromptTemplate.fromMessages([

['system',
'You are an AI chatbot having a conversation with a human. Use the
following context to understand the human question. Do not include emojis in
your answer',
],
new MessagesPlaceholder('chat_history'),
['human', '{input}'],
]);

const chain = prompt.pipe(llm);

const chainWithHistory = new RunnableWithMessageHistory({

runnable: chain,
getMessageHistory: sessionId => history,
inputMessagesKey: 'input',
historyMessagesKey: 'chat_history',
});

After creating the chainWithHistory object, you need to update

the chain called in the getReply() function:

export const getReply = async (message: string) => {

const response = await chainWithHistory.invoke({
input: message
}, {
configurable: {
sessionId: "test"
}
});

return response.content;
}

Now previous chat messages will be added to the prompt.

You can try asking a question first, then ask the last question
you asked as shown below:

Now the application looks like a simpler version of ChatGPT.

Very nice!

Summary
The code for this chapter is available in the 15_nextjs_langchain
folder in the book source code.
In this chapter, You have successfully integrated LangChain into
a Next.js application.

As you can see, developing an AI-powered application is not so

different than developing a regular web application.

When using Next.js, you can create the interface using React
components, then define server actions that use the LangChain
library to call LLMs.

When you receive a response, you can store the response in a

state, then show it to the user from there.

OceanofPDF.com
CHAPTER 16: DEPLOYING
NEXT.JS AI APPLICATION TO
PRODUCTION

Now that the Next.js application is working, we’re going to

make a few improvements so that the application can be
deployed into production.

We need to improve the user experience so that the application

feels more interactive, and then ask for an API key from the
user to use the application.

Streaming the Response

So far, the response from the LLM is displayed all at once after
the LLM has finished generating an answer. This delay makes
the user experience less interactive.

To improve the user experience, we can stream the response as

the LLM generates one. This way, users will see the output
gradually, making the interaction feel more dynamic and
responsive.

From the terminal, run npm install ai as follows:

npm install ai

The ai package is a library used for managing chat streams and

UI updates. It enables you to develop dynamic AI-driven
interfaces more efficiently.

Now open the openai.action.ts file and import the

createStreamableValue function from the ai/rsc package:

import { createStreamableValue } from 'ai/rsc';

export const getReply = async (message: string) => {

const stream = createStreamableValue();

(async () => {
const response = await chainWithHistory.stream(
{
input: message,
},
{
configurable: {
sessionId: 'test',
},
}
);

for await (const chunk of response) {

stream.update(chunk.content);
}

stream.done();
})();

return { streamData: stream.value };

};

The createStreamableValue() function is used to create a

streamable object. This object can be updated in real-time as
the response is coming.
Instead of the usual invoke() method, we call the stream()
method from the chain.

The stream() method returns an iterable, which is updated as

the response is streamed by LangChain.

The for await…of syntax is used to handle asynchronous

iteration over the chunks of data received by the response
object.

When the response is completed, the stream.done() method is

called to signal the stream process is finished, and the
stream.value will be returned.

The stream process is wrapped in an immediately invoked

function expression (IIFE) so that the streaming runs in
parallel.

If we remove the IIFE, the streamData is immediately returned

to the front-end before the stream is finished.

Next, open the chat/index.tsx file and update the sendMessage()

function to read the streamable value as follows:

import { readStreamableValue } from 'ai/rsc';

const sendMessage = async (newMessage: Message) => {

setLoadingSubmit(true);
setMessages(prevMessages => [...prevMessages, newMessage]);

const { streamData } = await getReply(newMessage.content);

const reply: Message = {

role: 'assistant',
content: '',
};
setLoadingSubmit(false);
setMessages(prevMessages => [...prevMessages, reply]);

for await (const stream of readStreamableValue(streamData)) {

reply.content = `${reply.content}${stream}`;
setMessages(prevMessages => {
return [...prevMessages.slice(0, -1), reply];
});
}
};

Here, we change the name of the response variable into

streamValue, then we initialize the reply object with an empty
content property.

When we receive a response, call the setLoadingSubmit(false) to

stop the thinking indicator, and then set reply as the latest
message.

After that, create another for await…of loop to read the

streamable value and pass it as the value of reply.content.

The latest message is overwritten by the reply object constantly

using the setMessages() function

Now when you send a message to the LLM, the response will
have a typed animation.

Creating API Key Input

This Next.js application is using our API key to access the LLM
provider’s API.

This is not recommended for production because we will be

charged every time a user uses the application.
Instead of supplying our API key, let’s enable users to add their
own API keys to the application.

To do so, we need to create a text input in the sidebar for the

API key, and run a process to instantiate the LLM on the server
only when this API key is added.

We will add the server action to process the API key first.

Adding the setApi() Function

Back in the openai.action.ts file, you need to wrap the llm and
chainWithHistory instantiation in a function.

The setApi() function below accepts a string of apiKey that will

be used to instantiate the llm object:

let chainWithHistory: RunnableWithMessageHistory<any, AIMessageChunk> | null =

null;

export const setApi = async (apiKey: string) => {

const llm = new ChatOpenAI({
model: 'gpt-4o',
apiKey: apiKey,
});

const history = new ChatMessageHistory();

const prompt = ChatPromptTemplate.fromMessages([

[
'system',
'You are an AI chatbot having a conversation with a human. Use the
following context to understand the human question. Do not include emojis in
your answer',
],
new MessagesPlaceholder('chat_history'),
['human', '{input}'],
]);

const chain = prompt.pipe(llm);

chainWithHistory = new RunnableWithMessageHistory({
runnable: chain,
getMessageHistory: sessionId => history,
inputMessagesKey: 'input',
historyMessagesKey: 'chat_history',
});
};

Here, we initialize the chainWithHistory variable as a null. It

will hold the RunnableWithMessageHistory instance when the
setApi() function is executed.

Now that chainWithHistory might contain a null value, we need

to assert the type of chainWithHistory in the getReply() function
as follows:

import { AIMessageChunk } from '@langchain/core/messages';

// inside getReply():
(async () => {
const response = await (
chainWithHistory as RunnableWithMessageHistory<any, AIMessageChunk>
).stream(
// ...
)
})

The as type assertion tells TypeScript that we are certain that

the chainWithHistory variable is not a null when this function is
executed.

Adding a Chat Sidebar

Now that we have the setApi() function, we need to create a
sidebar where users can input their API key to use the
application.
Create a new file named chat-sidebar.tsx and write the
following code:

'use client';

import { useState } from 'react';

interface ChatSidebarProps {
handleSubmitKey: (apiKey: string) => void;
}

export default function ChatSidebar({ handleSubmitKey }: ChatSidebarProps) {

const [keyInput, setKeyInput] = useState('');

const submitForm = (e: React.FormEvent<HTMLFormElement>) => {

e.preventDefault();
handleSubmitKey(keyInput);
};

return (
<aside className='fixed top-0 left-0 z-40 w-64 h-screen -translate-x-full
translate-x-0'>
<div className='h-full px-3 py-4 overflow-y-auto bg-slate-50'>
<h1 className='mb-4 text-2xl font-extrabold'>OpenAI API Key</h1>
<div className='w-full flex py-6 items-center justify-between
lg:justify-center'>
<form className='space-y-4' onSubmit={submitForm}>
<input
type='password'
placeholder='API Key'
className='border border-gray-300 p-2 rounded focus:outline-none
focus:ring-2 focus:ring-blue-500'
value={keyInput}
onChange={e => setKeyInput(e.target.value)}
/>
<button
type='submit'
className='bg-blue-500 text-white p-2 rounded hover:bg-blue-600
focus:outline-none focus:ring-2 focus:ring-blue-500'
>
Submit
</button>
</form>
</div>
</div>
</aside>
);
}

The ChatSidebar component is similar to the ChatBottombar

component as it also has a form with an input and a button.

Next, open index.tsx file to import the component and the

setApi() function and use them:

import { setApi, getReply } from '@/actions/openai.action';

import ChatSidebar from './chat-sidebar';

export default function Chat() {

// ...
const [apiKey, setApiKey] = useState('');

// ...

const handleSubmitKey = async (apiKey: string) => {

await setApi(apiKey);
setApiKey(apiKey);
};

return (
<div className='max-w-2xl flex flex-col justify-between w-full h-full '>
<ChatSidebar handleSubmitKey={handleSubmitKey} />
<ChatList apiKey={apiKey} messages={messages} loadingSubmit=
{loadingSubmit} />
{apiKey && (
<ChatBottombar sendMessage={sendMessage} />
)}
</div>
);
}

We also add the apiKey prop to the ChatList component so that

we can show an alert as long as the API key is empty.

Update the ChatList component slightly to read the apiKey

value:
useEffect(() => {
scrollToBottom();
}, [messages]);

if (!apiKey) {
return (
<div className='w-full h-full flex justify-center items-center'>
<div className='flex flex-col gap-4 items-center'>
<div
className='bg-blue-50 text-blue-700 px-4 py-3 rounded'
role='alert'
>
<p className='font-bold'>OpenAI Key</p>
<p className='text-sm'>
Input your OpenAI API Key to use this application.
</p>
</div>
</div>
</div>
);
}

When the API key is not added, an alert is shown as follows:

With this, the application is now ready for deployment.

Running Build Locally

The next step is to run the build command from the command
line. If you get an error when running the build locally, the
same error will certainly happen when deploying to
production.

From the root folder of your project, run the Next.js build
command as follows:

npm run build

Once the build process is finished, you’ll see an output like this:

The result above shows the size of the compiled application,

which means the build is successful.

Pushing Code to GitHub

Deploying an application requires you to grant access to the
project files and folders. GitHub is a platform that you can use
to host and share your software project.

At this point, you should have a GitHub account already when

signing up for UploadThing, but if you haven’t created one yet,
it’s a good time to do so.

Head over to https://github.com and register for a new account.

From the dashboard, create a new repository by clicking + New
on the left sidebar, or the + sign on the right side of the
navigation bar:

Figure 40. Two Ways To Create a Repository in GitHub

A repository (or repo) is a storage space used to store software

project files.

In the Create a Repository page, fill in the details of your

project. The only required detail is the repository name.

I name mine as 'nextjs-langchain' as shown below:

You can make the repository public if you want this project as a
part of your portfolio, or you can make it private.

Once a new repo is created, you will be given instructions on

how to push your files into the repository.

The one instruction you need is push existing repo from the
command line:

Figure 41. How to Push Existing Repo to GitHub

Now you need to create a repository for your project. Open the
command line, and at the root folder of your project, run the
git init command:

git init

This will turn your project into a local repository. Add all
project files into this local repo by running the git add .
command:

git add .

Changes added to the repo aren’t permanent until you run the
git commit command. Commit the changes as shown below:

git commit -m 'Application ready for deployment'

The -m option is used to add a message for the commit. Usually,

you summarize the changes committed to the repository as the
message.

Now you need to push this existing repository to GitHub. You

can do so by following the GitHub instructions:

git remote add origin <URL>

git branch -M main
git push -u origin main

You might be asked to enter your GitHub username and

password when running the git push command.

Once the push is complete, refresh the GitHub repo page on the
browser, and you should see your project files and folders
there:
Figure 42. Project Pushed to GitHub

This means our application is already pushed (uploaded) to a

remote repository hosted on GitHub.

Vercel Deployment
The last step is to deploy this application on a development
platform. There are several platforms you can use for deploying
an application, such as Google Cloud Platform, AWS, or
Microsoft Azure.

But the best development platform for deploying a Next.js

application is Vercel.

Vercel is a cloud hosting company that you can use to build and
deploy web applications to the internet. It’s also the same
company that created Next.js, so deploying a Next application
on Vercel is very easy.

You can sign up for a free account at https://vercel.com, then

select Create New Project on the Dashboard page:

Figure 43. Vercel Create New Project

Next, you will be asked to provide the project that you want to
build and deploy.

Since the project is uploaded to GitHub, you can select Continue

With Github as shown below:
Figure 44. Vercel Import Repository Menu

Once you grant access to your GitHub account, select the project
to deploy. You can use the search bar to filter the repositories:
Figure 45. Vercel GitHub Import

Then, you will be taken to the project setup page. Click the
Deploy button and Vercel will build the application for you.

When the build is done, you will be shown the success page as
follows:
Figure 46. Vercel Congratulations! Page

You can click on the image preview to open your application.

The application will be assigned a free .vercel.app domain. You
can add your own domain from Vercel settings.

The deployment is finished. Cheers!

Summary
The code for this chapter is available in the
16_nextjs_langchain_prod folder in the book source code.

You have successfully deployed a Next.js application to the

internet. Well done!
Because AI models are accessible using HTTP protocol, you
need to ensure that an API key to access the models is added to
the application.

Instead of providing your API key and incurring charges from

running the model, you can ask the users to provide their own
keys.

OceanofPDF.com
WRAPPING UP

Congratulations on finishing this book! We’ve gone through

many concepts and topics together to help you learn how to
develop an AI-powered application using LangChain, Next.js,
and LLMS such as GPT, Gemini, and Ollama.

You’ve also learned how to deploy the application to Vercel so

that it can be accessed from the internet.

I hope you enjoyed learning and exploring LangChain.js with

this book as much as I enjoyed writing it.

I’d like to ask you for a small favor.

If you enjoyed the book, I’d be very grateful if you would leave
an honest review on Amazon (I read all reviews coming my
way)

Every single review counts, and your support makes a big

difference.

Thanks again for your kind support!

Until next time,

Nathan

OceanofPDF.com
ABOUT THE AUTHOR

Nathan Sebhastian is a senior software developer with 8+ years

of experience in developing web and mobile applications.

He is passionate about making technology education accessible

for everyone and has taught online since 2018.

OceanofPDF.com
LangChainJS For Beginners

A Step-By-Step Guide to AI Application Development With

LangChain,JavaScript/NextJS, OpenAI/ChatGPT and Other LLMs

By Nathan Sebhastian

https://codewithnathan.com

No part of this book may be reproduced, or stored in a retrieval

system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without
express written permission from the author.

OceanofPDF.com

300 LangChain Projects
100% (2)
300 LangChain Projects
17 pages
Generative AI Apps With Langchain and Python - Rabi Jay
100% (3)
Generative AI Apps With Langchain and Python - Rabi Jay
387 pages
LangChain Programming For Beginners
100% (2)
LangChain Programming For Beginners
154 pages
Building Advanced AI Agent Systems: From Fundamentals To Scalable Architecture
No ratings yet
Building Advanced AI Agent Systems: From Fundamentals To Scalable Architecture
18 pages
7 Agentic RAG System Architectures To Build AI Agents
100% (2)
7 Agentic RAG System Architectures To Build AI Agents
12 pages
Ai Agents Workflow
100% (6)
Ai Agents Workflow
67 pages
Agents Companion v2
100% (3)
Agents Companion v2
76 pages
Hands-On Guide To Agentic Corrective RAG-1
100% (1)
Hands-On Guide To Agentic Corrective RAG-1
5 pages
AI & NLP Mastery Course
83% (6)
AI & NLP Mastery Course
34 pages
Nerative AI Agents B0F9KK7N2H
100% (5)
Nerative AI Agents B0F9KK7N2H
254 pages
Why Model Context Protocol Is The Foundation For Agentic AI
100% (1)
Why Model Context Protocol Is The Foundation For Agentic AI
16 pages
Fine Tuning Techniques For Large Language Models LLMs
100% (4)
Fine Tuning Techniques For Large Language Models LLMs
15 pages
How To Become An Agentic AI Expert in 2025
75% (4)
How To Become An Agentic AI Expert in 2025
19 pages
Guide To Agentic AI Multi Agent Pattern 1741332267
No ratings yet
Guide To Agentic AI Multi Agent Pattern 1741332267
11 pages
Vector Databases
100% (1)
Vector Databases
35 pages
A Visual Guide To LLM Agents - by Maarten Grootendorst
100% (2)
A Visual Guide To LLM Agents - by Maarten Grootendorst
30 pages
RAG Technics
100% (1)
RAG Technics
8 pages
LLM Mesh: A Practical Guide To Using Generative AI in The Enterprise
100% (3)
LLM Mesh: A Practical Guide To Using Generative AI in The Enterprise
27 pages
AI Engineer Roadmap
No ratings yet
AI Engineer Roadmap
22 pages
Understanding Agents
100% (2)
Understanding Agents
11 pages
Key Challenges in RAG Systems
No ratings yet
Key Challenges in RAG Systems
43 pages
Mastering AI Agents
100% (12)
Mastering AI Agents
93 pages
A Step-By-Step Guide To Building AI Agents With LangGraph - by Alannaelga - Coinmonks - Nov, 2024 - Medium
100% (1)
A Step-By-Step Guide To Building AI Agents With LangGraph - by Alannaelga - Coinmonks - Nov, 2024 - Medium
32 pages
Multi-Agentic RAG With Hugging Face Code Agents - by Gabriele Sgroi, PHD - Dec, 2024 - Towards Data Science
No ratings yet
Multi-Agentic RAG With Hugging Face Code Agents - by Gabriele Sgroi, PHD - Dec, 2024 - Towards Data Science
42 pages
AI Multi-Agent Workflow with LangChain
No ratings yet
AI Multi-Agent Workflow with LangChain
13 pages
Guide To Planning AI Agents
100% (1)
Guide To Planning AI Agents
12 pages
Agentic Workflows in Large Language Models
No ratings yet
Agentic Workflows in Large Language Models
6 pages
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
100% (2)
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
24 pages
A Taxonomy of Retrieval Augmented Generation
100% (5)
A Taxonomy of Retrieval Augmented Generation
56 pages
Scalable Agentic AI for Enterprises
100% (1)
Scalable Agentic AI for Enterprises
48 pages
AI Agents by Google
100% (11)
AI Agents by Google
42 pages
Lang Graph
100% (2)
Lang Graph
113 pages
Zero To Production AI Agent Guide
100% (1)
Zero To Production AI Agent Guide
30 pages
Top Agentic AI Architecture Design Patterns
100% (6)
Top Agentic AI Architecture Design Patterns
8 pages
LangGraph: Multi-Agent Systems
No ratings yet
LangGraph: Multi-Agent Systems
9 pages
The Complete Guide To Building Agents
100% (1)
The Complete Guide To Building Agents
50 pages
Best Practices For Fine-Tuning and Prompt Engineering LLMs - Weights & Biases LLM Whitepaper
50% (2)
Best Practices For Fine-Tuning and Prompt Engineering LLMs - Weights & Biases LLM Whitepaper
21 pages
Principles of Building AI Agents 2nd Edition
100% (7)
Principles of Building AI Agents 2nd Edition
149 pages
Software Architecture in An AI World
100% (2)
Software Architecture in An AI World
25 pages
Evolving LLMOps for RAG Applications
No ratings yet
Evolving LLMOps for RAG Applications
6 pages
Building Effective AI Agents - Anthropic
100% (1)
Building Effective AI Agents - Anthropic
16 pages
An Illustrated Guide To AI Agents
100% (11)
An Illustrated Guide To AI Agents
117 pages
200+ Ready To Deploy n8n AI Agents
86% (7)
200+ Ready To Deploy n8n AI Agents
103 pages
Dokumen - Pub Building Agentic Ai Systems Create Intelligent Autonomous Ai Agents That Can Reason Plan and Adapt 9781803238753
100% (5)
Dokumen - Pub Building Agentic Ai Systems Create Intelligent Autonomous Ai Agents That Can Reason Plan and Adapt 9781803238753
288 pages
Mastering RAG: A Comprehensive Guide
100% (1)
Mastering RAG: A Comprehensive Guide
15 pages
15 AI Agents With n8n 1752830106
80% (5)
15 AI Agents With n8n 1752830106
27 pages
Building Intelligent Agents With Semantic Kernel: A Comprehensive Guide
No ratings yet
Building Intelligent Agents With Semantic Kernel: A Comprehensive Guide
16 pages
MCP 9
No ratings yet
MCP 9
17 pages
LLM Evaluation
No ratings yet
LLM Evaluation
1 page
Mastering AI Agents - A Practical Handbook For Understanding, Building, and Leveraging LLM-Powered Autonomous Systems To... (Marcus Lighthaven)
100% (2)
Mastering AI Agents - A Practical Handbook For Understanding, Building, and Leveraging LLM-Powered Autonomous Systems To... (Marcus Lighthaven)
124 pages
Building Generative AI Agents With Vertex AI Agent Builder
100% (1)
Building Generative AI Agents With Vertex AI Agent Builder
13 pages
Guide To Building AI Agents From Scratch
100% (9)
Guide To Building AI Agents From Scratch
17 pages
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
100% (3)
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
48 pages
AI Agents Google
100% (5)
AI Agents Google
42 pages
LangChain For JavaScript Developers How To Integrate LLMs Into Javascript Web Apps (Daniel Nastase) (Z-Library)
No ratings yet
LangChain For JavaScript Developers How To Integrate LLMs Into Javascript Web Apps (Daniel Nastase) (Z-Library)
120 pages
LLM Frameworks
No ratings yet
LLM Frameworks
8 pages
Lang Chain
No ratings yet
Lang Chain
7 pages
Langchain Demo
No ratings yet
Langchain Demo
1 page
GenAI Curriculum
No ratings yet
GenAI Curriculum
64 pages
LangChain Concepts Updated With Tealium
No ratings yet
LangChain Concepts Updated With Tealium
17 pages
DBMSLabManual Suresh
No ratings yet
DBMSLabManual Suresh
34 pages
Surpac Tutorial - Pit Design - Block Modelling
50% (2)
Surpac Tutorial - Pit Design - Block Modelling
20 pages
Nokia Cloudband Application Manager: Benefits
100% (1)
Nokia Cloudband Application Manager: Benefits
4 pages
Igcse 0478 Specification Map
100% (1)
Igcse 0478 Specification Map
2 pages
YAZ Configuration Guide for SLIMS 5
100% (1)
YAZ Configuration Guide for SLIMS 5
6 pages
(WORD 365-2019) Mocktest 2
No ratings yet
(WORD 365-2019) Mocktest 2
6 pages
Spring Practice Questions 1
No ratings yet
Spring Practice Questions 1
9 pages
Unit 4
No ratings yet
Unit 4
48 pages
Astrology Software Evolution
100% (1)
Astrology Software Evolution
3 pages
In The Hon'Ble Supreme Court of El-Mango: W.P. No. of 2011
No ratings yet
In The Hon'Ble Supreme Court of El-Mango: W.P. No. of 2011
30 pages
NetMeeting ILS Server Configuration
No ratings yet
NetMeeting ILS Server Configuration
35 pages
Technical Data: Technical Data Oi/En Td/C12 Pacis Operator Interface
No ratings yet
Technical Data: Technical Data Oi/En Td/C12 Pacis Operator Interface
8 pages
Sophos Central Overview v4
No ratings yet
Sophos Central Overview v4
22 pages
Web Development Practical Guide
No ratings yet
Web Development Practical Guide
2 pages
DebuggingUVM PDF
No ratings yet
DebuggingUVM PDF
58 pages
Computer Science Basics
No ratings yet
Computer Science Basics
109 pages
66102E
No ratings yet
66102E
197 pages
CW Whats New
No ratings yet
CW Whats New
63 pages
ANSYS Materials
No ratings yet
ANSYS Materials
57 pages
33545tpnews 11072020
No ratings yet
33545tpnews 11072020
30 pages
Stellar Instabackup Gold: Data Backup Tool
No ratings yet
Stellar Instabackup Gold: Data Backup Tool
3 pages
Claims System for Efficient Processing
No ratings yet
Claims System for Efficient Processing
8 pages
0 - HTML Notes For BCA
80% (5)
0 - HTML Notes For BCA
22 pages
D2072A Unlocking Guide
No ratings yet
D2072A Unlocking Guide
5 pages
Java Training for IT Professionals
No ratings yet
Java Training for IT Professionals
11 pages
FG6123BTM Firewall Logs & DoS Alerts
No ratings yet
FG6123BTM Firewall Logs & DoS Alerts
54 pages
Guía ERP para Ingenieros Comerciales
No ratings yet
Guía ERP para Ingenieros Comerciales
9 pages
Suraj - Chaudhary - CV 2022 07 04 043319
No ratings yet
Suraj - Chaudhary - CV 2022 07 04 043319
3 pages
Diploma in Computer Engineering: Annexure-1 Part A Micro Project Report On
No ratings yet
Diploma in Computer Engineering: Annexure-1 Part A Micro Project Report On
16 pages
Network Monitoring & Security Tools
No ratings yet
Network Monitoring & Security Tools
4 pages