大语言模型的应用大多数采用python 语言实现,为了与其它应用结合,最简单的方法就是采样网页 RESTful API 服务。目前流行的Python 的Web 服务是FastAPI。本文记录了一个简单LLM 对话FastAPI 的网站服务的实验过程。
界面
安装
安装如下两个主要模块
pip install fastapi
pip install uvicorn
文件目录结构
FastAPI 具有固定的目录结构,Templates 中包含了index.hml
static 文件夹
,static 中包含了js,css,images等文件夹。
主程序(main.py)
import asyncio
import nest_asyncio
from langchain_core.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ErnieBotChat
from pydantic import BaseModel
nest_asyncio.apply()
llm= ErnieBotChat(model_name='ERNIE-Bot', #ERNIE-Bot
ernie_client_id='xxxxxxxx',
ernie_client_secret='xxxxxxxxxxx',
temperature=0.75,
)
template = """You are a nice chatbot having a conversation with a human.
New human question: {question}
Response:"""
prompt = PromptTemplate.from_template(template)
# Notice that we need to align the `memory_key`
conversation = LLMChain(
llm=llm,
prompt=prompt,
verbose=True,
)
class Prompt(BaseModel):
Method: str
Message:str
app = FastAPI()
app.mount("/static", StaticFiles(directory="static"), name="static")
templates = Jinja2Templates(directory="templates")
@app.get("/")
async def root(request: Request):
# return {"message": "Hello, World!"}
return templates.TemplateResponse("index.html",{
"request": request
})
@app.post("/generate/")
def generate(prompt:Prompt):
print(prompt)
AIresponse=conversation.predict(question=prompt.Message)
response=prompt
response.Message=AIresponse
print(response)
return {"response": response}
async def run_server():
uvicorn.run(app, host="localhost", port=8000)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_c