Supports any LLM, such as GPT, GLM, Kimi, deepseek or local LLMs (such as ollama).
Allows you to define your own AI tools, with different tools able to use different models.
Most importantly, you can use free models provided by any platform (such as Cloudflare, GitHub models, SiliconFlow, openrouter or other platforms).

Note

The configurations of different LLMs (such as ollama, deepseek), UI configurations, and AI tools (including code completion) should be checked in the examples first. Here you will find most of the information you want to know. Additionally, before using the plugin, you should ensure that your LLM_KEY is valid and that the environment variable is in effect.

Platform	Link to obtain api key	Note
Cloudflare	https://dash.cloudflare.com/	You can see all of Cloudflare's models here, with the ones marked as beta being free models.
ChatGLM(智谱清言)	https://open.bigmodel.cn/
Kimi(月之暗面)	Moonshot AI 开放平台
Github Models	Github Token
siliconflow (硅基流动)	siliconflow	You can see all models on Siliconflow here, and select 'Only Free' to see all free models.
Deepseek	https://platform.deepseek.com/api_keys
Openrouter	https://openrouter.ai/
Chatanywhere	https://api.chatanywhere.org/v1/oauth/free/render	200 free calls to GPT-4o-mini are available every day.

For local llms, Set LLM_KEY to NONE in your zshrc or bashrc.

⬆ back to top

Minimal installation example

lazy.nvim

  {
    "Kurama622/llm.nvim",
    dependencies = { "nvim-lua/plenary.nvim", "MunifTanjim/nui.nvim"},
    cmd = { "LLMSessionToggle", "LLMSelectedTextHandler", "LLMAppHandler" },
    config = function()
      require("llm").setup({
        url = "https://models.inference.ai.azure.com/chat/completions",
        model = "gpt-4o-mini",
        api_type = "openai"
      })
    end,
    keys = {
      { "<leader>ac", mode = "n", "<cmd>LLMSessionToggle<cr>" },
    },
  }

Configure template

Configuration

Commands

Cmd	Description
`LLMSessionToggle`	Open/hide the Chat UI
`LLMSelectedTextHandler`	Handle the selected text, the way it is processed depends on the prompt words you input
`LLMAppHandler`	Call AI tools

Model Parameters

Expand the table.

Parameter	Description	Value
url	Model entpoint	String
model	Model name	String
api_type	Result parsing format	`workers-ai` \| `zhipu`\| `openai`\| `ollama`
timeout	The maximum timeout for a response (in seconds)	Number
fetch_key	Function that returns the API key	Function
max_tokens	Limits the number of tokens generated in a response.	Number
temperature	From 0 to 1. The lower the number is, the more deterministic the response will be. The higher the number is the more creative the response will be, but moe likely to go off topic if it's too high	Number
top_p	A threshold(From 0 to 1). The higher the threshold is the more diverse and the less repetetive the response will be. (But it could also lead to less likely tokens which also means: off-topic responses.)	Number
enable_thinking	Activate the model's deep thinking ability (The model itself needs to ensure this feature.)	Boolean
thinking_budget	The maximum length of the thinking process only takes effect when enable_thinking is true.	Number
schema	Function-calling required function parameter description	Table
functions_tbl	Function dict required for Function-calling	Table
keep_alive	Maintain connection (usually for ollama)	see keep_alive/OLLAMA_KEEP_ALIVE
streaming_handler	Customize the parsing format of the streaming output	Function
parse_handler	Customize the parsing format for non-streaming output	Function

keymaps

Expand the table.

Style	Keyname	Description	Default: `[mode] keymap`	Window
float	Input:Submit	Submit your question	`[i] ctrl+g`	Input
float	Input:Cancel	Cancel dialog response	`[i] ctrl+c`	Input
float	Input:Resend	Rerespond to the dialog	`[i] ctrl+r`	Input
float	Input:HistoryNext	Select the next session history	`[i] ctrl+j`	Input
float	Input:HistoryPrev	Select the previous session history	`[i] ctrl+k`	Input
float	Input:ModelsNext	Select the next model	`[i] ctrl+shift+j`	Input
float	Input:ModelsPrev	Select the previous model	`[i] ctrl+shift+k`	Input
split	Output:Ask	Open the input box In the normal mode of the input box, press Enter to submit your question)	`[n] i`	Output
split	Output:Cancel	Cancel dialog response	`[n] ctrl+c`	Output
split	Output:Resend	Rerespond to the dialog	`[n] ctrl+r`	Output
float/split	Session:Toggle	Toggle session	`[n] <leader>ac`	Input+Output
float/split	Session:Close	Close session	`[n] <esc>`	`float`: Input+Output `split`: Output
float/split	Session:Models	Open the model-list window	`[n] ctrl+m`	`float`: App input window `split`: Output
split	Session:History	Open the history window `j`: next `k`: previous `<cr>`: select `<esc>`: close	`[n] ctrl+h`	Output
float	Focus:Input	Jump from the output window to the input window	-	Output
float	Focus:Output	Jump from the input window to the output window	-	Input
float	PageUp	Output Window page up	`[n/i] Ctrl+b`	Output
float	PageDown	Output window page down	`[n/i] Ctrl+f`	Output
float	HalfPageUp	Output Window page up (half)	`[n/i] Ctrl+u`	Output
float	HalfPageDown	Output window page down (half)	`[n/i] Ctrl+d`	Output
float	JumpToTop	Jump to the top (output window)	`[n] gg`	Output
float	JumpToBottom	Jump to the bottom (output window)	`[n] G`	Output

Tool

Handler name	Description
side_by_side_handler	Display results in two windows side by side
action_handler	Display results in the source file in the form of a diff
qa_handler	AI for single-round dialogue
flexi_handler	Results will be displayed in a flexible window (window size is automatically calculated based on the amount of output text)
disposable_ask_handler	Flexible questioning, you can choose a piece of code to ask about, or you can ask directly (the current buffer is the context)
attach_to_chat_handler	Attach the selected content to the context and ask a question.
completion_handler	Code completion
curl_request_handler	The simplest interaction between curl and LLM is generally used to query account balance or available model lists, etc.

Each handler's parameters can be referred to here.

Examples can be seen AI Tools Configuration

UI

See UI Configuration and nui/popup

⬆ back to top

Local LLM Configuration

Local LLMs require custom parsing functions; for streaming output, we use our custom streaming_handler; for AI tools that return output results in one go, we use our custom parse_handler.

Below is an example of ollama running llama3.2:1b.

Expand the code.

local function local_llm_streaming_handler(chunk, ctx, F)
  if not chunk then
    return ctx.assistant_output
  end
  local tail = chunk:sub(-1, -1)
  if tail:sub(1, 1) ~= "}" then
    ctx.line = ctx.line .. chunk
  else
    ctx.line = ctx.line .. chunk
    local status, data = pcall(vim.fn.json_decode, ctx.line)
    if not status or not data.message.content then
      return ctx.assistant_output
    end
    ctx.assistant_output = ctx.assistant_output .. data.message.content
    F.WriteContent(ctx.bufnr, ctx.winid, data.message.content)
    ctx.line = ""
  end
  return ctx.assistant_output
end

local function local_llm_parse_handler(chunk)
  local assistant_output = chunk.message.content
  return assistant_output
end

return {
  {
    "Kurama622/llm.nvim",
    dependencies = { "nvim-lua/plenary.nvim", "MunifTanjim/nui.nvim" },
    cmd = { "LLMSessionToggle", "LLMSelectedTextHandler" },
    config = function()
      require("llm").setup({
        url = "http://localhost:11434/api/chat", -- your url
        model = "llama3.2:1b",

        streaming_handler = local_llm_streaming_handler,
        app_handler = {
          WordTranslate = {
            handler = tools.flexi_handler,
            prompt = "Translate the following text to Chinese, please only return the translation",
            opts = {
              parse_handler = local_llm_parse_handler,
              exit_on_move = true,
              enter_flexible_window = false,
            },
          },
        }
      })
    end,
    keys = {
      { "<leader>ac", mode = "n", "<cmd>LLMSessionToggle<cr>" },
    },
  }
}

⬆ back to top

TODO List

todo-list

⬆ back to top

Author's configuration

plugins/llm

Acknowledgments

We would like to express our heartfelt gratitude to the contributors of the following open-source projects, whose code has provided invaluable inspiration and reference for the development of llm.nvim:

olimorris/codecompanion.nvim: Diff style and prompt.
SmiteshP/nvim-navbuddy: UI.
milanglacier/minuet-ai.nvim: Code completions.

Special thanks

ACKNOWLEDGMENTS

Name		Name	Last commit message	Last commit date
Latest commit History 278 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
after/syntax		after/syntax
docs/tools		docs/tools
examples		examples
lua/llm		lua/llm
plugin		plugin
ACKNOWLEDGMENTS.md		ACKNOWLEDGMENTS.md
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
basic_template.lua		basic_template.lua
stylua.toml		stylua.toml

License

Kurama622/llm.nvim

Folders and files

Latest commit

History

Repository files navigation

Contents

Screenshots

Chat

Explain Code

Optimize Code

Installation

Dependencies

Preconditions

Websites of different AI platforms

Minimal installation example

Configuration

Commands

Model Parameters

keymaps

Tool

UI

Local LLM Configuration

TODO List

Author's configuration

Acknowledgments

Special thanks

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages