Skip to content

Free large language model (LLM) support for Neovim, provides commands to interact with LLM (like ChatGPT, ChatGLM, kimi, deepseek, openrouter and local llms). Support Github models.

License

Notifications You must be signed in to change notification settings

Kurama622/llm.nvim

Repository files navigation

llm.nvim

English | 简体中文


Important

A free large language model(LLM) plugin that allows you to interact with LLM in Neovim.

  1. Supports any LLM, such as GPT, GLM, Kimi, deepseek or local LLMs (such as ollama).
  2. Allows you to define your own AI tools, with different tools able to use different models.
  3. Most importantly, you can use free models provided by any platform (such as Cloudflare, GitHub models, SiliconFlow, openrouter or other platforms).

Note

The configurations of different LLMs (such as ollama, deepseek), UI configurations, and AI tools (including code completion) should be checked in the examples first. Here you will find most of the information you want to know. Additionally, before using the plugin, you should ensure that your LLM_KEY is valid and that the environment variable is in effect.

Contents

Screenshots

Chat

models | UI

llm-chat

  • virtual text

completion-virtual-text

  • blink.cmp or nvim-cmp

completion-blink-cmp

llm-translate

Explain Code

Streaming output | Non-streaming output

llm-explain-code

One-time, no history retained.

You can configure inline_assistant to decide whether to display diffs (default: show by pressing 'd').

llm-ask

You can configure inline_assistant to decide whether to display diffs (default: show by pressing 'd').

llm-attach

Optimize Code

llm-optimize-code

llm-optimize-compare-action

test-case

llm-trans

llm-git-commit-msg

llm-docstring

⬆ back to top

Installation

Dependencies

  • curl

Preconditions

  1. Register on the official website and obtain your API Key (Cloudflare needs to obtain an additional account).

  2. Set the LLM_KEY (Cloudflare needs to set an additional ACCOUNT) environment variable in your zshrc or bashrc.

export LLM_KEY=<Your API_KEY>
export ACCOUNT=<Your ACCOUNT> # just for cloudflare

Websites of different AI platforms

Expand the table.
Platform Link to obtain api key Note
Cloudflare https://dash.cloudflare.com/ You can see all of Cloudflare's models here, with the ones marked as beta being free models.
ChatGLM(智谱清言) https://open.bigmodel.cn/
Kimi(月之暗面) Moonshot AI 开放平台
Github Models Github Token
siliconflow (硅基流动) siliconflow You can see all models on Siliconflow here, and select 'Only Free' to see all free models.
Deepseek https://platform.deepseek.com/api_keys
Openrouter https://openrouter.ai/
Chatanywhere https://api.chatanywhere.org/v1/oauth/free/render 200 free calls to GPT-4o-mini are available every day.

For local llms, Set LLM_KEY to NONE in your zshrc or bashrc.

⬆ back to top

Minimal installation example

  • lazy.nvim
  {
    "Kurama622/llm.nvim",
    dependencies = { "nvim-lua/plenary.nvim", "MunifTanjim/nui.nvim"},
    cmd = { "LLMSessionToggle", "LLMSelectedTextHandler", "LLMAppHandler" },
    config = function()
      require("llm").setup({
        url = "https://models.inference.ai.azure.com/chat/completions",
        model = "gpt-4o-mini",
        api_type = "openai"
      })
    end,
    keys = {
      { "<leader>ac", mode = "n", "<cmd>LLMSessionToggle<cr>" },
    },
  }

Configure template

Configuration

Commands

Cmd Description
LLMSessionToggle Open/hide the Chat UI
LLMSelectedTextHandler Handle the selected text, the way it is processed depends on the prompt words you input
LLMAppHandler Call AI tools

Model Parameters

Expand the table.
Parameter Description Value
url Model entpoint String
model Model name String
api_type Result parsing format workers-ai | zhipu|
openai| ollama
timeout The maximum timeout for a response (in seconds) Number
fetch_key Function that returns the API key Function
max_tokens Limits the number of tokens generated in a response. Number
temperature From 0 to 1.
The lower the number is, the more deterministic the response will be.
The higher the number is the more creative the response will be, but moe likely to go off topic if it's too high
Number
top_p A threshold(From 0 to 1).
The higher the threshold is the more diverse and the less repetetive the response will be.
(But it could also lead to less likely tokens which also means: off-topic responses.)
Number
enable_thinking Activate the model's deep thinking ability (The model itself needs to ensure this feature.) Boolean
thinking_budget The maximum length of the thinking process only takes effect when enable_thinking is true. Number
schema Function-calling required function parameter description Table
functions_tbl Function dict required for Function-calling Table
keep_alive Maintain connection (usually for ollama) see keep_alive/OLLAMA_KEEP_ALIVE
streaming_handler Customize the parsing format of the streaming output Function
parse_handler Customize the parsing format for non-streaming output Function

keymaps

Expand the table.
Style Keyname Description Default: [mode] keymap Window
float Input:Submit Submit your question [i] ctrl+g Input
float Input:Cancel Cancel dialog response [i] ctrl+c Input
float Input:Resend Rerespond to the dialog [i] ctrl+r Input
float Input:HistoryNext Select the next session history [i] ctrl+j Input
float Input:HistoryPrev Select the previous session history [i] ctrl+k Input
float Input:ModelsNext Select the next model [i] ctrl+shift+j Input
float Input:ModelsPrev Select the previous model [i] ctrl+shift+k Input
split Output:Ask Open the input box
In the normal mode of the input box, press Enter to submit your question)
[n] i Output
split Output:Cancel Cancel dialog response [n] ctrl+c Output
split Output:Resend Rerespond to the dialog [n] ctrl+r Output
float/split Session:Toggle Toggle session [n] <leader>ac Input+Output
float/split Session:Close Close session [n] <esc> float: Input+Output
split: Output
float/split Session:Models Open the model-list window [n] ctrl+m float: App input window
split: Output
split Session:History Open the history window
j: next
k: previous
<cr>: select
<esc>: close
[n] ctrl+h Output
float Focus:Input Jump from the output window to the input window - Output
float Focus:Output Jump from the input window to the output window - Input
float PageUp Output Window page up [n/i] Ctrl+b Output
float PageDown Output window page down [n/i] Ctrl+f Output
float HalfPageUp Output Window page up (half) [n/i] Ctrl+u Output
float HalfPageDown Output window page down (half) [n/i] Ctrl+d Output
float JumpToTop Jump to the top (output window) [n] gg Output
float JumpToBottom Jump to the bottom (output window) [n] G Output

Tool

Handler name Description
side_by_side_handler Display results in two windows side by side
action_handler Display results in the source file in the form of a diff
qa_handler AI for single-round dialogue
flexi_handler Results will be displayed in a flexible window (window size is automatically calculated based on the amount of output text)
disposable_ask_handler Flexible questioning, you can choose a piece of code to ask about, or you can ask directly (the current buffer is the context)
attach_to_chat_handler Attach the selected content to the context and ask a question.
completion_handler Code completion
curl_request_handler The simplest interaction between curl and LLM is generally used to query account balance or available model lists, etc.

Each handler's parameters can be referred to here.

Examples can be seen AI Tools Configuration

UI

See UI Configuration and nui/popup

⬆ back to top

Local LLM Configuration

Local LLMs require custom parsing functions; for streaming output, we use our custom streaming_handler; for AI tools that return output results in one go, we use our custom parse_handler.

Below is an example of ollama running llama3.2:1b.

Expand the code.
local function local_llm_streaming_handler(chunk, ctx, F)
  if not chunk then
    return ctx.assistant_output
  end
  local tail = chunk:sub(-1, -1)
  if tail:sub(1, 1) ~= "}" then
    ctx.line = ctx.line .. chunk
  else
    ctx.line = ctx.line .. chunk
    local status, data = pcall(vim.fn.json_decode, ctx.line)
    if not status or not data.message.content then
      return ctx.assistant_output
    end
    ctx.assistant_output = ctx.assistant_output .. data.message.content
    F.WriteContent(ctx.bufnr, ctx.winid, data.message.content)
    ctx.line = ""
  end
  return ctx.assistant_output
end

local function local_llm_parse_handler(chunk)
  local assistant_output = chunk.message.content
  return assistant_output
end

return {
  {
    "Kurama622/llm.nvim",
    dependencies = { "nvim-lua/plenary.nvim", "MunifTanjim/nui.nvim" },
    cmd = { "LLMSessionToggle", "LLMSelectedTextHandler" },
    config = function()
      require("llm").setup({
        url = "http://localhost:11434/api/chat", -- your url
        model = "llama3.2:1b",

        streaming_handler = local_llm_streaming_handler,
        app_handler = {
          WordTranslate = {
            handler = tools.flexi_handler,
            prompt = "Translate the following text to Chinese, please only return the translation",
            opts = {
              parse_handler = local_llm_parse_handler,
              exit_on_move = true,
              enter_flexible_window = false,
            },
          },
        }
      })
    end,
    keys = {
      { "<leader>ac", mode = "n", "<cmd>LLMSessionToggle<cr>" },
    },
  }
}

⬆ back to top

TODO List

todo-list

⬆ back to top

Author's configuration

plugins/llm

Acknowledgments

We would like to express our heartfelt gratitude to the contributors of the following open-source projects, whose code has provided invaluable inspiration and reference for the development of llm.nvim:

Special thanks

ACKNOWLEDGMENTS

About

Free large language model (LLM) support for Neovim, provides commands to interact with LLM (like ChatGPT, ChatGLM, kimi, deepseek, openrouter and local llms). Support Github models.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages