Skip to content
View yuyhao's full-sized avatar

Block or report yuyhao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。

Python 242 21 Updated Dec 25, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,356 334 Updated Dec 29, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,804 1,369 Updated Dec 30, 2025

This repository is the official implementation of the ECAI 2024 conference paper SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM

Python 68 4 Updated Aug 13, 2024

MTEB: Massive Text Embedding Benchmark

Python 3,053 528 Updated Jan 3, 2026

Question and Answer based on Anything.

Python 13,803 1,328 Updated Mar 24, 2025

Production-ready platform for agentic workflow development.

Python 124,432 19,350 Updated Jan 2, 2026

Generating fake data for the JVM (Java, Kotlin, Groovy) has never been easier!

Java 1,719 226 Updated Dec 31, 2025

从0到1构建一个MiniLLM (pretrain+sft+dpo实践中)

Python 511 60 Updated Mar 23, 2025

ChatPilot: Chat Agent Web UI,实现Chat对话前端,支持Google搜索、文件网址对话(RAG)、代码解释器功能,复现了Kimi Chat(文件,拖进来;网址,发出来)。

Svelte 588 60 Updated Jun 23, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 53,738 5,889 Updated Dec 30, 2025

HuatuoGPT2, One-stage Training for Medical Adaption of LLMs. (An Open Medical GPT)

Python 397 60 Updated Aug 30, 2024

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Python 2,696 294 Updated Aug 14, 2024

文本相似度,语义向量,文本向量,text-similarity,similarity, sentence-similarity,BERT,SimCSE,BERT-Whitening,Sentence-BERT, PromCSE, SBERT

Python 75 13 Updated Nov 26, 2024

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Python 3,790 446 Updated Nov 27, 2025

RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF. 纯原生实现RAG功能,基于本地LLM、embedding模型、reranker模型实现,支持GraphRAG,无须安装任何第三方agent库。

Python 824 143 Updated Apr 2, 2025

RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF

Python 1 Updated Jan 9, 2024

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 46,121 6,677 Updated Jan 2, 2026

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc.…

Python 2,462 280 Updated Sep 26, 2024

比Sentence-BERT更有效的句向量方案

Python 375 25 Updated Nov 9, 2022

《ChatGPT原理与实战:大型语言模型的算法、技术和私有化》

Python 369 77 Updated Dec 9, 2023

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,467 789 Updated Nov 7, 2025

Retrieval and Retrieval-augmented LLMs

Python 11,083 820 Updated Dec 15, 2025

Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文搜图、图搜图,python3开发,开箱即用。

Python 891 90 Updated Oct 29, 2024

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 16,915 1,359 Updated Oct 6, 2025

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

Python 14,752 1,304 Updated Apr 6, 2025

A series of large language models developed by Baichuan Intelligent Technology

Python 4,123 292 Updated Nov 8, 2024

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 6,625 588 Updated Oct 24, 2024

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

Python 4,508 659 Updated Aug 30, 2025

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Python 2,887 348 Updated May 21, 2024
Next