Lists (19)
Sort Name ascending (A-Z)
C++/Linux
course
go
NLP专属repo
记录收藏一些NLP相关的有趣的实验NLP以及开发面经
NLP随便点点
scratch
分布式
大模型
存放chatgpt之后涌现的大模型、资料、数据开发
强化学习
情感原因
推理优化
大模型推理优化部署相关类似tensorrt等数据集
专门用来star数据集的有意思的开源软件
爬虫
一些网络爬虫,小应用或者爬取语料等特别收藏
收藏一些NLP以后可能需要但是目前不知道放在哪个文件夹里的东西训练/推理引擎
收藏训练引擎和推理引擎框架论文repo
Stars
This project aims to replicate mainstream open-source model architectures with limited computational resources, implementing mini models with 100-200M parameters.
Official style files for papers submitted to venues of the Association for Computational Linguistics
Official implementation of Log-linear Sparse Attention (LLSA).
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
A collection of GPU experiments and benchmarks for my personal understanding and research.
Distributed MoE in a Single Kernel [NeurIPS '25]
https://bbuf.github.io/gpu-glossary-zh/
GRID: Generative Recommendation with Semantic IDs
Triton implementation of FlashAttention2 that adds Custom Masks.
轻量级大语言模型MiniMind的源码解读,包含tokenizer、RoPE、MoE、KV Cache、pretraining、SFT、LoRA、DPO等完整流程
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。
大模型训练框架,支持现代大模型训练的全流程,包括基础预训练、长上下文预训练、思维链推理微调、强化学习训练、指令混合微调、直接偏好优化、模型权重合并、Web 推理服务
Efficient Triton Kernels for LLM Training
Benchmarking code for running quantized kernels from vLLM and other libraries
Implement a reasoning LLM in PyTorch from scratch, step by step
🤗 Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtime
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
A minimal PyTorch re-implementation of Qwen3 VL with a fancy CLI

