duyuankai1992

duyuankai1992

Stars

50 stars written in Python

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,582 31,631 Updated Jan 5, 2026

meta-llama / llama

Inference code for Llama models

Python 59,019 9,812 Updated Jan 26, 2025

opendatalab / MinerU

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 51,542 4,281 Updated Jan 4, 2026

xai-org / grok-1

Grok open release

Python 50,573 8,369 Updated Aug 30, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 38,466 4,183 Updated Dec 3, 2025

open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark

Python 32,243 9,835 Updated Aug 21, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,254 2,694 Updated Aug 12, 2024

PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.

Python 22,044 2,453 Updated Oct 2, 2025

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,396 2,148 Updated Dec 18, 2025

datalab-to / surya

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 19,067 1,309 Updated Oct 21, 2025

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 14,303 1,487 Updated Jan 4, 2026

zai-org / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,290 1,238 Updated Nov 4, 2025

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,668 750 Updated Sep 22, 2025

YaoFANGUK / video-subtitle-extractor

视频硬字幕提取，生成srt文件。无需申请第三方API，本地实现文本识别。基于深度学习的视频字幕提取框架，包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Python 8,254 850 Updated Aug 21, 2025

Morizeyao / GPT2-Chinese

Chinese version of GPT2 training code, using BERT tokenizer.

Python 7,607 1,699 Updated Apr 25, 2024

DA-southampton / NLP_ability

总结梳理自然语言处理工程师(NLP)需要积累的各方面知识，包括面试题，各种基础知识，工程能力等等，提升核心竞争力

Python 7,438 1,208 Updated Aug 24, 2022

kohya-ss / sd-scripts

Python 6,828 1,155 Updated Dec 21, 2025

zai-org / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,713 450 Updated May 29, 2024

yangjianxin1 / Firefly

Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 6,626 588 Updated Oct 24, 2024