stmatengss

♨️

Focusing

Teng Ma stmatengss

♨️

Focusing

PhD, Tsinghua (16~21); Postdoc, Alibaba (21~23); Staff Engineer, Alibaba (23~present)

157 followers · 72 following

Alibaba Group
Beijing, China
stmatengss.github.io
https://scholar.google.com/citations?user=8zXo0KMAAAAJ
in/teng-ma-69a0a8115

Achievements

x3 x2

Achievements

x3 x2

Organizations

Lists (1)

Sort

LLM

1 repository

Starred repositories

tukuaiai / vibe-coding-cn

Forked from EnzeD/vibe-coding

我的开发经验+提示词库=vibecoding工作站；My development experience + prompt dictionary = Vibecoding workstation；ניסיון הפיתוח שלי + מילון פרומפטים = תחנת עבודה Vibecoding；私の開発経験 + プロンプト辞書 = Vibecoding ワークステーション；나…

Python 5,458 566 Updated Dec 30, 2025

mistralai / mistral-common

Official inference library for pre-processing of Mistral models

Python 835 121 Updated Dec 27, 2025

vipshop / cache-dit

🤗A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.

Python 834 46 Updated Dec 30, 2025

ali-vilab / TeaCache

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Python 1,213 49 Updated Jun 8, 2025

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 2,646 253 Updated Dec 29, 2025

ModelTC / LightX2V

Light Image Video Generation Inference Framework

Python 1,604 114 Updated Dec 30, 2025

ZJU-LLMs / Foundations-of-LLMs

A book for Learning the Foundations of LLMs

15,121 1,399 Updated Dec 12, 2025

RooCodeInc / Roo-Code

Roo Code gives you a whole dev team of AI agents in your code editor.

TypeScript 21,491 2,720 Updated Dec 30, 2025

foundation-model-stack / fastsafetensors

High-performance safetensors model loader

Python 89 16 Updated Dec 17, 2025

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 1,859 234 Updated Dec 30, 2025

f / awesome-chatgpt-prompts

Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

TypeScript 141,122 18,694 Updated Dec 30, 2025

aibrix / PrisKV

High Performance KV Cache Store for LLM

C 44 4 Updated Nov 27, 2025

deepseek-ai / LPLB

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 478 27 Updated Nov 19, 2025

radixark / miles

Python 645 65 Updated Dec 29, 2025

modelscope / modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Python 8,604 895 Updated Dec 23, 2025

hao-ai-lab / FastVideo

A unified inference and post-training framework for accelerated video generation.

Python 2,878 230 Updated Dec 30, 2025

InferenceMAX / InferenceMAX

Open Source Continuous Inference Benchmarking - GB200 NVL72 vs MI355X vs B200 vs H200 vs MI325X & soon™ TPUv6e/v7/Trainium2/3/GB300 NVL72 - DeepSeek 670B MoE, GPTOSS

Python 407 68 Updated Dec 30, 2025

aigw-project / aigw

The Intelligent Inference Scheduler for Large-scale Inference Services.

Go 50 13 Updated Dec 28, 2025

ROCm / mori

Modular RDMA Interface

C++ 69 15 Updated Dec 28, 2025

ovg-project / kvcached

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 730 75 Updated Nov 30, 2025

thu-pacman / chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,364 92 Updated Dec 30, 2025

izumihanako / LightDSA

This is the code repository of paper "LightDSA: Enabling Efficient DSA Through Hardware-Aware Transparent Optimization"

C++ 5 Updated Oct 20, 2025

ai-dynamo / aiconfigurator

Offline optimization of your disaggregated Dynamo graph

Python 136 40 Updated Dec 23, 2025

sgl-project / sgl-learning-materials

Materials for learning SGLang

703 51 Updated Dec 15, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 39,472 5,022 Updated Dec 28, 2025

ByteDance-Seed / Triton-distributed

Distributed Compiler based on Triton for Parallel Systems

Python 1,297 115 Updated Dec 27, 2025

NVlabs / Fast-dLLM

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 760 74 Updated Nov 28, 2025

supermemoryai / supermemory

Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.

TypeScript 13,912 1,477 Updated Dec 30, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,383 3,251 Updated Dec 30, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,336 364 Updated Dec 29, 2025

Teng Ma stmatengss

Organizations

Lists (1)

LLM

Starred repositories

delta-stepping