- China
-
08:32
(UTC +08:00)
Highlights
- Pro
Lists (20)
Sort Name ascending (A-Z)
Stars
Implementing scalable LLMs in pure JAX (no third-party libraries)
OpenTUI is a library for building terminal user interfaces (TUIs)
Professional Antigravity Account Manager & Switcher. One-click seamless account switching for Antigravity Tools. Built with Tauri v2 + React (Rust).专业的 Antigravity 账号管理与切换工具。为 Antigravity 提供一键无缝账号切…
A general and accurate MACs / FLOPs profiler for PyTorch models
A template for modern C++ projects using CMake, Clang-Format, CI, unit testing and more, with support for downstream inclusion.
🚀 Kick-start your C++! A template for modern C++ projects using CMake, CI, code coverage, clang-format, reproducible dependency management and much more.
git worktrees + tmux windows for zero-friction parallel dev
Pipeline Parallelism Emulation and Visualization
Accelerating MoE with IO and Tile-aware Optimizations
An implementation of the Debug Adapter Protocol for Python
Muon is an optimizer for hidden layers in neural networks
Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP
[HPCA 2026] A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
Ring attention implementation with flash attention
A cinematic Git commit replay tool for the terminal, turning your Git history into a living, animated story.
Remote vanilla PDB (over TCP sockets).
📰 Must-read papers and blogs on Speculative Decoding ⚡️
A tiny debugger implement the GDB Remote Serial Protocol. Can work on i386, x86_64, ARM and PowerPC.
An early research stage expert-parallel load balancer for MoE models based on linear programming.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
Ideas for projects related to Tinker
Tutorial Exercises and Code for GPU Communications Tutorial at HOT Interconnects 2025
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Allow torch tensor memory to be released and resumed later

