goog00

Follow

sunteng goog00

Follow

8 followers · 46 following

Achievements

Achievements

Stars

google / or-tools

Google's Operations Research tools:

C++ 12,269 2,252 Updated Jul 26, 2025

PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

C++ 23,065 5,774 Updated Jul 26, 2025

NVIDIA / nccl

Optimized primitives for collective multi-GPU communication

C++ 3,889 974 Updated Jul 24, 2025

parasailteam / coconet

HTML 80 11 Updated Dec 2, 2022

jafioti / luminal

Deep learning at the speed of light.

Rust 2,046 129 Updated Jul 27, 2025

krahets / hello-algo

《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version in translation

Java 114,757 14,198 Updated Jul 21, 2025

KnowingNothing / MatmulTutorial

A Easy-to-understand TensorOp Matmul Tutorial

C++ 368 48 Updated Sep 21, 2024

modular / modular

The Modular Platform (includes MAX & Mojo)

Mojo 24,558 2,672 Updated Jul 26, 2025

ByteDance-Seed / Triton-distributed

Distributed Compiler based on Triton for Parallel Systems

Python 916 80 Updated Jul 25, 2025

YukinoriSato / PATT

C 5 2 Updated May 23, 2019

mindspore-ai / akg

AKG (Auto Kernel Generator) is an optimizer for operators in Deep Learning Networks, which provides the ability to automatically fuse ops with specific patterns.

Python 230 41 Updated Jul 26, 2025

prajna-lang / vscode-prajna-tools

1 Updated Apr 12, 2023

ROCm / ROCm

AMD ROCm™ Software - GitHub Home

Shell 5,494 457 Updated Jul 25, 2025

vincentloechner / polylib

Forked from harenome/polylib

PolyLib official git.

C 9 4 Updated Jul 25, 2025

vincentloechner / PolyhedralCompilers

Shell 10 4 Updated Apr 24, 2025

google-gemini / gemini-cli

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 64,479 6,121 Updated Jul 27, 2025

alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 12,629 2,015 Updated Jul 25, 2025

microsoft / triton-shared

Shared Middle-Layer for Triton Compilation

MLIR 259 71 Updated Jul 24, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 16,316 2,441 Updated Jul 27, 2025

microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 17,312 3,359 Updated Jul 27, 2025

Tencent / TNN

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its…

C++ 4,555 771 Updated May 9, 2025

PaddlePaddle / FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

Python 3,424 574 Updated Jul 25, 2025

OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

C 6,879 1,577 Updated Jul 25, 2025

bytedance / byteir

A model compilation solution for various hardware

MLIR 439 54 Updated Jul 24, 2025

AdvancedCompiler / AdvancedCompiler

先进编译实验室的个人主页

C++ 117 10 Updated Apr 21, 2025

Adlik / Adlik

Adlik: Toolkit for Accelerating Deep Learning Inference

C++ 804 82 Updated Dec 27, 2023

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,532 161 Updated Jul 27, 2025

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 5,420 371 Updated Jul 24, 2025

HazyResearch / aisys-building-blocks

Building blocks for foundation models.

519 25 Updated Jan 3, 2024

gpu-mode / resource-stream

GPU programming related news and material links

1,630 90 Updated Jan 6, 2025