Skip to content
View goog00's full-sized avatar

Block or report goog00

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Google's Operations Research tools:

C++ 12,269 2,252 Updated Jul 26, 2025

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

C++ 23,065 5,774 Updated Jul 26, 2025

Optimized primitives for collective multi-GPU communication

C++ 3,889 974 Updated Jul 24, 2025
HTML 80 11 Updated Dec 2, 2022

Deep learning at the speed of light.

Rust 2,046 129 Updated Jul 27, 2025

《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation

Java 114,757 14,198 Updated Jul 21, 2025

A Easy-to-understand TensorOp Matmul Tutorial

C++ 368 48 Updated Sep 21, 2024

The Modular Platform (includes MAX & Mojo)

Mojo 24,558 2,672 Updated Jul 26, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 916 80 Updated Jul 25, 2025
C 5 2 Updated May 23, 2019

AKG (Auto Kernel Generator) is an optimizer for operators in Deep Learning Networks, which provides the ability to automatically fuse ops with specific patterns.

Python 230 41 Updated Jul 26, 2025

AMD ROCm™ Software - GitHub Home

Shell 5,494 457 Updated Jul 25, 2025

PolyLib official git.

C 9 4 Updated Jul 25, 2025

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 64,479 6,121 Updated Jul 27, 2025

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 12,629 2,015 Updated Jul 25, 2025

Shared Middle-Layer for Triton Compilation

MLIR 259 71 Updated Jul 24, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 16,316 2,441 Updated Jul 27, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 17,312 3,359 Updated Jul 27, 2025

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its…

C++ 4,555 771 Updated May 9, 2025

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

Python 3,424 574 Updated Jul 25, 2025

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

C 6,879 1,577 Updated Jul 25, 2025

A model compilation solution for various hardware

MLIR 439 54 Updated Jul 24, 2025

先进编译实验室的个人主页

C++ 117 10 Updated Apr 21, 2025

Adlik: Toolkit for Accelerating Deep Learning Inference

C++ 804 82 Updated Dec 27, 2023

Tile primitives for speedy kernels

Cuda 2,532 161 Updated Jul 27, 2025

Efficient Triton Kernels for LLM Training

Python 5,420 371 Updated Jul 24, 2025

Building blocks for foundation models.

519 25 Updated Jan 3, 2024

GPU programming related news and material links

1,630 90 Updated Jan 6, 2025
Next