Skip to content
View ipengx1029's full-sized avatar

Block or report ipengx1029

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
13 results for sponsorable starred repositories
Clear filter

JSON for Modern C++

C++ 48,426 7,279 Updated Jan 1, 2026

PyTorch Quantization Aware Training Example

Python 149 36 Updated May 18, 2024

CUTLASS and CuTe Examples

Cuda 115 13 Updated Nov 30, 2025

ONNX Runtime Inference C++ Example

C++ 257 57 Updated Apr 3, 2025

ONNX Python Examples

Dockerfile 16 6 Updated Sep 13, 2022

Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration

C++ 77 12 Updated May 26, 2025

CUDA Matrix Multiplication Optimization

Cuda 250 24 Updated Jul 19, 2024

A General-purpose Task-parallel Programming System using Modern C++

C++ 11,571 1,349 Updated Jan 5, 2026

✍🏻 这里是写博客的地方 —— Halfrost-Field 冰霜之地

Go 13,267 1,892 Updated Dec 28, 2023

Python bindings for llama.cpp

Python 9,877 1,264 Updated Aug 15, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,887 12,401 Updated Jan 6, 2026

Accessible large language models via k-bit quantization for PyTorch.

Python 7,866 811 Updated Dec 12, 2025

Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels

Python 113 7 Updated Jul 31, 2023