- Official Repo
- Official Tutorial
- Triton-Puzzles
- fla: Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
- Attorch: A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
- Trident: A performance library for machine learning applications.
- triton-transformer: Implementation of a Transformer, but completely in Triton
- unsloth
- Kernl
- Liger-Kernel
- efficient_cross_entropy
- Xformers kernels
- Flash Attention kernels
- FlagGems: FlagGems is an operator library for large language models implemented in Triton Language.
- applied-ai: Applied AI experiments and examples for PyTorch
- FlagAttention: A collection of memory efficient attention operators implemented in the Triton language.
- triton-activations: Collection of neural network activation function kernels for Triton Language Compiler by OpenAI
- FLASHNN
- GPTQ-triton
- Accelerating Triton Dequantization Kernels for GPTQ
- Block-sparse matrix multiplication kernels
- bitsandbytes
- flex-attention
- conv
- lightning_attention
- int mm
- low-bit matmul kernels
- linear cross entropy
- optimized_hf_llama_class_for_training
- kernel-hyperdrive
- scattermoe
- triton-dejavu
- Triformer
- σ-MoE layer
- fast linear attn
This library is inspired by Awesome-Triton-Kernels.