Stars
Efficient Triton Kernels for LLM Training
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A machine learning compiler for GPUs, CPUs, and ML accelerators
Enabling PyTorch on XLA Devices (e.g. Google TPU)
intelligent-machine-learning / xla
Forked from openxla/xlaA machine learning compiler for GPUs, CPUs, and ML accelerators
Tensors and Dynamic neural networks in Python with strong GPU acceleration
A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
DLRover: An Automatic Distributed Deep Learning System
An Open Source Machine Learning Framework for Everyone
A unified framework for privacy-preserving data analysis and machine learning
DeepRec is a high-performance recommendation deep learning framework based on TensorFlow. It is hosted in incubation in LF AI & Data Foundation.
Mobius is an AI infrastructure platform for distributed online learning, including online sample processing, training and serving.