AMD Developer

Semiconductor Manufacturing

Advancing AI innovation together. Built with devs, for devs. Supported through an open ecosystem. Powered by AMD.

About us

Website: https://www.amd.com/en/developer
External link for AMD Developer
Industry: Semiconductor Manufacturing

Updates

AMD Developer reposted this
PyTorch

293,676 followers
12h Edited
Report this post
vLLM V1 is now fully supported on AMD GPUs! New blog from vLLM Team at IBM Research, vLLM Team at Red Hat, and vLLM Team at AMD: Learn how the teams at IBM, AMD and Red Hat built an optimized attention backend for vLLM V1 using Triton kernels. In this deep technical blog, we describe the optimizations we performed to ensure that the Triton backend achieves state-of-the-art performance on AMD. Check it out if you are interested in knowing more about the art of writing lighting fast Triton kernels for AI applications! 🔗 Read here: https://hubs.la/Q03PC6kh0 #vLLM #PyTorch #OpenSourceAI
3 Comments

Like Comment Share
AMD Developer

3,641 followers
12h Edited
Report this post
Overheard at AMD AI Dev Day... coding at dawn or coding at midnight? ☕💻 We asked Daniel Han, CEO Unsloth AI to pick sides in a lighting round. #AMDevs

Overheard at AI: This or That with Daniel Han, CEO and co-founder at Unsloth AI

Like Comment Share
AMD Developer

3,641 followers
15h
Report this post
Congratulations to our friends at Unsloth AI!
Daniel Han

Co-founder @ Unsloth AI
20h Edited

Unsloth AI just hit 100 million lifetime downloads on Hugging Face! 🦥🤗 Huge thanks to all of you! The amazing community, model creators, and HF team. 💖 Our HF page has everything you need for open-source - from Dynamic GGUFs for local running, to safetensors to fine-tune/deploy with. We collab directly with model labs to identify and fix issues in LLMs. That means when you use Unsloth uploads, you’re getting models that are always accurate, reliable, and actively maintained. We also reached 10K followers and over 86K Unsloth-trained models publicly shared on HF! 🚀 🤗 Our Hugging Face page: huggingface.co/unsloth ⭐ Star us on GitHub: https://lnkd.in/gyaDBTxK
Like Comment Share
AMD Developer reposted this
Junfan Zhu

AI Engineer (open), Agent, RL | Georgia Tech CS | UChicago Maths, ex-Quant | Stanford GSB | Github (1000⭐️): junfanz1
1d Edited
Report this post
🔥 AMD #DevDay 2025 🎉 Very glad (and lucky!) to win the #Radeon RX9070 XT #GPU prize — huge thanks to the #AMD team! 🥥 Summary: From system-level optimization (vLLM, #SGLang) to model orchestration (#gpt-oss, #Gemma Nano, #Ollama) — an inspiring cross-section of how open-source #AI + AMD hardware co-evolve. Modern #LLM #infrastructure converges on a few core engineering patterns across serving, memory, #agentic #reasoning and #kernel stacks. 👉 Long article: https://lnkd.in/gHBHCJXy 🤝 Many thanks to speakers and hands-on workshops by: Ion Stoica (University of California, Berkeley), Lin Qiao (Fireworks AI), Daniel Han (Unsloth AI), Simon Mo (vLLM), Robert Shaw (Red Hat), Linda Yang (Supermicro), Michael Chiang (Ollama), Dominik Kundel (OpenAI), Erwan Mendard (Crusoe), Tris Warkentin (Google DeepMind), Yineng Zhang (sgl-project), Driss Guessous (Meta), Jeffrey Daily (AMD), Zhenyu Gu (AMD), Sina Rafati (AMD), Simran Arora (Together AI, Stanford University), ComfyUI, Hugging Face. 🧩 Summary 📚 Layer/token–aware memory (#vLLM + #Jenga): two-level allocator with LCM-aligned blocks + fine-grained sub-allocator. Handles heterogeneous KV sizes, minimizes fragmentation, boosts GPU utilization (~4.9× throughput gain). ⭐️ Distributed inference (#llm-d): inference gateway + prefix-aware routing + prefill/decode disaggregation. RDMA-based KV transfer, prefix indices (Bloom filters), expert-parallel load balancing (EPLB), and 5-stage decode pipeline enable predictable SLOs and lower TTFT. ⏰ #Quantization & model structure (#GPT-#OSS): Harmony prompt schema and MXFP4 4-bit float quantization for MoE reduce memory footprint (120B fits 80 GB). Requires ROCm-level kernel tuning (AITER + fused AllReduce). 🌉 #Hierarchical KV caching (#SGLang / DeepSeek / #HiCache / #Mooncake): GPU–host–remote tiers with GPU-assisted radix indexing and RDMA prefetch; achieves 6× throughput, 84% TTFT reduction. Deterministic kernels via CUDA graph replay ensure reproducible #RL. 🏙️ Hardware/software co-design (AITER, Primus-Turbo, CK, Triton): fused GEMM/attention kernels, FP8/FP4 quant, token-routing fusion, and Uneven PP for fine-grained #parallelism. Kernel-level tuning now as critical as model design. 🍃 AGIKIT & auto-tuning: emerging pipelines that benchmark, modify, and redeploy kernels automatically — agentic optimization loop for workload-specific GPU performance. ⚙️ 2025 Topics Jenga-style allocators + per-layer KV metrics. Wire vLLM KV connector + IGW prefix routing into serving. Split prefill/decode with RDMA KV path. Validate MXFP4/FP8 quant on #CoT/function-calling tests. Benchmark ROCm + AITER + Primus early; kernel correctness = performance. Use deterministic kernels for reproducible RL. 📈 Summary LLM infra / system treats kernels, caches, and quant formats as first-class product surfaces. Next: observability on KV layers, P/D RDMA prototypes, quant sensitivity evals, and integrated kernel auto-tuning loops. Read more: https://lnkd.in/geSMr9G6
- +15
7 Comments

Like Comment Share
AMD Developer

3,641 followers
18h
Report this post
👏
AMD

1,967,250 followers
18h Edited

A huge thank you to everyone who joined us for AMD AI Dev Day! From speakers and sponsors to every developer, researcher, innovator, and AMDer, we’re so grateful for this vibrant community pushing the boundaries of AI together. We heard thought-provoking talks by industry leaders and experienced hands-on workshops where developers learned, built, and collaborated side by side. We’re incredibly thankful to our speakers, who shared their expertise and inspired the community throughout the day: Ion Stoica (University of California, Berkeley) Lin Qiao (Fireworks AI) Simran A. (Together AI / Stanford University) Jeff Boudier (Hugging Face) Michael Chiang (Ollama) Driss Guessous (Meta Platforms) Daniel Han (Unsloth AI) Jedrzej Kosinski (ComfyUI) Dominik Kundel (OpenAI) Yannik Marek (ComfyUI) Erwan Menard (Crusoe) Simon Mo (vLLM) Mark Saroufim (Meta) Robert Shaw (Red Hat AI) Tris Warkentin (Google DeepMind) Linda Yang (Supermicro) Yineng Zhang (Together AI) Chenyu Zhao (Fireworks AI) And thank you to our sponsors, Crusoe, Dell Technologies, Hewlett Packard Enterprise, and Supermicro for helping make it all possible. Want to stay connected with the AMD AI developer community? 📍 AMD Developer 📱 X, @AIatAMD: https://x.com/AIatAMD 💬 Discord, AMD Developers: https://lnkd.in/g7a9C3qT ☁️ AMD Developer Cloud: https://lnkd.in/gDu92yjX ▶️ AMD Dev Central on YouTube: https://lnkd.in/dTf-SbkH 📖 AMD Dev Central hub: https://lnkd.in/d9KVwqWf #AIDevDay #AMDevs
Like Comment Share
AMD Developer

3,641 followers
1d
Report this post
Overheard at AMD AI Dev Day Anush E. takes on a quick “This or That." Supervised vs. unsupervised? Heap sort vs. merge sort? You might be surprised at his picks.

3 Comments

Like Comment Share
AMD Developer

3,641 followers
1d Edited
Report this post
What happens when you ask #AMDevs what’s next for AI? 👂Stay tuned to hear what we Overheard at AMD AI Dev Day... real insights, big ideas, and plenty of inspiration from the community that's driving what’s next. #AIDevDay

Overheard at AI Dev Day

Like Comment Share
AMD Developer

3,641 followers
1d
Report this post
It was an incredible morning hearing from Ion Stoica, Lin Qiao, Anush E., and Ramine Roane, and now, it's off to the workshops! #AIDevDay
1 Comment

Like Comment Share
AMD Developer

3,641 followers
1d
Report this post
We're looking forward to a great day with our AI dev community!
AMD

1,967,250 followers
1d Edited

Roll call! 🙌 Who’s joining us at AMD AI Dev Day today? We’re kicking off a full day of developer insights and hands-on learning with some of the brightest minds in the industry. Follow along on AMD Developer for highlights as we explore the future of AI, open software, and developer innovation. #AIDevDay
Like Comment Share
AMD Developer

3,641 followers
3d
Report this post
Fresh from the AMD AI Developer Community: This week’s highlights include new Ryzen AI tools, improved PyTorch integration, and performance breakthroughs for large-scale AI models like Llama3.3-70B. Swipe for all the details 👉 🔗Ryzen AI Software Update: https://bit.ly/4qvqfiq 🔗Gumiho blog: https://bit.ly/4qgfZdv 🔗PyTorch 2.9: https://bit.ly/4qlOMXa 🔗New SpecForge Tutorial: https://bit.ly/4qhgQdW 🔗Quark tutorial: https://bit.ly/4oKNxiH

Like Comment Share

LinkedIn respects your privacy

AMD Developer

Semiconductor Manufacturing

Advancing AI innovation together. Built with devs, for devs. Supported through an open ecosystem. Powered by AMD.

About us

Updates

Overheard at AI: This or That with Daniel Han, CEO and co-founder at Unsloth AI

Overheard at AI Dev Day

Join now to see what you are missing

Affiliated pages

AMD

AMD Enterprise

Similar pages

NVIDIA

Intel Corporation

Qualcomm

Google

Micron Technology

Arm

Microsoft

Texas Instruments

Apple

NXP Semiconductors